Dr. Christopher Kunz on nostr

Degraded data redundancy: 240248/360372 objects degraded (66.667%), 129 pgs degraded, 129 pgs undersized you do not, in fact, love to see it. Thankfully I have backups for the important stuff.

Dr. Christopher Kunz @Dr. Christopher Kunz 1776413395

@nprofile1q... Holy crap, that gives me instant PTSD. Back in 2014-ish, when Ceph was still relatively new, we had a large outage in our production Ceph cluster that resulted in a similar outcome (fck those intel SSDs). Luckily, we had an Inktank support contract and Sage Weil personally wrote a Python script that helped reconstruct the data from the one remaining replica. It took about a week until all VMs in the cluster were running, and I had customers on the phone who were literally in tears.