• catloaf@lemm.ee
    link
    fedilink
    English
    arrow-up
    8
    ·
    2 months ago

    That’s a pretty common failure scenario in SANs. If you buy a bunch of drives, they’re almost guaranteed to come from the same batch, meaning they’re likely to fail around the same time. The extra load of a rebuild can kill drives that are already close to failure.

    Which is why SANs have hot spares that can be allocated instantly on failure. And you should use a RAID level with enough redundancy to meet your reliability needs. And RAID is not backup, you should have backups too.

    • kalleboo@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      Also why you need to schedule periodical parity scrubs, then the “extra load of a rebuild” is exercised regularly so weak drives will be found long before a rebuild is needed.