Due to individual unreliable commodity components, failures are common in large-scale distributed storage systems. Erasure codes are widely deployed in practical storage systems to provide fault tolerance with low storage overhead. However, the …
Disk failures are very common in modern storage systems due to the large number of inexpensive disks. As a result, it takes a long time to recover a failed disk due to its large capacity and limited I/O. To speed up the recovery process and maintain …
Nowadays many clustered file systems adopt asynchronous encoding which transforms replicated data into erasure coding to maintain data availability with bounded storage overhead. Existing implementations of asynchronous encoding construct coding …
Redundant array of independent disk (RAID) offers a good option to provide device-level fault tolerance for solid-state drives (SSDs). However, parity update with either read-modify-write or read-reconstruct-write may introduce a lot of extra I/Os …
It is inevitable to scale RAID systems with the increasing demand of storage capacity and I/O throughput. When scaling RAID systems, we will always need to update parity to maintain the reliability of the storage systems. There are two schemes, …
Parity-based RAID poses a design trade-off issue for large-scale SSD storage systems: it improves reliability against SSD failures through redundancy, yet its parity updates incur extra I/Os and garbage collection operations, thereby degrading the …
RAID provides a good option to provide device-level fault tolerance. Conventional RAID usually updates parities with read-modify-write or read-reconstruct-write, which may introduce a lot of extra I/Os and thus significantly degrade SSD RAID …
Network coding allows intermediate nodes to encode data packets to improve network throughput and robustness. However, it increases the propagation speed of polluted data packets if a malicious node injects fake data packets into the network, which …