"Fault tolerant systems"

D3: Deterministic Data Distribution for Efficient Data Reconstruction in Erasure-Coded Distributed Storage Systems

Due to individual unreliable commodity components, failures are common in large-scale distributed storage systems. Erasure codes are widely deployed in practical storage systems to provide fault tolerance with low storage overhead. However, the …

A Hierarchical RAID Architecture Towards Fast Recovery and High Reliability

Disk failures are very common in modern storage systems due to the large number of inexpensive disks. As a result, it takes a long time to recover a failed disk due to its large capacity and limited I/O. To speed up the recovery process and maintain …

DSC: Dynamic stripe construction for asynchronous encoding in clustered file system

Nowadays many clustered file systems adopt asynchronous encoding which transforms replicated data into erasure coding to maintain data availability with bounded storage overhead. Existing implementations of asynchronous encoding construct coding …

Workload-Aware Elastic Striping With Hot Data Identification for SSD RAID Arrays

Redundant array of independent disk (RAID) offers a good option to provide device-level fault tolerance for solid-state drives (SSDs). However, parity update with either read-modify-write or read-reconstruct-write may introduce a lot of extra I/Os …

Efficient Parity Update for Scaling RAID-like Storage Systems

It is inevitable to scale RAID systems with the increasing demand of storage capacity and I/O throughput. When scaling RAID systems, we will always need to update parity to maintain the reliability of the storage systems. There are two schemes, …

Elastic Parity Logging for SSD RAID Arrays

Parity-based RAID poses a design trade-off issue for large-scale SSD storage systems: it improves reliability against SSD failures through redundancy, yet its parity updates incur extra I/Os and garbage collection operations, thereby degrading the …

Grouping-Based Elastic Striping with Hotness Awareness for Improving SSD RAID Performance

RAID provides a good option to provide device-level fault tolerance. Conventional RAID usually updates parities with read-modify-write or read-reconstruct-write, which may introduce a lot of extra I/Os and thus significantly degrade SSD RAID …