DSC: Dynamic stripe construction for asynchronous encoding in clustered file system

摘要

Nowadays many clustered file systems adopt asynchronous encoding which transforms replicated data into erasure coding to maintain data availability with bounded storage overhead. Existing implementations of asynchronous encoding construct coding stripes with logically sequential data blocks, which suffers from heavy cross-rack traffic and necessitates data block redistribution. Recent work [12] solves this problem by carefully distributing replicated data blocks among racks at the time when they are being written, but it is not applicable to the cases when existing systems have different data layouts or the data layout changes. In this paper, we propose Dynamic Stripe Construction (DSC) to transform N-way replication to erasure coding. DSC does not induce to any cross-rack traffic for encoding, and it does not require data block redistribution after encoding. Besides, DSC is general enough to be applied to any existing CFSes with various erasure codes, and it can also be deployed on a distributed file system in a hot-plugging-in manner. To validate the effectiveness of DSC, we implement it on HDFS. Through extensive testbed experiments in a real storage cluster, we show that DSC can significantly increase the encoding throughput and reduce the foreground user response time over the traditional approach.

出版物
IEEE INFOCOM 2017 - IEEE Conference on Computer Communications

相关