Rack-Aware Regenerating Codes with Multiple Erasure Tolerance

Liyang Zhou, Zhifang Zhang

In a modern distributed storage system, storage nodes are organized in racks, and the cross-rack communication dominates the system bandwidth. In this paper, we focus on the rack-aware storage system. The initial setting was immediately repairing every single node failure. However, multiple node failures are frequent, and some systems may even wait for multiple nodes failures to occur before repairing them in order to keep costs down. For the purpose of still being able to repair them properly when multiple failures occur, we relax the repair model of the rack-aware storage system. In the repair process, the cross-rack connections (i.e., the number of helper racks connected for repair which is called repair degree) and the intra-rack connections (i.e., the number of helper nodes in the rack contains the failed node) are all reduced. We focus on minimizing the cross-rack bandwidth in the rack-aware storage system with multiple erasure tolerances. First, the fundamental tradeoff between the repair bandwidth and the storage size for functional repair is established. Then, the two extreme points corresponding to the minimum storage and minimum cross-rack repair bandwidth are obtained. Second, the explicitly construct corresponding to the two points are given. Both of them have minimum sub-packetization level (i.e., the number of symbols stored in each node) and small repair degree. Besides, the size of underlying finite field is approximately the block length of the code. Finally, for the convenience of practical use, we also establish a transformation to convert our codes into systematic codes.

Knowledge Graph



Sign up or login to leave a comment