# RDP: Row-Diagonal Parity for Double Disk Failure Correction

## One-line Summary

RDP is an algorithm that protects against double disk failures. RDP can be applied to RAID systems. RDP is also known as RAID-DP/RAID-6 (There are other RAID-6 approaches to handle two disk failures, but RDP is the most intuitive).

## Paper Structure Outline

1. Introduction
2. Related Work
3. Double Disk Failure Models and Analysis
4. Row-Diagonal Parity Algorithm
5. Proof of Correctness
6. Performance Analysis
7. Algorithm Extensions
8. Implementation Experience
9. Measured Performance
10. Conclusions
11. Acknowledgments

## Background & Motivation

There are two types of disk failures: Individual disks can fail by whole-disk failure, whereby all the data on the disk becomes temporarily or permanently inaccessible, or by media failure, whereby a small portion of the data on a disk becomes temporarily inaccessible. The previous RAID only considers whole-disk failures.

Multiple disk errors are likely: the authors gave a detailed analysis of why this is the case in section 3 (which I'm not going to get into).

## Design and Implementation

RDP is built on RAID-4 or RAID-5. In this paper, we will focus on RAID-4.

![In this case, p = 5. We have (p+1) disks and (p-1) data disks.](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MP_lk50bIxK-nm6qXyJ%2F-MP_pPBNrgNRl3VyOsz7%2FScreen%20Shot%202020-12-27%20at%202.13.28%20PM.png?alt=media\&token=5d5f822c-4463-438c-bb05-800f1d624e61)

XOR is still used for parity. The figure shows the diagonal of each block. In the example above, if we have whole-disk failures on data disks 1 and 3, the data can be easily recovered in many ways.

RDP can also be extended to encompass multiple RAID-4 or RAID-5 disk arrays in a single RDP disk array.

## Evaluation

* Read performance is unaffected.
* Sequential write: Write p-1 stripes at once for best performance (update row and diagonal parity at the same time).
* Partial stripe writes: Writing d blocks by subtraction requires 2d+4 I/Os (d+2 for read, d+2 for write), and writing d blocks by additive requires n I/Os (n-d-2 for read, d+2 for write). Thus, we use a combination of additive and subtractive.
* Proof of correctness and optimality is covered in the paper.

![Write performance measured: RDP gives a much better reliability for the same cost and performance.](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MP_lk50bIxK-nm6qXyJ%2F-MP_rZejj5RKgRwnPV0F%2FScreen%20Shot%202020-12-27%20at%202.22.53%20PM.png?alt=media\&token=47122f1d-d505-4262-a430-1cd8e8033e8e)

{% hint style="info" %}

* G: number of separate RAID groups connected to the filer
* d: number of data disks per RAID group
* p: number of parity disks per RAID group
  {% endhint %}

## New Vocabulary

* NetApp: A cloud data services and data management company.

## Links

* [Paper PDF](https://www.usenix.org/legacy/publications/library/proceedings/fast04/tech/corbett/corbett.pdf)

{% file src="<https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MPjfz7Ai6Qw1AukP6dr%2F-MPkIt9XdUzLx2b-40Q4%2FL3%2BL4%2BL5-RAID%2BRDP%2BiBench.pptx?alt=media&token=a6336a8a-cb1e-4279-a697-ecab5978df03>" %}
Prof. Andrea's slides on RAID and RDP
{% endfile %}
