Ceph erasure code jerasure plugin benchmarks
On a Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz processor (and all SIMD capable Intel processors) the Reed Solomon Vandermonde technique of the jerasure plugin, which is the default in Ceph Firefly, performs better.
The chart is for decoding erasure coded objects. Y are in GB/s and the X are K/M/erasures. For instance 10/3/2 is K=10,M=3 and 2 erasures, meaning each object is sliced in K=10 equal chunks and M=3 parity chunks have been computed and the jerasure plugin is used to recover from the loss of two chunks (i.e. 2 erasures).
Benchmark reports ¶
The bench.sh output is rendered in a standalone HTML page with Flot from the root directory of the source file.
TOTAL_SIZE=$((4 * 1024 * 1024 * 1024)) \ CEPH_ERASURE_CODE_BENCHMARK=src/ceph_erasure_code_benchmark \ PLUGIN_DIRECTORY=src/.libs \ qa/workunits/erasure-code/bench.sh fplot jerasure
PARAMETERS='--parameter jerasure-variant=generic' \ TOTAL_SIZE=$((4 * 1024 * 1024 * 1024)) \ CEPH_ERASURE_CODE_BENCHMARK=src/ceph_erasure_code_benchmark \ PLUGIN_DIRECTORY=src/.libs \ qa/workunits/erasure-code/bench.sh fplot jerasure
Results interpretation ¶
The benchmarks are presented in two charts, one for encoding performances and another for decoding performances. The Y axis is the amount data processed in GB/s : more is better.
The X axis has one K/M pair for each point, ordered from the simpler on the left (K=2, M=1 which is also the default in Firefly) to the one requiring more effort on the right (K=10, M=4).
The X axis of the chart for decoding performances is further divided to show the cost of recovering from an increasing number of erasures. For instance the 4/3/1 point for Reed Solomon shows that an object encoded with K=4, M=3 that has lost one chunk (one erasure) can be decoded at a rate over 0.75 GB/s. The next point, 4/3/2 shows that when there are two erasures, the rate falls under 0.75 GB/s. The points that share the same K/M pair are connected with a line.
SIMD improvements and previous benchmarks ¶
jerasure version 2 can use SIMD to accelerate encoding and decoding. Without SIMD, the Cauchy technique performs better than the Reed Solomon Vandermonde technique with 1MB objects.
With SIMD the Reed Solomon Vandermonde technique is faster.
The previous jerasure benchmarks were on version one but they also show that the Cauchy technique is faster. However, these benchmarks were conducted before the implementation of erasure coded pools. The actual stripe size is 4KB and the 1MB results are only included to compare with previous results.