SSE optimization for erasure code in Ceph
The jerasure library is the default erasure code plugin of Ceph. The gf-complete companion library supports SSE optimizations at compile time, when the compiler provides them (-msse4.2 etc.). The jerasure (and gf-complete with it) plugin is compiled multiple times with various levels of SSE features:
- jerasure_sse4 uses SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE
- jerasure_sse3 uses SSSE3, SSE3, SSE2, SSE
- jerasure_generic uses no SSE instructions
When an OSD loads the jerasure plugin, the CPU features are probed and the appropriate plugin is selected depending on their availability.
The gf-complete source code is cleanly divided into functions that take advantage of specific SSE features. It should be easy to use the ifunc attribute to semi-manually select each function individually, at runtime and without performance penalty (because the choice is made the first time the function is called and recorded for later calls). With such a fine grain selection, there would be no need to compile three plugins because each function would be compiled with exactly the set of flag it needs.