Deep Scrub Distribution

syndicated

To verify the integrity of data, Ceph uses a mechanism called deep scrubbing which browse all your data once per week for each placement group. This can be the cause of overload when all osd running deep scrubbing at the same time.

You can easly see if a deep scrub is current running (and how many) with ceph status `ceph -w`. But the most interesting is to see the distribution over a week to anticipate the load :

Get the distribution for the week

$ for date in `ceph pg dump | grep active | awk '{print $20}'`; do date +%A -d $date; done | sort | uniq -c
    239 monday
     41 tuesday
      2 saturday
    410 sunday

We can easy see when are done the deep scrub (in my case sunday and monday).

The default interval for deep scrubbing is every week but it could be tunable with parameter ‘osd deep scrub interval’ : http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing

Distribution per hours

$ for date in `ceph pg dump | grep active | awk '{print $21}'`; do date +%H -d $date; done | sort | uniq -c
     11 00
     26 01
     34 02
     37 03
     26 04
     25 05
     14 06
      1 08
     28 09
     49 10
     52 11
     46 12
     37 13
     31 14
     29 15
     45 16
     34 17
     33 18
     27 19
     33 20
     27 21
     25 22
     22 23