RadosGW Big Index
$ rados -p .default.rgw.buckets.index listomapkeys .dir.default.1970130.1 | wc -l
166768275
With each key containing between 100 and 250 bytes, this make a very big object for rados (several GB)… Especially when migrating it from an OSD to another (this will lock all writes), moreover, the OSD containing this object will use a lot of memory …
Since the hammer release it is possible to shard the bucket index. However, you can not shard an existing one but you can setup it for new buckets. This is a very good thing for the scalability.
Setting up index max shards ¶
You can specify the default number of shards for new buckets :
- Per zone, in regionmap :
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
- In in radosgw section in ceph.conf (this override the per zone value)
1 2 3 4 |
|
Verification : ¶
$ radosgw-admin metadata get bucket:mybucket | grep bucket_id
"bucket_id": "default.1970130.1"
$ radosgw-admin metadata get bucket.instance:mybucket:default.1970130.1 | grep num_shards
"num_shards": 8,
$ rados -p .rgw.buckets.index ls | grep default.1970130.1
.dir.default.1970130.1.0
.dir.default.1970130.1.1
.dir.default.1970130.1.2
.dir.default.1970130.1.3
.dir.default.1970130.1.4
.dir.default.1970130.1.5
.dir.default.1970130.1.6
.dir.default.1970130.1.7
Bucket listing impact : ¶
A simple test with ~200k objects in a bucket :
num_shard | time (s) |
---|---|
0 | 25 |
8 | 36 |
128 | 109 |
So, do not use buckets with thousands of shards if you do not need it, because the bucket listing will become very slow…
Link to the blueprint :
https://wiki.ceph.com/Planning/Blueprints/Hammer/rgw%3A_bucket_index_scalability