Ceph blog stories provide high-level spotlights on our customers all over the world
This week marked the very first Ceph Developer Summit where the community gathered to discuss development efforts focused on the next stable release ‘Dumpling.’ There was quite a turnout for such a boutique event! We hit over 50 concurrent participants in the live video stream and had almost 400 unique visitors to the relatively new Ceph wiki during that window. Participants included folks from all over the world:
There was a ton of work proposed by the community, and almost all of it was accepted and discussed for inclusion in Dumpling. We were incredibly pleased with both the turn-out, and the general caliber of the participants. Having an awesome community makes it really easy to stay excited about what we do.
Below you will find each of the session videos split out with a brief description and links to the blueprint, etherpad, and irc logs as they appeared during the session. The original summit page has also been updated with the appropriate links for posterity. We plan to leave these pages up in order to give people the ability to look back at Ceph development as far as possible. If you have questions or feedback, please email the community team.
We will be doing a developer summit for each stable release (quarterly) so if you are interested in participating feel free to post a blueprint on the wiki for consideration. The sessions for each developer summit are selected directly from submitted blueprints.
If you are interested in contributing to Ceph on a smaller scale feel free to dive right in, clone our github repository and submit a pull request for any changes you make.
Now, on to the summit!
Sage kicked off the event with a few slides that offered a summary of the event, how Cuttlefish development went, and what the next steps were. This event actually marks the first time we have promoted an open roadmap discussion and really pushed for community participation. The event was entirely virtual so some discussion of the tools was in order, explaining how we planned to run the event.
The goal of this event was ultimately to make sure that all proposed development work for Dumpling was viable, coordinated, and appropriately understood. Blueprint ownership was established, implementation questions were discussed, and next steps were established. Inktank folks really want to make sure the community has all the tools at their disposal to participate fully and are happy to act as a resource for anyone pushing code to the main repository.
There were actually quite a few sessions for such a small event. Thirteen blueprints were discussed in total. Inktank shared the work they are planning to contribute to the Dumpling release. This work includes geo-replication, a management API, RBD support for OpenStack Havana, and several others. The community contributed plans for many things including a modularization of the RADOS gateway, Ceph stats and reporting work, erasure encoding, and inline data support to name a few. Read below for a quick summary of each session and the associated video, irc chatlog, and etherpad.
Inktank’s Dan Mick kicked off the first session with an overview of Inktank’s plans for building a management API that will provide a RESTful endpoint to performan monitoring and management tasks. The goal here is to allow complete integration into whatever management tool or dashboard that is desired. Specific use case examples might be inclusion in OpenStack Horizon or a similar technology.
Loic Dachary and Christopher Liljenstolpe jointly kicked off the next session with an ambitious look at erasure encoding.
“For a class of users, there is a requirement for very durable data, but the cost of meeting that durability by using xN replication becomes cost prohibitive if the size of the data to be stored is large. An example usecase where this is an issue is genomic data. By using a 3x replication, a 20PB genomic repository requires spinning 60+PB of disk. However, the other features of ceph are very attractive (such as using a common storage infrastructure for objects, blocks, and filesystems, CRUSH, self-healing infrastructure, non-disruptive scaling, etc.”
Inktank’s Yehuda Sadeh led the next session which discussed the work he is doing with the RADOS Gateway (RGW) for the purposes of geo-replication. The current Ceph model expects a high-bandwidth, low latency connection between all nodes. This makes WAN scale replication impractical. This session showed how Inktank plans to take the first pass at geo-replication for disaster recovery and geographic eventual consistancy using RGW.
The afternoon tracks were designed to be small, concentrated sessions and were split into two tracks. Track 1 saw more RGW discussion, CephFS security, inline data support, and fallocate/hole punching. Track 2 saw discussions on orchestration, testing, stats and monitoring, RADOS namespaces, and a hook framework for Ceph FS.
42on founder Wido den Hollander is championing the efforts behind moving the RADOS Gateway (RGW) out of the core code and making it more modular. This would allow for a number of improvements to flexibility and the ability to stand up RGW without FastCGI and Apache being quite so required.
While there was no blueprint for this session, Inktank’s Alexandre Marangone and Dan Mick hosted a session to talk about some of the orchestration work including consolidating the Chef recipes and some of the work that has been done on ceph-deploy. Discussions included many orchestration tools that Ceph is currently deployed with including Juju, Salt, Puppet, and others. The focus was how deployment went from ceph-deploy -> chef and how this could be scaled to others.
Inktank’s Yehuda Sadeh made another appearance to discuss the feasibility of bucket-level quotas in the Ceph Object Gateway. The focus here is how to simplify, optimize, and ensure accurate record keeping across multiple gateways without incurring huge overhead costs.
Sage took the helm for another blueprint-less session to discuss how to build and test Ceph automatically using the Teuthology framework and how this framework could evolve. Work items coming out of this session include the creation of a large cluster test suite, a qemu gitbuilder, performance regressions, possible chart.io graphs, and several others.
Currently access to CephFS is somewhat of an “all or nothing” model. Several folks were interesting in more robust security measures being built in for user- and/or tree-level security. Mike Kelly was leading the charge on tackling this issue and may end up combining some of the RADOS namespace discussion with his solution.
Sage tackled several blueprints at once with this session that included deep-level discussion on both RADOS and CRUSH. The focus was on extending both to be able to handle new or more complex tasks.
NUDT’s Li Wang has stepped up to tackle inline data support. The hope here is to allow a mount option to store metadata as an extended attribute for each file.
“Inline data is a good feature for accelerating small file access, which is present in mainstream local file systems, for example, ext4, btrfs etc. It should be beneficial to let Ceph implement this optimization, since it could save the client the calculation of object location and communication with the OSDs. It hopefully will receive a good IO speedup for small files traffic.
For a typical Ceph file access traffic, client first asks mds for metadata, then communicates with osd for file data. If a file is very small, its data can be stored together with the metadata, as an extended attribute. While opening a small file, osd will receive file metadata as well as data from mds, the calculation of object location as well as communication with osd are saved.”
While Ceph tracks some things internally it also exposes a wealth of knowledge that can be assimilated by other tools. Dreamhost’s Kyle Bader discussed some of the work that they have been doing with tools like graphite and nagios and how some of this data can be shared with the community. The hope is to create some shared community knowledge around monitoring and stats for a Ceph cluster.
NUDT’s Li Wang took the reins for another session to talk about how to implement hole punching. This was a relatively short discussion since this should be a relatively small change.
In the last session of the day, UCSC’s Yasuhiro Ohara presented a proposal for a hook framework that would enable execution of callback functions for CephFS operations. There was quite a bit of discussion around implementation decisions of fork/exec vs something a bit less heavyweight.
While there were a few technical hiccups using the Google hangouts, overall it was a great event with some amazing community participation. Now that the implementation decisions and next steps have been established, the development begins! Each of these blueprints is designed to be a living doc with notes, updates, and tasks recorded as needed. If you are interested in participating in an existing blueprint or future development please contact the owner or the community team and get started! Thanks again to everyone who participated and helped make this a great first Ceph Developer Summit.