Aggregated news from external sources
Through this article I’m delighted to introduce a new category on the Ceph blog called User Story. The aim of this category is to offer general feedback to centralized users . The structure of the article is as follows:
Since you have the whole gist of the article, let’s start!
For the people who don’t read my blog, I’ve been blogging for almost 2 years now, however I’ve been really active and focused since March 2012. This date is not really a coincidence; it’s when my final year internship started. My point of interest is currently directed towards OpenStack, Ceph and Pacemaker.
Stone-IT is a reliable service provider for organizations that run Linux and open source software business. They design and build highly available Linux platforms for websites and applications and they manage large Linux infrastructures. 2012 was a critical year for Stone-IT. A new company called Smile recently bought them out. Merging, as we can imagine, is not always a trivial affair, especially for the internal management of the organization concerned and inter-agency communication. In addition to this, the current Linux-Cloud has reached its limit in terms of resources available like computers and space. Thanks to Smile they also have acquired some new customers, which have dramatically increased the load on their Cloud. I was recruited during February 2012, so was aware of those problems earlier. Thus I quickly started to work on some solutions. Basically, they needed to find a new core engine for their cloud platform; this will lead to a Linux-Cloud 2.0.
Finally I started my internship in April 2012; I have been doing a lot of research and development on several subjects:
The mission was well defined; I had to investigate those two topics, elect a technology, design a viable concept and then build the next platform in production based on this research.
Since most of our business is based on hosting and maintaining websites, we need a distributed filesystem to bring webservers with consistent data. Our conception is that developers have admin virtual machines where they can put new codes and test them in a pre-production environment. When tests are done they can deliver their updates to the NFS share, which makes new files ‘instantly’ available on every web virtual. This is how things are managed at Stone-IT. I guess it’s a pretty standard method while working with a lot of webservers. As I mentioned earlier, I had to investigate CMP and storage solutions. Ideally both should be compatible in terms of features and integration and one should be able to take advantage of the other. NFS is a viable solution for age, the original plan wasn’t to drop NFS but as an innovative company we wanted to open our business and try to do some new stuff. Thus, I started to evaluate quite a lot of Distributed Filesystems, I will spare you deep technical details but the main software were:
For the CMP part, it was pretty clear that we will use OpenStack. In order to make things ideal I had a look at the drivers available for (the old) nova-volume. This is where I came across the Ceph driver. I then started to evaluate Ceph, which eventually led to my article Introducing Ceph to OpenStack, from which I received a lot of nice reviews and feedback. The original idea was to use CephFS to get rid of NFS, I tried it and as the article attests, I made some nice think working with it like the KVM live migration. However, even if things had gone fine for me I couldn’t have taken the risk to put something in production that core developers hadn’t yet recommended. Things didn’t really change in my head, I ended up with a little temporary workaround in order to continue to use Ceph. Basically we have our Ceph backend and in front of it two servers. On those 2 servers, we map RBD devices; devices that are re-exported by NFS, then each web virtual mounts the share. This might sound like overkill and tricky (ok it’s tricky), but it’s perfectly stable and performance is not that bad. This result is more or less what I described in my article NFS over RBD.
Our infrastructure is not that big, because we are not a big company and we didn’t have the budget of a big company either!
Performance corner, I made a lot of benchmarks. Really, a lot. Maybe too many, but after all those tests, the only thing that I can say about Ceph performance is that you should not worry about it. If you know the design you should not be surprised because they are just as expected. Ceph won’t be the bottleneck; it delivers the maximum performance that your hardware and network can offer.
Now let’s bring some details about the hardware we use:
4 storage nodes:
Eventually before going live you might like those little tips/best practices:
It has been 6 months since I started working on Ceph and the first thing that popped into my mind is: “what a damn amazing project!” And it didn’t take any threatening to get me to say it, not even from Ross .
To conclude this first user story, I’d like to thank the people from Inktank that gave me this opportunity. Many thanks to the community as well. I’m not a great developer so I can’t really contribute to the code, but I’ve been quite active on testing, reports, feedback and blogging on the project.