Aggregated news from external sources
Although it is extremely unlikely to loose an object stored in Ceph, it is not impossible. When it happens to a Cinder volume based on RBD, knowing which has an object missing will help with disaster recovery. The list_missing command … Continue reading →
An update on my talk submission for the OpenStack summit this year in Paris: my speech about Ceph performance analysis was not chosen by the committee for the official agenda. But at least one piece of good news: Marc’s talk will be part of t…
The Call for Speakers period for the OpenStack Summit from 03. – 07.11.2014 in Paris ended this week. Now the voting for the submitted talks started and ends at 11:59pm CDT on August 6. (6:59 am CEST on 7. August).I’ve submitted a talk to the stor…
For more than a year, Ceph has become increasingly popular and saw several deployments inside and outside OpenStack. For those of you who do not know Ceph is unified, distributed and massively scalable open source storage technology that provides several ways to access and consume your data such as object, block and filesystem. The community and Ceph itself has greatly… Read more →
I experimented with Taobao’s fork of nginx, Tengine, in front of an object storage cluster. I was surprised by the results.
I’ve always been a fan of nginx, it was love at first sight.
I tend to use nginx first and foremost as a reverse proxy server for web
content and applications. This means that nginx sends your request to
backend servers and forwards you their response.
Some examples of backend servers I use:
Now, the cool thing is that these backend servers are good at what they
do: serve code and applications written in specific languages.
Mix an awesome, lightweight, proxy and an awesome backend server, you’re
in for some serious performance.
This is in contrast to Apache that has an approach with modules: it
tries to do everything itself – jack of all trades, master of none.
Enough of nginx, let’s talk about Tengine.
Ever heard of Taobao ? I’ll be honest, I hadn’t until fairly
It turns out they are number 8 on Alexa’s top websites, right in
front of Twitter.
When China makes up almost 20% of the World’s population, even a
small penetration on the market is in fact huge by all means.
Tengine is a fork of nginx created by the team over at Taobao. There’s a
lot of features in Tengine that do not (yet) exist in nginx and some
features that upstream maintainers said they would not implement.
Some highlights include:
Long story short, Object storage is a mean of storing data online
and make it easily accessible with the help of APIs.
Example of products using this technology include Dropbox, Google Drive,
Microsoft OneDrive or Amazon S3.
Owncloud is also a good open source and self-hosted alternative front
end to Object Storage.
They’re both similar in that you upload files to a proxy server – a
Swift proxy server or a Ceph RADOS Gateway server. These proxy servers
take care of sending the files back to storage servers that ensure data
is distributed and replicated to ensure the high availability and
redundancy of your data.
It looks a bit like this:
+-----------+ +--> | Storage | | +-----------+ | +-----+ File +-------+ | +-----------+ | You | +----> | Proxy | +-----> | Storage | +-----+ +-------+ | +-----------+ | | +-----------+ +--> | Storage | +-----------+
Now, in a highly available and distributed environment, you might have
dozens or hundreds of storage and proxy servers. There are a lot of
options out there, you might have something like haproxy, pound
or nginx for load balancing.
With a load balancer in front of your proxy servers, your setup now
looks like this:
+-------+ +-----------+ +--> | Proxy | +--+--> | Storage | | +-------+ | +-----------+ | | +-----+ File +---------------+ | +-------+ | +-----------+ | You | +----> | Load Balancer | +-----> | Proxy | +-----> | Storage | +-----+ +---------------+ | +-------+ | +-----------+ | | | +-------+ | +-----------+ +--> | Proxy | +--+--> | Storage | +-------+ +-----------+
I noticed a problem when using nginx as a load balancer in front of
servers that are the target of large and numerous uploads. nginx buffers
the request of the body and this is something that drives a lot of
discussion in the nginx mailing lists.
This effectively means that the file is uploaded twice. You upload a
file to nginx that acts as a reverse proxy/load balancer and nginx waits
until the file is finished uploading before sending the file to one of
the available backends. The buffer will happen either in memory or to an
actual file, depending on configuration.
Tengine was recently brought up in the Ceph mailing lists as part of
the solution to tackling the problem so I decided to give it a try and
see what kind of impact it’s unbuffered requests had on performance.
I uploaded a 1GB file to an Object storage cluster with nginx 1.6.0 in
front. I then swapped it out for Tengine 1.5.2 and tried again. Swapping
webservers was as simple as uninstalling Nginx and installing Tengine
from a package I built. The configuration I had was 100% compatible,
I only had to add configuration to disable request buffering.
The layout looked like this:
+----+ 1GB File +---------------+ +-------+ +-----------+ | Me | +---------> | Load Balancer | +---> | Proxy | +---> | Storage | +----+ +---------------+ +-------+ +-----------+ 1Gbps 1Gbps
With nginx, the upload took 1 minute 13 seconds.
With Tengine, the upload took 41 seconds.
That’s a difference of more than 30 seconds !
I was blown away by the difference disabling the buffering made.
Tengine really was a drop-in replacement to Nginx, much like
MariaDB 5.5 is for MySQL.
This blog now runs Tengine, perhaps there is also a
bright future ahead of Taobao’s team ?
It might just start making waves outside of China.
Let’s wait and see.
Six months have passed since Hong Kong and it is always really exciting to see all the folks from the community gathered all-together in a (bit chilly) convention center. As far I saw from the submitted and accepted talks, Ceph continues its road to the top. There is still a huge growing interest about Ceph. On tuesday May 13th, Josh… Read more →
OpenStack Havana Cinder volumes associated with a RBD Ceph pool are bound to a host. cinder service-list –host bm0014.the.re@rbd-ovh +—————+———————–+——+———+——-+ | Binary | Host | Zone | Status | State | +—————+———————–+——+———+——-+ | cinder-volume | bm0014.the.re@rbd-ovh | ovh | enabled … Continue reading →
A few non profit organizations (April, FSF France, tetaneutral.net…) and volunteers constantly research how to get compute, storage and bandwidth that are: 100% Free Software Content neutral Low maintenance Reliable Cheap The latest setup, in use since ocbober 2013, is … Continue reading →
If you’re in Atlanta Sunday 11th, may 2014 evening, for the OpenStack summit or any other reason, join us to celebrate the OpenStack Icehouse release and the Ceph Firefly release. There will be both OpenStack and Ceph developers present and … Continue reading →
Come chat with me about Openstack, Swift and Ceph in Montreal March 17th.
It’s with great pleasure that I accepted an invitation from my colleague
Rafael Rosa (@rafaelrosafu) to talk about Ceph in the context of Openstack.
Our friends at Enovance will be talking about Swift, the object storage
project in Openstack.
Should definitely be fun, it will in fact be my first public speech ever 😀
Come join us, register – time is running short:
Edit: It was fun ! The presentation I did about Ceph is available on