Simplyfing Ceph Object Deployments: Production-ready Ceph Object Gateway (RGW) with New Cephadm Features

Dec 28, 2024 Daniel Parkes, Anthony D'Atri (IBM)

Introduction ¶

Deploying a production-ready object storage solution can be challenging, particularly when managing complex requirements including SSL/TLS encryption, optimal data placement, and multisite replication. During deployment, it’s easy to overlook configuration options that become crucial once the system is live in production.

Traditionally, configuring Ceph for high availability, security, and efficient data handling required users to manually adjust multiple parameters based on their needs, such as Multisite Replication, Encryption, and High Availability. This initial complexity made it tedious to achieve a production-ready Object Storage configuration.

To tackle these challenges, we have introduced several new features to Ceph's orchestrator that simplify the deployment of Ceph RGW and its associated services. Enhancing the Ceph Object Gateway and Ingress service specification files enables an out-of-the-box, production-ready RGW setup with just a few configuration steps. These enhancements include automated SSL/TLS configurations, virtual host bucket access support, erasure coding for cost-effective data storage, and more.

These improvements aim to provide administrators with a seamless deployment experience that ensures secure, scalable, and production-ready configurations for the Ceph Object Gateway and Ingress Service (load balancer).

In this blog post, we'll explore each of these new features, discuss the problems they solve, and demonstrate how they can be easily configured using cephadm spec files to achieve a fully operational Ceph Object Gateway setup in minutes.

Feature Highlights ¶

Virtual Host Bucket Access and Self-Signed Certificates ¶

One of the major challenges in deploying RGW is ensuring seamless access to buckets using virtual host-style URLs. For applications and users that rely on virtual host bucket access, proper SSL/TLS certificates that include the necessary Subject Alternative Names (SANs) are crucial. To simplify this, we've added the option to automatically generate self-signed certificates for the Object Gateway if the user does not provide custom certificates. These self-signed certificates include SAN entries that allow TLS/SSL to work seamlessly with virtual host bucket access.

Complete TLS/SSL Client-to-RGW Encryption ¶

Security is a top priority for any production-grade deployment, and the Ceph community has increasingly requested full TLS/SSL encryption from the client to the Object Gateway service. Previously, our ingress implementation only supported terminating SSL at the HAProxy level, which meant that communication between HAProxy and RGW could not be encrypted.

To address this, we've added configurable options that allow users to choose whether to re-encrypt traffic between HAProxy and RGW or to use passthrough mode, where the TLS connection remains intact from the client to RGW. This flexibility allows users to achieve complete end-to-end encryption, ensuring sensitive data is always protected in transit.

Multisite Replication Configuration ¶

In the past, Ceph multisite deployments involved running many commands to configure your Realm, zonegroup, and zone, and also establishing the relationship between the zones that will be involved in the Multisite replication. Thanks to the RGW manager module, the multisite bootstrap and configuration can now be done in two steps. There is an example in the Object Storage Replication blog post.

In Squid release, we have also added the possibility of configuring dedicating Object Gateways just for client traffic purposes through the cephadm spec file with the RGW spec file option:

disable_multisite_sync_traffic: True

The advantages of dedicating Ceph Object Gateways to specific tasks are covered in the blog post: Ceph Object Storage Multisite Replication Series. Part Three

Configure your Erasure coded data pool in the Spec file during bootstrap. ¶

Object Storage often uses Erasure Coding for the data pool to reduce the TCO of the object storage solution. We have included options for configuring erasure-coded (EC) pools in the spec file. This allows users to define the EC profile, device class, and failure domain for RGW data pools, which provides control over data placement and storage efficiency.

Ceph Object Deployment Walkthrough ¶

If you are new to Ceph and cephadm, the Automating Ceph Cluster Deployments with Ceph: A Step-by-Step Guide Using Cephadm and Ansible (Part 1) blog post will give you a good overview of cephadm and how we can define the desired state of Ceph services in a declarative YAML spec file to deploy and configure Ceph.

Below, we'll walk through the CLI commands required to deploy a production-ready RGW setup using the new features added to the cephadm orchestrator.

Enabling the RGW Manager Module ¶

The first step is to enable the RGW manager module. This module is required to manage RGW services through cephadm.

# ceph mgr module enable rgw

Next, we create a spec file for the Object Gateway service. This spec file includes realm, zone, and zonegroup settings, SSL/TLS, EC profile for the data pool, etc.

# cat << EOF > /root/rgw-client.spec
service_type: rgw
service_id: client
service_name: rgw.client
placement:
  label: rgw
  count_per_host: 1
networks:
  - 192.168.122.0/24
spec:
  rgw_frontend_port: 4443
  rgw_realm: multisite
  rgw_zone: zone1
  rgw_zonegroup: multizg
  generate_cert: true
  ssl: true
  zonegroup_hostnames:
    - s3.cephlab.com
  data_pool_attributes:
    type: ec
    k: 2
    m: 2
extra_container_args:
  - "--stop-timeout=120"
config:
  rgw_exit_timeout_secs: "120"
  rgw_graceful_stop: true
EOF

In this spec file we specify that the RGW service should use erasure coding with a 2+2 profile (k: 2, m: 2) for the data pool, which reduces storage costs compared to a replicated setup. We also generate a self-signed certificate (generate_cert: true) for the RGW service to ensure secure SSL/TLS communication. With zonegroup_hostnames, we enable virtual host bucket access using the specified domain bucket.s3.cephlab.com. Thanks to the config parameter rgw_gracefull_stop, we configure graceful stopping of object gateway services. During a graceful stop, the service will wait until all client connections are closed (drained) subject to the specified 120 second timeout.

Bootstrapping the RGW Realm ¶

Once the spec file is created, we bootstrap RGW services. This step creates and deploys RGW services with the configuration specified in our spec file.

# ceph rgw realm bootstrap -i rgw-client.spec

Verifying the RGW Services ¶

The cephadm bootstrap command will asynchronously apply the configuration defined in our spec file. Soon the RGW services will be up and running, and we can verify their status using the ceph orch ps command.

# ceph orch ps --daemon_type rgw
NAME                            HOST                      PORTS                 STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION          IMAGE ID      CONTAINER ID
rgw.client.ceph-node-05.yquamf  ceph-node-05.cephlab.com  192.168.122.175:4443  running (32m)    94s ago  32m    91.2M        -  19.2.0-53.el9cp  fda78a7e8502  a0c39856ddd8
rgw.client.ceph-node-06.zfsutg  ceph-node-06.cephlab.com  192.168.122.214:4443  running (32m)    94s ago  32m    92.9M        -  19.2.0-53.el9cp  fda78a7e8502  82c21d350cb7

This output shows that the RGW services run on the specified nodes and are accessible via the configured 4443/tcp port.

Verifying the Data Pool ¶

To verify that the RGW data pools are correctly configured with erasure coding, we can use the following command:

# ceph osd pool ls detail | grep data
pool 24 'zone1.rgw.buckets.data' erasure profile zone1_zone_data_pool_ec_profile size 4 min_size 3 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 258 lfor 0/0/256 flags hashpspool stripe_width 8192 application rgw

Viewing the Erasure Code Profile ¶

To get more details about the erasure code profile used for the data pool, we can run the below:

# ceph osd erasure-code-profile get zone1_zone_data_pool_ec_profile
crush-device-class=
crush-failure-domain=host
crush-num-failure-domains=0
crush-osds-per-failure-domain=0
crush-root=default
jerasure-per-chunk-alignment=false
k=2
m=2
plugin=jerasure
technique=reed_sol_van
w=8

This confirms that the erasure code profile is configured with k=2 and m=2 and uses the Reed-Solomon technique.

Configuring the Ingress Service ¶

Finally, we must configure the ingress service to load balance traffic to multiple RGW daemons. We create a spec file for the ingress service:

# cat << EOF > rgw-ingress.yaml
service_type: ingress
service_id: rgw
placement:
  hosts:
    - ceph-node-06.cephlab.com
    - ceph-node-07.cephlab.com
spec:
  backend_service: rgw.client
  virtual_ip: 192.168.122.152/24
  frontend_port: 443
  monitor_port:  1967
  use_tcp_mode_over_rgw: True
EOF

This spec file sets up the ingress service with the virtual (floating) IP (VIP) address 192.168.122.152 and specifies that it should use TCP mode for communication with the Object Gateway, ensuring that SSL/TLS is maintained throughout. With the backend_service we specify the RGW service we want to use as the backend for HAproxy, as it is possible for a Ceph cluster to run multiple, unrelated RGW services.

Testing Load Balancer and SSL/TLS Configuration ¶

Our ingress service stack uses keepalived for HA of the VIP, and HAproxy takes care of the load balancing:

# ceph orch ps --service_name ingress.rgw
NAME                                HOST                      PORTS       STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION         IMAGE ID      CONTAINER ID
haproxy.rgw.ceph-node-06.vooxuh     ceph-node-06.cephlab.com  *:443,1967  running (58s)    46s ago  58s    5477k        -  2.4.22-f8e3218  0d25561e922f  4cd458e1f6b0
haproxy.rgw.ceph-node-07.krdmsb     ceph-node-07.cephlab.com  *:443,1967  running (56s)    46s ago  56s    5473k        -  2.4.22-f8e3218  0d25561e922f  4d18247e7615
keepalived.rgw.ceph-node-06.cwraia  ceph-node-06.cephlab.com              running (55s)    46s ago  55s    1602k        -  2.2.8           6926947c161f  50fd6cf57187
keepalived.rgw.ceph-node-07.svljiw  ceph-node-07.cephlab.com              running (53s)    46s ago  53s    1598k        -  2.2.8           6926947c161f  aaab5d79ffdd

When we check the haproxy configuration on ceph-node-06 where the service is running, we confirm that we are using TCP passthrough for the backend configuration of our Object Gateway services.

# ssh ceph-node-06.cephlab.com cat /var/lib/ceph/93d766b0-ae6f-11ef-a800-525400ac92a7/haproxy.rgw.ceph-node-06.vooxuh/haproxy/haproxy.cfg | grep -A 10 "frontend frontend"
...
backend backend
    mode        tcp
    balance     roundrobin
    option ssl-hello-chk
    server rgw.client.ceph-node-05.yquamf 192.168.122.175:4443 check weight 100 inter 2s
    server rgw.client.ceph-node-06.zfsutg 192.168.122.214:4443 check weight 100 inter 2s

To verify that the SSL/TLS configuration is working correctly, we can use curl to test the endpoint. We can see that the CA is not trusted by our client system where we are running the curl command:

# curl https://192.168.122.152
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it.

To fix this, we need to add the cephadm root CA certificate to the trusted store of our client system:

# ceph orch cert-store get cert cephadm_root_ca_cert > /etc/pki/ca-trust/source/anchors/cephadm-root-ca.crt
# update-ca-trust

After updating the trusted store, we can test again:

# curl  https://s3.cephlab.com
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID></Owner><Buckets></Buckets></ListAllMyBucketsResult>

This confirms that the SSL/TLS self-signed certificate configuration works correctly and that the RGW service is accessible using HTTPS. As you can see, we have configured our DNS subdomain s3.cephlab.com and the wildcard *.s3.cephlab.com to point to our VIP address 192.168.122.152. Also, it's important to mention that you can have more than one VIP address configured so not all the traffic goes through a single haproxy LB node; when using a list of VIP IPs, you need to use the option: virtual_ips_list

Conclusion ¶

These new features in the cephadm orchestrator represents significant steps forward in making Ceph RGW deployments more accessible, secure, and production-ready. By automating complex configurations—such as SSL/TLS encryption, virtual host bucket access, multisite replication, and erasure coding administrators can now deploy an RGW setup ready for production with minimal manual intervention.

For further details on the Squid release, check Laura Flores' blog post

Note that some features described here may not be available before the Squid 19.2.2 release.

Footnote ¶

The authors would like to thank IBM for supporting the community by facilitating our time to create these posts.