Ceph Days NYC

Bloomberg: Fostering a vibrant Ceph community at Ceph Days NYC

On Tuesday, February 21, 2023, Bloomberg hosted Ceph Days NYC. This day-long event was held at the Bloomberg Park Avenue office in New York City, and was dedicated to sharing knowledge about Ceph. Roughly fifty members of the open source Ceph community came together for a dozen presentations related to Ceph’s roadmap, tools, and use cases.

Speakers included operators, developers, and researchers from Canonical, SoftIron, Bloomberg, IBM, Clyso, and Platina.

Frank Yang of Platina shared the lessons that he learned when he worked in partnership with a major American sports league, using Ceph to archive a large and irreplaceable cache of videotaped live event data that had been compiled over decades.

Federico Lucifredi and Sage Mctaggart of the Ceph Storage team at IBM discussed and explained various methods of hardening Ceph storage.

Bloomberg was especially pleased to host Dan van der Ster, who is CERN’s Data and Architecture lead and a member of the team that won the Nobel Prize for discovering the Higgs boson. During his talk, van der Ster reviewed the past decade of his team’s use of Ceph at CERN. He provided a wide-ranging account of the challenges of integrating Ceph into research lab hardware and working through bugs in CERN’s compression library, and he recounted the story of one particularly harrowing eight-hour day when he learned the importance of spreading data across multiple Ceph clusters.

Finally, event attendees joined a networking reception dedicated to fostering the Ceph community.

“Running stuff at scale is a complex task,” van der Ster said at the conclusion of his talk, “and it’s important to collaborate with other organizations and individuals who have experience.” We couldn’t agree more.

We look forward to upcoming opportunities to bring together this varied and dedicated group of Ceph users to learn from each other and to discuss and shape the future of this open source project.

Bringing Ceph to NYC!

Come find out why leading enterprises are adopting Ceph, why Ceph is the lowest cost per gig storage solution, and how easy it is to deploy your own Ceph cluster!

Event description

A full-day event dedicated to sharing Ceph’s transformative power and fostering the vibrant Ceph community.

The expert Ceph team, Ceph’s customers and partners, and the Ceph community join forces to discuss things like the status of the Ceph project, recent Ceph project improvements and roadmap, and Ceph community news. The day ends with a networking reception, to foster more Ceph learning.

Space is limited, so register soon.

Videos

Join the Ceph announcement list, or follow Ceph on social media for updates:

Important Dates

  • CFP Opens: 2022-12-01
  • CFP Closes: 2023-01-01
  • Speakers receive confirmation of acceptance: 2023-01-16
  • Schedule Announcement: 2023-01-23
  • Event Date: 2023-02-21

Hotel Recommendations

NameLocationWebsite
Andaz 5th Avenuetwo blocks from the officeAndaz 5th Avenue Website
Hyatt Grand Centralaround the corner from the officeHyatt Grand Central Website
Library Hotelone block from the officeLibrary Hotel Website
The Westin New York Grand Centrala couple of blocks from the officeThe Westin New York Grand Central Website
The Kitano Hotel New YorkPark Avenue and 38th StreetThe Kitano Hotel New York Website

Schedule

TimeAbstractSpeaker
9:00 AMWelcoming
Bloomberg
9:10 AMCommunity Update
Mike Perez
Ceph Foundation / IBM
9:15 AMState of the Cephalopod

In this talk, we'll provide an update on the state of the Ceph upstream project, recent development efforts, current priorities, and community initiatives. We will share details of features released across components in the latest Ceph release, Quincy, and explain how this release is different from previous Ceph releases. The talk will also provide a sneak peek into features being planned for the next Ceph release, Reef.

Neha Ojha & Josh Durgin
IBM
9:45 AMNVMe-over-Fabrics support for Ceph

NVMe-over-Fabrics (NVMeoF) is a widely adopted, de facto standard in remote block storage access. Ceph clients use the RADOS protocol to access RBD images, but there are good reasons to enable access via NVMeoF: to allow existing NVMeoF storage users to easily migrate to Ceph and to enable the use of NVMeoF offloading hardware. This talk presents our effort to provide native NVMeoF support for Ceph. We discuss some of the challenges, including multi-pathing for fault tolerance and performance.


Jonas Pfefferle
IBM Research
10:15 AMCeph crossing the chasm

The new generation of hybrid cloud provides a common platform across all your cloud, on-premises, and edge environments. That means you can skill once, build once, and manage from a single pane of glass. That also implies platform needs to support diverse workloads and different level of maturity in management skills. In this presentation, we will cover the open source projects and proposals to enhance Ceph's consumability and manageability to enable Ceph in more environments.


Vincent Hsu
IBM
10:45 AMBreak
11:00 AM100 Years of Sports on Ceph

Working together with a major American sports league, we built a multi-site 40 PB active archive housing over 100 years of game video and audio assets by using Ceph as the foundational storage technology. Along the way, we learned many lessons about architecting, deploying, and operationalizing Ceph from the vantage point of a large, modern, and rapidly growing media company. We would like to share our experience and learnings with the community to help others traveling a similar road.

Frank Yang

& Adam Waters

Platina
11:30 AMCeph Telemetry - Observability in Action

To increase product observability and robustness, Ceph’s telemetry module allows users to automatically report anonymized data about their clusters. Ceph’s telemetry backend runs tools that analyze this data to help developers understand how Ceph is used and what problems users may be experiencing. In this session we will overview the various aspects of Ceph’s upstream telemetry and its benefits for users, and explore how telemetry can be deployed independently as a tool for fleet observability.


Yaarit Hatuka
IBM
12:00 PMLunch
1:00 PMWhy We Built A “Message-Driven Telemetry System At Scale” Ceph Cluster

Ceph’s Prometheus module provides performance counter metrics via the ceph-mgr component. While this works well for smaller installations, it can be problematic to put metric workloads into ceph-mgr at scale. Ceph is just one component of our internal S3 product. We also need to gather telemetry data about space, objects per bucket, buckets per tenancy, etc., as well as telemetry from a software-defined distributed quality of service (QoS) system which is not natively supported by Ceph.


Nathan Hoad
Bloomberg
1:30 PMIntroducing Sibench: A New Open Source Benchmarking Tool Optimized for Ceph

Benchmarking Ceph has always been a complex task - there are lots of tools but many have drawbacks and are written for more general-purpose use. For Ceph we need to benchmark Librados, RBD, CephFS, and RGW and each of these protocols has unique challenges and typical deployment scenarios. Not only that, Ceph works better at scale and so we need to ensure that we can build a benchmarking system that will also scale and be able to generate an adequate load at large scale.


Danny Abukalam
SoftIron
2:00 PMOptimizing RGW Object Storage Mixed Media through Storage Classes and Lua Scripting

Ceph enables flexible and scalable object storage of unstructured data for a wide variety of workloads. RGW (RADOS GateWay) deployments experience a wide variety of object sizes and must balance workload, cost, and performance requirements. S3 storage classes are an established way to steer data onto underlying media that meet specific resilience, cost, and performance requirements. One might for example define RGW back end storage classes for SSD or HDD media, non-redundant vs replicated vs erasure coding pools, etc. Diversion of individual objects or entire buckets into a non-default storage class usually requires specific client action. Compliance however can be awkward to request and impossible to enforce, especially in multi-tenant deployments that may include paying customers as well as internal users. This work enables the RGW back end to enforce storage class on uploaded objects based on specific criteria without requiring client actions. For example, one might define a default storage class on performance TLC or Optane media for resource-intensive small S3 objects while assigning larger objects to cost-effective QLC SSD media.


Anthony D'Atri
2:30 PMCeph at CERN: A Ten-Year Retrospective

In 2013, the data storage team at CERN began investigating Ceph to solve an emerging problem: how to provide reliable, flexible, future-proof storage for our growing on-premises OpenStack cloud. Beginning with a humble 3PB cluster, the infrastructure has grown to support the entire lab, with 50PB of storage across multiple data centres used across a variety of use-cases ranging from basic IT apps, databases, HPC, cloud storage, and others.


Dan van der Ster
CERN
3:00 PMBreak
3:15 PMAn Introduction to MicroCeph

Building up a Ceph cluster can be a bit tricky and time consuming, especially if it’s just for testing or a small home lab. To make this much easier, we’ve started working on microceph. It's a snap package that uses a small management daemon that allows for very easy clustering of multiple systems which, combined with an easy bootstrap process, allows for setting up a Ceph cluster in just a few minutes!

Chris MacNaughton
Canonical
3:30 PMSQL on Ceph

Ceph was originally designed to fill a need for a distributed file system within scientific computing environments but has since grown to become a dominant **unified** software-defined distribute storage system. This talk will cover the new development of an SQLite Virtual File System (VFS) on top of Ceph's distributed object store (RADOS). I will show how SQL can now be run on Ceph for both its internal use and for new application storage requirements.


Patrick Donnelly
IBM
4:00 PMData Security and Storage Hardening in Rook and Ceph

We explore the security model exposed by Rook with Ceph, the leading software-defined storage platform of the Open Source world. Digging increasingly deeper in the stack, we examine options for hardening Ceph storage that are appropriate for a variety of threat profiles.


Federico Lucifredi & Sage McTaggart
IBM
4:30 PMDynamic multi-cluster management with Rook for cloud native IaaS providers for the private clouds.

Over the last few years, we have been gaining experience with Rook in production. One of our challenges was to implement dynamic resource management between 50+ Ceph clusters. Kubernetes events dynamically and fully automatically distribute loads and capacity between Ceph clusters. This is done by removing single or multiple Ceph nodes from Ceph clusters while ensuring data integrity at all times. In the next step, the released Ceph nodes are integrated into other Ceph clusters as needed.


Joachim Kraftmayer
Clyso
5:00 PMEvening Event