Ceph is a storage system designed to be used at scale, with clusters of Ceph in deployment in excess of 40 petabytes today. At LinuxCon Europe, Allen Samuels, Engineering Fellow at Western Digital, says that Ceph has been proven to scale out reasonably well. Samuels says, “the most important thing that a storage management system does in the clustered world is to give you availability and durability,” and much of the technology in Ceph focuses on controlling the availability and the durability of your data. In his presentation, Samuels talks not just about some of the performance advantages to deploying Ceph on Flash, but he also goes into detail about what they are doing to optimize Ceph in future releases.
The most common way that people use Flash with Ceph today is to put the journal on Flash. Samuels mentions that this “significantly improves your write latencies because the first thing that Ceph is going to do, is to take your transaction and put it into the journal. By putting the journal on Flash, you’re able to get high performance and short latency.” Another option is to put the key-value store, the metadata, on Flash, but you may or may not get much of a performance improvement depending on your specific usage. In some cases, where you have very small objects, moving the metadata to Flash can have a significant benefit, but for larger objects, you may get very little improvement.
“Over the last couple of years … we’ve developed, together with the community, about a 15X performance boost,” Samuels says. Unfortunately, they’ve reached the state where they need to break compatibility to make additional future improvements, because the basic architecture of FileStore has become an issue. Samuels outlines a number of specific issues with FileStore, which can be found in the video of his talk, but the key takeaway is that it is being replaced by BlueStore. The good news is that for now they can be intermixed within a cluster, so your new nodes can be set up to use BlueStore without breaking or needing to upgrade your existing FileStore nodes. However, Samuels points out that “if you update your software to the latest version and you expect it to suddenly start running better, you’ll be a little disappointed,” since you won’t see this improvement until you actively start using BlueStore. It is still under active development, but BlueStore is expected to be at least twice as fast as FileStore for write operations and to outperform FileStore for read operations, too. Some additional functionality that will be coming with BlueStore includes checksums on reads. Currently, Ceph replies on your hardware to provide data integrity, which can be a bit dangerous at scale.
To get more details about how to improve the performance of Ceph using Flash or to hear more about additional improvements coming in future versions of Ceph with BlueStore, watch the video from LinuxCon Europe.