January 17, 2017

OpenStack Swift: Scalable and Durable Object Storage

christian-schwede-lce.png

Christian Schwede LinuxCon
In his LinuxCon Europe talk, Christian Schwede from Red Hat talked about how Swift is deployed at large enterprise companies.

The goal of OpenStack Swift is modeled after Alpine swift birds that can stay in the air for months at a time without coming down. These birds even eat and drink while flying. Not unlike the birds, OpenStack Swift is designed for maximum uptime to be able to serve data to your users all the time without stopping, even if parts of your cluster are down. With Swift, you should still be able to store new data and even to upgrade your cluster in production without downtime. 

In his LinuxCon Europe talk, Christian Schwede from Red Hat talked about how Swift is deployed at large enterprise companies with many of these deployments operating on a scale of multiple petabytes. The biggest one is at Rackspace, the original founders of the project, where they are running more than a 100 petabyte system with the second biggest one at OVH, a French hosting provider.

Swift’s highly available, durable, and scalable object storage provides the ability to retrieve existing data and store new data even when part of your cluster fails by replicating your data across a variety of servers, zones, and regions to help you distribute your data to different disks, servers, power supplies, buildings, data centers, and geographical areas. There are also several checks in place to help make sure that the data is properly stored and hasn’t disappeared or degraded over time. Schwede mentioned that one method is via the checksum that was computed by Swift and stored along with your object. If one object isn’t valid, it can return a replica so that only a good object is returned. When it finds a bad copy, Swift provides the ability to replace it with a valid replicated object.

While Swift provides the tools to manage your data replication, you still need an operator to help Swift decide where to store your data and when to create new copies. Schwede provided this example: if a storage node goes missing, Swift doesn’t know if this is routine maintenance where the node will re-appear in a few minutes or a disaster that caused total loss of the node. However, Swift still keeps everything balanced and running as smoothly as possible until it has instructions for how to handle the issue.

Schwede went on to talk about the Swift proxy server, which is the gateway to your cluster and how your users access it. The proxy server has built-in middleware for things like container sync, bulk operations, authentication, large objects, and more. However, if there are any missing features, you can also write your own. The last part of Schwede’s talk includes a demo of how to get started using Swift along with a few dos and don’ts for using it.

Watch the full video of this talk for more details and the demo!

Interested in speaking at Open Source Summit North America on September 11 - 13? Submit your proposal by May 6, 2017. Submit now>>
Not interested in speaking but want to attend? Linux.com readers can register now with the discount code, LINUXRD5, for 5% off the all-access attendee registration price. Register now to save over $300!

Click Here!