OpenStack Object Storage is Great For…

Soon, the OpenStack Object Storage software will be released. It’s available now as a Developer Preview if you would like to contribute, or perhaps if you’re just curious. The first release is expected later this month. This is a fantastic piece of software that really hits the mark for scalability, high availability, and performance.

About OpenStack Object Storage
OpenStack Object Storage was originally developed by Rackspace, and was released as Open Source Software earlier this year as part of the OpenStack Project. It was written for hosting the Rackspace Cloud Files service. It’s original project code name was swift, so you may see references to that in various documentation.

OpenStack Object Storage aggregates commodity servers to work together in clusters for reliable, redundant, and large-scale storage of static objects. Objects are written to multiple hardware devices in the datacenter, with the OpenStack software responsible for ensuring data replication and integrity across the cluster. Storage clusters can scale horizontally by adding new nodes, which are automatically configured. Should a node fail, OpenStack works to replicate its content from other active nodes. Because OpenStack uses software logic to ensure data replication and distribution across different devices, inexpensive commodity hard drives and servers can be used in lieu of more expensive equipment. [1]

The system uses a flat namespace, and has a concept an account (how you access the system), a container (like a directory) and an object (like a file). You can have an arbitrary number accounts each with an arbitrary number of containers. Each container can hold an arbitrary number of objects.

OpenStack Object Storage is very good for is storing unstructured data using an object name as a lookup key (like a filename). You access your data from a web client using the web service REST API, not like a filesystem. Download an object (like a file) using an HTTP GET request, fetch object metadata with an HTTP HEAD request, delete an object with an HTTP DELETE request, etc. There are multiple language bindings so you can access your files in OpenStack Object Storage from your favorite language natively (Java, Python, Perl, PHP, .NET, etc.).

The system has no central point of failure, so it’s extremely fault tolerant, and the data and related metadata are distributed throughout the system, so there are no central scalability constraints. You can store arbitrary amounts of data in the system in both large and small sizes. It performs very well, even under very high levels of concurrency. It keeps multiple replicas of each object, so it’s reliable, and the storage is very durable, without any expensive hardware. You don’t need any RAID on any of the servers unless you want it for additional performance.

Use OpenStack Object Storage For…
Here are some good use cases for OpenStack Object Storage:

Storing media libraries (photos, music, videos, etc.)
Archiving video surveillance files
Archiving phone call audio recordings
Archiving compressed log files
Archiving backups (<5GB each object)
Storing and loading of OS Images, etc.
Storing file populations that grow continuously on a practically infinite basis.
Storing small files (<50 KB). OpenStack Object Storage is great at this.
Storing billions of files.
Storing Petabytes (millions of Gigabytes) of data.
Recognize the Limitations
Objects must be <5GB

This is an arbitrary size limit, but it can not be set to an unlimited value because of the system design. If you want to store a backup something larger than 5GB, you’ll need to have a way of breaking it up into chunks, and storing some manifest of the parts so you can later join them back together again when you want to download the data and use it again.

Not a Filesystem

Uses a REST API, or a language binding that consumes the REST API. It does not use the typical POSIX filesystem semantics like open(), read(), write(), seek(), and close().

No User Quotas

There are no maximums that can be configured on a per-user basis to limit how much storage is used.

No Directory Hierarchies

You can create an arbitrary number of containers, but there is no nested container capability. You can simulate a directory structure using creative object names, but this is limited to a maximum string length. If you only need a shallow hierarchy, or don’t have long directory names, this might be fine. Just remember that I warned you this is generally a bad idea.

No writing to a byte offset in a file

The only way to update a file is to essentially overwrite it. The system creates a new version of an object each time you upload one with the same name.

No ACL’s

Per-Container ACL’s will probably be added in a later release. Per-Object ACL’s will probably not be supported, but maybe.

No Append Support

It’s possible that this may be added at a later time using a versioning trick.

No File Locking

Most filesystems integrate with the kernel to offer advisory locking. This is not possible with OpenStack Object Storage.

Eventual Consistency

Don’t expect version consistency between multiple nodes when data is being updated.

If you upload a new version of an object, and immediately GET that object from another client, you may get a previous version of the file. There is no way to know which version of a given object the system is responding with, unless you set version metadata on each object yourself. If there is any problem with the network, you may get outdated versions of objects, or be able to see objects that were deleted, but the local node may not yet know are deleted.

No Support for Data Encryption

You must encrypt the data yourself. The current version does not have SSL support either. Use an SSL proxy to work around this by terminating the SSL sessions on the same network where the OpenStack Object Storage system runs.

Not Compatible With Web Browsers

You must supply a storage token header to authorize each request. Regular web browsers can’t do this. This can be solved using a proxy between the client and the system to handle token authentication. This is not a problem is you are using one of the language bindings. They will take care of this when you integrate your web app with the system.

Not a Database

It supports no querying or processing of data on the servers. All you can do is list the objects within a given container. There is no way to search based on object metadata. You need to keep your own external search indexes.

Don’t try to frequently update large objects.

All updates produce a new version of an object, because objects are immutable.

Don’t store unlimited objects per container

You can store as many objects in a container as you wish. However, your per-object upload latency will increase considerably one you reach a certain point. I found the optimal number of objects per container to be just under one million. This number will vary depending on your equipment, and how heavy of a workload it’s subjected to.

Changing Swift Into a Filesystem

You might think of using FUSE to access objects and containers in OpenStack Object Storage as files and directories with a filesystem interface, but you’ll quickly discover that this is only really good for very simple use cases. Most of the things you need to implement what we think of as a filesystem are missing.

If you are a developer, and you are thinking of building a filesystem on top of OpenStack Object Storage using objects as blocks, that could possibly work, but would probably not perform very well compared to existing alternatives that are actually designed for distributed block storage. The blocks would need to be pretty large to keep the network/protocol overhead down. Frequent writing is not likely to work well. Most users of filesystems are not expecting eventual consistency behavior. They want strong data consistency. You would also want some strategy to handle read/write concurrency with some locking capability. Plus, you would need to have a way to keep track of the blocks like a filesystem does in some data structure or database. Frankly speaking, OpenStack Object Storage is probably not the right tool for the job.

Conclusion

You should probably only use OpenStack Object Storage for use cases it’s intended for. If what you really want is a clustered filesystem, you’re probably better off looking at other solutions like Lustre, GlusterFS, GFS, OCFS, etc. Keep in mind that each of these have their own strengths and weaknesses. Pay particular attention to what they are designed for, and use them accordingly. If you want to use OpenStack Object Storage for something that it was designed for, then you will probably be very happy with it. Keep in mind that it’s a blob storage system. It’s not a filesystem, not a file server, not a database, etc. To learn more about OpenStack Object Storage, please check out the Developer Documentation.

Cloud, Development, LinuxOpenStack, swift
You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

你可能感兴趣的:(OpenStack Object Storage is Great For…)