一个Ceph客户端,通过librados直接与OSD交互,来存储和取出数据。为了与OSD交互,客户端应用必须直接调用librados,连接一个Ceph Monitor。一旦连接好以后,librados会从Monitor处取回一个Cluster map。当客户端的应用想读或者取数据的时候,它会创建一个I/O上下文并且与一个pool绑定。通过这个I/O上下文,客户端将Object的名字提供给librados,然后librados会根据Object的名字和Cluster map计算出相应的PG和OSD的位置。然后客户端就可以读或者写数据。客户端的应用无需知道这个集群的拓扑结构。
Ceph Storage Cluster handle封装了如下客户端配置,包括:
1. 给rados_create()使用的User ID和给rados_create2()使用的user name,其中后者更受推荐
2. Cephx authentication key
3. Monitor的ID和IP地址
4. 日志级别
5. 调试级别
所以,从你的app使用集群的步骤如下:1. 创建一个cluster handle,它将被用来连接集群;2. 使用这个handle来连接集群。为了连接集群,这个app必须提供一个monitor地址,一个用户名,还有一个authentication key。
步骤一 配置Ceph cluster handle
那么到底该如何去配置一个handle?Rados提供了一系列方法让你来设置所需要的设置的值。对于monitor和加密key的设置,一个简单的方法就是在ceph.conf文件中设置monitor的地址和keyring的path。举个例子:
[global]
mon host = 192.168.1.155
keyring = /etc/ceph/ceph.client.admin.keyring
一旦你创造了一个handle,那么可以通过读取ceph的配置文件来配置handle,也可以通过读取命令行参数或者环境参数来配置handle。
当连接好之后,客户端的app可以调用很多函数来影响整个集群,比如可以做以下事情:
* Get cluster statistics
* Use Pool Operation (exists, create, list, delete)
* Get and set the configuration
Ceph有一个非常强大的能力就是绑定不同的pool。每个pool可能拥有不同数量的PG,对象拷贝数量和拷贝策略。举个例子,一个使用SSD的pool可以被设置为hot pool来存储热数据,一个使用HDD的pool可以被设置用来存储冷数据。
下面是一个配置handle并连接ceph集群的C代码示例:
#include <stdio.h>
#include <string.h>
#include <rados/librados.h>
int main (int argc, char argv**)
{
/* Declare the cluster handle and required arguments. */
rados_t cluster;
char cluster_name[] = "ceph";
char user_name[] = "client.admin";
uint64_t flags;
/* Initialize the cluster handle with the "ceph" cluster name and the "client.admin" user */
int err;
err = rados_create2(&cluster, cluster_name, user_name, flags);
if (err < 0) {
fprintf(stderr, "%s: Couldn't create the cluster handle! %s\n", argv[0], strerror(-err));
exit(EXIT_FAILURE);
} else {
printf("\nCreated a cluster handle.\n");
}
/* Read a Ceph configuration file to configure the cluster handle. */
err = rados_conf_read_file(cluster, "/etc/ceph/ceph.conf");
if (err < 0) {
fprintf(stderr, "%s: cannot read config file: %s\n", argv[0], strerror(-err));
exit(EXIT_FAILURE);
} else {
printf("\nRead the config file.\n");
}
/* Read command line arguments */
err = rados_conf_parse_argv(cluster, argc, argv);
if (err < 0) {
fprintf(stderr, "%s: cannot parse command line arguments: %s\n", argv[0], strerror(-err));
exit(EXIT_FAILURE);
} else {
printf("\nRead the command line arguments.\n");
}
/* Connect to the cluster */
err = rados_connect(cluster);
if (err < 0) {
fprintf(stderr, "%s: cannot connect to cluster: %s\n", argv[0], strerror(-err));
exit(EXIT_FAILURE);
} else {
printf("\nConnected to the cluster.\n");
}
}
步骤二 创建I/O上下文并应用
当你的应用有一个cluster handle之后并且连接上了Ceph集群之后,你可以创建一个I/O上下文来读 或者写数据。一个I/O上下文绑定到一个特定pool的连接。I/O上下文的功能包括如下:
1. Write/read data and extended attributes
2. List and iterate over objects and extended attributes
3. Snapshot pools, list snapshots, etc.
Rados允许你同步或者异步地与cluster交互。
来看一段C代码的例子
#include <stdio.h>
#include <string.h>
#include <rados/librados.h>
int main (int argc, const char argv**)
{
/*
* Continued from previous C example, where cluster handle and
* connection are established. First declare an I/O Context.
*/
rados_ioctx_t io;
char *poolname = "data";
err = rados_ioctx_create(cluster, poolname, &io);
if (err < 0) {
fprintf(stderr, "%s: cannot open rados pool %s: %s\n", argv[0], poolname, strerror(-err));
rados_shutdown(cluster);
exit(EXIT_FAILURE);
} else {
printf("\nCreated I/O context.\n");
}
/* Write data to the cluster synchronously. */
err = rados_write(io, "hw", "Hello World!", 12, 0);
if (err < 0) {
fprintf(stderr, "%s: Cannot write object \"hw\" to pool %s: %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nWrote \"Hello World\" to object \"hw\".\n");
}
char xattr[] = "en_US";
err = rados_setxattr(io, "hw", "lang", xattr, 5);
if (err < 0) {
fprintf(stderr, "%s: Cannot write xattr to pool %s: %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nWrote \"en_US\" to xattr \"lang\" for object \"hw\".\n");
}
/*
* Read data from the cluster asynchronously.
* First, set up asynchronous I/O completion.
*/
rados_completion_t comp;
err = rados_aio_create_completion(NULL, NULL, NULL, &comp);
if (err < 0) {
fprintf(stderr, "%s: Could not create aio completion: %s\n", argv[0], strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nCreated AIO completion.\n");
}
/* Next, read data using rados_aio_read. */
char read_res[100];
err = rados_aio_read(io, "hw", comp, read_res, 12, 0);
if (err < 0) {
fprintf(stderr, "%s: Cannot read object. %s %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nRead object \"hw\". The contents are:\n %s \n", read_res);
}
/* Wait for the operation to complete */
rados_wait_for_complete(comp);
/* Release the asynchronous I/O complete handle to avoid memory leaks. */
rados_aio_release(comp);
char xattr_res[100];
err = rados_getxattr(io, "hw", "lang", xattr_res, 5);
if (err < 0) {
fprintf(stderr, "%s: Cannot read xattr. %s %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nRead xattr \"lang\" for object \"hw\". The contents are:\n %s \n", xattr_res);
}
err = rados_rmxattr(io, "hw", "lang");
if (err < 0) {
fprintf(stderr, "%s: Cannot remove xattr. %s %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nRemoved xattr \"lang\" for object \"hw\".\n");
}
err = rados_remove(io, "hw");
if (err < 0) {
fprintf(stderr, "%s: Cannot remove object. %s %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);
rados_shutdown(cluster);
exit(1);
} else {
printf("\nRemoved object \"hw\".\n");
}
}
步骤三 关闭session
当你的APP完成了所有要完成的工作以后,这个app应该能够关闭连接并且关闭这个handle。对于异步I/O操作,app还应该保证所有的异步操作已经完成。
rados_ioctx_destroy(io);
rados_shutdown(cluster);