This document describes the format of the entries in XenStore, how and what they're used for, and how third-party apps should use XenStore as a management interface.
XenStore is a hierarchical namespace (similar to sysfs or Open Firmware) which is shared between domains. The interdomain communication primitives exposed by Xen are very low-level (virtual IRQ and shared memory). XenStore is implemented on top of these primitives and provides some higher level operations (read a key, write a key, enumerate a directory, notify when a key changes value).
XenStore is a database, hosted by domain 0, that supports transactions and atomic operations. It's accessible by either a Unix domain socket in Domain-0, a kernel-level API, or an ioctl interface via /proc/xen/xenbus. XenStore should always be accessed through the functions defined in <xs.h>. XenStore is used to store information about the domains during their execution and as a mechanism of creating and controlling Domain-U devices.
XenBus is the in-kernel API used by virtual IO drivers to interact with XenStore.
There are three main paths in XenStore :
/vm - stores configuration information about domain
/local/domain - stores information about the domain on the local node (domid, etc.)
/tool - stores information for the various tools
The /vm path stores configuration information for a domain. This information doesn't change and is indexed by the domain's UUID. A /vm entry contains the following information:
ssidref - ssid reference for domain
uuid - uuid of the domain (somewhat redundant)
on_reboot - the action to take on a domain reboot request (destroy or restart)
on_poweroff - the action to take on a domain halt request (destroy or restart)
on_crash - the action to take on a domain crash (destroy or restart)
vcpus - the number of allocated vcpus for the domain
memory - the amount of memory (in megabytes) for the domain Note: appears to sometimes be empty for domain-0
vcpu_avail - the number of active vcpus for the domain (vcpus - number of disabled vcpus)
name - the name of the domain
The image path is only available for Domain-Us and contains:
ostype - identifies the builder type (linux or vmx)
kernel - path to kernel on domain-0
cmdline - command line to pass to domain-U kernel
ramdisk - path to ramdisk on domain-0
The /local path currently only contains one directory, /local/domain that is indexed by domain id. It contains the running domain information. The reason to have two storage areas is that during migration, the uuid doesn't change but the domain id does. The /local/domain directory can be created and populated before finalizing the migration enabling localhost=>localhost migration.
This path contains:
cpu_time - xend start time (this is only around for domain-0)
handle - private handle for xend
name - see /vm
on_reboot - see /vm
on_poweroff - see /vm
on_crash - see /vm
vm - the path to the VM directory for the domain
domid - the domain id (somewhat redundant)
running - indicates that the domain is currently running
memory/ - a directory for memory information
target - target memory size for the domain (in kilobytes)
cpu - the current CPU the domain is pinned to (empty for domain-0?)
cpu_weight - the weight assigned to the domain
vcpu_avail - a bitmap telling the domain whether it may use a given VCPU
online_vcpus - how many vcpus are currently online
vcpus - the total number of vcpus allocated to the domain
console/ - a directory for console information
ring-ref - the grant table reference of the console ring queue
port - the event channel being used for the console ring queue (local port)
tty - the current tty the console data is being exposed of
limit - the limit (in bytes) of console data to buffer
backend/ - a directory containing all backends the domain hosts
vbd/ - a directory containing vbd backends
<domid>/ - a directory containing vbd's for domid
<virtual-device>/ - a directory for a particular virtual-device on domid
frontend-id - domain id of frontend
frontend - the path to the frontend domain
physical-device - backend device number
sector-size - backend sector size
sectors - backend number of sectors
info - device information flags. 1=cdrom, 2=removable, 4=read-only
domain - name of frontend domain
params - parameters for device
type - the type of the device
dev - frontend virtual device (as given by the user)
node - backend device node (output from block creation script)
hotplug-status - connected or error (output from block creation script)
state - communication state across XenBus to the frontend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed
vif/ - a directory containing vif backends
<domid>/ - a directory containing vif's for domid
<vif number>/ - a directory for each vif
frontend-id - the domain id of the frontend
frontend - the path to the frontend
mac - the mac address of the vif
bridge - the bridge the vif is connected to
handle - the handle of the vif
script - the script used to create/stop the vif
domain - the name of the frontend
hotplug-status - connected or error (output from block creation script)
state - communication state across XenBus to the frontend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed
device/ - a directory containing the frontend devices for the domain
vbd/ - a directory containing vbd frontend devices for the domain
<virtual-device>/ - a directory containing the vbd frontend for virtual-device
virtual-device - the device number of the frontend device
device-type - the device type ("disk", "cdrom", "floppy")
backend-id - the domain id of the backend
backend - the path of the backend in the store (/local/domain path)
ring-ref - the grant table reference for the block request ring queue
event-channel - the event channel used for the block request ring queue
state - communication state across XenBus to the backend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed
vif/ - a directory containing vif frontend devices for the domain
<id>/ - a directory for vif id frontend device for the domain
backend-id - the backend domain id
mac - the mac address of the vif
handle - the internal vif handle
backend - a path to the backend's store entry
tx-ring-ref - the grant table reference for the transmission ring queue
rx-ring-ref - the grant table reference for the receiving ring queue
event-channel - the event channel used for the two ring queues
state - communication state across XenBus to the backend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed
device-misc/ - miscellanous information for devices
vif/ - miscellanous information for vif devices
nextDeviceID - the next device id to use
store/ - per-domain information for the store
port - the event channel used for the store ring queue
ring-ref - the grant table reference used for the store's communication channel
image - private xend information
The XenStore interface provides transaction based reads and writes to points in the xenstore hierarchy. Watches can be set at points in the hierarchy and an individual watch will be triggered when anything at or below that point in the hierachy changes. A watch is registered with a callback function and a "token". The "token" can be a pointer to any piece of data. The callback function is invoked with the of the changed node and the "token".
The interface is centered around the idea of a central polling loop that reads watches, providing the path, callback, and token, and invoking the callback.
These code snippets should provide a helpful starting point.
struct xs_handle *xs;
xs_transaction_t th;
char *path;
int fd;
fd_set set;
int er;
struct timeval tv = {.tv_sec = 0, .tv_usec = 0 };
char **vec;
unsigned int num_strings;
char * buf;
unsigned int len;
/* Get a connection to the daemon */
xs = xs_daemon_open();
if ( xs == NULL ) error();
/* Get the local domain path */
path = xs_get_domain_path(xs, domid);
if ( path == NULL ) error();
/* Make space for our node on the path */
path = realloc(path, strlen(path) + strlen("/mynode") + 1);
if ( path == NULL ) error();
strcat(path, "/mynode");
/* Create a watch on /local/domain/%d/mynode. */
er = xs_watch(xs, path, "mytoken");
if ( er == 0 ) error();
/* We are notified of read availability on the watch via the
* file descriptor.
*/
fd = xs_fileno(xs);
while (1)
{
/* TODO (TimPost), show a simpler example with poll()
* in a modular style, using a simple callback. Most
* people think 'inotify' when they see 'watches'. */
FD_ZERO(&set);
FD_SET(fd, &set);
/* Poll for data. */
if ( select(fd + 1, &set, NULL, NULL, &tv) > 0
&& FD_ISSET(fd, &set))
{
/* num_strings will be set to the number of elements in vec
* (typically, 2 - the watched path and the token) */
vec = xs_read_watch(xs, &num_strings);
if ( !vec ) error();
printf("vec contents: %s|%s\n", vec[XS_WATCH_PATH],
vec[XS_WATCH_TOKEN]);
/* Prepare a transaction and do a read. */
th = xs_transaction_start(xs);
buf = xs_read(xs, th, vec[XS_WATCH_PATH], &len);
xs_transaction_end(xs, th);
if ( buf )
{
printf("buflen: %d\nbuf: %s\n", len, buf);
}
/* Prepare a transaction and do a write. */
th = xs_transaction_start(xs);
er = xs_write(xs, th, path, "somestuff", strlen("somestuff"));
xs_transaction_end(xs);
if ( er == 0 ) error();
}
}
/* Cleanup */
close(fd);
xs_daemon_close(xs);
free(path);
1 # xsutil provides access to xshandle() which allows you to use something closer to the C-style API,
2 # however it does not support polling in the same manner.
3 from xen . xend . xenstore . xsutil import *
4 # xswatch provides a callback interface for the watches. I similar interface exists for C within xenbus.
5 from xen . xend . xenstore . xswatch import *
6 xs = xshandle ( ) # From xsutil
7 path = xs . get_domain_path ( ) + "/mynode"
8 # Watch functions take the path as the first argument
9 # all other arguments that are passed via the xswatch are also included.
10 def watch_func ( path , xs ) :
11 # Read the data
12 th = xs . transaction_start ( )
13 buf = xs . read ( th , path )
14 xs . transaction_end ( th )
15 log . info ( "Got %s" % buf )
16 # Write back
17 th = xs . transaction_start ( )
18 xs . write ( th , path , "somestuff" )
19 xs . transaction_end ( th )
20 mywatch = xswatch ( path , xs )
You can use direct Read/Write or gather calls via xstransact.
By default the python xsutil.xshandle() is a shared global handle. xswatch uses this handle with a blocking read_watch call. Because the read_watch function is protected by a per-handle mutex, multiple calls will be interleaved and you probably do not want this behavior. If you would like a blocking mechanism, you might consider introducing a semaphore in the callback function that can be used to block code execution. You need to be sure to handle failure cases and not block indefinitely. For instance, the "@releaseDomain" watch will be triggered on domain destruction for watches within the /local/domain/* trees.
It is also possible -- currently indirectly -- to get a fresh XenStore handle within python and block on read_watch in the main execution path. This may be necessary if you want to block waiting for a XenStore node value in a code path initialed by an xswatch callback.
N.B.: Changes subject to http://wiki.xensource.com/xenwiki/XenStoreReference