最近在用etcd做分布式服务的存储系统时,需要用到加锁的功能来避免不同进程对同一文件进行读写的竞争问题,查了下文档,发现官方版本的lock model相关的API早在0.4版本就已废弃,不知道为什么新版本一直没有添加进去,查了下资料,github上有个python-etcd的客户端,封装了etcd的api,并额外实现了lock模块,下来大致的看了下,其中分布式锁的设计参考了zookeeper,很有启发,并在其基础上稍微做了修改,现在把该实现思路整理整理,学习备忘。
/_locks
来保存所有的锁,作为所有所的根目录 /_locks/key1
和/_locks/key2
两个目录,同时生成唯一的uuid作为自身标识按序列写入锁目录(注意源码acquire
函数的实现中,write方式为apeend
) 举个例子:etcd是K/V存储系统,设有下列K,V对:
key: /_locks/file1/000000002 vaule: 123456
这说明当前对文件file1
进行了加锁,锁的id(uuid)为123456,000000002
是etcd post方法自动生成的递增序列号,新的锁请求到来时,假设新的锁对象uuid
为 123455,则新插入一条记录
key: /_locks/file1/000000003 vaule: 123455
然后检索/_locks/file1/
下所有子节点,获得最小的子节点和离自己最近的子节点,该例子中都是00000002
,获取的最小子节点不等于自己的序列号,则监视离自己最近的子节点变化,直到最小子节点等于自己的序列号,获得锁。释放锁时直接删除自己序列号所对应节点。序列号需要在插入时由etcd生成,没有序列号的锁对象释放锁时需要通过递归查询所有指定锁目录下的节点,获取所有的 uuid
,通过uuid
来取得key值进行删除。
系统实现的关键点是保证请求锁的进程在写文件时的key值能够按照申请顺序生成,etcd的api 提供了POST方法来在指定目录下自动生成一个当前最大的值作为key,保证了分布式锁的获得顺序与申请顺序相同,同时也提供了watch方法去监视指定key值变化。这两点保证了分布式锁的设计。
另外分布式锁与系统锁有一点不同需要注意的是分布式锁写在文件中,程序崩溃会不会自动释放,要注意设置好文件(锁)存活时间,防止程序出错后不用的锁长时间存在于系统中造成麻烦。
在实际项目的应用中,使用uuid来表示锁其实并不是特别方便,限制了加锁和释放锁必须是同一个对象,同时用uuid4的方法生成的随机值有概率重复,不是完全的可靠,在实际应用中使用了结点名称来代替uuid,对代码做了改进,外部看来加锁解锁的操作以分布式系统中的每个结点为基本单位,结点内部表示为一个请求队列,比较好的解决了原有实现的不足。
原始实现相关代码如下:
import logging
import etcd
import uuid
_log = logging.getLogger(__name__)
class Lock(object):
"""
Locking recipe for etcd, inspired by the kazoo recipe for zookeeper
"""
def __init__(self, client, lock_name):
self.client = client
self.name = lock_name
# props to Netflix Curator for this trick. It is possible for our
# create request to succeed on the server, but for a failure to
# prevent us from getting back the full path name. We prefix our
# lock name with a uuid and can check for its presence on retry.
self._uuid = uuid.uuid4().hex
self.path = "/_locks/{}".format(lock_name)
self.is_taken = False
self._sequence = None
_log.debug("Initiating lock for %s with uuid %s", self.path, self._uuid)
@property
def uuid(self):
"""
The unique id of the lock
"""
return self._uuid
@uuid.setter
def set_uuid(self, value):
old_uuid = self._uuid
self._uuid = value
if not self._find_lock():
_log.warn("The hand-set uuid was not found, refusing")
self._uuid = old_uuid
raise ValueError("Inexistent UUID")
@property
def is_acquired(self):
"""
tells us if the lock is acquired
"""
if not self.is_taken:
_log.debug("Lock not taken")
return False
try:
self.client.read(self.lock_key)
return True
except etcd.EtcdKeyNotFound:
_log.warn("Lock was supposedly taken, but we cannot find it")
self.is_taken = False
return False
def acquire(self, blocking=True, lock_ttl=3600, timeout=0):
"""
Acquire the lock.
:param blocking Block until the lock is obtained, or timeout is reached
:param lock_ttl The duration of the lock we acquired, set to None for eternal locks
:param timeout The time to wait before giving up on getting a lock
"""
# First of all try to write, if our lock is not present.
if not self._find_lock():
_log.debug("Lock not found, writing it to %s", self.path)
res = self.client.write(self.path, self.uuid, ttl=lock_ttl, append=True)
self._set_sequence(res.key)
_log.debug("Lock key %s written, sequence is %s", res.key, self._sequence)
elif lock_ttl:
# Renew our lock if already here!
self.client.write(self.lock_key, self.uuid, ttl=lock_ttl)
# now get the owner of the lock, and the next lowest sequence
return self._acquired(blocking=blocking, timeout=timeout)
def release(self):
"""
Release the lock
"""
if not self._sequence:
self._find_lock()
try:
_log.debug("Releasing existing lock %s", self.lock_key)
self.client.delete(self.lock_key)
except etcd.EtcdKeyNotFound:
_log.info("Lock %s not found, nothing to release", self.lock_key)
pass
finally:
self.is_taken = False
def __enter__(self):
"""
You can use the lock as a contextmanager
"""
self.acquire(blocking=True, lock_ttl=0)
def __exit__(self, type, value, traceback):
self.release()
def _acquired(self, blocking=True, timeout=0):
locker, nearest = self._get_locker()
self.is_taken = False
if self.lock_key == locker:
_log.debug("Lock acquired!")
# We own the lock, yay!
self.is_taken = True
return True
else:
self.is_taken = False
if not blocking:
return False
# Let's look for the lock
watch_key = nearest
_log.debug("Lock not acquired, now watching %s", watch_key)
t = max(0, timeout)
while True:
try:
r = self.client.watch(watch_key, timeout=t)
_log.debug("Detected variation for %s: %s", r.key, r.action)
return self._acquired(blocking=True, timeout=timeout)
except etcd.EtcdKeyNotFound:
_log.debug("Key %s not present anymore, moving on", watch_key)
return self._acquired(blocking=True, timeout=timeout)
except etcd.EtcdException:
# TODO: log something...
pass
@property
def lock_key(self):
if not self._sequence:
raise ValueError("No sequence present.")
return self.path + '/' + str(self._sequence)
def _set_sequence(self, key):
self._sequence = key.replace(self.path, '').lstrip('/')
def _find_lock(self):
if self._sequence:
try:
res = self.client.read(self.lock_key)
self._uuid = res.value
return True
except etcd.EtcdKeyNotFound:
return False
elif self._uuid:
try:
for r in self.client.read(self.path, recursive=True).leaves:
if r.value == self._uuid:
self._set_sequence(r.key)
return True
except etcd.EtcdKeyNotFound:
pass
return False
def _get_locker(self):
results = [res for res in
self.client.read(self.path, recursive=True).leaves]
if not self._sequence:
self._find_lock()
l = sorted([r.key for r in results])
_log.debug("Lock keys found: %s", l)
try:
i = l.index(self.lock_key)
if i == 0:
_log.debug("No key before our one, we are the locker")
return (l[0], None)
else:
_log.debug("Locker: %s, key to watch: %s", l[0], l[i-1])
return (l[0], l[i-1])
except ValueError:
# Something very wrong is going on, most probably
# our lock has expired
raise etcd.EtcdLockExpired(u"Lock not found")
python-etcd 作者github 地址: python-etcd
相关源码位于 lock.py
中