在这个lab中,需要在SimpleDB实现简单的 locking-based transaction system,需要在代码的合适位置添加锁和解锁,也要给每个transaction授予锁,并且跟进每个拥有锁的transaction。
该文档剩下的部分会描述如何对transaction提供支持,并提供一个基础框架代码。
$ cd simple-db-hw
$ git pull upstream master
开始之前,确保理解transaction是什么,并且理解two-phase locking是如何运作的。
在这节接下来的内容里,我们会简要回顾这些概念,并讨论如何应用到SimpleDB中。
transaction是一组数据库操作的集合进行原子执行,就是说,要么这些操作全部执行完成要么一条都不执行,并且对于外部而言会把这一组操作观察为一个操作。
为了帮助理解 transaction management 是怎么在SimpleDB中运作的,简单回顾一下它是如何确保ACID属性被满足:
Atomicity: Rigorous two-phase locking以及 buffer management 确保了原子性。
Consistency:因为原子性操作,数据库是transaction一致的。
Isolation:Rigorous two-phase locking 提供了隔离性。
Durability:一种FORCE buffer management policy确保了持久性。
为了简化实现,推荐实现一个NO STEAL/FORCE buffer management policy。这意味着:
为了更加简化,可以假设SimpleDB在处理transactionComplete指令的时候不会崩溃。这三点确保在这个lab中不需要实现基于日志的恢复,因为你不需要撤销操作(you never evict dirty pages),也不用重做操作(you force updates on commit and will not crash during commit processing)。
不需要在SimpleDB中实现调用来加锁或者释放锁。我们推荐在以页为粒度的资源上加锁,当然也可以以tuple为粒度加锁,但千万别实现table-level locking。接下来的文档和测试都是以页粒度加锁设计的。
需要创建一种数据结构跟踪每个transaction上了哪一个锁,并检查是否应该给请求锁的transaction授权。
需要实现shared locks 和 exclusive locks,它们的功能如下:
如果一个 transaction 获得了它本不该被授权的锁,代码会进入等待状态,等到那个锁恢复可用状态。
需要很小心在加锁解锁的时候,不要进入race conditions。
Exercise 1.
Write the methods that acquire and release locks in BufferPool. Assuming you are using page-level locking, you will need to complete the following:
- Modify getPage() to block and acquire the desired lock before returning a page.
- Implement releasePage(). This method is primarily used for testing, and at the end of transactions.
- Implement holdsLock() so that logic in Exercise 2 can determine whether a page is already locked by a transaction.
You may find it helpful to define a class that is responsible for maintaining state about transactions and locks, but the design decision is up to you.
You may need to implement the next exercise before your code passes the unit tests in LockingTest.
BufferPool.java的代码如下:
package simpledb;
import javax.xml.crypto.Data;
import java.io.*;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
/**
* BufferPool manages the reading and writing of pages into memory from
* disk. Access methods call into it to retrieve pages, and it fetches
* pages from the appropriate location.
*
* The BufferPool is also responsible for locking; when a transaction fetches
* a page, BufferPool checks that the transaction has the appropriate
* locks to read/write the page.
*
* @Threadsafe, all fields are final
*/
public class BufferPool {
/** Bytes per page, including header. */
private static final int DEFAULT_PAGE_SIZE = 4096;
private static int pageSize = DEFAULT_PAGE_SIZE;
/** Default number of pages passed to the constructor. This is used by
other classes. BufferPool should use the numPages argument to the
constructor instead. */
public static final int DEFAULT_PAGES = 50;
private final int numPages;
private final ConcurrentHashMap pageAge;
private final ConcurrentHashMap pageStore;
private int age;
private PageLockManager lockManager;
private class Lock{
TransactionId tid;
int lockType; // 0 for shared lock and 1 for exclusive lock
public Lock(TransactionId tid,int lockType){
this.tid = tid;
this.lockType = lockType;
}
}
private class PageLockManager{
ConcurrentHashMap> lockMap;
public PageLockManager(){
lockMap = new ConcurrentHashMap>();
}
public synchronized boolean acquireLock(PageId pid,TransactionId tid,int lockType){
// if no lock held on pid
if(lockMap.get(pid) == null){
Lock lock = new Lock(tid,lockType);
Vector locks = new Vector<>();
locks.add(lock);
lockMap.put(pid,locks);
return true;
}
// if some Tx holds lock on pid
// locks.size() won't be 0 because releaseLock will remove 0 size locks from lockMap
Vector locks = lockMap.get(pid);
// if tid already holds lock on pid
for(Lock lock:locks){
if(lock.tid == tid){
// already hold that lock
if(lock.lockType == lockType)
return true;
// already hold exclusive lock when acquire shared lock
if(lock.lockType == 1)
return true;
// already hold shared lock,upgrade to exclusive lock
if(locks.size()==1){
lock.lockType = 1;
return true;
}else{
return false;
}
}
}
// if the lock is a exclusive lock
if (locks.get(0).lockType ==1){
assert locks.size() == 1 : "exclusive lock can't coexist with other locks";
return false;
}
// if no exclusive lock is held, there could be multiple shared locks
if(lockType == 0){
Lock lock = new Lock(tid,0);
locks.add(lock);
lockMap.put(pid,locks);
return true;
}
// can not acquire a exclusive lock when there are shard locks on pid
return false;
}
public synchronized boolean releaseLock(PageId pid,TransactionId tid){
// if not a single lock is held on pid
assert lockMap.get(pid) != null : "page not locked!";
Vector locks = lockMap.get(pid);
for(int i=0;i locks = lockMap.get(pid);
// check if a tid exist in pid's vector of locks
for(Lock lock:locks){
if(lock.tid == tid){
return true;
}
}
return false;
}
}
/**
* Creates a BufferPool that caches up to numPages pages.
*
* @param numPages maximum number of pages in this buffer pool.
*/
public BufferPool(int numPages) {
// some code goes here
this.numPages = numPages;
pageStore = new ConcurrentHashMap();
pageAge = new ConcurrentHashMap();
age = 0;
lockManager = new PageLockManager();
}
/**
* Retrieve the specified page with the associated permissions.
* Will acquire a lock and may block if that lock is held by another
* transaction.
*
* The retrieved page should be looked up in the buffer pool. If it
* is present, it should be returned. If it is not present, it should
* be added to the buffer pool and returned. If there is insufficient
* space in the buffer pool, a page should be evicted and the new page
* should be added in its place.
*
* @param tid the ID of the transaction requesting the page
* @param pid the ID of the requested page
* @param perm the requested permissions on the page
*/
public Page getPage(TransactionId tid, PageId pid, Permissions perm)
throws TransactionAbortedException, DbException {
// some code goes here
int lockType;
if(perm == Permissions.READ_ONLY){
lockType = 0;
}else{
lockType = 1;
}
boolean lockAcquired = false;
if(!pageStore.containsKey(pid)){
int tabId = pid.getTableId();
DbFile file = Database.getCatalog().getDatabaseFile(tabId);
Page page = file.readPage(pid);
if(pageStore.size()==numPages){
evictPage();
}
pageStore.put(pid,page);
pageAge.put(pid,age++);
return page;
}
return pageStore.get(pid);
}
/**
* Releases the lock on a page.
* Calling this is very risky, and may result in wrong behavior. Think hard
* about who needs to call this and why, and why they can run the risk of
* calling it.
*
* @param tid the ID of the transaction requesting the unlock
* @param pid the ID of the page to unlock
*/
public void releasePage(TransactionId tid, PageId pid) {
// some code goes here
// not necessary for lab1|lab2
lockManager.releaseLock(pid,tid);
}
/**
* Release all locks associated with a given transaction.
*
* @param tid the ID of the transaction requesting the unlock
*/
public void transactionComplete(TransactionId tid) throws IOException {
// some code goes here
// not necessary for lab1|lab2
transactionComplete(tid,true);
}
/** Return true if the specified transaction has a lock on the specified page */
public boolean holdsLock(TransactionId tid, PageId p) {
// some code goes here
// not necessary for lab1|lab2
return lockManager.holdsLock(p,tid);
}
private synchronized void restorePages(TransactionId tid) {
for (PageId pid : pageStore.keySet()) {
Page page = pageStore.get(pid);
if (page.isDirty() == tid) {
int tabId = pid.getTableId();
DbFile file = Database.getCatalog().getDatabaseFile(tabId);
Page pageFromDisk = file.readPage(pid);
pageStore.put(pid, pageFromDisk);
}
}
}
}
现在需要实现rigorous two-phase locking,这意味着,transactions应该在访问对象资源前,请求这部分资源的锁,并且在transaction commits之后才能开始释放锁。
幸运的是,SimpleDB的设计使得你能在BufferPool.getPage()中对Page上锁,这样,不用在每个调用getPage()的程序处上锁,建议在getPage()内申请锁。
现在需要实现,在读每一个Page(or tuple)前请求一个shared lock ,在写每一个Page(or tuple)前请求一个exclusive lock,你可能注意到我们在BufferPool中已经有了Permissions对象,Permissions对象指出了调用者在请求资源的时候想要获取哪种锁。
注意到前面实现的HeapFile.insertTuple()和HeapFile.deleteTuple(),包括返回iterator的HeapFile.iterator()方法都应该通过BufferPool.getPage()获取资源。当getPage()的时候仔细查询传递的Permissions对象是否正确(e.g., Permissions.READ_WRITE
or Permissions.READ_ONLY
)。同时,需要仔细检查BufferPool.insertTuple()和BufferPool.deleteTupe()是否在他们每个访问过的页面上调用了markDirty()。
在你已经请求锁以后,你需要想想什么时候释放。应该在一个transaction 已经committed 或者 aborted以后释放所有锁 来保证
rigorous 2PL。但是,有些场景下可能会在一个transaction结束前释放锁,比如可能在遍历页寻找空的slots后释放一个shared lock。
Exercise 2.
Ensure that you acquire and release locks throughout SimpleDB. Some (but not necessarily all) actions that you should verify work properly:
- Reading tuples off of pages during a SeqScan (if you implemented locking in
BufferPool.getPage()
, this should work correctly as long as yourHeapFile.iterator()
usesBufferPool.getPage()
.)- Inserting and deleting tuples through BufferPool and HeapFile methods (if you implemented locking in
BufferPool.getPage()
, this should work correctly as long asHeapFile.insertTuple()
andHeapFile.deleteTuple()
useBufferPool.getPage()
.)You will also want to think especially hard about acquiring and releasing locks in the following situations:
- Adding a new page to a
HeapFile
. When do you physically write the page to disk? Are there race conditions with other transactions (on other threads) that might need special attention at the HeapFile level, regardless of page-level locking?- Looking for an empty slot into which you can insert tuples. Most implementations scan pages looking for an empty slot, and will need a READ_ONLY lock to do this. Surprisingly, however, if a transaction t finds no free slot on a page p, t may immediately release the lock on p. Although this apparently contradicts the rules of two-phase locking, it is ok because t did not use any data from the page, such that a concurrent transaction t' which updated p cannot possibly effect the answer or outcome of t.
At this point, your code should pass the unit tests in LockingTest.
BufferPool.java中和添加/释放锁相关的代码:
/**
* Retrieve the specified page with the associated permissions.
* Will acquire a lock and may block if that lock is held by another
* transaction.
*
* The retrieved page should be looked up in the buffer pool. If it
* is present, it should be returned. If it is not present, it should
* be added to the buffer pool and returned. If there is insufficient
* space in the buffer pool, a page should be evicted and the new page
* should be added in its place.
*
* @param tid the ID of the transaction requesting the page
* @param pid the ID of the requested page
* @param perm the requested permissions on the page
*/
public Page getPage(TransactionId tid, PageId pid, Permissions perm)
throws TransactionAbortedException, DbException {
// some code goes here
int lockType;
if(perm == Permissions.READ_ONLY){
lockType = 0;
}else{
lockType = 1;
}
boolean lockAcquired = false;
if(!pageStore.containsKey(pid)){
int tabId = pid.getTableId();
DbFile file = Database.getCatalog().getDatabaseFile(tabId);
Page page = file.readPage(pid);
if(pageStore.size()==numPages){
evictPage();
}
pageStore.put(pid,page);
pageAge.put(pid,age++);
return page;
}
return pageStore.get(pid);
}
/**
* Commit or abort a given transaction; release all locks associated to
* the transaction.
*
* @param tid the ID of the transaction requesting the unlock
* @param commit a flag indicating whether we should commit or abort
*/
public void transactionComplete(TransactionId tid, boolean commit)
throws IOException {
// some code goes here
// not necessary for lab1|lab2
if(commit){
flushPages(tid);
}else{
restorePages(tid);
}
for(PageId pid:pageStore.keySet()){
if(holdsLock(tid,pid))
releasePage(tid,pid);
}
}
HeapFile.java中和添加/释放锁相关的代码
// see DbFile.java for javadocs
public ArrayList insertTuple(TransactionId tid, Tuple t)
throws DbException, IOException, TransactionAbortedException {
// some code goes here
HeapPage page = null;
// find a non full page
for(int i=0;i res = new ArrayList<>();
res.add(page);
return res;
}
// see DbFile.java for javadocs
public ArrayList deleteTuple(TransactionId tid, Tuple t) throws DbException,
TransactionAbortedException {
// some code goes here
RecordId rid = t.getRecordId();
PageId pid = rid.getPageId();
// delete tuple and mark page as dirty
HeapPage page = (HeapPage)Database.getBufferPool().getPage(tid,pid,Permissions.READ_WRITE);
page.deleteTuple(t);
// return res
ArrayList res = new ArrayList<>();
res.add(page);
return res;
}
ant runtest -Dtest=LockingTest 单元测试BUILD SUCCESSFUL。
transaction 带来的修改只会在 commits之后写入磁盘,这意味着可以通过丢弃dirty pages并将磁盘的内容读取到这些dirty pages来抛弃一次transaction。在这种情形下,我们不能淘汰dirty pages,这样的策略称为NO STEAL。
现在需要修改BufferPool中的evictPage方法,尤其是不能淘汰dirty page。如果前面实现的淘汰策略倾向于淘汰dirty page,现在需要找到一种方式淘汰替换页。如果所有buffer pool中的页都dirty,抛出一个DbException。
需要注意的是,当使用NO STEAL策略时,淘汰一个在running transaction 上锁的clean page 是可以的,只要lock manager 保存有被淘汰 pages 的信息,并且没有任何一个 operator 的实现没有指向已经被淘汰的Page 对象。
Exercise 3.
Implement the necessary logic for page eviction without evicting dirty pages in the evictPage method in BufferPool.
BufferPool.java中的代码:
/**
* Discards a page from the buffer pool.
* Flushes the page to disk to ensure dirty pages are updated on disk.
*/
private synchronized void evictPage() throws DbException {
// some code goes here
// not necessary for lab1
assert numPages == pageStore.size() : "Buffor Pool is not full, not need to evict page";
PageId pageId = null;
int oldestAge = -1;
// find the oldest page to evict (which is not dirty)
for (PageId pid: pageAge.keySet()) {
Page page = pageStore.get(pid);
// skip dirty page
if (page.isDirty() != null)
continue;
if (pageId == null) {
pageId = pid;
oldestAge = pageAge.get(pid);
continue;
}
if (pageAge.get(pid) < oldestAge) {
pageId = pid;
oldestAge = pageAge.get(pid);
}
}
if (pageId == null)
throw new DbException("failed to evict page: all pages are either dirty");
Page page = pageStore.get(pageId);
// evict page
pageStore.remove(pageId);
pageAge.remove(pageId);
}
在SimpleDB中,一个TransactionId对象 是在每个query开始时创建的。这个对象被传到这个query的每个operators上,当query完成的时候,BufferPool 中的 transactionComplete 方法被调用。
在commits或者aborts 一个transaction的时候调用这个方法,尤其是在有一个参数是 commit的时候。在执行的任何时候,一个operator都可能会抛出TransactionAbortedException错误,这说明一个内部错误或者死锁已经发生了。提供的测试创建了一个TransactionId对象,以合适的方式传递给operators,并在query完成的时候调用transactionComplete函数。同时,也需要实现TransactionId。
Exercise 4.
Implement the
transactionComplete()
method inBufferPool
. Note that there are two versions of transactionComplete, one which accepts an additional boolean commit argument, and one which does not. The version without the additional argument should always commit and so can simply be implemented by callingtransactionComplete(tid, true)
.When you commit, you should flush dirty pages associated to the transaction to disk. When you abort, you should revert any changes made by the transaction by restoring the page to its on-disk state.
Whether the transaction commits or aborts, you should also release any state the
BufferPool
keeps regarding the transaction, including releasing any locks that the transaction held.At this point, your code should pass the
TransactionTest
unit test and theAbortEvictionTest
system test. You may find theTransactionTest
system test illustrative, but it will likely fail until you complete the next exercise.
/**
* Release all locks associated with a given transaction.
*
* @param tid the ID of the transaction requesting the unlock
*/
public void transactionComplete(TransactionId tid) throws IOException {
// some code goes here
// not necessary for lab1|lab2
transactionComplete(tid,true);
}
/**
* Commit or abort a given transaction; release all locks associated to
* the transaction.
*
* @param tid the ID of the transaction requesting the unlock
* @param commit a flag indicating whether we should commit or abort
*/
public void transactionComplete(TransactionId tid, boolean commit)
throws IOException {
// some code goes here
// not necessary for lab1|lab2
if(commit){
flushPages(tid);
}else{
restorePages(tid);
}
for(PageId pid:pageStore.keySet()){
if(holdsLock(tid,pid))
releasePage(tid,pid);
}
}
private synchronized void restorePages(TransactionId tid) {
for (PageId pid : pageStore.keySet()) {
Page page = pageStore.get(pid);
if (page.isDirty() == tid) {
int tabId = pid.getTableId();
DbFile file = Database.getCatalog().getDatabaseFile(tabId);
Page pageFromDisk = file.readPage(pid);
pageStore.put(pid, pageFromDisk);
}
}
}
/** Write all pages of the specified transaction to disk.
*/
public synchronized void flushPages(TransactionId tid) throws IOException {
// some code goes here
// not necessary for lab1|lab2
for (PageId pid : pageStore.keySet()) {
Page page = pageStore.get(pid);
if (page.isDirty() == tid) {
flushPage(pid);
}
}
}
ant runtest -Dtest=TransactionTest
单元测试BUILD SUCCESSFUL。
ant runsystest -Dtest=AbortEvictionTest
系统测试BUILD SUCCESSFUL。
SimpleDB中的transactions是有可能死锁的,需要实现对死锁的检测并抛出一个TransactionAbortedException异常。
有很多种名方式检测死锁。比如,可以实现一个简单的超时策略,如果在既定时间内一个transaction没有完成,就抛弃这个transaction。另外,也可以依靠dependency graph data structure实现一个cycle-detection,可以随时在授予锁的时候检查这个图数据结构中是否有环,如果有就抛弃。
在检测到死锁存在之后,需要考虑如何改善。如果在 transaction t等待一个锁释放时检测到死锁,如果你比较抓狂,可以抛弃掉t等待的所有transactions,这可能导致很大一部分工作没能完成,但能够保证t能够继续往下执行。相反的,你也可以选择抛弃t ,是的其他transactions有机会继续往下执行,这意味着在之后需要重新尝试执行transaction t。
Exercise 5.
Implement deadlock detection and resolution in
src/simpledb/BufferPool.java
. Most likely, you will want to check for a deadlock whenever a transaction attempts to acquire a lock and finds another transaction is holding the lock (note that this by itself is not a deadlock, but may be symptomatic of one.) You have many design decisions for your deadlock resolution system, but it is not necessary to do something complicated. Please describe your choices in the lab writeup.You should ensure that your code aborts transactions properly when a deadlock occurs, by throwing a
TransactionAbortedException
exception. This exception will be caught by the code executing the transaction (e.g.,TransactionTest.java
), which should calltransactionComplete()
to cleanup after the transaction. You are not expected to automatically restart a transaction which fails due to a deadlock -- you can assume that higher level code will take care of this.We have provided some (not-so-unit) tests in
test/simpledb/DeadlockTest.java
. They are actually a bit involved, so they may take more than a few seconds to run (depending on your policy). If they seem to hang indefinitely, then you probably have an unresolved deadlock. These tests construct simple deadlock situations that your code should be able to escape.Note that there are two timing parameters near the top of
DeadLockTest.java
; these determine the frequency at which the test checks if locks have been acquired and the waiting time before an aborted transaction is restarted. You may observe different performance characteristics by tweaking these parameters if you use a timeout-based detection method. The tests will outputTransactionAbortedExceptions
corresponding to resolved deadlocks to the console.Your code should now should pass the
TransactionTest
system test (which may also run for quite a long time).At this point, you should have a recoverable database, in the sense that if the database system crashes (at a point other than
transactionComplete()
) or if the user explicitly aborts a transaction, the effects of any running transaction will not be visible after the system restarts (or the transaction aborts.) You may wish to verify this by running some transactions and explicitly killing the database server.
BufferPool.java中的代码:
/**
* Retrieve the specified page with the associated permissions.
* Will acquire a lock and may block if that lock is held by another
* transaction.
*
* The retrieved page should be looked up in the buffer pool. If it
* is present, it should be returned. If it is not present, it should
* be added to the buffer pool and returned. If there is insufficient
* space in the buffer pool, a page should be evicted and the new page
* should be added in its place.
*
* @param tid the ID of the transaction requesting the page
* @param pid the ID of the requested page
* @param perm the requested permissions on the page
*/
public Page getPage(TransactionId tid, PageId pid, Permissions perm)
throws TransactionAbortedException, DbException {
// some code goes here
int lockType;
if(perm == Permissions.READ_ONLY){
lockType = 0;
}else{
lockType = 1;
}
boolean lockAcquired = false;
long start = System.currentTimeMillis();
long timeout = new Random().nextInt(2000) + 1000;
while(!lockAcquired){
long now = System.currentTimeMillis();
if(now-start > timeout){
// TransactionAbortedException means detect a deadlock
// after upper caller catch TransactionAbortedException
// will call transactionComplete to abort this transition
// give someone else a chance: abort the transaction
throw new TransactionAbortedException();
}
lockAcquired = lockManager.acquireLock(pid,tid,lockType);
}
if(!pageStore.containsKey(pid)){
int tabId = pid.getTableId();
DbFile file = Database.getCatalog().getDatabaseFile(tabId);
Page page = file.readPage(pid);
if(pageStore.size()==numPages){
evictPage();
}
pageStore.put(pid,page);
pageAge.put(pid,age++);
return page;
}
return pageStore.get(pid);
}
ant runtest -Dtest=DeadlockTest
单元测试BUILD SUCCESSFUL。
ant runsystest -Dtest=TransactionTest系统测试BUILD SUCCESSFUL