前言
之前在看 LruCache 的时候,就像看看基于 Disk 版的 LruCache,当我看完 LruCache 后,如果是我自己去写,大概还是会基于 LruCache 去保留 key,根据这个 key 去做 LRU,只不过把值保存在硬盘里,这几面又有一个对应关系,Key 和 Value,如果说,一个Key 对应一个文件,那么就会好办得很多,直接根据 Key的值获取文件名,然后读取数据,或者写入数据,然而这样就会造成文件数和 Key 的个数线性增长,效果不见得会好,如果多个 Key 对应一个文件,那么就要考虑,这个 Key 的值在那个文件,哪一位置,想想这些,正式 key-value 数据库需要解决的问题,只不过人家的数据库是不会存在过期的,当然,Redis 是有过期的说法。
突然有一种想去了解一下内存数据库的实现的冲动,类似 Redis,我记得里面的数据结构有使用 SkipList 来做,插入,删除都能在一定的时间范围内,当然红黑树也是比较合适,就是如何快速找到这个 Key 相关信息,然后找出数据的存在位置。这期间可能你想做到数据一部分加载入内存,一部分在硬盘,这样理论上数据库的存储空间便是磁盘的存储空间,我们可以根据某个策略,当热点数据存放在内存,冷数据存放硬盘,82原则,这有点类似操作系统的内存管理,我们可以利用 Lru 对我们的内存进行管理,当然这可能比全部加载进内存要慢得多,但大部分数据在内存,速度要比每次从磁盘要快得多。带着这样的疑虑,我们看一下 Jake大神的 DiskLruCache
DiskLruCache 是什么?
这是一个非常短小精悍的Key-Value,基于Lru的原则实现的Cache,value只能为字符串,所以你得将你想要存的数据转化成字符串格式,然后再存,这使得这个库局限于字符串,但为了短小,也没办法
如何使用?
// 创建一个实例,cacheDir 是cache的文件目录
DiskLruCache cache = DiskLruCache.open(cacheDir, appVersion, 2, Integer.MAX_VALUE);
// 存或者更改数据
DiskLruCache.Editor creator = cache.edit("k1");
creator.set(0, "A");
creator.set(1, "B");
// 要调用这个
creator.commit();
cache.close();
// 读取数据
DiskLruCache.Snapshot snapshot = cache.get("k1");
String data = snapshot.getString(0);
snapshot.close();
源码分析
属性
在DiskLruCache 中有三种文件,一个是日志文件,里面是我们操作的记录,还有临时文件,存在的是我们的Value数据,在未提交前,一个是备份文件,可从这个里面恢复数据。lruEntries 是key的集合,使用了 LinkedHashMap,于LruCache一样,redundantOpCount 是里面冗余操作的次数,当这个值大于2000,会trimToSize,重新构建日志文件
private final File directory;
private final File journalFile;
private final File journalFileTmp;
private final File journalFileBackup;
private final int appVersion;
private long maxSize;
private final int valueCount;
private long size = 0;
private Writer journalWriter;
private final LinkedHashMap lruEntries =
new LinkedHashMap(0, 0.75f, true);
private int redundantOpCount;
日志文件的格式,前几行是文件头,后面是操作记录
* This cache uses a journal file named "journal". A typical journal file
* looks like this:
* libcore.io.DiskLruCache
* 1
* 100
* 2
*
* CLEAN 3400330d1dfc7f3f7f4b8d4d803dfcf6 832 21054
* DIRTY 335c4c6028171cfddfbaae1a9c313c52
* CLEAN 335c4c6028171cfddfbaae1a9c313c52 3934 2342
* REMOVE 335c4c6028171cfddfbaae1a9c313c52
* DIRTY 1ab96a171faeeee38496d8b330771a7a
* CLEAN 1ab96a171faeeee38496d8b330771a7a 1600 234
* READ 335c4c6028171cfddfbaae1a9c313c52
* READ 3400330d1dfc7f3f7f4b8d4d803dfcf6
看一下如何建立一个实例,首先查看是否有备份文件,如果有的话,在检查是否有日志文件,首次创建时没有任何文件,因此会执行到 directory.mkdirs() 然后rebuildJournal,
/**
* Opens the cache in {@code directory}, creating a cache if none exists
* there.
*
* @param directory a writable directory
* @param valueCount the number of values per cache entry. Must be positive.
* @param maxSize the maximum number of bytes this cache should use to store
* @throws IOException if reading or writing the cache directory fails
*/
public static DiskLruCache open(File directory, int appVersion, int valueCount, long maxSize)
throws IOException {
if (maxSize <= 0) {
throw new IllegalArgumentException("maxSize <= 0");
}
if (valueCount <= 0) {
throw new IllegalArgumentException("valueCount <= 0");
}
// If a bkp file exists, use it instead.
File backupFile = new File(directory, JOURNAL_FILE_BACKUP);
if (backupFile.exists()) {
File journalFile = new File(directory, JOURNAL_FILE);
// If journal file also exists just delete backup file.
if (journalFile.exists()) {
backupFile.delete();
} else {
renameTo(backupFile, journalFile, false);
}
}
// Prefer to pick up where we left off.
DiskLruCache cache = new DiskLruCache(directory, appVersion, valueCount, maxSize);
if (cache.journalFile.exists()) {
try {
cache.readJournal();
cache.processJournal();
return cache;
} catch (IOException journalIsCorrupt) {
System.out
.println("DiskLruCache "
+ directory
+ " is corrupt: "
+ journalIsCorrupt.getMessage()
+ ", removing");
cache.delete();
}
}
// Create a new empty cache.
directory.mkdirs();
cache = new DiskLruCache(directory, appVersion, valueCount, maxSize);
cache.rebuildJournal();
return cache;
}
线程安全的,首先使用将文件头写入,再将entry写入,最后判断 journal 文件是否存在,如果存在,那么现将其改为备份文件,之后把临时文件改成日志文件,删除备份文件,总之就是要留下一个日志文件,最后创建一个 JournalWriter,追加模式
/**
* Creates a new journal that omits redundant information. This replaces the
* current journal if it exists.
*/
private synchronized void rebuildJournal() throws IOException {
if (journalWriter != null) {
journalWriter.close();
}
Writer writer = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(journalFileTmp), Util.US_ASCII));
try {
writer.write(MAGIC);
writer.write("\n");
writer.write(VERSION_1);
writer.write("\n");
writer.write(Integer.toString(appVersion));
writer.write("\n");
writer.write(Integer.toString(valueCount));
writer.write("\n");
writer.write("\n");
// 每一个 entry 会有一个Editor
for (Entry entry : lruEntries.values()) {
if (entry.currentEditor != null) {
writer.write(DIRTY + ' ' + entry.key + '\n');
} else {
writer.write(CLEAN + ' ' + entry.key + entry.getLengths() + '\n');
}
}
} finally {
writer.close();
}
if (journalFile.exists()) {
renameTo(journalFile, journalFileBackup, true);
}
renameTo(journalFileTmp, journalFile, false);
journalFileBackup.delete();
journalWriter = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(journalFile, true), Util.US_ASCII));
}
// 通过改名来复制文件内容
private static void renameTo(File from, File to, boolean deleteDestination) throws IOException {
if (deleteDestination) {
deleteIfExists(to);
}
if (!from.renameTo(to)) {
throw new IOException();
}
}
看一下 Entry,主要是Key和valueCount,还有一个 currentEditor
private final class Entry {
private final String key;
/** Lengths of this entry's files. */
private final long[] lengths;
/** True if this entry has ever been published. */
private boolean readable;
/** The ongoing edit or null if this entry is not being edited. */
private Editor currentEditor;
/** The sequence number of the most recently committed edit to this entry. */
private long sequenceNumber;
private Entry(String key) {
this.key = key;
this.lengths = new long[valueCount];
}
public String getLengths() throws IOException {
StringBuilder result = new StringBuilder();
for (long size : lengths) {
result.append(' ').append(size);
}
return result.toString();
}
/** Set lengths using decimal numbers like "10123". */
private void setLengths(String[] strings) throws IOException {
if (strings.length != valueCount) {
throw invalidLengths(strings);
}
try {
for (int i = 0; i < strings.length; i++) {
lengths[i] = Long.parseLong(strings[i]);
}
} catch (NumberFormatException e) {
throw invalidLengths(strings);
}
}
private IOException invalidLengths(String[] strings) throws IOException {
throw new IOException("unexpected journal line: " + java.util.Arrays.toString(strings));
}
public File getCleanFile(int i) {
return new File(directory, key + "." + i);
}
public File getDirtyFile(int i) {
return new File(directory, key + "." + i + ".tmp");
}
}
看一下初始化,首先看一下,通常会新建一个entry,或者读取,然后绑定一个Editor,写日志
public Editor edit(String key) throws IOException {
return edit(key, ANY_SEQUENCE_NUMBER);
}
private synchronized Editor edit(String key, long expectedSequenceNumber) throws IOException {
checkNotClosed();
validateKey(key);
Entry entry = lruEntries.get(key);
if (expectedSequenceNumber != ANY_SEQUENCE_NUMBER && (entry == null
|| entry.sequenceNumber != expectedSequenceNumber)) {
return null; // Snapshot is stale.
}
if (entry == null) {
entry = new Entry(key);
lruEntries.put(key, entry);
} else if (entry.currentEditor != null) {
return null; // Another edit is in progress.
}
Editor editor = new Editor(entry);
entry.currentEditor = editor;
// Flush the journal before creating files to prevent file leaks.
journalWriter.write(DIRTY + ' ' + key + '\n');
journalWriter.flush();
return editor;
}
Editor 的实现,一个Entry对应一个Editor,通常我们会通过Editor set get数据,
/** Edits the values for an entry. */
public final class Editor {
private final Entry entry;
private final boolean[] written;
private boolean hasErrors;
private boolean committed;
private Editor(Entry entry) {
this.entry = entry;
this.written = (entry.readable) ? null : new boolean[valueCount];
}
/**
* Returns an unbuffered input stream to read the last committed value,
* or null if no value has been committed.
*/
public InputStream newInputStream(int index) throws IOException {
synchronized (DiskLruCache.this) {
if (entry.currentEditor != this) {
throw new IllegalStateException();
}
if (!entry.readable) {
return null;
}
try {
return new FileInputStream(entry.getCleanFile(index));
} catch (FileNotFoundException e) {
return null;
}
}
}
/**
* Returns the last committed value as a string, or null if no value
* has been committed.
*/
public String getString(int index) throws IOException {
InputStream in = newInputStream(index);
return in != null ? inputStreamToString(in) : null;
}
/** Sets the value at {@code index} to {@code value}. */
public void set(int index, String value) throws IOException {
Writer writer = null;
try {
writer = new OutputStreamWriter(newOutputStream(index), Util.UTF_8);
writer.write(value);
} finally {
Util.closeQuietly(writer);
}
}
/**
* Commits this edit so it is visible to readers. This releases the
* edit lock so another edit may be started on the same key.
*/
public void commit() throws IOException {
if (hasErrors) {
completeEdit(this, false);
remove(entry.key); // The previous entry is stale.
} else {
completeEdit(this, true);
}
committed = true;
}
/**
* Aborts this edit. This releases the edit lock so another edit may be
* started on the same key.
*/
public void abort() throws IOException {
completeEdit(this, false);
}
public void abortUnlessCommitted() {
if (!committed) {
try {
abort();
} catch (IOException ignored) {
}
}
}
private class FaultHidingOutputStream extends FilterOutputStream {
private FaultHidingOutputStream(OutputStream out) {
super(out);
}
@Override public void write(int oneByte) {
try {
out.write(oneByte);
} catch (IOException e) {
hasErrors = true;
}
}
@Override public void write(byte[] buffer, int offset, int length) {
try {
out.write(buffer, offset, length);
} catch (IOException e) {
hasErrors = true;
}
}
@Override public void close() {
try {
out.close();
} catch (IOException e) {
hasErrors = true;
}
}
@Override public void flush() {
try {
out.flush();
} catch (IOException e) {
hasErrors = true;
}
}
}
}
获取一个OuputStream,注意,每个index是一个文件,而且不能超过valueCount,用了全局锁,写的时候使用了 dirtyFile,这是一个临时文件,等commit了会把这个文件变为正式文件
/**
* Returns a new unbuffered output stream to write the value at
* {@code index}. If the underlying output stream encounters errors
* when writing to the filesystem, this edit will be aborted when
* {@link #commit} is called. The returned output stream does not throw
* IOExceptions.
*/
public OutputStream newOutputStream(int index) throws IOException {
if (index < 0 || index >= valueCount) {
throw new IllegalArgumentException("Expected index " + index + " to "
+ "be greater than 0 and less than the maximum value count "
+ "of " + valueCount);
}
synchronized (DiskLruCache.this) {
if (entry.currentEditor != this) {
throw new IllegalStateException();
}
if (!entry.readable) {
written[index] = true;
}
File dirtyFile = entry.getDirtyFile(index);
FileOutputStream outputStream;
try {
outputStream = new FileOutputStream(dirtyFile);
} catch (FileNotFoundException e) {
// Attempt to recreate the cache directory.
directory.mkdirs();
try {
outputStream = new FileOutputStream(dirtyFile);
} catch (FileNotFoundException e2) {
// We are unable to recover. Silently eat the writes.
return NULL_OUTPUT_STREAM;
}
}
return new FaultHidingOutputStream(outputStream);
}
}
提交的时候,会锁住整个实例,
private synchronized void completeEdit(Editor editor, boolean success) throws IOException {
Entry entry = editor.entry;
// 省略
....
// 把dirty文件变为正式文件
for (int i = 0; i < valueCount; i++) {
File dirty = entry.getDirtyFile(i);
if (success) {
if (dirty.exists()) {
File clean = entry.getCleanFile(i);
dirty.renameTo(clean);
long oldLength = entry.lengths[i];
long newLength = clean.length();
entry.lengths[i] = newLength;
size = size - oldLength + newLength;
}
} else {
deleteIfExists(dirty);
}
}
// 增加一次操作
redundantOpCount++;
entry.currentEditor = null;
if (entry.readable | success) {
entry.readable = true;
journalWriter.write(CLEAN + ' ' + entry.key + entry.getLengths() + '\n');
if (success) {
entry.sequenceNumber = nextSequenceNumber++;
}
} else {
lruEntries.remove(entry.key);
journalWriter.write(REMOVE + ' ' + entry.key + '\n');
}
journalWriter.flush();
// 是否需要清除数据
if (size > maxSize || journalRebuildRequired()) {
executorService.submit(cleanupCallable);
}
}
执行清除数据的过程
/** This cache uses a single background thread to evict entries. */
final ThreadPoolExecutor executorService =
new ThreadPoolExecutor(0, 1, 60L, TimeUnit.SECONDS, new LinkedBlockingQueue());
private final Callable cleanupCallable = new Callable() {
public Void call() throws Exception {
synchronized (DiskLruCache.this) {
if (journalWriter == null) {
return null; // Closed.
}
trimToSize();
if (journalRebuildRequired()) {
rebuildJournal();
redundantOpCount = 0;
}
}
return null;
}
};
// 访问过得节点会被移动到末尾,链首的数据都是老数据,
// 新数据都是添加到链尾,每次被访问的数据也会被放入链尾
private void trimToSize() throws IOException {
while (size > maxSize) {
Map.Entry toEvict = lruEntries.entrySet().iterator().next();
remove(toEvict.getKey());
}
}
我们再回过头看一下,如果缓存文件存在,那么则读取日志文件,首先检查文件头,再读取操作记录,redundantOpCount = lineCount - lruEntries.size() 这里记录了冗余操作的次数,当这个数大于2000 时,则重新构建日志文件,重新构建会删除之前的操作记录,这样有利于日志文件大小的控制
private void readJournal() throws IOException {
StrictLineReader reader = new StrictLineReader(new FileInputStream(journalFile), Util.US_ASCII);
try {
String magic = reader.readLine();
String version = reader.readLine();
String appVersionString = reader.readLine();
String valueCountString = reader.readLine();
String blank = reader.readLine();
if (!MAGIC.equals(magic)
|| !VERSION_1.equals(version)
|| !Integer.toString(appVersion).equals(appVersionString)
|| !Integer.toString(valueCount).equals(valueCountString)
|| !"".equals(blank)) {
throw new IOException("unexpected journal header: [" + magic + ", " + version + ", "
+ valueCountString + ", " + blank + "]");
}
int lineCount = 0;
while (true) {
try {
readJournalLine(reader.readLine());
lineCount++;
} catch (EOFException endOfJournal) {
break;
}
}
redundantOpCount = lineCount - lruEntries.size();
// If we ended on a truncated line, rebuild the journal before appending to it.
if (reader.hasUnterminatedLine()) {
rebuildJournal();
} else {
journalWriter = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(journalFile, true), Util.US_ASCII));
}
} finally {
Util.closeQuietly(reader);
}
}
小结
核心思想:LruCache,利用LinkedHashMap存储Key,将Value存于文件,一个Key,一个value对应一个文件各不影响。