1. TrashPolicy类:所有的垃圾回收策略都需要实现该类,hdfs默认的实现方式是:TrashPolicyDefault,可通过fs.trash.classname 来配置。
2. TrashPolicy和TrashPolicyDefault类如下,这里只截取部分代码:
/**
* This interface is used for implementing different Trash policies.
* Provides factory method to create instances of the configured Trash policy.
*/
@InterfaceAudience.Public
@InterfaceStability.Evolving
public abstract class TrashPolicy extends Configured {
protected FileSystem fs; // the FileSystem
protected Path trash; // path to trash directory
protected long deletionInterval; // deletion interval for Emptier
......
public class TrashPolicyDefault extends TrashPolicy {
private static final Logger LOG =
LoggerFactory.getLogger(TrashPolicyDefault.class);
private static final Path CURRENT = new Path("Current");
private static final FsPermission PERMISSION =
new FsPermission(FsAction.ALL, FsAction.NONE, FsAction.NONE);
private static final DateFormat CHECKPOINT = new SimpleDateFormat("yyMMddHHmmss");
/** Format of checkpoint directories used prior to Hadoop 0.23. */
private static final DateFormat OLD_CHECKPOINT =
new SimpleDateFormat("yyMMddHHmm");
private static final int MSECS_PER_MINUTE = 60*1000;
private long emptierInterval;
几个关键参数和方法的说明:
protected Path trash; // 垃圾回收目录
protected long deletionInterval; // 当前时间-deletionInterval >检查点(时间)的检查点会被删除
private long emptierInterval; // 每过这么长时间就会进行一次删除检查点和创建检查点的操作,即deleteCheckpoint()和 createCheckpoint()操作
deletionInterval和emptierInterval不配置,则默认取值为0,即禁用垃圾回收功能。deletionInterval取值fs.trash.interval,emptierInterval取值fs.trash.checkpoint.interval,没配置则取值deletionInterval;emptierInterval大于deletionInterval,则取值deletionInterval。
public void createCheckpoint() throws IOException {
创建检查点,就是把Current目录重命名为当前时间yyMMddHHmmss
public void deleteCheckpoint() throws IOException {
删除检查点,就是把yyMMddHHmmss的日期超过deletionInterval的都删掉
清理线程的主要逻辑就是:睡眠emptierInterval时间,先删除检查点,再建立检查点。
protected class Emptier implements Runnable {
@Override
public void run() {
if (emptierInterval == 0)
return; // trash disabled
long now = Time.now();
long end;
while (true) {
end = ceiling(now, emptierInterval);
try { // sleep for interval
Thread.sleep(end - now);
} catch (InterruptedException e) {
break; // exit on interrupt
}
try {
now = Time.now();
if (now >= end) {
Collection
trashRoots = fs.getTrashRoots(true); // list all trash dirs
for (FileStatus trashRoot : trashRoots) { // dump each trash
if (!trashRoot.isDirectory())
continue;
try {
TrashPolicyDefault trash = new TrashPolicyDefault(fs, conf);
trash.deleteCheckpoint(trashRoot.getPath());
trash.createCheckpoint(trashRoot.getPath(), new Date(now));
} catch (IOException e) {
LOG.warn("Trash caught: "+e+". Skipping " +
trashRoot.getPath() + ".");
}
}
}
} catch (Exception e) {
LOG.warn("RuntimeException during Trash.Emptier.run(): ", e);
}
}
try {
fs.close();
} catch(IOException e) {
LOG.warn("Trash cannot close FileSystem: ", e);
}
}