简单了解Java本身是怎样判断一个文件是目录

起因是了解Common IO时看到FilenameUtils的类注释,里面说到FilenameUtils对带路径符才默认为目录(文件夹),那Java本身是怎样判断一个文件是否目录的?

  • Note that this class works best if directory filenames end with a separator.
  • If you omit the last separator, it is impossible to determine if the filename
  • corresponds to a file or a directory. As a result, we have chosen to say
  • it corresponds to a file.
    ……

通常是使用File$isDirectory判断一个File是否为文件夹的,而File$isDirectory实际上先通过SecurityManager$checkRead用于检查程序是否有读取的权限,其次检查路径名是否合法字符,最后通过FileSystem$getBooleanAttributes获取File的属性进行判断。

    /**
     * Tests whether the file denoted by this abstract pathname is a
     * directory.
     *
     * 

Where it is required to distinguish an I/O exception from the case * that the file is not a directory, or where several attributes of the * same file are required at the same time, then the {@link * java.nio.file.Files#readAttributes(Path,Class,LinkOption[]) * Files.readAttributes} method may be used. * * @return true if and only if the file denoted by this * abstract pathname exists and is a directory; * false otherwise * * @throws SecurityException * If a security manager exists and its {@link * java.lang.SecurityManager#checkRead(java.lang.String)} * method denies read access to the file */ public boolean isDirectory() { SecurityManager security = System.getSecurityManager(); if (security != null) { security.checkRead(path); } if (isInvalid()) { return false; } return ((fs.getBooleanAttributes(this) & FileSystem.BA_DIRECTORY) != 0); } /** * Check if the file has an invalid path. Currently, the inspection of * a file path is very limited, and it only covers Nul character check. * Returning true means the path is definitely invalid/garbage. But * returning false does not guarantee that the path is valid. * * @return true if the file path is invalid. */ final boolean isInvalid() { if (status == null) { status = (this.path.indexOf('\u0000') < 0) ? PathStatus.CHECKED : PathStatus.INVALID; } return status == PathStatus.INVALID; }

其中FileSystem是个接口,基于不同的操作系统有不同的接口实现,windows 10 则是WinNTFileSystem,毕竟windows10是基于NT内核的,而看回源码也可以发现FileSystem也只有WinNTFileSystem的实现类,这可能与Java所在的系统平台有关。

    /* Constants for simple boolean attributes */
    @Native public static final int BA_EXISTS    = 0x01;
    @Native public static final int BA_REGULAR   = 0x02;
    @Native public static final int BA_DIRECTORY = 0x04; 
    /**
     * Return the simple boolean attributes for the file or directory denoted
     * by the given abstract pathname, or zero if it does not exist or some
     * other I/O error occurs.
     */
    public abstract int getBooleanAttributes(File f);

WinNTFileSystem$getBooleanAttributes声明了FileSystem$getBooleanAttributes的实现,可以看出其实File$isDirectory是基于native方法实现的。

/**
 * Unicode-aware FileSystem for Windows NT/2000.
 *
 * @author Konstantin Kladko
 * @since 1.4
 */
class WinNTFileSystem extends FileSystem{
    ....
    @Override
    public native int getBooleanAttributes(File f); 

由于Sun java本身不开源,native方法只能看openjdk的源码。native 方法本身和本地操作系统有关,涉及到JNI,实际上是c/c++编译成dll供jvm调用,可以看下使用JNI进行Java与C/C++语言混合编程(1)–在Java中调用C/C++本地库。WinNTFileSystem的native方法在openjdk\jdk\src\windows\native\java\io的WinNTFileSystem_md.c中定义。

简单了解Java本身是怎样判断一个文件是目录_第1张图片

WinNTFileSystem$getBooleanAttributes对应的native
方法就是Java_java_io_WinNTFileSystem_getBooleanAttributes。首先将Java对象java.io.FIle通过fileToNTPath映射为文件路径,并通过isReservedDeviceNameW排除以保留字命名的文件夹,如con则是DOS的保留字,不可作为文件夹名,最后getFinalAttributes获取文件属性,假如具有FILE_ATTRIBUTE_DIRECTORY,则添加java_io_FileSystem_BA_DIRECTORY二进制值到方法的返回值中。而getFinalAttributes本质上是链式调用了windows.h的GetFileInformationByHandle

JNIEXPORT jint JNICALL
Java_java_io_WinNTFileSystem_getBooleanAttributes(JNIEnv *env, jobject this,
                                                  jobject file)
{
    jint rv = 0;
    jint pathlen;

    WCHAR *pathbuf = fileToNTPath(env, file, ids.path);
    if (pathbuf == NULL)
        return rv;
    if (!isReservedDeviceNameW(pathbuf)) {
        DWORD a = getFinalAttributes(pathbuf);
        if (a != INVALID_FILE_ATTRIBUTES) {
            rv = (java_io_FileSystem_BA_EXISTS
                | ((a & FILE_ATTRIBUTE_DIRECTORY)
                    ? java_io_FileSystem_BA_DIRECTORY
                    : java_io_FileSystem_BA_REGULAR)
                | ((a & FILE_ATTRIBUTE_HIDDEN)
                    ? java_io_FileSystem_BA_HIDDEN : 0));
        }
    }
    free(pathbuf);
    return rv;
}

/* Check whether or not the file name in "path" is a Windows reserved
   device name (CON, PRN, AUX, NUL, COM[1-9], LPT[1-9]) based on the
   returned result from GetFullPathName, which should be in thr form of
   "\\.\[ReservedDeviceName]" if the path represents a reserved device
   name.
   Note1: GetFullPathName doesn't think "CLOCK$" (which is no longer
   important anyway) is a device name, so we don't check it here.
   GetFileAttributesEx will catch it later by returning 0 on NT/XP/
   200X.

   Note2: Theoretically the implementation could just lookup the table
   below linearly if the first 4 characters of the fullpath returned
   from GetFullPathName are "\\.\". The current implementation should
   achieve the same result. If Microsoft add more names into their
   reserved device name repository in the future, which probably will
   never happen, we will need to revisit the lookup implementation.

static WCHAR* ReservedDEviceNames[] = {
    L"CON", L"PRN", L"AUX", L"NUL",
    L"COM1", L"COM2", L"COM3", L"COM4", L"COM5", L"COM6", L"COM7", L"COM8", L"COM9",
    L"LPT1", L"LPT2", L"LPT3", L"LPT4", L"LPT5", L"LPT6", L"LPT7", L"LPT8", L"LPT9",
    L"CLOCK$"
};
 */
static BOOL isReservedDeviceNameW(WCHAR* path) {
#define BUFSIZE 9
    WCHAR buf[BUFSIZE];
    WCHAR *lpf = NULL;
    DWORD retLen = GetFullPathNameW(path,
                                   BUFSIZE,
                                   buf,
                                   &lpf);
    if ((retLen == BUFSIZE - 1 || retLen == BUFSIZE - 2) &&
        buf[0] == L'\\' && buf[1] == L'\\' &&
        buf[2] == L'.' && buf[3] == L'\\') {
        WCHAR* dname = _wcsupr(buf + 4);
        if (wcscmp(dname, L"CON") == 0 ||
            wcscmp(dname, L"PRN") == 0 ||
            wcscmp(dname, L"AUX") == 0 ||
            wcscmp(dname, L"NUL") == 0)
            return TRUE;
        if ((wcsncmp(dname, L"COM", 3) == 0 ||
             wcsncmp(dname, L"LPT", 3) == 0) &&
            dname[3] - L'0' > 0 &&
            dname[3] - L'0' <= 9)
            return TRUE;
    }
    return FALSE;
}

/**
 * Take special cases into account when retrieving the attributes
 * of path
 */
DWORD getFinalAttributes(WCHAR *path)
{
    DWORD attr = INVALID_FILE_ATTRIBUTES;

    WIN32_FILE_ATTRIBUTE_DATA wfad;
    WIN32_FIND_DATAW wfd;
    HANDLE h;

    if (GetFileAttributesExW(path, GetFileExInfoStandard, &wfad)) {
        attr = getFinalAttributesIfReparsePoint(path, wfad.dwFileAttributes);
    } else if (GetLastError() == ERROR_SHARING_VIOLATION &&
               (h = FindFirstFileW(path, &wfd)) != INVALID_HANDLE_VALUE) {
        attr = getFinalAttributesIfReparsePoint(path, wfd.dwFileAttributes);
        FindClose(h);
    }
    return attr;
}

/**
 * If the given attributes are the attributes of a reparse point, then
 * read and return the attributes of the special cases.
 */
DWORD getFinalAttributesIfReparsePoint(WCHAR *path, DWORD a)
{
    if ((a != INVALID_FILE_ATTRIBUTES) &&
        ((a & FILE_ATTRIBUTE_REPARSE_POINT) != 0))
    {
        BY_HANDLE_FILE_INFORMATION finfo;
        BOOL res = getFileInformation(path, &finfo);
        a = (res) ? finfo.dwFileAttributes : INVALID_FILE_ATTRIBUTES;
    }
    return a;
}
/**
 * Retrieves file information for the specified file. If the file is
 * symbolic link then the information on fully resolved target is
 * returned.
 */
static BOOL getFileInformation(const WCHAR *path,
                               BY_HANDLE_FILE_INFORMATION *finfo)
{
    BOOL result;
    DWORD error;
    HANDLE h = CreateFileW(path,
                           FILE_READ_ATTRIBUTES,
                           FILE_SHARE_DELETE |
                               FILE_SHARE_READ | FILE_SHARE_WRITE,
                           NULL,
                           OPEN_EXISTING,
                           FILE_FLAG_BACKUP_SEMANTICS,
                           NULL);
    if (h == INVALID_HANDLE_VALUE)
        return FALSE;
    result = GetFileInformationByHandle(h, finfo);
    error = GetLastError();
    if (CloseHandle(h))
        SetLastError(error);
    return result;
}

所以windows 10下的Java,File$isDirectory是调用了windows系统api的GetFileInformationByHandle获取文件夹信息后判断的。至于GetFileInformationByHandle的实现,这就涉及到操作系统和文件系统的范畴了,可以看下【原创】用WinHex查看SD卡FAT32文件系统结构、分析NTFS文件系统内部结构了解。

你可能感兴趣的:(java)