The Linux Programming Interface 05 File I/O: Further Details 文件I/O的更多细节

The Linux Programming Interface

File I/O: Further Details

(01) 原子操作

Atomicity is essential to the successful completion of some operations. In particular, it allows us to avoid race conditions.

(02) 重要,man 手册查询错误码, man errno

   ENOENT          No such file or directory (POSIX.1)

(03)原子操作的错误实例,两个进程都声称单独创建t.txt文件

  1 #include 
  2 #include 
  3 #include 
  4 
  5 int main(int argc, char *argv[]) {
  6     int fd;
  7     fd = open(argv[1], O_WRONLY);
  8     if (fd != -1) {
  9         printf("[PID %ld File \"%s\" already exists\n",
 10             (long) getpid(), argv[1]);
 11     } else {
 12         if (errno != ENOENT) {  /* Failed for unexpected reason */
 13             errExit("open");
 14         } else {
 15             printf("[PID %ld] File \"%s\" doesn't exist yet\n",
 16                 (long) getpid(), argv[1]);
 17             if (argc > 2) {
 18                 sleep(30);
 19                 printf("[PID %ld] Done sleeping\n", (long) getpid());
 20             }
 21             /* WINDOW FOR FAILURE */
 22             fd = open(argv[1], O_WRONLY | O_CREAT, S_IRUSR | S_IWUSR);
 23             if (fd == -1)
 24                 errExit("open");
 25 
 26             printf("[PID %ld] Create file \"%s\" exclusively\n",
 27                 (long) getpid(), argv[1]);
 28         }
 29     }
 30 }
输出:

wang@wang:~/test/tlpi-dist/lib$ [PID 8567] File "t.txt" doesn't exist yet
wang@wang:~/test/tlpi-dist/lib$ ./bad_exclusive_open t.txt
[PID 8586] File "t.txt" doesn't exist yet
[PID 8586] Create file "t.txt" exclusively
wang@wang:~/test/tlpi-dist/lib$
wang@wang:~/test/tlpi-dist/lib$ [PID 8567] Done sleeping
[PID 8567] Create file "t.txt" exclusively
[1]+  Exit 43                 ./bad_exclusive_open t.txt sleep

Using a single open() call that specifies the O_CREAT and O_EXCL flags prevents this possibility by guaranteeing that the check and creation steps are carried out as a single atomic operation.

(04) fcntl函数

One use of fcntl() is to retrieve or modify the access mode and open file status flags of an open file.

int flags, accessMode;
flags = fcntl(fd, F_GEETFL);
if (flags == -1)
    errExit("fcntl");

if (flags & O_SYNC)
    printf("writes are synchronized\n");

(05) 检查文件读写权限

Checking the access mode of the file is slightly more complex, since the O_RDONLY(0), O_WRONLY(1), and O_RDWR(2) constants don't correspond to single bits in the open file status flags.

Therefore, to make this check, we mask the flags value with the constant O_ACCMODE,  and then test for equality with one of the constants:

accessMode = flags & O_ACCESS;
if (accessMode == O_WRONLY || accessMode == O_RDWR)
    printf("file is writable\n");
(06)修改文件读写权限

We can use the fcntl() F_SETFL command to modify some of the open file status flags.

The flags that can be modified are O_APPEND, O_NONBLOCK, O_NOATIME, O_ASYNC, and O_DIRECT.

To modify the open file status flags, we use fcntl() to retrieve a copy of the existing flags, then modify the bits we wish to change, and finally make a further call to fcntl() to update the flags.

int flags;
flags = fcntl(fd, F_GETFL);
if (flags == -1)
    errExit("fcntl");
flags |= O_APPEND;
if (fcntl(fd, F_SETFL, flags) == -1)
    errExit("fcntl");

(07) Relationship between file Descriptors and open files

It is possible and useful to have multiple descriptors referring to the same open file.

Three data structures maintained by the kernel

(1) the per-process file descriptor table

(2) the system-wide table of open file descriptors; and

(3) the file system i-node table.

An open file description stores all information relating to an open file, including:

the current file offset and write(), or explicitly modified using lseek()

...

the file access mode (read-only, write-only, or read-write)

The Linux Programming Interface 05 File I/O: Further Details 文件I/O的更多细节_第1张图片

In process A, descriptors 1 and 20 both refer to the same open file description, this situation may arise as a result of a call to dup, dup2(), or fcntl().

Descriptor 2 of process A and descriptor 2 of process B refer to a single open file description(73), this scenario could occur after a call to fork().

Two process can call open().

(08)dup2复制文件描述符函数

The dup() call takes oldfd, an open file descriptor, and returns a new descriptor that refers to the same open file description.

int dup(int oldfd);

int dup2(int oldfd, int newfd);

(09) dup3()函数

The dup3() system call perform the same task as dup2(), but adds an additional argument, flags, that is a bit mask that modifies the behavior of the system call.

int dup3(int oldfd, int newfd, int flags);

(10) The pread() and pwrite() system calls operate just like read() and write(), except that the file I/O is performed at the location specified by offset, rather than at the current file offset. The file offset is left unchanged by these calls.

ssize_t pread(int fd, void *buf, size_t count, off_t offset);

pwrite

(11) readv函数的使用

  1 #include 
  2 #include 
  3 #include 
  4 #include "tlpi_hdr.h"
  5 
  6 int main(int argc, char *argv[]) {
  7     int fd;
  8     struct iovec iov[3];
  9     struct stat myStruct;   // first buffer
 10     int x;                  //  second buffer
 11 #define STR_SIZE 100    
 12     char str[STR_SIZE];     // third buffer
 13     ssize_t numRead, totalRequired;
 14     if (argc != 2 || strcmp(argv[1], "--help") == 0)
 15         usageErr("%s file\n", argv[0]);
 16 
 17     fd = open(argv[1], O_RDONLY);
 18     if (fd == -1)
 19         errExit("open");
 20 
 21     totalRequired = 0;
 22     iov[0].iov_base = &myStruct;
 23     iov[0].iov_len = sizeof(struct stat);
 24     totalRequired += iov[0].iov_len;
 25 
 26     iov[1].iov_base = &x;
 27     iov[1].iov_len = sizeof(x);
 28     totalRequired += iov[1].iov_len;
 29 
 30     iov[2].iov_base = str;
 31     iov[2].iov_len = STR_SIZE;
 32     totalRequired += iov[2].iov_len;
 33 
 34     numRead = readv(fd, iov, 3);
 35     if (numRead == -1)
 36         errExit("readv");
 37     if (numRead < totalRequired)
 38         printf("Read fewer bytes than requested\n");
 39 
 40     printf("total bytes requested: %ld: bytes read: %ld\n",
 41         (long) totalRequired, (long) numRead);
 42     exit(EXIT_SUCCESS);
 43 }
输出:

wang@wang:~/test/tlpi-dist/lib$ ./t_readv text.txt
Read fewer bytes than requested
total bytes requested: 248: bytes read: 33

(12) preadv 和 pwritev

The preadv() and pwritev() system calls perform the same task as readv() and writev(), but perform the I/O at the file location specified by offset.

(13) 截断文件 truncate() 和 ftruncate()

int ftruncate(int fd, off_t length);

(14) Nonblocking I/O

O_NONBLOCK is generally ignored for regular file, because the kernel buffer cache ensures that I/O on regular files does not block.

(15) large file

In order to access a large file, we simply use the 64-bit version of the function.

fd = open64(name, O_CREAT | O_RDWR, mode)
if (fd == -1)
    errExit("open");

(16)大文件举例

  1 #define _LARGEFILE64_SOURCE
  2 #include 
  3 #include 
  4 #include "tlpi_hdr.h"
  5 
  6 int main(int argc, char *argv[]) {
  7     int fd;
  8     off64_t off;
  9     if (argc != 3 || strcmp(argv[1], "--help") == 0)
 10         usageErr("%s pathname offset\n", argv[0]);
 11 
 12     fd = open64(argv[1], O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
 13     if (fd == -1)
 14         errExit("open64");
 15 
 16     off = atoll(argv[2]);
 17     if (lseek64(fd, off, SEEK_SET) == -1)
 18         errExit("lseek64");
 19 
 20     if (write(fd, "test", 4) == -1)
 21         errExit("write");
 22     exit(EXIT_SUCCESS);
 23 }
输出:

wang@wang:~/test/tlpi-dist/lib$ gcc large_file.c error_functions.c -o large_filewang@wang:~/test/tlpi-dist/lib$ ./large_file  x 10111222333
wang@wang:~/test/tlpi-dist/lib$ ls -l x
-rw------- 1 wang wang 10111222337  3月  7 09:23 x

Using _FILE_OFFSET_BITS macro is not required by the LFS specification, which merely mentions this macro as an optional method of specifying the size of the off_t data type.

(17) /dev/fd 目录

For each process, the kernel provides the special virtual directory /dev/fd. This directory contains filenames of the form /dev/fd/n, where n is a number corresponding to one of the open file descriptors for the process.


(18) Opening one of the files in the /dev/fd directory is equivalent to duplicating the corresponding file descriptor.

fd = open("/dev/fd/1", O_WRONLY);

fd = dup(1);

(19) 创建临时文件

Typically, a temporary file is unlinked(deleted) soon after it is opened, using the unlink() system call.

int fd;
char template[] = "/tmp/somestringXXXXXX";
fd = mkstemp(template);
if (fd == -1)
    errExit("mkstemp");
printf("Generated filename was: %s\n", template);
/* name disappears immediately, but the file is removed only after close() */
unlink(template);
(20) 总结

In the course of this chapter, we introduced the concept of atomicity, which is crucial to the correct operation of some system calls. In particular, the open() O_EXCL flag allows the caller to ensure that it is the creator of a file, and the open() O_APPEND flag ensures that multiple processes appending data to the same file don't override each other's output.

    The fcntl() system call perform a variety of file control operations, including changing open file status flags and duplicating file descriptors. Duplicating file descriptor is also possible using dup() and dup2().

    We looked at the relationship between file descriptors, open file descriptions, and file i-nodes, and noted that different information is associated with each of these three objects. Duplicate file descriptors refer to the same open file description, and thus share open file status flags and the file offset.

    We described a number of system calls that extend the functionality of the conventional read() and write() system calls. The pread() and pwrite() system calls perform I/O at a specified file location without changing the file offset. The readv() and writev() calls perform scatter-gather I/O. The pread() nad pwriter() calls combine scatter-gather I/O functionality with the ability to perform I/O at a specified file location.

    The truncate() and ftrancate() system calls can be used to decrease the size of a file, discarding the excess bytes, or to increase the size, padding with a zero-filled file hole.

    We briefly introduced the concept of nonblocking I/O, and we'll return to it in later chapters.

    The LFS specification defines a set of extensions that allow processes running on 32-bit system to perform operations on files whose size is too large to be represented in 32 bits.

    The numbered files in the /dev/fd virtual directory allow a process to access its own open files via file descriptor numbers, which can be particularly useful in shell commands.

    The mkstemp() and tmpfile() functions allow an application to create temporary files.

(21)习题

练习1答案,使用_FILE_OFFSET_BITS macro创建大文件

  1 #include 
  2 #include 
  3 #include "tlpi_hdr.h"
  4 
  5 int main(int argc, char *argv[]) {
  6     int fd;
  7     off_t off;
  8     if (argc != 3 || strcmp(argv[1], "--help") == 0)
  9         usageErr("%s pathname offset\n", argv[0]);
 10 
 11     fd = open(argv[1], O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
 12     if (fd == -1)
 13         errExit("open64");
 14 
 15     off = atoll(argv[2]);
 16     if (lseek(fd, off, SEEK_SET) == -1)
 17         errExit("lseek64");
 18 
 19     if (write(fd, "test", 4) == -1)
 20         errExit("write");
 21     exit(EXIT_SUCCESS);
 22 }
输出:

wang@wang:~/test/tlpi-dist/lib$ gcc -D_FILE_OFFSET_BITS=64 large_file.c error_functions.c -o large_file1
wang@wang:~/test/tlpi-dist/lib$ ./large_file1 x 10111222333
wang@wang:~/test/tlpi-dist/lib$ ls -l x
-rw------- 1 wang wang 10111222337  3月  7 15:07 x

表明改写成功。

练习2

虽然seek到了文件头,但是open打开的时候有append标志,写入的时候还是会在末尾添加。

以下是操作过程

源码

  1 #include 
  2 #include 
  3 
  4 int main(int argc, char *argv[]) {
  5     int fd;
  6     fd = open(argv[1], O_RDWR|O_APPEND);
  7     lseek(fd, 0, SEEK_SET);
  8     char buf[7] = "hello\n";
  9     write(fd, buf, sizeof(buf)-1);
 10     return 0;
 11 }

编写的文件text.txt

The Linux Programming Interface

执行之后

The Linux Programming Interface
hello

练习3

源码是作者提供的,参考使用。

  1 #include 
  2 #include 
  3 #include "tlpi_hdr.h"
  4 
  5 int main(int argc, char *argv[]) {
  6     int numBytes, j, flags, fd;
  7     Boolean useLseek;
  8     if (argc < 3 || strcmp(argv[1], "--help") == 0)
  9         usageErr("%s file num-bytes [x]\n"
 10             "   'x' mean use lseek() instead of O_APPEND\n", argv[0]);
 11     useLseek = argc > 3;
 12     flags = useLseek ? 0 : O_APPEND;
 13     numBytes = getInt(argv[2], 0, "num-bytes");
 14     fd = open(argv[1], O_RDWR | O_CREAT | flags, S_IRUSR | S_IWUSR);
 15     if (fd == -1)
 16         errExit("open");
 17     for (j = 0; j < numBytes; j++) {
 18         if (useLseek)
 19             if (lseek(fd, 0, SEEK_END) == -1)
 20                 errExit("lseek");
 21         if (write(fd, "x", 1) != 1)
 22             fatal("write() failed");
 23     }
 24     printf("%ld done\n", (long) getpid());
 25     exit(EXIT_SUCCESS);
 26 }

验证结果

wang@wang:~/test/tlpi-dist/lib$ ./atomic_append f1 1000000 & ./atomic_append f1 1000000

wang@wang:~/test/tlpi-dist/lib$ ./atomic_append f2 1000000 x & ./atomic_append f2 1000000 x

wang@wang:~/test/tlpi-dist/lib$ ls -l f1 f2
-rw------- 1 wang wang 2000000  3月  7 15:59 f1
-rw------- 1 wang wang 1007188  3月  7 16:00 f2
可以看到第二个文件比第一个要少,文件读写有覆盖的情况。

习题4

不考虑错误处理的话,会简单写成如下两种方式。

fd = fcntl(oldfd, F_DUPFD, 0);

fd = fcntl(oldfd, F_DUPFD, newfd);

...

练习5, 6, 7



你可能感兴趣的:(The,Linux,Programming,Interface,linux)