5-Standard I&O Library

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1

Welcome to my github: https://github.com/gaoxiangnumber1

5.1 Introduction

5.2 Streams and FILE Objects

  • When a file is opened, a file descriptor is returned, and that descriptor is then used for all subsequent I/O operations. When we open or create a file with the standard I/O library, we have associated a stream with the file.
  • ASCII character set: a single character is represented by a single byte. International character sets: a character can be represented by more than one byte. Standard I/O file streams can be used with both single-byte and multibyte character sets. A stream’s orientation determines whether the characters that are read and written are single byte or multibyte.
  • When a stream is created, it has no orientation. If a multibyte I/O function(
#include <stdio.h>
#include <wchar.h>
int fwide(FILE *fp, int mode);
Returns: positive if stream is wide oriented, negative if stream is byte oriented, or 0 if stream has no orientation
  • fwide will not change the orientation of a stream that is already oriented. If mode
    1. < 0: fwide will try to make the specified stream byte oriented.
    2. > 0: fwide will try to make the specified stream wide oriented.
    3. = 0: fwide will not set the orientation, but will return a value identifying the stream’s orientation.
  • When we open a stream, fopen(Section 5.5) returns a pointer to a FILE object which is a structure that contains all the information required by the standard I/O library to manage the stream: the file descriptor used for actual I/O, a pointer to a buffer for the stream, the size of the buffer, a count of the number of characters currently in the buffer, an error flag, and the like. We refer to a pointer to a FILE object, the type FILE *, as a file pointer.

5.3 Standard Input, Standard Output, and Standard Error

  • Three streams are predefined and automatically available to a process: standard input, standard output, and standard error. These streams refer to the same files as the file descriptors STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO, respectively.
  • These three standard I/O streams are referenced through the predefined file pointers stdin, stdout, and stderr. The file pointers are defined in

5.4 Buffering

  • The goal of the buffering provided by the standard I/O library is to use the minimum number of read and write calls. This library tries to do its buffering automatically for each I/O stream.
  • Three types of buffering are provided:
    1. Fully buffered.
  • Actual I/O takes place when the standard I/O buffer is filled. Files residing on disk are normally fully buffered by the standard I/O library. The buffer used is usually obtained by malloc(Section 7.8) the first time I/O is performed on a stream.
  • The term flush describes the writing of a standard I/O buffer. A buffer can be flushed automatically by the standard I/O routines, such as when a buffer fills, or we can call the function fflush to flush a stream.
  • In UNIX, flush means two different things.
    1. For the standard I/O library: it means writing out the contents of a buffer, which may be partially filled.
    2. For the terminal driver: it means to discard the data that’s already stored in a buffer.
    3. Line buffered.
  • The standard I/O library performs I/O when a newline character is encountered on input or output. This allows us to output a single character at a time with the fputc function, knowing that actual I/O will take place only when we finish writing each line. Line buffering is typically used on a stream when it refers to a terminal(standard input/output).
  • Line buffering comes with two caveats.
    • First, the size of the buffer that the standard I/O library uses to collect each line is fixed, so I/O might take place if we fill this buffer before writing a newline.
    • Second, whenever input is requested through the standard I/O library from either
      (a) an unbuffered stream or
      (b) a line-buffered stream(that requires data to be requested from the kernel),
      all line-buffered output streams are flushed. The reason for the qualifier on (b) is that the requested data may already be in the buffer, which doesn’t require data to be read from the kernel. Any input from an unbuffered stream, item (a), requires data to be obtained from the kernel.
      1. Unbuffered.
  • The standard I/O library does not buffer the characters. If we write 15 characters with the standard I/O fputs function, we expect these 15 characters to be output as soon as possible, probably with the write function.
  • The standard error stream is normally unbuffered so that any error messages are displayed as quickly as possible, regardless of whether they contain a newline.
  • ISO C requires the following buffering characteristics.
    1. Standard input and standard output are fully buffered, if and only if they do not refer to an interactive device.
    2. Standard error is never fully buffered.
  • Most implementations default to the following types of buffering:
    1. Standard error is always unbuffered.
    2. All other streams are line buffered if they refer to a terminal device; otherwise, they are fully buffered.
  • Linux 3.2.0: standard error is unbuffered, streams open to terminal devices are line buffered, and all other streams are fully buffered.
  • If we don’t like these defaults for any given stream, we can change the buffering by calling either the setbuf or setvbuf function.
#include <stdio.h>
void setbuf(FILE *restrict fp, char *restrict buf );
int setvbuf(FILE *restrict fp, char *restrict buf, int mode, size_t size);
Returns: 0 if OK, nonzero on error
  • These functions must be called after the stream has been opened but before any other operation is performed on the stream.
  • setbuf: can turn buffering on or off.
    1. To enable buffering, buf must point to a buffer of length BUFSIZ(a constant defined in
#include <stdio.h>
int fflush(FILE *fp);
Returns: 0 if OK, EOF on error
  • The fflush function causes any unwritten data for the stream to be passed to the kernel. If fp is NULL, fflush causes all output streams to be flushed.

5.5 Opening a Stream

  • The fopen, freopen, and fdopen functions open a standard I/O stream.
#include <stdio.h>
FILE *fopen(const char *restrict pathname, const char *restrict type);
FILE *freopen(const char *restrict pathname, const char *restrict type, FILE *restrict fp);
FILE *fdopen(int fd, const char *type);
All three return: file pointer if OK, NULL on error
  • fopen opens a specified file.
  • freopen opens a specified file on a specified stream, closing the stream first if it is already open. If the stream previously had an orientation, freopen clears it. This function is typically used to open a specified file as one of the predefined streams: standard input/output/error.
  • fdopen takes an existing file descriptor, which we could obtain from the open, dup, dup2, fcntl, pipe, socket, socketpair, or accept functions, and associates a standard I/O stream with the descriptor. This function is often used with descriptors that are returned by the functions that create pipes and network communication channels. Because these special types of files cannot be opened with the standard I/O fopen function, we have to call the device-specific function to obtain a file descriptor, and then associate this descriptor with a standard I/O stream using fdopen.
  • ISO C specifies 15 values for the type argument, shown in Figure 5.2. Using the character b as part of the type allows the standard I/O system to differentiate between a text file and a binary file.

  • fdopen: The descriptor has already been opened, so opening for writing does not truncate the file. For example, if the descriptor was created by the open function and the file already existed, the O_TRUNC flag would control whether the file was truncated. The fdopen function cannot truncate any file it opens for writing. The standard I/O append mode cannot create the file since the file has to exist if a descriptor refers to it.
  • When a file is opened with a type of append, each write will take place at the current end of file. If multiple processes open the same file with the standard I/O append mode, the data from each process will be correctly written to the file.
  • When a file is opened for reading and writing, two restrictions apply.
    1. Output cannot be directly followed by input without an intervening fflush, fseek, fsetpos, or rewind.
    2. Input cannot be directly followed by output without an intervening fseek, fsetpos, or rewind, or an input operation that encounters an end of file.

  • If a new file is created by specifying a type of either w or a, we are not able to specify the file’s access permission bits, as we were able to do with the open function and the creat function in Chapter 3. POSIX.1 requires implementations to create the file with the following permissions bit set:
    S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH
    Section 4.8: we can restrict these permissions by adjusting our umask value.
  • By default, the stream that is opened is fully buffered, unless it refers to a terminal device(line buffered). Once the stream is opened, but before we do any other operation on the stream, we can change the buffering with setbuf or setvbuf.
  • An open stream is closed by calling fclose.
#include <stdio.h>
int fclose(FILE *fp);
Returns: 0 if OK, EOF on error
  • Any buffered output data is flushed before the file is closed. Any input data that may be buffered is discarded. If the standard I/O library had automatically allocated a buffer for the stream, that buffer is released.
  • When a process terminates normally(calling exit() directly or returning from main), all standard I/O streams with unwritten buffered data are flushed and all open standard I/O streams are closed.

5.6 Reading and Writing a Stream

  • Once we open a stream, we can choose from among three types of unformatted I/O.
    1. Character-at-a-time I/O.
      We can read or write one character at a time, with the standard I/O functions handling all the buffering, if the stream is buffered.
    2. Line-at-a-time I/O.
      If we want to read or write a line at a time, we use fgets and fputs. Each line is terminated with a newline character, and we have to specify the maximum line length that we can handle when we call fgets.
    3. Direct I/O.
      This type of I/O is supported by the fread and fwrite functions. For each I/O operation, we read or write some number of objects, where each object is of a specified size. These two functions are often used for binary files where we read or write a structure with each operation.
  • The term direct I/O is also known as: binary I/O, object-at-a-time I/O, record-oriented I/O, or structure-oriented I/O. This feature is unrelated with the O_DIRECT open flag supported by Linux.

Input Functions

  • Three functions allow us to read one character at a time.
#include <stdio.h>
int getc(FILE *fp);
int fgetc(FILE *fp);
int getchar(void);
All return: next character if OK, EOF on end of file or error
  • Difference between getc and fgetc: getc can be implemented as a macro, fgetc cannot. This means three things.
    1. The argument to getc should not be an expression with side effects, because it could be evaluated more than once.
    2. fgetc is guaranteed to be a function and we can pass the address of fgetc as an argument to another function.
    3. Calls to fgetc probably take longer than calls to getc, as it usually takes more time to call a function.
  • getchar is defined to be equivalent to getc(stdin).
  • These three functions return the next character as an unsigned char converted to an int. The constant EOF in
#include <stdio.h>
int ferror(FILE *fp);
int feof(FILE *fp);
Both return: nonzero(true) if condition is true, 0(false) otherwise
void clearerr(FILE *fp);
  • Two flags are maintained for each stream in the FILE object:
    • An error flag
    • An end-of-file flag
    Both flags are cleared by calling clearerr.
  • After reading from a stream, we can push back characters by calling ungetc.
#include <stdio.h>
int ungetc(int c, FILE *fp);
Returns: c if OK, EOF on error
  • The characters that are pushed back are returned by subsequent reads on the stream in reverse order of their pushing. Although ISO C allows an implementation to support any amount of push-back, an implementation is required to provide only a single character of push-back. We should not count on more than a single character.
  • The character that we push back does not have to be the same character that was read. We are not able to push back EOF. When we reach the end of file, we can push back a character. The next read will return that character, and the read after that will return EOF. This works because a successful call to ungetc clears the end-of-file indication for the stream.
  • When we push characters back with ungetc, they are not written back to the underlying file or device. Instead, they are kept incore in the standard I/O library’s buffer for the stream.

Output Functions

#include <stdio.h>
int putc(int c, FILE *fp);
int fputc(int c, FILE *fp);
int putchar(int c);
All three return: c if OK, EOF on error
  • putc can be implemented as a macro, fputc cannot.
  • putchar(c) is equivalent to putc(c, stdout).

5.7 Line-at-a-Time I/O

#include <stdio.h>
char *fgets(char *restrict buf, int n, FILE *restrict fp);
char *gets(char *buf );
Both return: buf if OK, NULL on end of file or error
  • buf is the address of the buffer to read the line into. gets reads from standard input, fgets reads from fp.
  • fgets: n is the size of the buffer. This function reads up through and including the next newline, but no more than n − 1 characters, into the buffer. The buffer is terminated with a null byte. If the line, including the terminating newline, is longer than n − 1, only a partial line is returned, but the buffer is always null terminated. Another call to fgets will read what follows on the line.
  • gets doesn’t store the newline in the buffer and should never be used because it doesn’t allow the caller to specify the buffer size.
#include <stdio.h>
int fputs(const char *restrict str, FILE *restrict fp);
int puts(const char *str);
Both return: non-negative value if OK, EOF on error
  • fputs writes the null-terminated string to fp. The null byte at the end is not written. This need not be line-at-a-time output, since the string need not contain a newline. Usually the last non-null character is a newline, but it’s not required.
  • The puts function writes the null-terminated string to the standard output, without writing the null byte. But puts then writes a newline character to the standard output.
  • We avoid using puts to prevent having to remember whether it appends a newline. If we always use fgets and fputs, we know that we always have to deal with the newline character at the end of each line.

5.8 Standard I/O Efficiency

  • The exit function will flush any unwritten data and then close all open streams.
  • Compare the time of 3 programs with the timing data from Figure 3.6. We show this data when operating on the same file(98.5 MB with 3 million lines) in Figure 5.6.

  • For each of the three standard I/O versions, the user CPU time is larger than the best read version from Figure 3.6, because the character-at-a-time standard I/O versions have a loop that is executed 100 million times, and the loop in the line-at-a-time version is executed 3,144,984 times. In the read version, its loop is executed only 25,224 times(for a buffer size of 4096). This difference in clock times stems from the difference in user times and the difference in the times spent waiting for I/O to complete, as the system times are comparable.
  • The system CPU time is about the same as before, because roughly the same number of kernel requests are being made. One advantage of using the standard I/O routines is that we don’t have to worry about buffering or choosing the optimal I/O size.
  • The version using line-at-a-time I/O is almost twice as fast as the version using character-at-a-time I/O because the line-at-a-time functions are implemented using memccpy(3).
  • The fgetc version is much faster than the BUFFSIZE=1 version from Figure 3.6. Both involve the same number of function calls. Difference is that the version using read executes 200 million function calls, which in turn execute 200 million system calls. With fgetc, we still execute 200 million function calls, but this translates into only 25224 system calls. System calls are usually much more expensive than ordinary function calls.

5.9 Binary I/O

#include <stdio.h>
size_t fread(void *restrict ptr, size_t size, size_t nobj, FILE *restrict fp);
size_t fwrite(const void *restrict ptr, size_t size, size_t nobj, FILE *restrict fp);
Both return: number of objects read or written
  • These functions have two common uses:
    1. Read or write a binary array. For example, to write elements 2 through 5 of a floating-point array, we could write
float data[10];
if (fwrite(&data[2], sizeof(float), 4, fp) != 4)
    err_sys("fwrite error");
  1. Read or write a structure. For example, we could write
struct
{
    short count;
    long total;
    char name[NAMESIZE];
} item;
if (fwrite(&item, sizeof(item), 1, fp) != 1)
    err_sys("fwrite error");
  • To read or write an array of structures: size = the sizeof the structure, nobj = the number of elements in the array.
  • Both fread and fwrite return the number of objects read or written.
    For the read case, this number can be less than nobj if an error occurs or if the end of file is encountered. In this situation, ferror or feof must be called.
    For the write case, if the return value is less than the requested nobj, an error has occurred.
  • Problem with binary I/O is that it can be used to read only data that has been written on the same system. These two functions won’t work among different systems for two reasons.
    1. The offset of a member within a structure can differ between compilers and systems because of different alignment requirements.
    2. The binary formats used to store multibyte integers and floating-point values differ among machine architectures.
  • The real solution for exchanging binary data among different systems is to use an agreed-upon canonical format.

5.10 Positioning a Stream

  • Three ways to position a standard I/O stream.
    1. The two functions ftell and fseek. They assume that a file’s position can be stored in a long integer.
    2. The two functions ftello and fseeko. They replace the long integer with the off_t data type.
    3. The two functions fgetpos and fsetpos. They use an abstract data type, fpos_t, that records a file’s position. This data type can be made as big as necessary to record a file’s position. When porting applications to non-UNIX systems, use fgetpos and fsetpos.
#include <stdio.h>
long ftell(FILE *fp);
Returns: current file position indicator if OK, −1L on error
int fseek(FILE *fp, long offset, int whence);
Returns: 0 if OK, −1 on error
void rewind(FILE *fp);
  • For binary files, a file’s position indicator is measured in bytes from the beginning of the file. The value returned by ftell for a binary file is this byte position. To position a binary file using fseek, we must specify a byte offset and indicate how that offset is interpreted.
  • whence =:
    1. SEEK_SET: from the beginning of the file,
    2. SEEK_CUR: from the current file position,
    3. SEEK_END: from the end of file.
  • To position a text file, whence has to be SEEK_SET, and only 2 values for offset are allowed:
    1. 0: rewind the file to its beginning
    2. A value that was returned by ftell for that file.
  • A stream can also be set to the beginning of the file with the rewind function.
  • The ftello function is the same as ftell, and the fseeko function is the same as fseek, except that the type of the offset is off_t instead of long.
#include <stdio.h>
off_t ftello(FILE *fp);
Returns: current file position indicator if OK, (off_t)−1 on error
int fseeko(FILE *fp, off_t offset, int whence);
Returns: 0 if OK, −1 on error
int fgetpos(FILE *restrict fp, fpos_t *restrict pos);
int fsetpos(FILE *fp, const fpos_t *pos);
Both return: 0 if OK, nonzero on error
  • The fgetpos function stores the current value of the file’s position indicator in the object pointed to by pos. This value can be used in a later call to fsetpos to reposition the stream to that location.

5.11 Formatted I/O

Formatted Output

#include <stdio.h>
int printf(const char *restrict format, ...);
int fprintf(FILE *restrict fp, const char *restrict format, ...);
int dprintf(int fd, const char *restrict format, ...);
All three return: number of characters output if OK, negative value if output error
int sprintf(char *restrict buf, const char *restrict format, ...);
Returns: number of characters stored in array if OK, negative value if encoding error
int snprintf(char *restrict buf, size_t n, const char *restrict format, ...);
Returns: number of characters that would have been stored in array if buffer was large enough, negative value if encoding error
  • The printf function writes to the standard output, fprintf writes to the specified stream, dprintf writes to the specified file descriptor, and sprintf places the formatted characters in the array buf. The sprintf function automatically appends a null byte at the end of the array, but this null byte is not included in the return value.
  • It’s possible for sprintf to overflow the buffer pointed to by buf. The caller is responsible for ensuring that the buffer is large enough.
  • With snprintf, the size of the buffer is an parameter; any characters that would have been written past the end of the buffer are discarded instead. The snprintf function returns the number of characters that would have been written to the buffer had it been big enough. The return value doesn’t include the terminating null byte. If snprintf returns a positive value less than the buffer size n, then the output was not truncated. If an encoding error occurs, snprintf returns a negative value.
  • Using dprintf removes the need to call fdopen to convert a file descriptor into a file pointer for use with fprintf.
  • format controls how the remainder of the arguments will be encoded and ultimately displayed. Each argument is encoded according to a conversion specification that starts with a percent sign(%). Other characters are copied unmodified.
  • A conversion specification has four optional components:
    %[flags][field_width][precision][length_modifier]convtype
    1. flags are summarized in Figure 5.7.

  1. field_width specifies a minimum field width for the conversion. If the conversion results in fewer characters, it is padded with spaces. The field width is a non-negative decimal integer or an asterisk.
  2. precision specifies the minimum number of digits to appear for integer conversions, the minimum number of digits to appear to the right of the decimal point for floating-point conversions, or the maximum number of bytes for string conversions. The precision is a period (.) followed by a optional non-negative decimal integer or an asterisk.
    • Either the field width or precision(or both) can be an asterisk. In this case, an integer argument specifies the value to be used. The argument appears directly before the argument to be converted.
  3. length_modifier specifies the size of the argument, summarized in Figure 5.8.

  1. convtype controls how the argument is interpreted, summarized in Figure 5.9.

  • Conversions are applied to the arguments in the order they appear after the format argument. An alternative conversion specification syntax allows the arguments to be named explicitly with the sequence %n$$ representing the nth argument. The two syntaxes can’t be mixed in the same format specification. With the alternative syntax, arguments are numbered starting at one. If either the field width or precision is to be supplied by an argument, the asterisk syntax is modified to *m$, where m specifies the position of the argument supplying the value.
#include <stdarg.h>
#include <stdio.h>
int vprintf(const char *restrict format, va_list arg);
int vfprintf(FILE *restrict fp, const char *restrict format, va_list arg);
int vdprintf(int fd, const char *restrict format, va_list arg);
All three return: number of characters output if OK, negative value if output error
int vsprintf(char *restrict buf, const char *restrict format, va_list arg);
Returns: number of characters stored in array if OK, negative value if encoding error
int vsnprintf(char *restrict buf, size_t n, const char *restrict format, va_list arg);
Returns: number of characters that would have been stored in array if buffer was large enough, negative value if encoding error

Formatted Input

#include <stdio.h>
int scanf(const char *restrict format, ...);
int fscanf(FILE *restrict fp, const char *restrict format, ...);
int sscanf(const char *restrict buf, const char *restrict format, ...);
All three return: number of input items assigned, EOF if input error or end of file before any conversion
  • The arguments following the format contain the addresses of the variables to initialize with the results of the conversions.
  • format controls how the arguments are converted for assignment.
  • The percent sign(%) indicates the beginning of a conversion specification. Except for the conversion specifications and white space, other characters in the format have to match the input. If a character doesn’t match, processing stops, leaving the remainder of the input unread.
  • There are 3 optional components to a conversion specification:
    %[*][filed_width][m][length_modifier]convtype
    1. Leading asterisk is used to suppress conversion. Input is converted as specified by the rest of the conversion specification, but the result is not stored in an argument.
    2. filed_width specifies the maximum field width in characters.
    3. The optional m character is called the assignment-allocation character. It can be used with the %c, %s, and %[ conversion specifiers to force a memory buffer to be allocated to hold the converted string. In this case, the corresponding argument should be the address of a pointer to which the address of the allocated buffer will be copied. If the call succeeds, the caller is responsible for freeing the buffer by calling the free function when the buffer is no longer needed.
    4. length_modifier component specifies the size of the argument to be initialized with the result of the conversion. The same length modifiers supported by the printf family of functions are supported by the scanf family of functions(Figure 5.8).
    5. convtype is similar to “convtype” in the printf family except that results that are stored in unsigned types can optionally be signed on input. E.g., −1 will scan as 4294967295 into an unsigned integer. Figure 5.10 summarizes the conversion types supported by the scanf family of functions.

  • Alternative conversion specification syntax: the sequence %n$ represents the nth argument. The same numbered argument can be referenced in the format string more than once. In this case, the behavior is undefined with the scanf family of functions.
#include<stdio.h>
#include<stdarg.h>
int vscanf(const char *restrict format, va_list arg);
int vfscanf(FILE *restrict fp, const char *restrict format, va_list arg);
int vsscanf(const char *restrict buf, const char *restrict format, va_list arg);
All three return: number of input items assigned, EOF if input error or end of file before any conversion

5.12 Implementation Details

  • Under UNIX, the standard I/O library ends up calling the I/O routines in Chapter 3. Each standard I/O stream has an associated file descriptor, and we can obtain the descriptor for a stream by calling fileno.
#include <stdio.h>
int fileno(FILE *fp);
Returns: the file descriptor associated with the stream
  • We need this function if we want to call the dup or fcntl functions, for example.

  • This program prints the buffering for the three standard streams and for a stream that is associated with a regular file.
  • Note that we perform I/O on each stream before printing its buffering status, since the first I/O operation usually causes the buffers to be allocated for a stream. The structure members and the constants used in this example are defined by the implementations of the standard I/O library used on the four platforms described in this book. Be aware that implementations of the standard I/O library vary, and programs like this example are nonportable, since they embed knowledge specific to particular implementations.
  • If we run the program in Figure 5.11 twice, once with the three standard streams connected to the terminal and once with the three standard streams redirected to files, we get the following result:

  • We can see that the default for this system is to have standard input and standard output line buffered when they’re connected to a terminal. The line buffer is 1,024 bytes. Note that this doesn’t restrict us to 1,024-byte input and output lines; that’s just the size of the buffer. Writing a 2,048-byte line to standard output will require two write system calls. When we redirect these two streams to regular files, they become fully buffered, with buffer sizes equal to the preferred I/O size—the st_blksize value from the stat structure—for the file system. We also see that the standard error is always unbuffered, as it should be, and that a regular file defaults to fully buffered.

5.13 Temporary Files

  • The ISO C standard defines two functions that are provided by the standard I/O library to assist in creating temporary files.
#include <stdio.h>
char *tmpnam(char *ptr);
Returns: pointer to unique pathname
FILE *tmpfile(void);
Returns: file pointer if OK, NULL on error
  • The tmpnam function generates a string that is a valid pathname and that does not match the name of any existing file. This function generates a different pathname each time it is called, up to TMP_MAX times. TMP_MAX is defined in
$ ./a.out
/tmp/fileT0Hsu6
/tmp/filekmAsYQ
  • The standard technique often used by the tmpfile function is to create a unique pathname by calling tmpnam, then create the file, and immediately unlink it. Recall from Section 4.15 that unlinking a file does not delete its contents until the file is closed.
  • This way, when the file is closed, either explicitly or on program termination, the contents of the file are deleted.
  • The Single UNIX Specification defines two additional functions as part of the XSI option for dealing with temporary files: mkdtemp and mkstemp.
  • Older versions of the Single UNIX Specification defined the tempnam function as a way to create a temporary file in a caller-specified location. It is marked obsolescent in SUSv4.
#include <stdlib.h>
char *mkdtemp(char *template);
Returns: pointer to directory name if OK, NULL on error
int mkstemp(char *template);
Returns: file descriptor if OK, −1 on error
  • The mkdtemp function creates a directory with a unique name, and the mkstemp function creates a regular file with a unique name. The name is selected using the template string. This string is a pathname whose last six characters are set to XXXXXX.
  • The function replaces these placeholders with different characters to create a unique pathname. If successful, these functions modify the template string to reflect the name of the temporary file.
  • The directory created by mkdtemp is created with the following access permission bits set: S_IRUSR | S_IWUSR | S_IXUSR. Note that the file mode creation mask of the calling process can restrict these permissions further. If directory creation is successful, mkdtemp returns the name of the new directory.
  • The mkstemp function creates a regular file with a unique name and opens it. The file descriptor returned by mkstemp is open for reading and writing. The file created by mkstemp is created with access permissions S_IRUSR | S_IWUSR.
  • Unlike tmpfile, the temporary file created by mkstemp is not removed automatically for us. If we want to remove it from the file system namespace, we need to unlink it ourselves.
  • Use of tmpnam and tempnam does have at least one drawback: a window exists between the time that the unique pathname is returned and the time that an application creates a file with that name. During this timing window, another process can create a file of the same name. The tmpfile and mkstemp functions should be used instead, as they don’t suffer from this problem.
  • The program in Figure 5.13 shows how to use (and how not to use) the mkstemp function.
  • If we execute the program in Figure 5.13, we get
$ ./a.out
trying to create first temp file...
temp name = /tmp/dirUmBT7h
file exists
trying to create second temp file...
Segmentation fault
  • The difference in behavior comes from the way the two template strings are declared. For the first template, the name is allocated on the stack, because we use an array variable. For the second name, however, we use a pointer. In this case, only the memory for the pointer itself resides on the stack; the compiler arranges for the string to be stored in the read-only segment of the executable. When the mkstemp function tries to modify the string, a segmentation fault occurs.

5.14 Memory Streams

  • The standard I/O library buffers data in memory, so operations such as character-at-a-time I/O and line-at-a-time I/O are more efficient. We also can provide our own buffer for the library to use by calling setbuf or setvbuf.
  • Memory streams are standard I/O streams for which there are no underlying files, although they are still accessed with FILE pointers. All I/O is done by transferring bytes to and from buffers in main memory.
  • Three functions are available to create memory streams. The first is fmemopen.
#include <stdio.h>
FILE *fmemopen(void *restrict buf, size_t size, const char *restrict type);
Returns: stream pointer if OK, NULL on error
  • fmemopen allows the caller to provide a buffer to be used for the memory stream: the buf argument points to the beginning of the buffer and the size argument specifies the size of the buffer in bytes. If buf is null, then fmemopen allocates a buffer of size bytes. In this case, the buffer will be freed when the stream is closed.
  • The type argument controls how the stream can be used, summarized in Figure 5.14.

  • Difference to the ones for file-based standard I/O streams:
    1. First, whenever a memory stream is opened for append, the current file position is set to the first null byte in the buffer. If the buffer contains no null bytes, then the current position is set to one byte past the end of the buffer. When a stream is not opened for append, the current position is set to the beginning of the buffer. Because the append mode determines the end of the data by the first null byte, memory streams aren’t well suited for storing binary data(which might contain null bytes before the end of the data).
    2. Second, if the buf argument is a null pointer, it makes no sense to open the stream for only reading or only writing. Because the buffer is allocated by fmemopen, there is no way to find the buffer’s address, so to open the stream only for writing means we could never read what we’ve written. To open the stream only for reading means we can only read the contents of a buffer into which we can never write.
    3. Third, a null byte is written at the current position in the stream whenever we increase the amount of data in the stream’s buffer and call fclose, fflush, fseek, fseeko, or fsetpos.

  • Figure 5.15: seeds the buffer with a known pattern to see how writes to the stream behave. This example shows the policy for flushing memory streams and appending null bytes.
  • A null byte is appended automatically whenever we write to a memory stream and advance the stream’s notion of the size of the stream’s contents(as opposed to the size of the buffer, which is fixed). The size of the stream’s contents is determined by how much we write to it.
#include <stdio.h>
FILE *open_memstream(char **bufp, size_t *sizep);
#include <wchar.h>
FILE *open_wmemstream(wchar_t **bufp, size_t *sizep);
Both return: stream pointer if OK, NULL on error
  • open_memstream creates a stream that is byte oriented, and open_wmemstream creates a stream that is wide oriented. These two functions differ from fmemopen in several ways:
    1. The stream created is only open for writing.
    2. We can’t specify our own buffer, but we can get access to the buffer’s address and size through the bufp and sizep arguments, respectively.
    3. We need to free the buffer ourselves after closing the stream.
    4. The buffer will grow as we add bytes to the stream.
  • We must follow rules regarding the use of the buffer address and its length.
    1. First, the buffer address and length are only valid after a call to fclose or fflush.
    2. Second, these values are only valid until the next write to the stream or a call to fclose. Because the buffer can grow, it may need to be reallocated. If this happens, then we will find that the value of the buffer ’s memory address will change the next time we call fclose or fflush.
  • Memory streams are well suited for creating strings, because they prevent buffer overflows. They can also provide a performance boost for functions that take standard I/O stream arguments used for temporary files, because memory streams access only main memory instead of a file stored on disk.

5.15 Alternatives to Standard I/O

  • One inefficiency in the standard I/O library is the amount of data copying that takes place. When we use the line-at-a-time functions, fgets and fputs, the data is usually copied twice: once between the kernel and the standard I/O buffer(when the corresponding read or write is issued) and again between the standard I/O buffer and our line buffer.

5.16 Summary

Exercises 2

Type in the program that copies a file using line-at-a-time I/O (fgets and fputs) from Figure 5.5, but use a MAXLINE of 4. What happens if you copy lines that exceed this length? Explain what is happening.

  • The fgets function reads up through and including the next newline or until the buffer is full (leaving room, of course, for the terminating null). Also, fputs writes everything in the buffer until it hits a null byte; it doesn’t care whether a newline is in the buffer. So, if MAXLINE is too small, both functions still work; they’re just called more often than they would be if the buffer were larger.
  • If either of these functions removed or added the newline (as gets and puts do), we would have to ensure that our buffer was big enough for the largest line.

Exercises 3

What does a return value of 0 from printf mean?

  • The function call printf(""); returns 0, since no characters are output.

Exercises 4

The following code works correctly on some machines, but not on others. What could be the problem?

#include <stdio.h>
int main(void)
{
    char c;
    while((c = getchar()) != EOF)
    {
        putchar(c);
    }
}
  • This is a common error. The return value from getc and getchar is an int, not a char. EOF is often defined to be −1, so if the system uses signed characters, the code normally works. But if the system uses unsigned characters, after the EOF returned by getchar is stored as an unsigned character, the character’s value no longer equals −1, so the loop never terminates. The four platforms described in this book all use signed characters, so the example code works on these platforms.

Exercises 5

How would you use the fsync function (Section 3.13) with a standard I/O stream?

  • Call fsync after each call to fflush. The argument to fsync is obtained with the fileno function. Calling fsync without calling fflush might do nothing if all the data were still in memory buffers.

Exercises 6

In the programs in Figures 1.7 and 1.10, the prompt that is printed does not contain a newline, and we don’t call fflush. What causes the prompt to be output?

  • Standard input and standard output are both line buffered when a program is run interactively. When fgets is called, standard output is flushed automatically.

Exercises 7

BSD-based systems provide a function called funopen that allows us to intercept read, write, seek, and close calls on a stream. Use this function to implement fmemopen for FreeBSD and Mac OS X.

  • Figure C.4

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1

Welcome to my github: https://github.com/gaoxiangnumber1

你可能感兴趣的:(github)