http://wpollock.com/CPlus/PrintfRef.htm
©2001 by Wayne Pollock, Tampa Florida USA. All Rights Reserved.printf and scanf are the two standard C programming language functions for console input and output. A variation of these commands (fprintf and fscanf) also allows I/O to files. Another (sprintf and sscanf) allows I/O to strings. (sscanf is especially useful.) All these as well as many other I/O functions are declared in the standard header file <stdio.h>.
Using scanf is tricker than printf. This is because of promotion, so that any type smaller than an int is promoted to an int when passed to printf, and floats are promoted to double. This means there is a single code for most integers and one for most floating point types. There is no such promotion with scanf arguments, which are actually pointers. So all types must be exactly specified. In addition there are security concerns with input that don't apply to output. (These will be discussed below. In particular, never use the gets() function in production-quality C code!)
A call to printf looks like this:
printf( "format-string", expression, ... );
Or you can use fprintf to send the output to the screen regardless of any output redirection like this:
fprintf( stderr, "format-string", expression, ... );
The format-string can contain regular characters which are simply printed out, and format specifications or place-holders. For each place-holder in the format-string there must be one matching expression. The expressions are converted to strings according to the instructions in the corresponding place-holder and are mixed with the regular text in the format-string. Then the whole string is output. Here's an example:
printf( "%i + %i = %i\n", 2, 3, (2+3) );will produce the following output (by converting the three integer arguments to strings using default formatting):
2 + 3 = 5
The following table shows the different format letters you can use with printf. Each letter corresponds to a different type of argument expression. It is important to use the correct letter that matches the type of the expression. The use of any other letter results in undefined behavior. (Note the %a and %A are only available in C99, the others should be available with any standard C compiler.) The arguments can be any expression of the correct type (such as literals), but usually are variables whose values were computer earlier.
In the examples, remember that 17 is an int literal, 17L is a long int literal, 017 is an unsigned octal literal (with the decimal value 1*8 + 7 = 15), 0x17, 0X1A are unsigned hexidecimal literals with decimal values of 1*16 + 7 = 23, and 1*16 + A = 16 + 10 = 26, 17u is an unsigned decimal integer literal, 'A' is a char literal with a decimal value of 65, 3.14 and 0.314E1 are double literals, 3.14f is a float literal, and finally 3.14L is a long double literal.
Letter | Type of Matching Argument | Example | Output |
---|---|---|---|
% | none ( See note) | printf( "%%" ); | % |
d, i | int ( See note) | printf( "%i", 17 ); | 17 |
u | unsigned int (Converts to decimal) | printf( "%u", 17u ); | 17 |
o | unsigned int (Converts to octal) | printf( "%o", 17 ); | 21 |
x | unsigned int (Converts to lower-case hex) | printf( "%x", 26 ); | 1a |
X | unsigned int (Converts to upper-case hex) | printf( "%X", 26 ); | 1A |
f, F | double ( See note) | printf( "%f", 3.14 ); | 3.140000 |
e, E | double ( See note) | printf( "%e", 31.4 ); | 3.140000e+01 |
g, G | double ( See note) | printf( "%g, %g", 3.14, 0.0000314 ); | 3.14, 3.14e-05 |
a, A | double ( See note) | printf( "%a", 31.0 ); | 0x1.fp+0 |
c | int (See note) | printf( "%c", 65 ); | A |
s | string (See note) | printf( "%s", "Hello" ); | Hello |
p | void* (See note) | int a = 1; printf( "%p", &a ); | 0064FE00 |
n | int* (See note) | int a; printf( "ABC%n", &a ); | ABC (a==3) |
To control the appearance of the converted arguments, any or all (or none) of the following format controls may be used between the % and the final letter of the conversion specification. Note these must appear (if at all) in the sequence shown here. A · is used to indicate a space in the example output where spacing is not obvious.
% flags minimum-field-width .precision length LetterFormat Control | Description | Example | Output | |
---|---|---|---|---|
flags | The flag characters may appear in any order and have the following meanings: | |||
- left-justify within the field. ( See note) | printf( "|%3i|%-3i|", 12, 12); | |·12|12·| | ||
+ Forces positive numbers to include a leading plus sign. | printf( "%+i", 17); | +17 | ||
space Forces positive number to include a leading space. ( See note) | printf( "|% i|", 12); | |·12| | ||
# This flag forces the output to be in some alternate form. ( See note) | printf( "%#X", 26); | 0X1A | ||
0 Pad with zeros rather than spaces. ( See note) | printf( "|%04i|", 12); | |0012| | ||
minimum field-width |
After converting any value to a string, the field width represents the minimum number of characters in the resulting string. (See note.) If the converted value has fewer characters, then the resulting string is padded with spaces (or zeros) on the left (or right) by default (or if the appropriate flag is used.) | printf( "|%5s|", "ABC"); | |··ABC| | |
Sometimes the minimum field width isn't known at compile-time, and must be computed at run-time. (For example, printing a table where the width of a column depends on the widest column value in the input.) In this case the field width can be specified as an asterisk ("*"), which acts like a place-holder for an int value used for the field width. The value appears in the argument list before the expression being converted. | printf( "|%-*s|", 5, "ABC" ); | |ABC··| | ||
.precision | A period by itself implies a precision of zero. A precision may be replaced with an asterisk ("*"), which works exactly the same as for an asterisk minimum field width described above. The meaning of a precision depends on the type of conversion done. Only the conversions listed below are defined: | |||
When used with floating-point conversion letters (a, A, e, E, f, F, g, and G) the precision specifies how many digits will appear to the right of the decimal point. The default precision is six. (For conversion letters g and G, the precision is actually the maximum number of significant digits.) The value displayed is always rounded, but note this doesn't change the matching expression in any way. If the precision is zero, no decimal point appears at all (but see "#" flag above). | printf( "|%5.2f|", 3.147 ); printf( "|%5.2G|", 3.147 ); |
|·3.15| |··3.1| |
||
When used with integer conversion letters (d, i, o, u, x, and X) the precision specifies the minimum number of digits to appear. Leading zeros are added as needed. | printf( "|%6.4i|", 17 ); | |··0017| | ||
When used with string conversions (letter "s") the precision specifies the maximum number of bytes written. (See note.) If the string is too long it will be truncated. | printf( "|%-5.3s|", "ABCD" ); | |ABC··| | ||
length | A length modifier is used to exactly specify the type of the matching argument. Since most types are promoted to int or double a length modifier is rarely used. However it is used for long and other types that don't have an explicit conversion letter of their own. Note that specific length modifiers only make sense in combination with specific conversion letters. Using undefined combinations causes unpredictable results. The length modifiers and their meanings are: | |||
|
printf( "%hhi", 300 ); (See note) | 44 | ||
|
printf( "%hi", 300 ); | 300 | ||
|
long a = 300, b = (long) 1.0E+14; printf( "%li\n%i", a, b ); printf( "%lc:%ls", L'A', L"ABC" ); |
300 276447232 A:ABC |
||
|
printf( "%#llX", 300 ); | 0X12C | ||
|
printf( "%ji", 17 ); | 17 | ||
|
printf( "%zi", sizeof(int) ); | 4 | ||
|
char a[5] = "abcd"; printf( "%ti", &(a[3]) - &(a[1]) ); |
2 |
||
|
printf( "%Lf", 3.14L ); | 3.140000 |
A call to scanf looks like this:
scanf( " conversion-string", & variable, ... );Or, reading from a file using a file handle (stdin is a predefined file handle but you can define your own via fopen) looks like this:
fscanf( stdin, " conversion-string", & variable, ... ); The conversion-string can contain three types of directives:Regular characters | This is text that must be matched character by character with the input. Such entries are rarely used for interactive programs, but can be handy when working with formatted input files. (See below for an example.) |
white-space characters | A blank, tab, or other white-space character will match any amount (including none) of any white-space. (So a single space will match any string of white-space, including newlines.) Note that it is legal for this to match no input at all (if there isn't a blank or tab, it is ok). |
Conversion Specifiers | Similar to printf conversion specifiers but just different enough to cause many errors. They all begin with a percent and end with a letter indicating the type of conversion. In between can be some special conversion controls, including the length. Unlike printf, failing to use the exact type and length for the conversion will result in unpredictable errors. Since few compilers will check the conversion-string for argument mis-matches, the result is a runtime (logic) error that can be hard to find. These conversion specifiers match a string of characters in the input, convert to the specified type (and length), and store the result in the RAM address provided by the corresponding argument. (The most common error with scanf is not using the address-of operator in front of a variable name for the argument.) |
scanf returns a useful error code. The return value is an int which indicates the number of conversions requested that (1) matched some input, (2) were converted without error, and (3) were assigned without any problems. (Matching only, or matching and converting only, doesn't count in the return value.) Depending on the error encountered the return value may be zero, EOF (a symbolic constant usually defined to be -1), or some other integer less than the number of requested conversions. Because so many problems in programs are a result of bad user input, it is common practice in production-quality code to always check the return value of scanf.
Here's an example use of scanf that attempts to read in two integers from an input file called foo that is formatted with lines like this:
Height: 12, Width: 34The C code fragment to read the numbers into variables height and width should look something like this:
int height, width; if ( fscanf( foo, "Height: %i, Width: %i", &height, &width ) != 2 ) { fprintf( stderr, "###Error with Scanf: bad input data.\a\n" ); // Do error processing, maybe just "continue" or "break". }
Here the fscanf is requesting two conversions, so if all goes well the return value should be 2. Note how the fscanf uses all three types of entires (regular text, white-space, and conversion specifiers). Although text such as "Height:" and "," are matched, they don't count toward the return value.
The system keeps track of which input has been seen so far. Every call to scanf picks up from where the last one stopped matching input. This means that if an error occured with the previous scanf, the input it failed to match is still left unread, as if the user typed ahead. If care isn't taken to discard error input, and a loop is used to read the input, your program can get caught in an infinite loop. (See below for an example and further discussion.)
For example, consider the program fragment above that reads in an age. if the input is "help" instead of a number, this will cause the scanf to fail when attempting to match an integer ("%i"), and the word help is left unread. So the next time through the loop, the scanf doesn't wait for fresh user input, it tries to convert help again.
Similarly if the input were "17.5", the "%i" will match the first two characters only (the 17), leaving the .5 as unread input for the next call to scanf.
Even if the input is correct, as "29", the newline that ended the input is still left unread. Normally that isn't a problem since most conversions automatically skip leading white-space such as the trailing newline from the previous line. However some conversions ("%c" and "%[") don't skip any leading white-space so you have to do it manually.
Note that all input functions that read from stdin share the same input buffer, so if a call to scanf("%i", &anInt); is followed by a call to getchar(), the newline left unread by scanf is read in now. This is not usually what is wanted.
A final warning: Some older compilers will not match any regular text after the last conversion specifier in the conversion-string. This bug would prevent the example for "%%" below from working correctly.
Letter | Type of Matching Argument | Auto-skip Leading White-Space |
Example | Sample Matching Input |
---|---|---|---|---|
% | % (a literal, matched but not converted or assigned) | no | int anInt; scanf("%i%%", &anInt); |
23% |
d | int (See note) | yes | int anInt; long l; scanf("%d %ld", &anInt, &l); |
-23 200 |
i | int (See note) | yes | int anInt; scanf("%i", &anInt); |
0x23 |
o | unsigned int (See note) | yes | unsigned int aUInt; scanf("%o", &aUInt); |
023 |
u | unsigned int (See note) | yes | unsigned int aUInt; scanf("%u", &aUInt); |
23 |
x | unsigned int (See note) | yes | unsigned int aUInt; scanf("%d", &aUInt); |
1A |
a, e, f, g | float or double (See note) | yes | float f; double d; scanf("%f %lf", &f, &d); |
1.2 3.4 |
c | char (See note) | no | char ch; scanf(" %c", &ch); |
Q |
s | array of char (See note) | yes | char s[30]; scanf("%29s", s); |
hello |
p | void (See note) | yes | int* pi; void* ptr; scanf("%p", &ptr); pi = (int*) ptr; |
0064FE00 |
n | int (See note) | no | int x, cnt; scanf("X: %d%n", &x, &cnt); |
X: 123 (cnt==6) |
[ | array of char (See note) | no | char s1[64], s2[64]; scanf(" %[^\n]", s1); scanf("%[^\t] %[^\t]", s1, s2); |
Hello World field1 field2 |
The control of input conversion is much simpler than for output conversions. Any, all, or none of the following format modifiers may be used between the % and the final letter of the conversion specification. Note these must appear (if at all) in the sequence shown here. A · is used to indicate a space in the example output where spacing is not obvious.
% * maximum-field-width length LetterConversion Modifier |
Description | Example | Matching Input |
Results | |
---|---|---|---|---|---|
* | Assignment Supression. This modifier causes the corresponding input to be matched and converted, but not assigned (no matching argument is needed). | int anInt; scanf("%*s %i", &anInt); |
Age:·29 |
anInt==29, return value==1 |
|
maximum field-width |
This is the maximum number of character to read from the input. Any remaining input is left unread. (Always use this with "%s" and "%[...]" in all production quality code! (No exceptions!) You should use one less than the size of the array used to hold the result.) | int anInt; char s[10]; scanf("%2i", &anInt); scanf("%9s", s); |
2345 VeryLongString |
anInt==23, return value==1 s=="VeryLongS" return value==1 |
|
length modifier |
This specifies the exact type of the matching arugment. These length codes are the same as the printf length modifiers, except as noted below: | ||||
|
double d; scanf("%lf", &d); |
3.14 | d==3.14 return value==1 |
The scanf call:
int i; float x; char name[50]; scanf( "%2d%f%*d %[0123456789]", &i, &x, name );
With this input:
56789 0123 56a72
will assign to i the value 56 and to x the value 789.0, will skip 0123, and assign to name the sequence 56\0 (the string "56"). The next character to be read from the input will be a.
A simple (and common) example reads an int from a user this way:
int age; for ( ;; ) { fprintf( stderr, "Please enter your age: " ); if ( scanf( "%i", &age ) = 1 ) break; // Do some sort of error processing: fprintf( stderr, "\nError reading in the age, please try again.\n" ); } printf( "You are %i years old.\n", age );
Note the use of fprintf to send the output to the screen even if output was redirected. Some common error processing is to reset the variables and try again, using a loop as shown here. Sometimes a count of attempts is kept and the user is only given a certain number of attempts before the program gives up.
A problem with scanf is that it leaves any unmatched input unread. This may be a problem for applications that expect line oriented input. When each line (or record) of input is to be processed independently, an error such as bad data on one line can cause errors when attempting to read the following line.
Consider the code above to read in an age. If the input entered by the user was non-numeric such as the word "help", the "h" would not match the "%i" and would be left unread. When the for loop repeated, the scanf would encounter the "h" again and immediately fail. This would cause an infinite loop!
A similar problem would exist if the user entered "29.5" for their age. The first time through the loop the scanf would read the 29. If the next input expected was a person's name or ID or whatever, the ".5" will be read next.
Another common problem with this approach is mixing scanf with getchar or getc. The scanf typically leaves the newline unread, so a call to read the next character retrieves that instead of the character the programmer expected to get. (This problem may be worse on DOS based systems, which have two characters to mark the end of lines.)
The solution is to use fgets (see note) to read input a line at a time into a buffer, then use the sscanf function to parse the contents of the buffer. With this approach each input operation consumes (reads) an entire line of input, even if it had errors. The next input operation starts fresh with the next line of input.
Here's an example to illustrate the technique:
char buf[BUFSIZ]; /* Buffer for a line of input. */ int age; fprintf( stderr, "Please enter your age: " ); while ( fgets( buf, sizeof(buf), stdin ) != NULL ) { if ( sscanf( buf, "%i", &age ) != 1 ) break; // Do some sort of error processing: fprintf( stderr, "\nError reading in the age, please try again.\n" ); }
The call to fgets reads all input up to and including a newline. It then copies that line of input into buf, adding a '\0' at the end to form a valid C string. The terminating newline is also copied into buf. On EOF, fgets returns NULL. (EOF and NULL are defined in <stdio.h>.) If the input is larger than the size of the buffer, than only the input that will fit is consumed (read). Note fgets is smart enough to reserve space for the '\0' from the size given. In this case the maximum input read would be BUFSIZ-1.
The sscanf works just like scanf or fscanf. The first argument to sscanf is the string to read from (instead of stdin as for scanf). If the fgets doesn't detect EOF but the sscanf fails to match any input using "%s", the input must have been a blank line. When using "%i" and not a "%s", the return value doesn't tell if the input was a blank line or some other error.
The line at a time example above works well but doesn't detect all errors. Consider what would happen if the user entered 29.5 for an age, or 32,500 for some numeric value (such as a person's income in dollars). While the fgets will read the whole line, the sscanf will only read 29 in the first case and 32 in the second. In both cases sscanf will return 1 and the errors would go undetected.
In order to detect such extra input on the line scanf must attempt to match it, convert it, and assign it to a variable. Then the return value will be 2 if extra input was present. There are two ways to do this. If extra white space is not considered an error you can use "%s" instead of the "%[" used below:
char buf[BUFSIZ], junk[BUFSIZ]; int income; fprintf( stderr, "Please enter your income: " ); while ( fgets( buf, sizeof(buf), stdin ) != NULL ) { if ( sscanf( buf, "%i%[^\n]", &income, junk ) != 1 ) break; // Do some sort of error processing: fprintf( stderr, "\nError reading your income, please try again.\n" ); }
Here the sscanf will skip leading white space (%i does this automatically), match digits until the first non-digit, convert the matched string to an int and assign the result to income. sscanf then matches any remaining characters (up to but not including a newline) and stores the string in junk. If there were no input errors, the %i would succeed but the %[^\n] will fail to match any input. The return value would therefore be 1. However if any extra input was encountered the %[^\n] will match it and assign the string to junk. This would cause sscanf to return 2. If the user input was "$32500, the %i would fail to match anything and the return value would be 0. This technique will therefore catch any input errors and consume the entire line (record) whether or not errors were present.
Below is a table of character constants. These can be used individually (as a char literal by surrounding the constant with single quotes) or as part of a double-quoted string literal.
Constant | Meaning |
---|---|
\' | A single quote |
\" | A double quote |
\? | A question mark |
\\ | A backslash |
\a | Alert sound |
\b | A backspace |
\f | A form-feed |
\n | A newline |
\r | A carrage-return |
\t | A tab |
\v | A vertical tab |
\ooo | Octal constant (up to three octal digits) |
\xHH | Hexadecimal constant (one or two hex digits) |
\uHHHH | Unicode constant (four hex digits) |
\UHHHHHHHH | Long Unicode constant (eight hex digits) |
The C standard doesn't use the term Unicode character much, the authors prefer the term Universal character. It's pretty clear that they mean Unicode though, since it is refered to by it's official name of "ISO/IEC 10646" in several places. Unicode is actually a 4 byte per character encoding, however most non-asian characters are found in the lower half of the character set, so the 2 byte form is common. (Actually Unicode is often stored in files in a form called UTF-8.)
Such multibyte characters are called wchar_t constants. You can make a literal for one as follows: L'\u00A9' (which is the same character as '\xA9', namely the "©" symbol). You can also form wide strings such as: L"\u00A92001", which translates to the string "©2001".
This information was extracted from ISO/IEC 9899, second edition (the C99 Standard), mostly from sections 7.19.6.1 and 7.19.6.2.