1、Files can be classified as either text or binary.(文件可以被归类为文本文件和二进制文件)
2、A file that can be processed (read, created, or modified) using a text editor such as Notepad on Windows or vi on UNIX is called a text file.
3、All the other files are called binary files.
4、For example, Java source programs are text files and can be read by a text editor, but Java class files are binary files and are read by the JVM.
5、Java offers many classes for performing file input and output. These can be categorized as text I/O classes and binary I/O classes.
Text data are read using the Scanner class and written using the PrintWriter class.
An input class contains the methods to read data, and an output class contains the methods to write data. PrintWriter is an example of an output class, and Scanner is an example of an input class. The following code creates an input object for the file temp.txt and reads data from the file.
Scanner input = new Scanner(new File("temp.txt"));
System.out.println(input.nextLine());
Figure 17.1 illustrates Java I/O programming. An input object reads a stream of data from
a file, and an output object writes a stream of data to a file. An input object is also called an input stream and an output object an output stream.
Binary I/O does not involve encoding or decoding and thus is more efficient than text I/O.
1、Computers do not differentiate between binary files and text files. All files are stored in binary format, and thus all files are essentially binary files.
2、Text I/O is built upon binary I/O to provide a level of abstraction for character encoding and decoding, as shown in Figure 17.2a.
3、The JVM converts Unicode to a file-specific encoding when writing a character, and it converts a file-specific encoding to Unicode when reading a character.
4、The JVM converts Unicode to a file-specific encoding when writing a character, and it converts a file-specific encoding to Unicode when reading a character.
5、Binary files are independent of the encoding scheme on the host machine and thus are portable.
6、Java programs on any machine can read a binary file created by a Java program. This is why Java class files are binary files. Java class files can run on a JVM on any machine.
The abstract InputStream is the root class for reading binary data, and the abstract OutputStream is the root class for writing binary data.
binary input classes, and OutputStream is the root for binary output classes. Figures 17.4 and 17.5 list all the methods in the classes InputStream and OutputStream.
Note
All the methods in the binary I/O classes are declared to throw java.io.IOException or a subclass of java.io.IOException.
1、FileInputStream/FileOutputStream is for reading/writing bytes from/to files.
2、All the methods in these classes are inherited from InputStream and OutputStream.
3、FileInputStream/FileOutputStream does not introduce new methods. To construct a FileInputStream, use the constructors shown in Figure 17.6.
4、A java.io.FileNotFoundException will occur if you attempt to create a FileInputStream with a nonexistent file.
5、To construct a FileOutputStream, use the constructors shown in Figure 17.7.
If the file does not exist, a new file will be created. If the file already exists, the first two constructors will delete the current content of the file. To retain the current content and append new data into the file, use the last two constructors and pass true to the append parameter.
6、Almost all the methods in the I/O classes throw java.io.IOException. Therefore, you have to declare to throw java.io.IOException in the method or place the code in a trycatch block, as shown below:
LISTING 17.1 TestFileStream.java
import java.io.*;
public class TestFileStream {
public static void main(String[] args) throws IOException {
try(FileOutputStream output = new FileOutputStream("temp.dat");){
for(int i = 1; i <= 10; i++)
output.write(i);
}
try(FileInputStream input = new FileInputStream("temp.dat");){
int value;
while((value = input.read()) != -1)
System.out.print(value + " ");
}
}
}
运行结果:
1 2 3 4 5 6 7 8 9 10
程序说明:
1、The program uses the try-with-resources to declare and create input and output streams so that they will be automatically closed after they are used. The java.io.InputStream and java.io.OutputStream classes implement the AutoClosable interface. The AutoClosable interface defines the close() method that closes a resource. Any object of the AutoClosable type can be used with the try-with-resources syntax for automatic closing.
2、The file temp.dat created in this example is a binary file. It can be read from a Java program but not from a text editor, as shown in Figure 17.8.
Tip
When a stream is no longer needed, always close it using the close() method or automatically close it using a try-with-resource statement. Not closing streams may cause data corruption in the output file, or other programming errors.
Note
The root directory for the file is the classpath directory. For the example in this book, the root directory is c:\book, so the file temp.dat is located at c:\book. If you wish to place temp.dat in a specific directory, replace line 6 with
FileOutputStream output = new FileOutputStream ("directory/temp.dat");
Note
An instance of FileInputStream can be used as an argument to construct a Scanner, and an instance of FileOutputStream can be used as an argument to construct a PrintWriter. You can create a PrintWriter to append text into a file using
new PrintWriter(new FileOutputStream("temp.txt", true));
If temp.txt does not exist, it is created. If temp.txt already exists, new data are appended to the file.
Filter streams are streams that filter(过滤) bytes for some purpose.
1、Using a filter class enables you to read integers, doubles, and strings instead of bytes and characters. FilterInputStream and FilterOutputStream are the base classes for filtering data.
2、When you need to process primitive numeric types(原始数值类型), use DataInputStream and DataOutputStream to filter bytes.
DataInputStream reads bytes from the stream and converts them into appropriate primitive-type values or strings. DataOutputStream converts primitive-type values or strings into bytes and outputs the bytes to the stream.
DataInputStream extends FilterInputStream and implements the DataInput interface, as shown in Figure 17.9. DataOutputStream extends FilterOutputStream and implements the DataOutput interface, as shown in Figure 17.10.
DataInputStream implements the methods defined in the DataInput interface to read primitive data-type values and strings. DataOutputStream implements the methods defined in the DataOutput interface to write primitive data-type values and strings.
1、A Unicode character consists of two bytes.
2、The writeChar(char c) method writes the Unicode of character c to the output.
3、The writeChars(String s) method writes the Unicode for each character in the string s to the output.
4、The writeBytes(String s) method writes the lower byte of the Unicode for each character in the string s to the output.
5、The high byte of the Unicode is discarded.
6、The writeBytes method is suitable for strings that consist of ASCII characters, since an ASCII code is stored only in the lower byte of a Unicode. If a string consists of non-ASCII characters, you have to use the writeChars method to write the string.
7、The writeUTF(String s) method writes two bytes of length information to the output stream, followed by the modified UTF-8 representation of every character in the string s.
8、UTF-8 is a coding scheme that allows systems to operate with both ASCII and Unicode. Most operating systems use ASCII. Java uses Unicode. The ASCII character set is a subset of the Unicode character set. Since most applications need only the ASCII character set, it is a waste to represent an 8-bit ASCII character as a 16-bit Unicode character.
The modified UTF-8 scheme stores a character using one, two, or three bytes. Characters are coded in one byte if their code is less than or equal to 0x7F, in two bytes if their code is greater than 0x7F and less than or equal to 0x7FF, or in three bytes if their code is greater than 0x7FF.
9、The writeUTF(String s) method converts a string into a series of bytes in the UTF-8 format and writes them into an output stream. The readUTF() method reads a string that has been written using the writeUTF method.
The UTF-8 format has the advantage of saving a byte for each ASCII character, because a Unicode character takes up two bytes and an ASCII character in UTF-8 only one byte. If most of the characters in a long string are regular ASCII characters, using UTF-8 is more efficient.
DataInputStream/DataOutputStream are created using the following constructors (see Figures 17.9 and 17.10):
public DataInputStream(InputStream instream)
public DataOutputStream(OutputStream outstream)
The following statements create data streams. The first statement creates an input stream for the file in.dat; the second statement creates an output stream for the file out.dat.
DataInputStream input = new DataInputStream(new FileInputStream("in.dat"));
DataOutputStream output=new DataOutputStream(new FileOutputStream("out.dat"));
LISTING 17.2 TestDataStream.java
import java.io.*;
public class TestDataStream {
public static void main(String[] args) throws IOException {
try(
DataOutputStream output = new DataOutputStream(new FileOutputStream("temp.dat"));
){
output.writeUTF("John");
output.writeDouble(85.5);
output.writeUTF("Jim");
output.writeDouble(185.5);
output.writeUTF("George");
output.writeDouble(105.25);
}
try(
DataInputStream input = new DataInputStream(new FileInputStream("temp.dat"));
){
System.out.println(input.readUTF() + " " + input.readDouble());
System.out.println(input.readUTF() + " " + input.readDouble());
System.out.println(input.readUTF() + " " + input.readDouble());
}
}
}
John 85.5
Jim 185.5
George 105.25
DataInputStream and DataOutputStream read and write Java primitive-type values and strings in a machine-independent fashion, thereby enabling you to write a data file on one machine and read it on another machine that has a different operating system or file structure.
DataInputStream filters data from an input stream into appropriate primitive-type values or strings. DataOutputStream converts primitive-type values or strings into bytes and outputs the bytes to an output stream. You can view DataInputStream/FileInputStream and DataOutputStream/FileOutputStream working in a pipe line as shown in Figure 17.11.
Caution
You have to read data in the same order and format in which they are stored. For example, since names are written in UTF-8 using writeUTF, you must read names using readUTF.
If you keep reading data at the end of an InputStream, an EOFException will occur. This exception can be used to detect the end of a file, as shown in Listing 17.3.
LISTING 17.3 DetectEndOfFile.java
import java.io.*;
public class DetectEndOfFile {
public static void main(String[] args) {
try{
try(DataOutputStream output = new DataOutputStream(new FileOutputStream("test.dat"))){
output.writeDouble(4.5);
output.writeDouble(43.25);
output.writeDouble(3.2);
}
try(DataInputStream input = new DataInputStream(new FileInputStream("test.dat"))){
while(true)
System.out.println(input.readDouble());
}
}
catch(EOFException ex){
System.out.println("All data were read");
}
catch(IOException ex){
ex.printStackTrace();
}
}
}
4.5
43.25
3.2
All data were read
BufferedInputStream/BufferedOutputStream can be used to speed up input and output by reducing the number of disk reads and writes. Using BufferedInputStream, the whole block of data on the disk is read into the buffer in the memory once. The individual data are then delivered to your program from the buffer, as shown in Figure 17.12a.
Using BufferedOutputStream, the individual data are first written to the buffer in the memory. When the buffer is full, all data in the buffer are written to the disk once, as shown in Figure 17.12b.
BufferedInputStream/BufferedOutputStream does not contain new methods. All the methods in BufferedInputStream/BufferedOutputStream are inherited from the InputStream/OutputStream classes. BufferedInputStream/BufferedOutputStream manages a buffer behind the scene and automatically reads/writes data from/to disk on demand.
You can wrap a BufferedInputStream/BufferedOutputStream on any InputStream/OutputStream using the constructors shown in Figures 17.13 and 17.14.
If no buffer size is specified, the default size is 512 bytes. You can improve the performance of the TestDataStream program in Listing 17.2 by adding buffers in the stream in lines 6–7 and lines 19–20, as follows:
DataOutputStream output = new DataOutputStream( new BufferedOutputStream(new FileOutputStream("temp.dat")));
DataInputStream input = new DataInputStream(new BufferedInputStream(new FileInputStream("temp.dat")));
Tip
You should always use buffered I/O to speed up input and output. For small files, you may not notice performance improvements. However, for large files—over 100 MB— you will see substantial improvements using buffered I/O.
This section develops a useful utility for copying files.
In this section, you will learn how to write a program that lets users copy files. The user needs to provide a source file and a target file as command-line arguments using the command:
java Copy source target
The program copies the source file to the target file and displays the number of bytes in the file. The program should alert the user if the source file does not exist or if the target file already exists. A sample run of the program is shown in Figure 17.15.
LISTING 17.4 Copy.java
import java.io.*;
public class Copy {
/** Main method
* @param args[0] for sourcefile
* @param args[1] for target file
*/
public static void main(String[] args) throws IOException{
//check command-line parameter usage
if(args.length != 2){
System.out.println("Usage: java Copy sourceFile targetFile");
System.exit(1);
}
//check if source file exists
File sourceFile = new File(args[0]);
if(!sourceFile.exists()){
System.out.println("Source file " + args[0] + "does not exist");
System.exit(2);
}
//check if target file exists
File targetFile = new File(args[1]);
if(targetFile.exists()){
System.out.println("Target file "+ args[1] + " already exists");
System.exit(3);
}
try(
//Create an input stream
BufferedInputStream input = new BufferedInputStream(new FileInputStream(sourceFile));
//Create an output stream
BufferedOutputStream output = new BufferedOutputStream(new FileOutputStream(targetFile));
){
//Continuously read a byte from input and write it to output
int r, numberOfBytesCopied = 0;
while((r = input.read()) != -1){
output.write((byte)r);
numberOfBytesCopied++;
}
System.out.println(numberOfBytesCopied + "bytes copied");
}
}
}
ObjectInputStream/ObjectOutputStream classes can be used to read/write serializable objects.
DataInputStream/DataOutputStream enables you to perform I/O for primitive-type values and strings.
ObjectInputStream/ObjectOutputStream enables you to perform I/O for objects in addition to primitive-type values and strings. Since ObjectInputStream/ObjectOutputStream contains all the functions of DataInputStream/DataOutputStream, you can replace DataInputStream/DataOutputStream completely with ObjectInputStream/ObjectOutputStream. ObjectInputStream extends InputStream and implements ObjectInput and ObjectStreamConstants, as shown in Figure 17.16.
ObjectInput is a subinterface of DataInput (DataInput is shown in Figure 17.9). ObjectStreamConstants contains the constants to support ObjectInputStream/ObjectOutputStream.
ObjectOutputStream extends OutputStream and implements ObjectOutput and ObjectStreamConstants, as shown in Figure 17.17. ObjectOutput is a subinterface of DataOutput (DataOutput is shown in Figure 17.10).
You can wrap an ObjectInputStream/ObjectOutputStream on any InputStream/ OutputStream using the following constructors:
// Create an ObjectInputStream
public ObjectInputStream(InputStream in)
// Create an ObjectOutputStream
public ObjectOutputStream(OutputStream out)
Listing 17.5 writes student names, scores, and the current date to a file named object.dat.
LISTING 17.5 TestObjectOutputStream.java
import java.io.*;
public class TestObjectOutputStream {
public static void main(String[] args) throws IOException{
try(
ObjectOutputStream output = new ObjectOutputStream(new FileOutputStream("object.dat"));
){
output.writeUTF("John");
output.writeDouble(85.5);
output.writeObject(new java.util.Date());
}
}
}
To improve performance, you may add a buffer in the stream using the following statement to replace lines 6 and 7:
ObjectOutputStream output = new ObjectOutputStream( new BufferedOutputStream(new FileOutputStream("object.dat")));
Multiple objects or primitives can be written to the stream. The objects must be read back from the corresponding ObjectInputStream with the same types and in the same order as they were written. Java’s safe casting should be used to get the desired type. Listing 17.6 reads data from object.dat.
LISTING 17.6 TestObjectInputStream.java
import java.io.*;
public class TestObjectInputStream {
public static void main(String[] args) throws ClassNotFoundException, IOException{
try(
ObjectInputStream input = new ObjectInputStream(new FileInputStream("object.dat"));
){
String name = input.readUTF();
double score = input.readDouble();
java.util.Date date = (java.util.Date)(input.readObject());
System.out.println(name + " " + score + " " + date);
}
}
}
Not every object can be written to an output stream. Objects that can be so written are said to be serializable. A serializable object is an instance of the java.io.Serializable interface, so the object’s class must implement Serializable.
The Serializable interface is a marker interface. Since it has no methods, you don’t need to add additional code in your class that implements Serializable. Implementing this interface enables the Java serialization mechanism to automate the process of storing objects and arrays.
Suppose you wish to store an ArrayList object. To do this you need to store all the elements in the list. Each element is an object that may contain other objects. As you can see, this would be a very tedious process. Fortunately, you don’t have to go through it manually. Java provides a built-in mechanism to automate the process of writing objects. This process is referred as object serialization, which is implemented in ObjectOutputStream.
In contrast, the process of reading objects is referred as object deserialization, which is implemented in ObjectInputStream.
Many classes in the Java API implement Serializable. All the wrapper classes for primitive type values, java.math.BigInteger, java.math.BigDecimal, java.lang.String, java.lang.StringBuilder, java.lang.StringBuffer, java.util.Date, and java.util.ArrayList implement java.io.Serializable. Attempting to store an object that does not support the Serializable interface would cause a NotSerializableException.
When a serializable object is stored, the class of the object is encoded; this includes the class name and the signature of the class, the values of the object’s instance variables, and the closure of any other objects referenced by the object. The values of the object’s static variables are not stored.
Note
Nonserializable fields
If an object is an instance of Serializable but contains nonserializable instance data fields, can it be serialized? The answer is no. To enable the object to be serialized, mark these data fields with the transient keyword to tell the JVM to ignore them
when writing the object to an object stream. Consider the following class:
public class C implements java.io.Serializable {
private int v1;
private static double v2;
private transient A v3 = new A();
}
class A { } // A is not serializable
When an object of the C class is serialized, only variable v1 is serialized. Variable v2 is not serialized because it is a static variable, and variable v3 is not serialized because it is marked transient. If v3 were not marked transient, a java.io.NotSerializableException would occur.
Note
Duplicate objects
If an object is written to an object stream more than once, will it be stored in multiple copies? No, it will not. When an object is written for the first time, a serial number is created for it. The JVM writes the complete contents of the object along with the serial number into the object stream. After the first time, only the serial number is stored if the same object is written again. When the objects are read back, their references are the same since only one object is actually created in the memory.
An array is serializable if all its elements are serializable. An entire array can be saved into a file using writeObject and later can be restored using readObject. Listing 17.7 stores an array of five int values and an array of three strings and reads them back to display on the console.
LISTING 17.7 TestObjectStreamForArray.java
import java.io.*;
public class TestObjectStreamForArray {
public static void main(String[] args) throws ClassNotFoundException, IOException {
int[] numbers = {1, 2, 3, 4, 5};
String[] strings = {"John", "Susan", "Kim"};
try ( // Create an output stream for file array.dat
ObjectOutputStream output = new ObjectOutputStream(new FileOutputStream("array.dat", true));
) {
// Write arrays to the object output stream
output.writeObject(numbers);
output.writeObject(strings);
}
try ( // Create an input stream for file array.dat
ObjectInputStream input = new ObjectInputStream(new FileInputStream("array.dat"));
) {
int[] newNumbers = (int[])(input.readObject());
String[] newStrings = (String[])(input.readObject());
// Display arrays
for (int i = 0; i < newNumbers.length; i++)
System.out.print(newNumbers[i] + " ");
System.out.println();
for (int i = 0; i < newStrings.length; i++)
System.out.print(newStrings[i] + " ");
}
}
}
1 2 3 4 5
John Susan Kim
Java provides the RandomAccessFile class to allow data to be read from and written to at any locations in the file.
All of the streams you have used so far are known as read-only or write-only streams. These streams are called sequential streams. A file that is opened using a sequential stream is called a sequential-access file. The contents of a sequential-access file cannot be updated.
Java provides the RandomAccessFile class to allow data to be read from and written to at any locations in a file. A file that is opened using the RandomAccessFile class is known as a random-access file.
The RandomAccessFile class implements the DataInput and DataOutput interfaces, as shown in Figure 17.18. The DataInput interface (see Figure 17.9) defines the methods for reading primitive-type values and strings (e.g., readInt, readDouble, readChar, readBoolean, readUTF) and the DataOutput interface (see Figure 17.10) defines the methods for writing primitive-type values and strings (e.g., writeInt, writeDouble, writeChar, writeBoolean, writeUTF).
When creating a RandomAccessFile, you can specify one of two modes: r or rw. Mode r means that the stream is read-only, and mode rw indicates that the stream allows both read and write. For example, the following statement creates a new stream, raf, that allows the program to read from and write to the file test.dat:
RandomAccessFile raf = new RandomAccessFile("test.dat", "rw");
If test.dat already exists, raf is created to access it; if test.dat does not exist, a new file named test.dat is created, and raf is created to access the new file. The method raf.length() returns the number of bytes in test.dat at any given time. If you append new data into the file, raf.length() increases.
Tip
If the file is not intended to be modified, open it with the r mode. This prevents unintentional modification of the file.
A special marker called a file pointer is positioned at one of these bytes. A read or write operation takes place at the location of the file pointer. When a file is opened, the file pointer is set at the beginning of the file. When you read or write data to the file, the file pointer moves forward to the next data item. For example,
if you read an int value using readInt(), the JVM reads 4 bytes from the file pointer, and now the file pointer is 4 bytes ahead of the previous location, as shown in Figure 17.19.
For a RandomAccessFile raf, you can use the raf.seek(position) method to move the file pointer to a specified position. raf.seek(0) moves it to the beginning of the file, and raf.seek(raf.length()) moves it to the end of the file. Listing 17.8 demonstrates RandomAccessFile. A large case study of using RandomAccessFile to organize an address book is given in Supplement VI.D.
LISTING 17.8 TestRandomAccessFile.java
import java.io.*;
public class TestRandomAccessFile {
public static void main(String[] args) throws IOException {
try ( // Create a random access file
RandomAccessFile inout = new RandomAccessFile("inout.dat", "rw");
) {
// Clear the file to destroy the old contents if exists
inout.setLength(0);
// Write new integers to the file
for (int i = 0; i < 200; i++)
inout.writeInt(i);
// Display the current length of the file
System.out.println("Current file length is " + inout.length());
// Retrieve the first number
inout.seek(0); // Move the file pointer to the beginning
System.out.println("The first number is " + inout.readInt());
// Retrieve the second number
inout.seek(1 * 4); // Move the file pointer to the second number
System.out.println("The second number is " + inout.readInt());
// Retrieve the tenth number
inout.seek(9 * 4); // Move the file pointer to the tenth number
System.out.println("The tenth number is " + inout.readInt());
// Modify the eleventh number
inout.writeInt(555);
// Append a new number
inout.seek(inout.length()); // Move the file pointer to the end
inout.writeInt(999);
// Display the new length
System.out.println("The new length is " + inout.length());
// Retrieve the new eleventh number
inout.seek(10 * 4); // Move the file pointer to the eleventh number
System.out.println("The eleventh number is " + inout.readInt());
}
}
}
Current file length is 800
The first number is 0
The second number is 1
The tenth number is 9
The new length is 804
The eleventh number is 555
1. I/O can be classified into text I/O and binary I/O. Text I/O interprets data in sequences of characters. Binary I/O interprets data as raw binary values. How text is stored in a file depends on the encoding scheme for the file. Java automatically performs encoding and decoding for text I/O.
2. The InputStream and OutputStream classes are the roots of all binary I/O classes. FileInputStream/FileOutputStream associates a file for input/output. BufferedInputStream/BufferedOutputStream can be used to wrap any binary I/O stream to improve performance. DataInputStream/DataOutputStream can be used to read/write primitive values and strings.
3. ObjectInputStream/ObjectOutputStream can be used to read/write objects in addition to primitive values and strings. To enable object serialization, the object’s defining class must implement the java.io.Serializable marker interface.
4. The RandomAccessFile class enables you to read and write data to a file. You can open a file with the r mode to indicate that it is read-only or with the rw mode to indicate that it is updateable. Since the RandomAccessFile class implements DataInput and DataOutput interfaces, many methods in RandomAccessFile are the same as those in DataInputStream and DataOutputStream.