Java-10 I/O流

I/O流全称是Input/Output Stream,译为输入输出流

I/O流常用的类型都在java.io包中

类型	输入流	输出流
字节流(`Byte Streams`)	`InputStream`	`OutputStream`
字符流(`Character Streams`)	`Reader`	`writer`
缓冲流(`Buffered Streams`)	`BufferedInputStream`,`BufferedReader`	`BufferedOutputStream`,`BufferedWriter`
数据流(`Data Streams`)	`DataInputStream`	`DataOutputStream`
对象流(`Object Streams`)	`ObjectInputStream`	`ObjectOutputStream`

字符集（Character Set）

在计算机里面，一个中文汉字是一个字符，一个英文字母是一个字符，一个阿拉伯数字和标点符号等等都是一个字符。

字符集： 简称Charset, 由字符组成的合集。

常见的字符集有：

ASCII: 128个字符，包括了英文字母大小写，阿拉伯数字等
ISO-8859-1：支持欧洲的部分语言文字，在有些环境也叫做Latin-1
GB2312：支持中文(包含6763个汉字)
BIG5: 支持繁体中文(包括了13053个汉字)
GBK：是对GB2312、BIG5的扩充（包含了21003个汉字），支持中日韩
GB18030：是对GBK的扩充（包括了27484个汉字）
Unicode：包括了世界上所有的字符

上面的2-7字符集都包含了ASCII中的所有字符

字符编码

每个字符集都有对应的字符编码，它决定了每个字符如何转成二进制存储在计算机中

ASCII：单字节编码，编码范围是 0x00 ～ 0x7F (0 ~ 127)
- 会把每个字符转成一个字节，也就是每个字符都会占用一个字节的大小
ISO-8859-1：单字节编码，编码范围是 0x00～0xFF
- 0x00～0x7F和ASCII一致，0x80～0x9F是控制字符，0xA0～0xFF是文字符号
GB2312、BIG5、GBK：采用双字节表示一个汉字
GB18030：采用单字节，双字节，四字节表示一个字符。会根据字符的不同采用不同的字节
Unicode：有Unicode、UTF-8、UTF-16、UTF-32等编码，最常用的是UTF-8编码
- UTF-8 采用单字节。双字节，三字节，四字节表示一个字符

  public static void main(String[] args) throws Exception {
    String name = "lwy木子李";
    System.out.println(Arrays.toString(name.getBytes("ASCII")));
    System.out.println(Arrays.toString(name.getBytes("ISO-8859-1")));
    System.out.println(Arrays.toString(name.getBytes("GB2312")));
    System.out.println(Arrays.toString(name.getBytes("BIG5")));
    byte[] bytes = name.getBytes("GBK");
    System.out.println(Arrays.toString(bytes));
    System.out.println(Arrays.toString(name.getBytes("GB18030")));
    System.out.println(Arrays.toString(name.getBytes("UTF-8")));
  }

打印

[108, 119, 121, 63, 63, 63] // ASCII
[108, 119, 121, 63, 63, 63] // ISO-8859-1
[108, 119, 121, -60, -66, -41, -45, -64, -18] // GB2312
[108, 119, 121, -92, -20, -92, 108, -89, -11] // BIG5
[108, 119, 121, -60, -66, -41, -45, -64, -18] // GBK
[108, 119, 121, -60, -66, -41, -45, -64, -18] // GB18030
[108, 119, 121, -26, -100, -88, -27, -83, -112, -26, -99, -114] // UTF-8

可以看到前三个字符的编码结果都是一样的，
对于ASCII字符集里的字符，不管采用什么编码方式，编码结果都是一样的，大小也都是一个字节。对于ASCII不认识的字符（比如汉字），会转成63。ISO-8859-1也是一样，对于不认识的字符转成了63。

对于GBK编码，示例中的lwy中每个字符占用一个字节，对应的是108，119，121. 后面每个汉字占用两个字节，所以“木”对应的是-60，-66，“子”对应的是-41，-45。

而GB18030是对GBK的扩充，所以对共有字符的编码结果是一样的

而Unicode-8对汉字编码采用了3字节，比如上面示例中，-26, -100, -88对应的是汉字“木”， -27, -83, -112对应的是汉字“子”。

最终将这些数据存放到文件中的时候，需要指定一个编码方式，最终存放到文件的就是编码之后的数据（也就是上面的数字，不过最终是以二进制的形式存在）

现在最广泛使用的是UTF-8的编码方式。

当String.getBytes不传递参数时，会使用JVM默认的编码方式，一般来说是跟随main方法所在文件的字符编码

   // 采用默认的编码方式
   name.getBytes();

可以点击查看源码

    public byte[] getBytes() {
        return StringCoding.encode(coder(), value);
    }
    // StringCoding.encode
    static byte[] encode(byte coder, byte[] val) {
        Charset cs = Charset.defaultCharset();
        if (cs == UTF_8) {
            return encodeUTF8(coder, val, true);
        }
        if (cs == ISO_8859_1) {
            return encode8859_1(coder, val);
        }
        if (cs == US_ASCII) {
            return encodeASCII(coder, val);
        }
        StringEncoder se = deref(encoder);
        if (se == null || !cs.name().equals(se.cs.name())) {
            se = new StringEncoder(cs, cs.name());
            set(encoder, se);
        }
        return se.encode(coder, val);
    }

可以通过Charset.defaultCharset方法来查看默认的编码方式

乱码

一般将【字符串】转为【二进制】的过程，叫做编码（Encode）。将【二进制】转为【字符串】的过程，叫做解码（Decode）

编码、解码时使用的字符编码必须要保持一致，否则会造成乱码

  public static void main(String[] args) throws Exception {
    String name = "lwy木子李";
    byte[] bytes = name.getBytes("UTF-8");
    String decodeResult = new String(bytes, "GBK");
    System.out.println(decodeResult); 
    // 打印
    // lwy鏈ㄥ瓙鏉�
  }

字节流（Byte Streams）

字节流的特点：

一次制度写一个字节
最终都继承自InputStream,OutputStream

常用的字节流有FileInputStream,FileOutputStream

  public static void main(String[] args) throws Exception {
    // 如果没有该文件 会创建文件
    // FileOutputStream fos = new FileOutputStream("/Users/apple/Desktop/1.txt");
    // 写入的数据追加到后面 而不是覆盖
    FileOutputStream fos = new FileOutputStream("/Users/apple/Desktop/1.txt", true);
    fos.write(108);
    fos.write(119);
    fos.write(121);
    // 把lwy写入到该文件
    fos.close();
  }

FileInputStream fis = new FileInputStream("/Users/apple/Desktop/1.txt");
    //read()方法依次读取一个字节 
    int byte1 = fis.read(); // l
    int byte2 = fis.read(); // w
    int byte3 = fis.read(); // y
    System.out.println(byte1);
    System.out.println(byte2);
    System.out.println(byte3);
    // 108
    // 119
    // 121

    byte[] bytes = new byte[100];
    // 返回的len才是真实的有效数据长度 因为文件中的数据长度可能小于100或者大于100
    int len = fis.read(bytes);
    String str = new String(bytes);
    System.out.println(str + "_");
    
    fis.close();
  }

简单的封装：

  public static void main(String[] args) throws Exception {
    write("123456".getBytes(), new File("/Users/apple/Desktop/1.txt"));
  }

  public static void write(byte[] bytes, File file) {
    if (bytes == null || file == null)
      return;
    // 采用覆盖的方式 如果有该文件了 直接返回
    if (file.exists())
      return;
    FileOutputStream fos = null;
    try {
      fos = new FileOutputStream(file);
      fos.write(bytes);
    } catch (FileNotFoundException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    } finally {
      try {
        fos.close();
      } catch (Exception e) {
        e.printStackTrace();
      }
    }
  }

try-with-resources

java 7开始推出try-with-resources语句，可以没有catch,finally语句。

    try (资源1;资源2; ...) {
      
    } catch (Exception e) {
    
    }

可以在try后面的小括号中声明一个或多个资源(resource)

实现了java.lang.AutoCloseable接口的实例，都可以称之为资源

不管try中的语句是正常还是意外结束，最终都会自动按照顺序调用每一个资源的close方法(close方法的调用顺序与资源的声明顺序相反)。

字符流(Character Streams)

字符流的特点：

一次只读写一个字符
最终都继承自Reader,Writer

最常用的字符流有FileReader,FileWriter

这两个类只适合文本文件，比如.txt,.java等这些文件。

  public static void main(String[] args) throws Exception {
    String path = "/Users/apple/Desktop/1.txt";
    File file = new File(path);
    FileReader reader = new FileReader(file);
    // 读取字符
    reader.read();
    // 读取下一个字符
    reader.read();
  }

读取的是字符，所以读取的字节长度不确定，并不总是一个字节的长度

缓冲流(Buffered Streams)

前面的字节流，字符流都是无缓冲的I/O流，每个读写操作都有底层操作系统直接处理。每个读写操作通常会触发磁盘访问，因此大量的读写操作可能会使程序的效率大大降低

为了减少读写操作带来的开销，Java实现了缓冲的I/O流

缓冲输入流：从缓冲区读取数据，只有当缓冲区为空时才调用本地的输入API（Native Input API）
缓冲输出流：将数据写入缓冲区，只有当缓冲区已满时才调用本地的输出API（Native Output API）

本地API就是系统底层的API，会去访问磁盘

	缓冲输入流	缓冲输出流
缓冲字节流	`BufferedInputStream`	`BufferedOutputStream`
缓冲字符流	`BufferedReader`	`BufferedWriter`

上述表格中的4个缓冲流的默认缓冲区大小是8192字节（8KB），可以通过构造方法传递参数来设置缓冲区大小

缓冲流的使用

缓冲流的常见使用方式：将无缓冲流传递给缓冲流的构造方法（将无缓冲流包装成缓冲流）

  public static void main(String[] args) throws Exception {
    String path = "/Users/apple/Desktop/1.txt";
    File file = new File(path);
    InputStream is = new FileInputStream(file);
    BufferedInputStream bis = new BufferedInputStream(is, 16384);
    // 只需要关闭缓冲流即可 它的内部会关闭无缓冲流
    bis.close();
  }

缓冲流 - close、flush

调用缓冲输出流的flush方法，会强制调用本地的输出API，将缓冲区的数据真正写入到文件中

缓冲输出流的close方法内部会调用一次flush方法

  public static void main(String[] args) throws Exception {
    String path = "/Users/apple/Desktop/1.txt";
    BufferedWriter wirter = new BufferedWriter(new FileWriter(path));
    wirter.write("123456");
    wirter.close();
  }
  
  // wirter.close的源码实现
    @SuppressWarnings("try")
    public void close() throws IOException {
        synchronized (lock) {
            if (out == null) {
                return;
            }
            try (Writer w = out) {
                flushBuffer();
            } finally {
                out = null;
                cb = null;
            }
        }
    }

数据流

有两个数据流：·DataInputStream·，DataOutputStream,支持基本类型，字符串类型的 I/O操作

  public static void main(String[] args) throws Exception {
    int age = 18;
    int money = 1000;
    double height = 1.70;
    String name = "Jack";

    String path = "/Users/apple/Desktop/1.txt";
    DataOutputStream dos = new DataOutputStream(new FileOutputStream(path));
    dos.writeInt(age);
    dos.writeInt(money);
    dos.writeDouble(height);
    dos.writeUTF(name);
    dos.close();

    // 基本数据的归档和解档
    DataInputStream dis = new DataInputStream(new FileInputStream(path));
    System.out.println(dis.readInt());
    System.out.println(dis.readInt());
    System.out.println(dis.readDouble());
    System.out.println(dis.readUTF());
    dis.close();
  }

对象流

有两个对象流：ObjectInputStream,ObjectOutputStream 支持引用类型的I/O操作

只有实现了java.io.Serializable接口的类才能使用对象流进行I/O操作

Serializable是一个标记接口(Maker Interface), 不要求实现任何方法

public class Person implements Serializable {
  private int age;
  private String name;
  private double height;

  public Person(String name, int age, double height){
    this.name = name;
    this.age = age;
    this.height = height;
  }
  @Override
  public String toString() {
    return "Person [age=" + age + ", name="+ name + "height=" + height + "]";
  }
}

  public static void main(String[] args) throws Exception {
    String path = "/Users/apple/Desktop/p.txt";

    // Person p = new Person("Jack", 20, 1.77);
    // ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(path));
    // oos.writeObject(p);
    // oos.close();

    ObjectInputStream ois = new ObjectInputStream(new FileInputStream(path));
    Person p = (Person) ois.readObject();
    System.out.println(p); // Person [age=20, name=Jackheight=1.77]
    ois.close();
  }

对象的序列化和反序列化

序列化（Serialization）：将对象转换为可以存储或传输的数据，利用ObjectOutputStream可以实现对象的序列化

反序列化（Deserialization）：从序列化后的数据中恢复对象，利用ObjectInputStream可以实现对象的反序列化

transient

被transient修饰的实例变量不会被序列化

  // age不会序列化
  private transient int age;

serialVersionUID

每一个可序列化类都有一个serialVersionUID,相当于类的版本号。默认情况下会根据类的详细信息计算出serialVersionUID的值，根据编译器实现的不同可能有所差别

建议每个类自定义serialVersionUID

  private static final long serialVersionUID = 1L;