NIO: High Performance File Copying

In a previous tip, I discussed a simple file copy algorithm in context to the best way to move a directory of files (see IO: Moving a Directory ). The algorithm I posted was something of this sort:

public static void copyFile(File source, File dest) throws IOException {
 if(!dest.exists()) {
  dest.createNewFile();
 }
 InputStream in = null;
 OutputStream out = null;
 try {
  in = new new FileInputStream(source);
  out = new FileOutputStream(dest);
    
  // Transfer bytes from in to out
  byte[] buf = new byte[1024];
  int len;
  while ((len = in.read(buf)) > 0) {
   out.write(buf, 0, len);
  }
 }
 finally {
  if(in != null) {
   in.close();
  }
  if(out != null) {
   out.close();
  }
 }
}
 



One of the things to note in this algorithm is the verbosity and explicitness of the code. The code specifically defines a byte[] buffer, and sets the size to 1 kilobyte, and then it simply does kilobyte-at-a-time copies. The first potential problem with this is that the amount of optimal buffering isn't neccessarily 1 kilobyte. In addition to that, for this code to work, Java IO code must read data from the file system, bring it up into JVM memory, and then push it back down to the filesystem through Java IO.

We all remember when Java 1.4 came out that it brought the java.nio package with it - most of us also remember how that was pretty much it - after that there wasn't a whole lot of noise regarding 'NIO'. It's really a shame - Java NIO has the potential to really improve performance in a lot of areas. File copies is just one of them. Here is the basic file-to-file copy algorithm re-implemented using 'NIO':

public static void copyFile(File sourceFile, File destFile) throws IOException {
 if(!destFile.exists()) {
  destFile.createNewFile();
 }
 
 FileChannel source = null;
 FileChannel destination = null;
 try {
  source = new FileInputStream(sourceFile).getChannel();
  destination = new FileOutputStream(destFile).getChannel();
  destination.transferFrom(source, 0, source.size());
 }
 finally {
  if(source != null) {
   source.close();
  }
  if(destination != null) {
   destination.close();
  }
}



The first thing you'll notice about this implementation is the difference in core copying logic:

  byte[] buf = new byte[1024];
  int len;
  while ((len = in.read(buf)) > 0) {
   out.write(buf, 0, len);
  }



... becomes:

 destination.transferFrom(source, 0, source.size());



Note that there is no reference to the buffering used or the implementation of the actual copy algorithm. This is key to the potential performance advantages of this algorithm. The 'transferFrom' algorithm has the advantage of being able to be optimized to a much higher level than most of us would want to try. For one thing, because of the design of the interacting objects, chances are good that on most platforms the copy request can be deferred directly to the underlying operating system. Let it be known that in most cases the OS will be faster at copying files than Java. Just the same, even if it can't actually defer directly to the OS for the copy, because the transferFrom algorithm is related to the underlying channel implementation, it can be optimized for the platform, context, and channel type, it can use native method calls, and do many other fancy things. Long story short, the transferFrom algorithm can be optimized and optimized and optimized (and is ).

Just to verify, I filled a folder with a ton of small and large files, just to see what would happen. It seems, on average, that there is about a 33% improvement in performance (yes, 1/3rd !!!) from the rather simple copy algorithm above. Not too shabby!

 

http://www.javalobby.org/java/forums/t17036.html

你可能感兴趣的:(performance)