Let's imagine a situation where we want to write a pure Java application that must download files from a remote computer running an FTP server. We also want to filter downloads on the basis of remote file information like name, date, or size.
Although it is possible, and maybe fun, to write a protocol handler for FTP from scratch, doing so is also hard, long, and potentially risky. Since we'd rather not spend the time, effort, or money writing a handler on our own, we prefer instead reusing an existing software component. And plenty of libraries are available on the World Wide Web. With an FTP client library, downloading a file can be written in Java as simply as:
FTPClient ftpClient = new FTPClient(); ftpClient.connect("ftp.foo.com", "user01", "pass1234"); ftpClient.download("C:\\Temp\\", "README.txt"); // Eventually other operations here ... ftpClient.disconnect();
Looking for a quality Java FTP client library that matches our needs is not as simple as it seems; it can be quite painful. It takes some time to find a Java FTP client library. Then, after we find all the existing libraries, which one do we select? Each library addresses different needs. The libraries are unequal in quality, and their designs differ fundamentally. Each offers a different set of features and uses different types of jargon to describe them.
Thus, evaluating and comparing FTP client libraries can prove difficult and confusing. Reusing existing components is a commendable process, but in this case, starting out can be discouraging. And this is a shame: after choosing a good FTP library, the rest is routine.
This article aims to make that selection process short, easy, and worthwhile. I first list all available FTP client libraries. Then I define and describe a list of relevant criteria that the libraries should address in some way. Finally, I present an overview matrix that gives a quick view of how the libraries stack up against each other. All this information provides everything we need to make a fast, reliable, and long-lasting decision.
The reference specification for FTP is Request for Comments: 959 (RFC959). Sun Microsystems provides an RFC959 implementation in the JDK, but it is internal, undocumented, and no source is provided. While RFC959 lies in the shadows, it is actually the back end of a public interface implementing RFC1738, the URL specification, as illustrated in Figure 1.
Figure 1. FTP support in JDK. Click on thumbnail to view full-size image.
An implementation of RFC1738 is offered as standard in the JDK. It does a reasonable job for basic FTP transfer operations. It is public and documented, and source code is provided. To use it, we write the following:
URL url = new URL("ftp://user01:[email protected]/README.txt;type=i"); URLConnection urlc = url.openConnection(); InputStream is = urlc.getInputStream(); // To download OutputStream os = urlc.getOutputStream(); // To upload
FTP client support in JDK strictly follows the standard recommendation, but it has several downsides:
URL
and URLConnection
classes only open streams for communication. The Sun library offers no straight support for structuring the raw FTP server responses into more usable Java objects like String
, File
, RemoteFile
, or Calendar
. So we have to write more code just to write data into a file or to exploit a directory listing.
For all or any of these reasons, using a third-party library is preferable. The following section lists the available third-party alternatives.
The list below outlines the libraries I compare throughout this article. They all follow the reference FTP specification. Below, I mention the provider name and the library name (in italics). Resources includes links to each product Website. To jumpstart library use, I also mention the main FTP client class.
com.jscape.inet.ftp.Ftp
ipworks.Ftp
com.enterprisedt.net.ftp.FTPClient
com.ibm.network.ftp.protocol.FTPProtocol
net.sf.jftp.net.FtpConnection
org.apache.commons.net.ftp.FTPClient
jshop.jnet.FTPClient
sun.net.ftp.FtpClient
com.cqs.ftp.FTP
cz.dhl.ftp.Ftp
org.globus.io.ftp.FTPClient
Notes:
So far, I have introduced the context and listed the available libraries. Now, I list the relevant criteria against which each library will be evaluated. I enumerate possible values for each criterion, along with the abbreviation (in bold) used in the final comparison matrix.
The libraries provide support to users through product documentation, compiled Javadocs, sample code, and an example application that can include comments and explanations. Additional support can be offered to users through forums, mailing lists, a contact email address, or an online bug tracking system. /n software offers extensive support for an additional fee.
A support administrator's motivation is an important factor for fast support. Support administrators can be:
For commercial projects, a product license is an important matter to consider from the beginning. Some libraries can be freely redistributed in commercial products and others cannot. For instance, GPL (GNU General Public License) is a strong, limiting license, while the Apache Software license only requires a mention in redistributed products.
Commercial licenses limit the number of development workstations programming with the library, but distribution of the library itself is not restricted.
For noncommercial projects, license is more a matter of philosophy; a free product is appreciable.
Licenses can be:
Some library providers provide alternate, less-restrictive licenses on demand.
A closed-sourced, black-box software library can be irritating. Having source code can be more comfortable for the following reasons:
Libraries have been tested, debugged, and supported since their first public release. As version numbering varies among libraries, I base this criterion on the year of the earliest public release.
Retrieving remote file information (name, size, date) from the server is important in most applications. The FTP protocol offers the NLST
command to retrieve the file names only; the NLST
command is explicitly designed to be exploited by programs. The LIST
command offers more file information; as RFC959 notes, "Since the information on a file may vary widely from system to system, this information may be hard to use automatically in a program, but may be quite useful to a human user." No other standard method retrieves file information; therefore, client libraries try to exploit the LIST
response. But this is not an easy task: since no authoritative recommendation is available for the LIST
response format, FTP servers have adopted various formats:
drwxr-xr-x 1 user01 ftp 512 Jan 29 23:32 prog
drwxr-xr-x 1 user01 ftp 512 Jan 29 1997 prog
drwxr-xr-x 1 1 1 512 Jan 29 23:32 prog
lrwxr-xr-x 1 user01 ftp 512 Jan 29 23:32 prog -> prog2000
drwxr-xr-x 1 usernameftp 512 Jan 29 23:32 prog
01-29-97 11:32PM <DIR> prog
drwxr-xr-x folder 0 Jan 29 23:32 prog
0 DIR 01-29-97 23:32 PROG
Unix style, then MS-DOS style, are the most widespread formats.
Java FTP client libraries try to understand and auto-detect as many formats as possible. In addition, they offer various alternatives for handling unexpected format answers:
Most libraries parse LIST
responses and structure raw file information into Java objects. For example, with JScape iNet Factory, the following code retrieves and exploits file information received in a directory listing:
java.util.Enumeration files = ftpClient.getDirListing(); while (files.hasMoreElements()) { FtpFile ftpFile = (FtpFile) files.nextElement(); System.out.println(ftpFile.getFilename()); System.out.println(ftpFile.getFilesize()); // etc. other helpful methods are detailed in Javadoc }
Section "Solutions for Remaining Problems" further considers directory listings.
In many cases, we are interested in a remote file's latest modification timestamp. Unfortunately, no RFC introduces a standard FTP command to retrieve this information. Two de facto methods exist:
LIST
response by parsing the server answer. Unfortunately, as you learned in the previous section, the LIST
response varies among FTP servers, and the timestamp information is sometimes incomplete. In the Unix format, imprecision occurs when the remote file is more than one year old: only the date and year, but not hours or minutes are given.MDTM
command, which specifically retrieves a remote file's last modification timestamp. Unfortunately, not all FTP servers implement this command.
An intricate alternative to MDTM
command support is to send a raw MDTM
command and parse the response. Most libraries provide a method for sending a raw FTP command, something like:
String timeStampString = ftpClient.command("MDTM README.txt");
Another possible concern is that FTP servers return time information in GMT (Greenwich Mean Time). If the server time zone is known apart from FTP communication, the java.util.TimeZone.getOffset()
method can help adjust a date between time zones. See JDK documentation for further information about this method.
Section "Solutions for Remaining Problems" further considers file timestamp retrieval.
Typically, a firewall is placed between a private enterprise network and a public network such as the Internet. Access is managed from the private network to the public network, but access is denied from the public network to the private network.
Socks is a publicly available protocol developed for use as a firewall gateway for the Internet. The JDK supports Socks 4 and Socks 5 proxies, which can be controlled by some of the libraries. As an alternative, the JVM command line can set the Socks proxy parameters: java -DsocksProxyPort=1080 -DsocksProxyHost=socks.foo.com -Djava.net.socks.username=user01 -Djava.net.socks.password=pass1234 ...
Another common alternative to Socks proxy support is to "socksify" the underlying TCP/IP layer on the client machine. A product like Hummingbird can do that job.
The JDK also supports HTTP tunnels. These widespread proxies do not allow FTP uploads. /n software's IP*Works allows you to set HTTP tunnel parameters.
Most libraries support both active and passive connections: passive connection is useful when the client is behind a firewall that inhibits incoming connections to higher ports. RFC1579 discusses this firewall-friendly functionality in more detail. Some products' documentations refer to active and passive connections as PORT
and PASV
commands, respectively.
In a desktop application, when a transfer starts in the main single thread, everything freezes. Some libraries automatically service the event loop for parallel transfers in separate threads so we do not have to create and manage our own threads.
Some libraries implement the JavaBean specification. JavaBean compliance allows visual programming, which is featured in major Java IDEs.
The n/ software IP*Works JavaBean design is event-based (for example, see the ipworks.Ftp.listDirectory()
method). Although it remains synchronous and is perfectly safe, some programmers may find it odd or awkward in server-side applications.
Some libraries implement progress monitoring. Progress monitoring support makes it easy to implement event listeners that track any FTP transfer's progress. This feature is useful when developing a friendly user interface.
RFC959 section 3.1.1 specifies several transmission types, among which two are common: ASCII nonprint (default) and image (also called binary). Some libraries can be set in auto mode, according to the file extension. Such a method is rarely useful in modern information systems. Other transmission types have become obsolete and are not supported by any of the Java libraries.
All libraries run on at least JDK 1.2.x and later; most should run on JDK 1.1.x, and maybe JDK 1.0.x.
All libraries are pure Java.
The comparison matrix lists other obvious criteria.
Now comes the final comparison matrix. It displays libraries on top against criteria on the left. In cells, Y means Yes; other abbreviations are explained in the criterion lists above (see letters in bold) and in the table's key.
FTP Comparison Matrix
When choosing a library, I have a few recommendations:
Most likely, at some point in our project, especially at the end when thorough testing occurs, we might want to change our library. Such a change affects all our calling code: our classes do not compile anymore, and some application parts must be recoded to match different method names and the new library's different design.
Since managing such a change can prove annoying, especially at a project's end when time is a critical resource, we should limit changes to one single class. Typically, we can apply the Façade pattern, with the FTP library as the back end, as illustrated in Figure 2.
Figure 2. The Façade pattern applied to an FTP library
A beneficial side effect of applying the Façade pattern to the FTP library is that we can add value to the library itself. For instance, we can write a Façade method that downloads an entire remote directory tree into a local zip file or a method that implements any basic feature lacking in the library.
Finally, two libraries that have the same signature do not necessarily have the same runtime behavior. Thus, switching from one library to another can also affect our application runtime. Such an impact is unpleasant and uncomfortable because discovering runtime differences is much more difficult, although detailed test cases can help.
In the explanations of the above criteria, I briefly described several unsolved problems. In this section, I further discuss and address them; I suggest both long-term solutions and short-term workarounds.
The lack of any authoritative specification for the LIST
response has led to many different FTP server implementations. This diversity is the biggest problem for FTP client programmers and is still an open issue.
As the problem's root lies in the protocol definition, I recommend that the concerned authoritative entity, the Internet Engineering Task Force (IETF), define the LIST
response structure specification in a new reference document (an RFC).
This process can be long. In the meantime, the most flexible solution is to use a library offering a framework for pluggable format parsers.
As I discussed earlier, no method retrieves a remote file's last modification timestamp through FTP. I suggest two long-term solutions for transferring that timestamp from the server to the client:
LIST
responseMDTM
command and response
For both solutions, server time zone should be considered in the communication.
Again, as the root of the problem lies in the protocol definition, I recommend that the IETF define one or both of the above solutions as an authoritative specification.
In the meantime, the most generic workaround is to use a library supporting both LIST
and MDTM
response parsing and exploit a combination of these two features.
In the related section above, I recommended the Façade pattern to reduce change effort in case of library replacement. As I mentioned, the pattern does not serve as a panacea, because the diversity in behaviors among libraries can still affect our entire application at runtime, which is difficult to control.
As this concern is a pure programming matter, I recommend that Sun publish a standard well-designed API, defining precise method signatures and behaviors. Anyone, including Sun, could implement it. Programmers could use the interface methods and back them up with their preferred implementation. And any switch from one library to another would have minimal impact on the rest of the application. JavaMail and JDBC (Java Database Connectivity) APIs are exemplary precedents.
The Java FTP API Standardization project aims to organize a consortium of users, developers, and providers to introduce a Request For Enhancement as a Java Specification Request in the Java Community Process. Your support would certainly be useful to this project, the homepage of which can be found in Resources.
In this article, I explained how to write FTP client code in Java and presented FTP client support in the JDK and third-party libraries. I presented important criteria to consider when evaluating various libraries and compared the criteria across libraries. I hope decision-makers facing the choice of a Java FTP client library find useful indications in this objective study to make the best decision.
Finally, I presented different problems common to all FTP libraries and suggested short-term workarounds as well as long-term solutions that could be adopted by authoritative entities like IETF and Sun. I hope these leads and actions will help forge the future of Java FTP client libraries.
Jean-Pierre Norguet holds an engineering degree in computer science from the Universite Libre de Bruxelles and a Socrates European master's degree from the Ecole Centrale Paris. After three years of full-time Java development with IBM on mission-critical e-business applications, as team leader and coach, his areas of expertise grew to include the entire application development life cycle. He now works as a research fellow in Brussels, Belgium, writing a PhD thesis about Internet audience analysis. His outside interests include artistic drawing, French theater acting, and well-being massage.
Resources