Unlike PHP, Java Servlet and JSP do not have build-in mechanisms for handling form-based file uploads. One solution to this problem is to implement a function yourself to extract uploaded files contained in an HTTP request. However, a better choice is to make use of a third-party library that can help us handle file uploads.
One robust library available is the Apache Jakarta Commons FileUpload package. It is open-source and can be downloaded free of charge over the Internet. We will demonstrate how to use the Apache Jakarta Commons FileUpload package to extract uploaded files submitted from a form. The techniques are the same for HTML and XHTML. If you are not familiar with JSP or Java Servlet, you may want to read some introductory tutorials before going through this section.
At the time of writing, the most up-to-date version of the Apache Jakarta Commons FileUpload library is 1.1.1. So, we assume you are using Commons FileUpload 1.1.1 in this tutorial. For other versions of Commons FileUpload, the procedures may be slightly different but the principle is the same.
To download the Apache Jakarta Commons FileUpload library, go to the home page of the Apache Jakarta Commons FileUpload project and navigate to the download section. The binaries are available in two file formats: zip format and tar-gzip format. Download either one of them and uncompress the file. Then go to the home page of the Apache Jakarta Commons IO project and repeat the same steps. We need the Commons IO library since Commons FileUpload uses it internally. Now you have two JAR files, "commons-fileupload-version.jar" and "commons-io-version.jar", where version is the version number. At the time of writing, the latest version of the Commons FileUpload library and that of the Commons IO library are 1.1.1 and 1.2 respectively. So, the JAR files we obtain are "commons-fileupload-1.1.1.jar" and "commons-io-1.2.jar".
Next, we need to install the Apache Jakarta Commons FileUpload library and the Apache Jakarta Commons IO library into a Servlet/JSP container such as Apache Tomcat. To do this, copy the JAR files "commons-fileupload-1.1.1.jar" and "commons-io-1.2.jar" to the /WEB-INF/lib/ directory in the document root of your web application.
Note that JAR libraries stored in /WEB-INF/lib/ will be available to the containing web application only. If you want to share the libraries among all web applications installed in Tomcat (suppose you are using Tomcat 5 or Tomcat 4), the JAR files should be copied to the $CATALINA_HOME/shared/lib/ directory, where $CATALINA_HOME is the root of your Tomcat installation.
Now that you have installed the Apache Jakarta Commons FileUpload library, you can start writing the code. First, we have to make sure the HTTP request is encoded in multipart format. This can be done using the static method isMultipartContent() of the ServletFileUpload class of the org.apache.commons.fileupload.servlet package:
if (ServletFileUpload.isMultipartContent(request)){
// Parse the HTTP request...
}
In the above Java code snippet, request is a javax.servlet.http.HttpServletRequest object that encapsulates the HTTP request. It should be very familiar to you if you know Java Servlet or JSP.
Second, we will parse the form data contained in the HTTP request. Parsing the form data is very straightforward with the Apache Jakarta Commons FileUpload library:
ServletFileUpload servletFileUpload = new ServletFileUpload(new DiskFileItemFactory());
List fileItemsList = servletFileUpload.parseRequest(request);
(In the above Java code snippet, DiskFileItemFactory is a class contained in the org.apache.commons.fileupload.disk package and List is an interface contained in the java.util package.)
If everything works fine, fileItemsList will contain a list of file items that are instances of FileItem of the org.apache.commons.fileupload package. A file item may contain an uploaded file or a simple name-value pair of a form field. (More details about FileItem will be provided later.)
By default, the ServletFileUpload instance created by the above Java code uses the following values when parsing the HTTP request:
· Size threshold = 10,240 bytes. If the size of a file item is smaller than the size threshold, it will be stored in the memory. Otherwise it will be stored in a temporary file on disk.
· Maximum HTTP request body size = -1, which means the server will accept HTTP request bodies of any size.
· Repository = System default temp directory, whose value can be found by the Java code System.getProperty("java.io.tmpdir"). Temporary files will be stored there.
If you do not like the default settings, you can change them using the methods setSizeThreshold() and setRespository() of the DiskFileItemFactory class and the setSizeMax() method of the ServletFileUpload class, like this:
DiskFileItemFactory diskFileItemFactory = new DiskFileItemFactory();
diskFileItemFactory.setSizeThreshold(40960); /* the unit is bytes */
File repositoryPath = new File("/temp");
diskFileItemFactory.setRepository(repositoryPath);
ServletFileUpload servletFileUpload = new ServletFileUpload(diskFileItemFactory);
servletFileUpload.setSizeMax(81920); /* the unit is bytes */
(In the above Java code snippet, File is a class of the java.io package.)
If the size of the HTTP request body exceeds the maximum you set, the SizeLimitExceededException exception (fully qualified name: org.apache.commons.fileupload.FileUploadBase.SizeLimitExceededException) will be thrown when you call the parseRequest() method:
try {
List fileItemsList = servletFileUpload.parseRequest(request);
/* Process file items... */
}
catch (SizeLimitExceededException ex) {
/* The size of the HTTP request body exceeds the limit */
}
Third, we will iterate through the file items and process each of them. The isFormField() method of the FileItem interface is used to determine whether a file item contains a simple name-value pair of a form field or an uploaded file:
Iterator it = fileItemsList.iterator();
while (it.hasNext()){
FileItem fileItem = (FileItem)it.next();
if (fileItem.isFormField()){
/* The file item contains a simple name-value pair of a form field */
}
else{
/* The file item contains an uploaded file */
}
}
(In the above Java code snippet, Iterator is an interface in the java.util package and FileItem is an interface in the org.apache.commons.fileupload package.)
If a file item contains a simple name-value pair of an ordinary form field, we can retrieve its name and value using the getFieldName() method and the getString() method respectively:
String name = fileItem.getFieldName();
String value = fileItem.getString();
For example, suppose there is a text field in an HTML/XHTML form:
and you enter "Welcome to our JSP / Servlet file upload tutorial" in the text field. After the execution of the previous two lines of Java code, the name variable should contain the value "text_field" (the name attribute value of the tag) and the value variable should contain the value "Welcome to our JSP / Servlet file upload tutorial".
If a file item contains an uploaded file, we can use a number of methods to obtain some information about the uploaded file before we decide what to do with it:
/* Get the name attribute value of the element. */
String fieldName = fileItem.getFieldName();
/* Get the size of the uploaded file in bytes. */
long fileSize = fileItem.getSize();
/* Get the name of the uploaded file at the client-side. Some browsers such as IE 6 include the whole path here (e.g. e:/files/myFile.txt), so you may need to extract the file name from the path. This information is provided by the client browser, which means you should be cautious since it may be a wrong value provided by a malicious user. */
String fileName = fileItem.getName();
/* Get the content type (MIME type) of the uploaded file. This information is provided by the client browser, which means you should be cautious since it may be a wrong value provided by a malicious user. */
String contentType = fileItem.getContentType();
Nokia cell phones such as Nokia 6230 determine the content type (MIME type) of the file to be uploaded by its file extension. The following table lists some of the file extensions that are recognized by Nokia 6230. We have shown this table to you before and we just copy and paste it here for your convenience.
File extension
|
Content type / MIME type
|
.jpg |
image/jpeg |
.gif |
image/gif |
.png |
image/png |
.wbmp |
image/vnd.wap.wbmp |
.txt |
text/plain |
If the Nokia 6230 cell phone does not recognize a file extension, it will specify "application/octet-stream" as the content type / MIME type of the file in the HTTP request.
In some situations, you just want to store the uploaded file in the file system without concerning what the uploaded file contains. The FileItem interface provides a method called write() that helps us perform this easily:
File saveTo = new File("/upload_files/myFile.txt");
fileItem.write(saveTo);
(In the above Java code snippet, File is a class of the java.io package.)
If everything works fine, the uploaded file will be saved to "/upload_files/myFile.txt". Otherwise the write() method will throw a java.lang.Exception exception.
Note that if the write() method is called more than once, an error may occur. In the case where the uploaded file is stored as a temporary file on the disk, the write() method will first try to rename the temporary file and put it at the new location instead of copying the file contents so as to obtain a performance gain. When the write() method is called the second time, the temporary file does not exist and so it will produce an error. However, if the uploaded file is hold in the memory, you can call the write() method multiple times.
If you do not want to save the uploaded file directly but to process it, the get() and getInputStream() methods can help you. The get() method returns the uploaded file as an array of the byte data type:
byte[] fileData = fileItem.get();
However, if the uploaded file is large in size, you will not want to load the whole file into memory. The getInputStream() method can help you in this case. It returns the uploaded file as a stream:
InputStream fileStream = fileItem.getInputStream();
(InputStream is a class of the java.io package.)
Below shows a JSP file upload script that is used to print out the name-value pair received from the earlier XHTML MP document and save the uploaded file to a certain location on the WAP server. Remember to change the action attribute of the
(file_upload.jsp)
<%@ page import="org.apache.commons.fileupload.*, org.apache.commons.fileupload.servlet.ServletFileUpload, org.apache.commons.fileupload.disk.DiskFileItemFactory, org.apache.commons.io.FilenameUtils, java.util.*, java.io.File, java.lang.Exception" %>
<% response.setContentType("application/vnd.wap.xhtml+xml"); %>
Data Received at the Server
<%
if (ServletFileUpload.isMultipartContent(request)){
ServletFileUpload servletFileUpload = new ServletFileUpload(new DiskFileItemFactory());
List fileItemsList = servletFileUpload.parseRequest(request);
String optionalFileName = "";
FileItem fileItem = null;
Iterator it = fileItemsList.iterator();
while (it.hasNext()){
FileItem fileItemTemp = (FileItem)it.next();
if (fileItemTemp.isFormField()){
%>
Name-value Pair Info:
Field name: <%= fileItemTemp.getFieldName() %>
Field value: <%= fileItemTemp.getString() %>
<%
if (fileItemTemp.getFieldName().equals("filename"))
optionalFileName = fileItemTemp.getString();
}
else
fileItem = fileItemTemp;
}
if (fileItem!=null){
String fileName = fileItem.getName();
%>
Uploaded File Info:
Content type: <%= fileItem.getContentType() %>
Field name: <%= fileItem.getFieldName() %>
File name: <%= fileName %>
File size: <%= fileItem.getSize() %>
<%
/* Save the uploaded file if its size is greater than 0. */
if (fileItem.getSize() > 0){
if (optionalFileName.trim().equals(""))
fileName = FilenameUtils.getName(fileName);
else
fileName = optionalFileName;
String dirName = "/file_uploads/";
File saveTo = new File(dirName + fileName);
try {
fileItem.write(saveTo);
%>
The uploaded file has been saved successfully.
<%
}
catch (Exception e){
%>
An error occurred when we tried to save the uploaded file.
<%
}
}
}
}
%>
The following screenshots show what you will see in the Nokia 6230 cell phone:
The above JSP script is very straightforward. Most of the code has been covered earlier. Below shows some lines of code that you may be unfamiliar with.
The line:
<% response.setContentType("application/vnd.wap.xhtml+xml"); %>
is used to set the MIME type of the JSP document. "application/vnd.wap.xhtml+xml" is the MIME type of XHTML MP.
The line:
fileName = FilenameUtils.getName(fileName);
is used to extract the file name from a path. (FilenameUtils is a class of the org.apache.commons.io package in the Apache Jakarta Commons IO library.) For example, both FilenameUtils.getName("/files/myFile.txt") and FilenameUtils.getName("myFile.txt") return the string "myFile.txt". The above line of code is necessary in our JSP script since some browsers provide the full path of the uploaded file in the HTTP request and so after the execution of the following line, the fileName variable may contain a path but not a file name.
String fileName = fileItem.getName();
As XHTML MP is compatible with HTML/XHTML, the resulting XHTML MP document generated by the JSP script can also be viewed on web browsers such as Microsoft Internet Explorer and Mozilla Firefox. The only thing you need to do is to remove the following line from the JSP script:
<% response.setContentType("application/vnd.wap.xhtml+xml"); %>
This is because unlike WAP 2.0 browsers on cell phones, Internet Explorer 6 and Mozilla Firefox 2.0 do not understand the MIME type of XHTML MP. Instead of displaying the XHTML MP document, they will pop up a dialog box asking you to select a program to open the document or save the document on disk.
The following screenshots show the result on Mozilla Firefox 2.0:
Before enabling HTTP file upload on your server, one important thing that you must consider is security, as improper design and configuration will make your server vulnerable to attacks.
For example, the PHP file upload script and JSP file upload script that were covered earlier are not secure. One problem is that we have not checked what the user entered in the optional filename text box. This gives malicious users the chance to modify the server's files (e.g. system files or password files). For example, if a malicious user enters a path such as "../password/password.dat" in the optional filename text box, our PHP and JSP script will save the uploaded file to the destination "/file_uploads/../password/password.dat", which is actually the path "/password/password.dat".
Here are a few security tips that may be useful to you. We will only provide some brief descriptions here. For more details, please refer to other sources.
· Check all information provided by the client to ensure that it is safe. For example:
o The HTTP request received includes a MIME type that describes what the uploaded file contains. A malicious user can provide a wrong value to trick you to think that the uploaded file is of another type. Hence, you should not rely on the MIME type included in the HTTP request but should perform a check by your own at the server-side. For instance, the photo album example covered earlier does not perform any checks to ensure the uploaded files are really image files. To enhance security, we can include a check on the uploaded files using the PHP function getimagesize() at the server-side. If getimagesize() returns false, that means the uploaded file is not a valid image file and it should be rejected.
o The HTTP request received includes the uploaded file's original file name at the client-side. A malicious user can provide an unsafe value to trick you to modify system or password files. This problem is similar to the one described in the second paragraph of this section, so we will not describe it once more.
In addition, you should prepare for the situation that the file name contains special characters that are not allowed to appear in file names or non-English characters. Make sure your WAP/web application will not crash or be left in an erroneous state when such situations occur.
· Set a file size limit so that the user cannot upload files that are too large or too small.
· Do not run web servers or application servers with the administrator account. Create and configure an account that is specifically for their use. Limit the file access permissions of the account so that even if your WAP/web application has security holes, the OS will not allow it to work with system files or files of other users.
· Make sure your WAP/web application does not reveal too much information to the user when an error occurs. The information revealed can help a malicious user find ways to attack your system.
· Log down the details (such as the time, the client's IP address and the user name) of file uploads and other related events. Although the logs only tell you what has happened, they can help you check what types of attacks have been made against your server and whether there were any successful attacks.