无组件java上传文件初探

      最近闲来无事,突然对文件上传感兴趣起来,不由的想着自己不借助第三方包写个java文件上传的工具类,经过对http协议的一番研究,总算有点小成果。

      这是页面表单代码片段:

        <form action="gb/test.do" method="post" enctype="multipart/form-data">
        <h1>Hello World!</h1>
        &nbsp;&nbsp;name:<input type="text" name="name" value="" /><br>
        &nbsp;&nbsp;address:<input type="text" name="address" value="" /><br>
        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;age:<input type="text" name="age" value="" /><br>
        photo1:<input type="file" name="f1"><br>
        photo2:<input type="file" name="f2"><br>
        验证码:<input type="text" name="image"><img src="RandomImageServlet">
        <input type="submit">
        </form>

 在页面填写完数据后,提交表单,通过httpwatch我们可以看到如下数据:

POST /testWeb/gb/test.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-ms-application, application/x-ms-xbap, application/vnd.ms-xpsdocument, application/xaml+xml, */*
Referer: http://localhost:8084/testWeb/
Accept-Language: zh-cn
Content-Type: multipart/form-data; boundary=---------------------------7db1e4183026c
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.3; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Host: localhost:8084
Content-Length: 7647320
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: JSESSIONID=FEF31749D9D8CBE22C2A49C7A6799A15

-----------------------------7db1e4183026c
Content-Disposition: form-data; name="name"

灏忓己
-----------------------------7db1e4183026c
Content-Disposition: form-data; name="address"

杈藉畞澶ц繛
-----------------------------7db1e4183026c
Content-Disposition: form-data; name="age"

23
-----------------------------7db1e4183026c
Content-Disposition: form-data; name="f1"; filename="C:\Documents and Settings\mark\妗岄潰\JavaEE_CN.chm"
Content-Type: application/octet-stream

ITSF   `      鳋
3  ?|獅?? 犐"骒?|獅?? 犐"骒`              x       T?     汤     ?      籵             ITSP   T   
            ;       :   <   	  j?].!?濝 犐"骒T   PMGL>          /   /#IDXHDR樳? /#ITBITS   	/#STRINGS橍臥嚽/#SYSTEM ??/#TOPICS樳?冔/#URLSTR樻猶帤_/#URLTBL樸?備X	/#WINDOWS栰?丩/$FIftiMain桏?佽孶	/$OBJINST桏??/$WWAssociativeLinks/   /$WWAssociativeLinks/Property桏?/$WWKeywordLinks/   /$WWKeywordLinks/BTree栰?壚L/$WWKeywordLinks/Data桍扥亼l/$WWKeywordL掜o 簔螓6礩?晢彌?瘓戠7?@T怶御<删2w搋?50*xN樾sa捏?洙{)藽缃镎菻Z粺拺??I荷o柦雛?檀}mxJ暾???瑩巧坮汎煘	絴?                            `...
-----------------------------7db1e4183026c
Content-Disposition: form-data; name="f2"; filename="C:\Documents and Settings\mark\妗岄潰\pietty0327.exe"
Content-Type: application/octet-stream

MZ                @                                     ? ???L?This program cannot be run in DOS mode.

$       ?垵頽嫖頽嫖頽嫖齠徫靚嫖隻單靚嫖隻槲鬾嫖隻刮nn嫖mf刮飊嫖齠晃靚嫖mf晃黱嫖M坞n嫖頽缥猳嫖隻呂猲嫖e肝飊嫖頽嫖靚嫖隻嘉飊嫖Rich頽嫖                PE  L S醐B         
 ?      ? P?  ?  ?   @                      ?    I/                               芎
 ?   ? ?                                                     ?庿xn?{+0c7$(~m兩姅躆?鵈Tn琏籡E (e...
-----------------------------7db1e4183026c
Content-Disposition: form-data; name="image"

7hti
-----------------------------7db1e4183026c--

 其中我们仔细观看上面的数据

Content-Type: multipart/form-data; boundary=---------------------------7db1e4183026c

这里有个值boundary,这个单词意为分界之意,不难理解,它的值“

---------------------------7db1e4183026c

”是个分界符,用于把表单数据分为不同的段,但是如果仔细观察,你会发现正文里的分界符的长度比这个值多了2个字节长度,也就是多 了两个“-”,而且正文中的分界符每个分界符结尾都有一个回车换行符(http协议指明了行应当由回车/换行对结束 ),占两个字节,这里还要注意的是最后一个边界符——终结符,比前面的边界符多了两个“-”,但是,这里要注意的是

       终结符!=边界符+“--”;

      而是

      终结符=边界符-回车换行符+“--”+回车换行符!

这里估计很多人刚开始都是会处理错误的。

      此外我们还发现上传文件的数据段是这样的:

Content-Disposition: form-data; name="f1"; filename="C:\Documents and Settings\mark\妗岄潰\JavaEE_CN.chm"
Content-Type: application/octet-stream

    而只是普通的文本框输入的数据的话,它并没有filename=“...”,而且也没有下面的Content-Type这行数据:

Content-Disposition: form-data; name="address"

    而我们的任务就是要处理这个数据流,把参数名,跟参数值从数据流里提取出来,下面是处理数据流的java代码:

 

import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.util.HashMap;
import javax.servlet.ServletInputStream;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
 *
 * @author mark
 */
public class UploadFile {

    private static Log log = LogFactory.getLog(UploadFile.class);

    /**
     * 上传文件组件,调用该方法的servlet在使用该方法前必须先调用request.setCharacterEncoding()方法,设置编码格式。该编码格式须与页面编码格式一致。
     * @param sis 数据流
     * @param encoding 编码方式。必须与jsp页面编码方式一样,否则会有乱码。
     * @param length 数据流长度
     * @param upLoadPath 文件保存路径
     * @throws FileNotFoundException
     * @throws IOException
     */
    public static HashMap uploadFile(ServletInputStream sis, String encoding, int length, String upLoadPath) throws IOException {
        HashMap paramMap = new HashMap();

        boolean isFirst = true;
        String boundary = null;//分界符
        byte[] tmpBytes = new byte[4096];//tmpBytes用于存储每行读取到的字节。
        int[] readBytesLength = new int[1];//数组readBytesLength中的元素i[0],用于保存readLine()方法中读取的实际字节数。
        int readStreamlength = 0;//readStreamlength用于记录已经读取的流的长度。
        String tmpString = null;

        tmpString = readLine(tmpBytes, readBytesLength, sis, encoding);
        readStreamlength = readStreamlength + readBytesLength[0];
        while (readStreamlength < length) {
            if (isFirst) {
                boundary = tmpString;
                isFirst = false;
            }
            if (tmpString.equals(boundary)) {
                String contentDisposition = readLine(tmpBytes, readBytesLength, sis, encoding);
                readStreamlength = readStreamlength + readBytesLength[0];
                String contentType = readLine(tmpBytes, readBytesLength, sis, encoding);
                readStreamlength = readStreamlength + readBytesLength[0];
                //当时上传文件时content-Type不会是null
                if (contentType != null && contentType.trim().length() != 0) {
                    String paramName = getPramName(contentDisposition);
                    String fileName = getFileName(getFilePath(contentDisposition));

                    paramMap.put(paramName, fileName);

                    //跳过空格行
                    readLine(tmpBytes, readBytesLength, sis, encoding);
                    readStreamlength = readStreamlength + readBytesLength[0];

                    /*
                     * 文件名不为空,则上传了文件。
                     */
                    if (fileName != null && fileName.trim().length() != 0) {
                        fileName = upLoadPath + fileName;

                        //开始读取数据
                        byte[] cash = new byte[4096];
                        int flag = 0;
                        FileOutputStream fos = new FileOutputStream(fileName);
                        tmpString = readLine(tmpBytes, readBytesLength, sis, encoding);
                        readStreamlength = readStreamlength + readBytesLength[0];
                        /*
                         *分界符跟结束符虽然看上去只是结束符比分界符多了“--”,其实不是,
                         *分界符是“-----------------------------45931489520280”后面有2个看不见的回车换行符,即0D 0A
                         *而结束符是“-----------------------------45931489520280--”后面再跟2个看不见的回车换行符,即0D 0A
                         *
                         */
                        while (tmpString.indexOf(boundary.substring(0, boundary.length() - 2)) == -1) {
                            for (int j = 0; j < readBytesLength[0]; j++) {
                                cash[j] = tmpBytes[j];
                            }
                            flag = readBytesLength[0];
                            tmpString = readLine(tmpBytes, readBytesLength, sis, encoding);
                            readStreamlength = readStreamlength + readBytesLength[0];
                            if (tmpString.indexOf(boundary.substring(0, boundary.length() - 2)) == -1) {
                                fos.write(cash, 0, flag);
                                fos.flush();
                            } else {
                                fos.write(cash, 0, flag - 2);
                                fos.flush();
                            }
                        }
                        fos.close();
                    } else {
                        //跳过空格行
                        readLine(tmpBytes, readBytesLength, sis, encoding);
                        readStreamlength = readStreamlength + readBytesLength[0];

                        //读取分界符或者结束符
                        tmpString = readLine(tmpBytes, readBytesLength, sis, encoding);
                        readStreamlength = readStreamlength + readBytesLength[0];
                    }
                } //当不是长传文件时
                else {
                    String paramName = getPramName(contentDisposition);
                    String value = readLine(tmpBytes, readBytesLength, sis, encoding);
                    //去掉回车换行符(最后两个字节)
                    byte[] valueByte=value.getBytes(encoding);
                    value =new String(valueByte, 0, valueByte.length-2, encoding);
                    
                    readStreamlength = readStreamlength + readBytesLength[0];
                    paramMap.put(paramName, value);
                    tmpString = readLine(tmpBytes, readBytesLength, sis, encoding);
                    readStreamlength = readStreamlength + readBytesLength[0];
                }
            }

        }
        sis.close();
        return paramMap;
    }

    /**
     * 从流中读取一行数据。
     * @param bytes 字节数组,用于保存从流中读取到的字节。
     * @param index 一个整型数组,只有一个元素,即index[0],用于保存从流中实际读取的字节数。
     * @param sis 数据流
     * @param encoding 组建字符串时所用的编码
     * @return 将读取到的字节经特定编码方式组成的字符串。
     */
    private static String readLine(byte[] bytes, int[] index, ServletInputStream sis, String encoding) {
        try {
            index[0] = sis.readLine(bytes, 0, bytes.length);//readLine()方法把读取的内容保存到bytes数组的第0到第bytes.length处,返回值是实际读取的 字节数。
            if (index[0] < 0) {
                return null;
            }
        } catch (IOException e) {
            log.error("read line ioexception");
            return null;
        }
        if (encoding == null) {
            return new String(bytes, 0, index[0]);
        } else {
            try {
                return new String(bytes, 0, index[0], encoding);
            } catch (UnsupportedEncodingException ex) {
                log.error("Unsupported Encoding");
                return null;
            }
        }

    }

    private static String getPramName(String contentDisposition) {
        String s = contentDisposition.substring(contentDisposition.indexOf("name=\"") + 6);
        s = s.substring(0, s.indexOf('\"'));
        return s;
    }

    private static String getFilePath(String contentDisposition) {
        String s = contentDisposition.substring(contentDisposition.indexOf("filename=\"") + 10);
        s = s.substring(0, s.indexOf('\"'));
        return s;
    }

    private static String getFileName(String filePath) {
        String rtn = null;
        if (filePath != null) {
            int index = filePath.lastIndexOf("/");//根据name中包不包含/来判断浏览器的类型。
            if (index != -1)//包含/,则此时可以判断文件由火狐浏览器上传
            {
                rtn = filePath.substring(index + 1);//获得文件名
            } else//不包含/,可以判断文件由ie浏览器上传。
            {
                index = filePath.lastIndexOf("\\");
                if (index != -1) {
                    rtn = filePath.substring(index + 1);//获得文件名
                } else {
                    rtn = filePath;
                }
            }
        }
        return rtn;
    }
}

你可能感兴趣的:(java,apache,浏览器,servlet,IE)