下载文件之Content-disposition头部处理

背景

在文件名可能千奇百怪,常见的问题为中文乱码和标点不识别的问题。比如中文情况,要么文件名被转码为%xx格式,要么空格被url截断等;又比如下载文件名中有英文的逗号叹号,这样的文件可能导致下载请求处理异常。

原因

  1. 采用HTTP协议下载文件时,需要在HTTP请求的头部设置Content-Type和Content-Disposition,前者与文件类型相关,后者用于指定下载后文件名以及相应的编码规则。

  2. 根据RFC 3986,URL中的特殊字符将被转义为 "%xx"格式(%加上一个16进制数字),具体见下文:

A percent-encoding mechanism is used to represent a data octet in a component when that octet’s corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component. A percent-encoded octet is encoded as a character triplet, consisting of the percent character “%” followed by the two hexadecimal digits representing that octet’s numeric value. For example, “%20” is the percent-encoding for the binary octet “00100000” (ABNF: %x20), which in US-ASCII corresponds to the space character (SP). Section 2.4 describes when percent-encoding and decoding is applied.
pct-encoded = "%" HEXDIG HEXDIG
The uppercase hexadecimal digits ‘A’ through ‘F’ are equivalent to the lowercase digits ‘a’ through ‘f’, respectively. If two URIs differ only in the case of hexadecimal digits used in percent-encoded octets, they are equivalent. For consistency, URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings.

ISO-8859-1 编码是单字节编码,向下兼容ASCII,其编码范围是0x00-0xFF,0x00-0x7F之间完全和ASCII一致,0x80-0x9F之间是控制字符,0xA0-0xFF之间是文字符号。

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

处理方法

1.指定Header的编码,由于空格会被转义为"\+",因为要将其转为"%20"

response.addHeader(HttpHeaders.CONTENT_DISPOSITION, "attachment;filename*=UTF-8''" + URLEncoder.encode(filename, "UTF-8").replaceAll("\\+", "%20"));

2.指定Tomcat编码

  

3.应用添加编码过滤器

      
       springUtf8Encoding  
       org.springframework.web.filter.CharacterEncodingFilter  
         
           encoding  
           UTF-8  
         
         
           forceEncoding  
           true  
          
      
      
       springUtf8Encoding  
       /*  
     

参考:
blog.robotshell.org/2012/deal-with-http-header-encoding-for-file-download/
borninsummer.com/2016/12/07/http-charset/

你可能感兴趣的:(下载文件之Content-disposition头部处理)