springboot2.x使用Jsoup防 XSS 攻击

后端应用经常接收各种信息参数,例如评论,回复等文本内容。除了一些场景下面,可以特定接受的富文本标签和属性之外(如:b,ul,li,h1, h2, h3...),需要过滤掉危险的字符和标签,防止xss攻击。

一、什么是XSS?

看完这个,应该有一个大致的概念。

二、准则

  1. 永远不要相信用户的输入和请求的参数(包括文字、上传等一切内容)
  2. 参考第1条

三、实现做法

结合具体业务场景,对相应内容进行过滤,这里使用Jsoup。

jsoup是一款Java的HTML解析器。Jsoup提供的Whitelist(白名单)对文本内容进行过滤,过滤掉字符、属性,但是又保留必要的富文本格式。
如,白名单中允许b标签存在(并且不允许b标签带有其他属性)那么在一段Html内容,在过滤之后,会变成:

过滤前:

abc

过滤后:

abc

Whitelist主要方法说明

方法 说明
addAttributes(String tag, String... attributes) 给标签添加属性。Tag是属性名,keys对应的是一个个属性值。例如:addAttributes("a", "href", "class")表示:给标签a添加href和class属性,即允许标签a包含href和class属性。如果想给每一个标签添加一组属性,使用:all。例如:addAttributes(":all", "class").即给每个标签添加class属性。
addEnforcedAttribute(String tag, String attribute, String value) 给标签添加强制性属性,如果标签已经存在了要添加的属性,则覆盖原有值。tag:标签;key:标签的键;value:标签的键对应的值。例如:addEnforcedAttribute("a", "rel", "nofollow")表示
addProtocols(String tag, String key, String...protocols) 给URL属性添加协议。例如:addProtocols("a", "href", "ftp", "http", "https")标签a的href键可以指向的协议有ftp、http、https
addTags(String... tags) 向Whitelist添加标签
basic() 允许的标签包括: a, b, blockquote, br, cite, code, dd, dl, dt, em, i, li, ol, p, pre, q, small, strike, strong, sub, sup, u, ul,以及合适的属性。标签a指向的连接可以是 http, https, ftp, mailto,转换完后会强制添加 rel=nofollow这个属性。不允许包含图片。
basicWithImages() 在basic的基础上增加了图片的标签:img以及使用src指向http或https类型的图片链接。
none() 只保留文本,其他所有的html内容均被删除
preserveRelativeLinks(booleanpreserve) false(默认):不保留相对地址的url;true:保留相对地址的url
relaxed() 允许的标签:a, b, blockquote, br, caption, cite, code, col, colgroup, dd, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul。结果不包含标签rel=nofollow,如果需要可以手动添加。
simpleText() 只允许:b, em, i, strong, u。

四、例子

基于springboot

pom.xml依赖

        
            org.springframework.boot
            spring-boot-starter-web
        

        
        
            org.jsoup
            jsoup
            1.13.1
        

        
            org.apache.commons
            commons-lang3
        

        
            commons-io
            commons-io
            2.6
        

    
HtmlFilter过滤类
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.safety.Whitelist;

import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.List;

/**
 * HtmlFilter
 *
 * @author 撸小鱼
 * Created by [email protected] on 2020-04-12
 */
public class HtmlFilter {

    /**
     * 默认使用relaxed()
     * 允许的标签: a, b, blockquote, br, caption, cite, code, col, colgroup, dd, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul。结果不包含标签rel=nofollow ,如果需要可以手动添加。
     */
    private Whitelist whiteList;


    /**
     * 配置过滤化参数,不对代码进行格式化
     */
    private Document.OutputSettings outputSettings;


    private HtmlFilter() {
    }

    /**
     * 静态创建HtmlFilter方法
     * @param whiteList 白名单标签
     * @param pretty 是否格式化
     * @return HtmlFilter
     */
    public static HtmlFilter create(Whitelist whiteList, boolean pretty) {
        HtmlFilter filter = new HtmlFilter();
        if (whiteList == null) {
            filter.whiteList = Whitelist.relaxed();
        }
        filter.outputSettings = new Document.OutputSettings().prettyPrint(pretty);
        return filter;
    }

    /**
     * 静态创建HtmlFilter方法
     * @return HtmlFilter
     */
    public static HtmlFilter create() {
        return create(null, false);
    }

    /**
     * 静态创建HtmlFilter方法
     * @param whiteList 白名单标签
     * @return HtmlFilter
     */
    public static HtmlFilter create(Whitelist whiteList) {
        return create(whiteList, false);
    }

    /**
     * 静态创建HtmlFilter方法
     * @param excludeTags 例外的特定标签
     * @param includeTags 需要过滤的特定标签
     * @param pretty      是否格式化
     * @return HtmlFilter
     */
    public static HtmlFilter create( List excludeTags,List includeTags, boolean pretty) {
        HtmlFilter filter = create(null, pretty);
        //要过滤的标签
        if (includeTags != null && !includeTags.isEmpty()) {
            String[] tags = (String[]) includeTags.toArray(new String[0]);
            filter.whiteList.removeTags(tags);
        }
        //例外标签
        if (excludeTags != null && !excludeTags.isEmpty()) {
            String[] tags = (String[]) excludeTags.toArray(new String[0]);
            filter.whiteList.addTags(tags);
        }
        return filter;
    }


    /**
     * 静态创建HtmlFilter方法
     * @param excludeTags 例外的特定标签
     * @param includeTags 需要过滤的特定标签
     * @return HtmlFilter
     */
    public static HtmlFilter create(List excludeTags,List includeTags) {
        return create( includeTags, excludeTags, false );
    }

    /**
     * @param content 需要过滤内容
     * @return 过滤后的String
     */
    public String clean(String content) {
        return Jsoup.clean(content, "", this.whiteList, this.outputSettings);

    }

    public static void main(String[] args) throws FileNotFoundException, IOException {
        String text = "alert(0);\">abc";
        System.out.println(HtmlFilter.create().clean(text));
    }


}
XssFilter过滤器
import org.apache.commons.lang3.StringUtils;

import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * XssFilter
 *
 * @author 撸小鱼
 * Created by [email protected] on 2020-04-12
 */
public class XssFilter implements Filter {

    /**
     * 例外urls
     */
    private List excludeUrls = new ArrayList<>();

    /**
     * 例外标签
     */
    private List excludeTags = new ArrayList<>();

    /**
     * 需要过滤标签
     */
    private List includeTags = new ArrayList<>();

    /**
     * 开关
     */
    public boolean enabled = false;

    /**
     * 编码
     */
    private String encoding = "UTF-8";

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        String enabledStr = filterConfig.getInitParameter("enabled");
        String excludeUrlStr = filterConfig.getInitParameter("urlPatterns");
        String excludeTagStr = filterConfig.getInitParameter("excludes");
        String includeTagStr = filterConfig.getInitParameter("includes");
        String encodingStr = filterConfig.getInitParameter("encoding");

        if (StringUtils.isNotEmpty(excludeUrlStr)) {
            String[] url = excludeUrlStr.split(",");
            Collections.addAll(this.excludeUrls, url);
        }

        if (StringUtils.isNotEmpty(excludeTagStr)) {
            String[] url = excludeTagStr.split(",");
            Collections.addAll(this.excludeTags, url);
        }

        if (StringUtils.isNotEmpty(includeTagStr)) {
            String[] url = includeTagStr.split(",");
            Collections.addAll(this.includeTags, url);
        }

        if (StringUtils.isNotEmpty(enabledStr)) {
            this.enabled = Boolean.parseBoolean(enabledStr);
        }

        if (StringUtils.isNotEmpty(encodingStr)) {
            this.encoding = encodingStr;
        }

    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        HttpServletRequest req = (HttpServletRequest) request;
        HttpServletResponse resp = (HttpServletResponse) response;
        if (handleExcludeUrls(req, resp)) {
            chain.doFilter(request, response);
            return;
        }

        XssHttpServletRequestWrapper xssRequest = new XssHttpServletRequestWrapper((HttpServletRequest) request, encoding, excludeTags, includeTags );
        chain.doFilter(xssRequest, response);

    }

    private boolean handleExcludeUrls(HttpServletRequest request, HttpServletResponse response) {
        if (!enabled) {
            return true;
        }
        if (excludeUrls == null || excludeUrls.isEmpty()) {
            return false;
        }
        String url = request.getServletPath();
        for (String pattern : excludeUrls) {
            Pattern p = Pattern.compile("^" + pattern);
            Matcher m = p.matcher(url);
            if (m.find()) {
                return true;
            }
        }
        return false;
    }
}
一般情况下,我们都是通过request的parameter来传递参数。
但是,如果在某些场景下面,通过requestBody体(json等),来传递相应参数应该怎么办?
这就要需要我们对request的inputStream来进行来过滤处理了

有个地方需要注意一下的:
servlet中inputStream只能一次读取,后续不能再次读取inputStream。Xss过滤器中读取了stream之后,后续如果其他逻辑涉及到inputStream读取,会抛出异常。那我们就需要想办法把已经读取的stream,重新放回到请求中。

import org.apache.commons.io.IOUtils;
import org.apache.commons.lang3.StringUtils;

import javax.servlet.ReadListener;
import javax.servlet.ServletInputStream;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletRequestWrapper;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;


/**
 * XSS过滤处理
 * @author 撸小鱼
 * Created by [email protected]
 */
public class XssHttpServletRequestWrapper extends HttpServletRequestWrapper{

    HttpServletRequest orgRequest;

    String encoding;

    HtmlFilter htmlFilter;

    private final static String JSON_CONTENT_TYPE =  "application/json";

    private final static String CONTENT_TYPE = "Content-Type";


    /**
     * @param request  HttpServletRequest
     * @param encoding 编码
     * @param excludeTags 例外的特定标签
     * @param includeTags 需要过滤的标签
     */
    public XssHttpServletRequestWrapper( HttpServletRequest request, String encoding, List excludeTags, List includeTags ){
        super( request );
        orgRequest = request;
        this.encoding = encoding;
        this.htmlFilter = HtmlFilter.create( excludeTags, includeTags );
    }

    /**
     *
     * @param request HttpServletRequest
     * @param encoding 编码
     */
    public XssHttpServletRequestWrapper( HttpServletRequest request, String encoding ){
        this( request, encoding, null, null  );
    }

    private String xssFilter( String input ){
        return htmlFilter.clean( input );
    }

    @Override
    public ServletInputStream getInputStream() throws IOException{
        // 非json处理
        if( !JSON_CONTENT_TYPE.equalsIgnoreCase( super.getHeader( CONTENT_TYPE ) ) ){
            return super.getInputStream();
        }
        InputStream in = super.getInputStream();
        String body = IOUtils.toString( in, encoding );
        IOUtils.closeQuietly( in );

        //空串处理直接返回
        if( StringUtils.isBlank( body ) ){
            return super.getInputStream();
        }

        // xss过滤
        body = xssFilter( body );
        return new RequestCachingInputStream( body.getBytes( encoding ) );

    }

    @Override
    public String getParameter( String name ){
        String value = super.getParameter( xssFilter( name ) );
        if( StringUtils.isNotBlank( value ) ){
            value = xssFilter( value );
        }
        return value;
    }

    @Override
    public String[] getParameterValues( String name ){
        String[] parameters = super.getParameterValues( name );
        if( parameters == null || parameters.length == 0 ){
            return null;
        }

        for( int i = 0; i < parameters.length; i++ ){
            parameters[i] = xssFilter( parameters[i] );
        }
        return parameters;
    }

    @Override
    public Map getParameterMap(){
        Map map = new LinkedHashMap<>();
        Map parameters = super.getParameterMap();
        for( String key : parameters.keySet() ){
            String[] values = parameters.get( key );
            for( int i = 0; i < values.length; i++ ){
                values[i] = xssFilter( values[i] );
            }
            map.put( key, values );
        }
        return map;
    }

    @Override
    public String getHeader( String name ){
        String value = super.getHeader( xssFilter( name ) );
        if( StringUtils.isNotBlank( value ) ){
            value = xssFilter( value );
        }
        return value;
    }

    /**
     * 
     * #获取最原始的request
     * 
     */
    public HttpServletRequest getOrgRequest(){
        return orgRequest;
    }

    /**
     * 
     * #获取最原始的request
     * 
     * @param request HttpServletRequest
     */
    public static HttpServletRequest getOrgRequest( HttpServletRequest request ){
        if( request instanceof XssHttpServletRequestWrapper ){
            return ((XssHttpServletRequestWrapper) request).getOrgRequest();
        }
        return request;
    }

    /**
     * 
     * servlet中inputStream只能一次读取,后续不能再次读取inputStream
     * xss过滤body后,重新把流放入ServletInputStream中
     * 
*/ private static class RequestCachingInputStream extends ServletInputStream { private final ByteArrayInputStream inputStream; public RequestCachingInputStream(byte[] bytes) { inputStream = new ByteArrayInputStream(bytes); } @Override public int read() throws IOException { return inputStream.read(); } @Override public boolean isFinished() { return inputStream.available() == 0; } @Override public boolean isReady() { return true; } @Override public void setReadListener( ReadListener readListener ){ } } }

springboot2.2.4.RELEASE中注册Filter

@Configuration
public class XssFilterConfig {

    @Value("${xss.enabled:true}")
    private String enabled;

    @Value("${xss.excludes:}")
    private String excludes;

    @Value("${xss.includes$:}")
    private String includes;

    @Value("${xss.urlPatterns:/*}")
    private String urlPatterns;

    @Bean
    public FilterRegistrationBean xssFilterRegistrationBean() {
        FilterRegistrationBean registration = new FilterRegistrationBean<>();
        registration.setDispatcherTypes(DispatcherType.REQUEST);
        registration.setFilter(new XssFilter());
        registration.addUrlPatterns(urlPatterns.split(","));
        registration.setName("XssFilter");
        registration.setOrder(Integer.MAX_VALUE);
        Map initParameters = new HashMap();
        initParameters.put("excludes", excludes);
        initParameters.put("includes", excludes);
        initParameters.put("enabled", enabled);
        registration.setInitParameters(initParameters);
        return registration;
    }
}

测试

http://localhost:8080/demo/th/xss?abc=%3Ca%20href=%22http://www.baidu.com/a%22%20onclick=%22alert(1);%22%3Eabc%3C/a%3E%3Cscript%3Ealert(0);%3C/script%3E&abc=%3Cb%20style=%22xxx%22%20onclick=%22%3Cscript%3Ealert(0);%3C/script%3E%22%3Eabc%3C/%3E

image

你可能感兴趣的:(java,springboot,xss)