根据乱码猜出编码

有时java出现乱码后,想知道编码前的字符集和编码后的字符集,这样可以快速调整编码集纠正乱码,但是同样是乱码怎么看出来他编码前和编码后到底是什么字符集呢,今天闲来无聊我就写了个demo,尝试了一下。代码和结果如下:

`

    String source = "中文测试";
    Charset gbkCharset = Charset.forName("gbk");
    Charset utf8Charset = Charset.forName("utf-8");
    Charset iso88591Charset = Charset.forName("iso-8859-1");

    Charset defaultCharset = Charset.defaultCharset();

    System.out.printf("defaultCharset:%s%n", defaultCharset);

    System.out.println(StringUtils.repeat("=",20));

    String str1 = StringUtils.toEncodedString(source.getBytes(gbkCharset), utf8Charset);
    System.out.printf("gbk=>utf-8:%s%n", str1);

    String str4 = StringUtils.toEncodedString(source.getBytes(utf8Charset), gbkCharset);
    System.out.printf("utf-8=>gbk:%s%n", str4);

    String str2 = StringUtils.toEncodedString(source.getBytes(iso88591Charset), utf8Charset);
    System.out.printf("iso8859-1=>utf-8:%s%n", str2);

    String str5 = StringUtils.toEncodedString(source.getBytes(utf8Charset), iso88591Charset);
    System.out.printf("utf-8=>iso8859-1:%s%n", str5);

    String str3 = StringUtils.toEncodedString(source.getBytes(gbkCharset), iso88591Charset);
    System.out.printf("gbk=>iso8859-1:%s%n", str3);

    String str6 = StringUtils.toEncodedString(source.getBytes(iso88591Charset), gbkCharset);
    System.out.printf("iso8859-1=>gbk:%s%n", str6);`
复制代码

运行结果:

defaultCharset:UTF-8

`=========================

gbk=>utf-8:���IJ���

utf-8=>gbk:涓枃娴嬭瘯

iso8859-1=>utf-8:????

utf-8=>iso8859-1:中ææµè¯

gbk=>iso8859-1:ÖÐÎIJâÊÔ

iso8859-1=>gbk:????

`=========================

我是用idea写的demo,项目代码文件默认编码是utf-8。这样是不是可以根据乱码的字符,大致判断出编码前和编码后的字符集从而调整相应的编码呢?

声明:

我没有系统的写单元测试,也不知道这个方法靠不靠谱,仅供参考

转载于:https://juejin.im/post/5abf1122f265da23986763d3

你可能感兴趣的:(根据乱码猜出编码)