java Process调用python脚本乱码踩坑记

背景:

项目中需要使用java调用python脚本执行一些操作,于是直接使用的Process调用shell脚本,shell脚本里面执行python的脚本调用。

出现的问题:

突然有一天,测试小姐姐说,python输出的日志中文全是乱码,于是乎...

解决思路:

1、出现乱码的原因:

出现乱码的原因无非就是编码和解码的方式对不上(比如:读用 UTF-8 写用 GB2312)

2、解决的办法

     1)、既然编码对不上,那直接将乱码的编码方式转换一下就行了,但是转码需要知道转换之前和转换之后的编码,python脚本的编码是UTF-8,但是java Process拿到的流的编码是没办法指定的。源码如下:

/**
     * Returns the input stream connected to the normal output of the
     * process.  The stream obtains data piped from the standard
     * output of the process represented by this {@code Process} object.
     *
     * 

If the standard output of the process has been redirected using * {@link ProcessBuilder#redirectOutput(Redirect) * ProcessBuilder.redirectOutput} * then this method will return a * null input stream. * *

Otherwise, if the standard error of the process has been * redirected using * {@link ProcessBuilder#redirectErrorStream(boolean) * ProcessBuilder.redirectErrorStream} * then the input stream returned by this method will receive the * merged standard output and the standard error of the process. * *

Implementation note: It is a good idea for the returned * input stream to be buffered. * * @return the input stream connected to the normal output of the * process */ public abstract InputStream getInputStream(); /** * Returns the input stream connected to the error output of the * process. The stream obtains data piped from the error output * of the process represented by this {@code Process} object. * *

If the standard error of the process has been redirected using * {@link ProcessBuilder#redirectError(Redirect) * ProcessBuilder.redirectError} or * {@link ProcessBuilder#redirectErrorStream(boolean) * ProcessBuilder.redirectErrorStream} * then this method will return a * null input stream. * *

Implementation note: It is a good idea for the returned * input stream to be buffered. * * @return the input stream connected to the error output of * the process */ public abstract InputStream getErrorStream();

于是在inputstream拿到的流外面指定编码。即如下操作:

BufferedReader reader = new BufferedReader(new InputStreamReader(stream,StandardCharsets.UTF_8)); //指定为UTF-8编码

结果,结果还是不行。后面仔细想想,这样原因是python是UTF-8编码,现在拿到的redader在控制台已经输出了乱码,说明现在的编码已经不是UTF-8,那么再强转成UTF-8肯定依旧是乱码。感觉行不通,换一种解决思路。

2)、查询Process源码的注释,发现一个很有用的信息,如下(本人英语比较差,翻译不好勿怪)

By default, the created process does not have its own terminal * or console. All its standard I/O (i.e. stdin, stdout, stderr) * operations will be redirected to the parent process, where they can * be accessed via the streams obtained using the methods

大概意思是,创建进程没有自己的控制台,他的所有的标准IO,输入输出,错误的输出都重定向到他的父进程,在父进程的控制台里面可以通过流被访问。父进程,父进程,哈哈,原来Process的编码是由父进程指定的,项目部署在docker里面,于是进入docker容器locale一下,果然没有指定编码,于是修改docker打包方式,指定默认编码为UTF-8乱码解决

你可能感兴趣的:(实战采坑,java,python,shell,docker)