最近做一个项目,需要得到英文单词的原型,上网查资料发现了StanfordNLP这一款工具,可以处理很多NLP相关的任务,而且效果不错。于是乎用这款工具做了单词还原。
这款工具使用java写的,而我的项目全部是用python写的,于是乎在github上面找python版本的,但是python版本的运行速度实在不敢恭维,那怎么办咧!!
想到python被称为胶水语言,理论上可以黏合各种语言才对,于是上网去查python中如何写java代码,没想到还真找到一个方法,那就是JPype包,导入这个包就可以在python中调用JVM,运行java代码了,可以在pycharm中python,java随意切换,再也不会有语法的混淆了,那叫一个酸爽!!
俗话说,工欲善其事必先利其器!!在使用之前先办环境配好,废话不多说。
JPype用起来还有点小麻烦,py2和py3有点小区别。如果你使用的是py2,那么安装包时用:pip install JPype;如果用的是py3,请使用pip install JPype1。
即时在python环境下调用JVM,那也是调用JVM,所以我们的环境还需要安装JVM,而且要求jdk的位数和python的位数必须保持一致。所以要看好自己的python和jdk的位数(jdk安装时,如果默认路径是x86,那一般是32位的,反之为64位的)。
万事俱备,开始写代码!!
首先贴上利用StanfordNLP做单词还原的java代码:
package part_of_speech_reduction;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.util.CoreMap;
import edu.stanford.nlp.util.StringUtils;
public class get_result {
public static void main(String[] args) throws IOException, InterruptedException{
try {
String sentence = "Many people who work in London prefer to live outside it, and to go in to their offices or schools every day by train, car or bus, even though this means they have to get up early in the morning and reach home late in the evening._ One advantage of living outside London is that houses are cheaper. Even a small flat in London without a garden costs quite a lot to rent. With the same money, one can get a little house in the country with a garden of one's own. Then, in the country one can really get away from the noise and hurry of busy working lives. Even though one has to get up earlier and spend more time in trains or buses, one can sleep better at night and during weekends and on summer evenings, one can enjoy the fresh, clean air of the country. If one likes gardens, one can spend one's free time digging, planting, watering and doing the hundred and one other jobs which are needed in a garden. Then, when the flowers and vegetables come up, one has got the reward together with those who have shared the secret of nature. Some people, however, take no interest in country things: for them, happiness lies in the town, with its cinemas and theatres, beautiful shops and busy streets, dance-halls and restaurants. Such people would feel that their life was not worth living if they had to live it outside London. An occasional (偶谆謩) walk in one of the parks and a fortnight's (two weeks) visit to the sea every summer is all the country they want: the rest of the country they are quite prepared to spend with those who are glad to get away from London every night.\r\n";
String result = deal_file(sentence);
// Runtime.getRuntime().exec("cls");
new ProcessBuilder("cmd","/c","cls").inheritIO().start().waitFor();
System.out.println(result);
} catch (Exception e) {
// TODO: handle exception
}
}
public static String deal_file(String sentence){
String text = "";
List word = getlema(sentence);
text = StringUtils.join(word, " ");
return text;
}
public static List getlema(String text) {
// TODO Auto-generated method stub
List wordslist = new ArrayList<>();
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation(text);
pipeline.annotate(document);
List words = document.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap word_temp:words ) {
for(CoreLabel token: word_temp.get(CoreAnnotations.TokensAnnotation.class)) {
String lema = token.get(CoreAnnotations.LemmaAnnotation.class);
wordslist.add(lema);
}
}
return wordslist;
}
}
写好java程序后,我们将其打包成jar包,自己起个名字,如何打包成jar包请自行百度,打包时选择可运行的jar文件,可以免去后续的配置。
单词还原之后我们便可以直接在pycharm中通过调用JPype来使用我们的java程序了,话不多说,直接上代码:
import jpype
from jpype import *
import os
#路径为刚才导出的jar包的路径
jar_path = os.path.join(os.path.abspath('.'), r'H:\company\cpit_cixinghuanyuan_v2.5.jar')
#路径为java程序所依赖的jar包的集合
ext_jar_path = os.path.join(os.path.abspath('.'), r'H:\company\extend_jars')
ext_jar = "-Djava.ext.dirs="+ext_jar_path
Djava = "-Djava.class.path="+jar_path
jvm_path = get_default_jvm_path()
jpype.startJVM(jvm_path, Djava, ext_jar)
#java程序所在包名
JPackge = jpype.JPackage("part_of_speech_reduction")
texts = "Many people who work in London prefer to live outside it, and to go in to their offices or school."
#调用函数
difference = JPackge.get_result.deal_file(texts)
print(difference)
shutdownJVM()
注意!!jvm只能启动一次,关闭一次,否则会报错。
在网上查资料的过程中,还看到了另一种思路,利用python调用dos命令行,再用dos命令行来调用jvm。这种方法大致看了一下思路,还没有深入研究。