声明:转载请注明出处,谢谢:https://www.jianshu.com/p/1fd987de980f
另外,更多实时更新的个人学习笔记分享,请关注:
知乎:https://www.zhihu.com/people/yuquanle/columns
公众号:StudyForAI
CSDN地址:http://blog.csdn.net/m0_37306360
写了一波关于命名实体识别工具方面的例子,希望对大家有帮助~~
Stanford CoreNLP命名实体类识别
安装:pip install stanfordcorenlp
国内源安装:pip install stanfordcorenlp -i https://pypi.tuna.tsinghua.edu.cn/simple
使用stanfordcorenlp进行命名实体类识别
先下载模型,下载地址:https://nlp.stanford.edu/software/corenlp-backup-download.html
对中文进行实体识别
from stanfordcorenlp import StanfordCoreNLP
zh_model = StanfordCoreNLP(r'stanford-corenlp-full-2018-02-27', lang='zh')
s_zh = '我爱自然语言处理技术!'
ner_zh = zh_model.ner(s_zh)
s_zh1 = '我爱北京天安门!'
ner_zh1 = zh_model.ner(s_zh1)
print(ner_zh)
print(ner_zh1)
[('我爱', 'O'), ('自然', 'O'), ('语言', 'O'), ('处理', 'O'), ('技术', 'O'), ('!', 'O')]
[('我爱', 'O'), ('北京', 'STATE_OR_PROVINCE'), ('天安门', 'FACILITY'), ('!', 'O')]
对英文进行实体识别
eng_model = StanfordCoreNLP(r'stanford-corenlp-full-2018-02-27')
s_eng = 'I love natural language processing technology!'
ner_eng = eng_model.ner(s_eng)
s_eng1 = 'I love Beijing Tiananmen!'
ner_eng1 = eng_model.ner(s_eng1)
print(ner_eng)
print(ner_eng1)
[('I', 'O'), ('love', 'O'), ('natural', 'O'), ('language', 'O'), ('processing', 'O'), ('technology', 'O'), ('!', 'O')]
[('I', 'O'), ('love', 'O'), ('Beijing', 'CITY'), ('Tiananmen', 'LOCATION'), ('!', 'O')]
Hanlp命名实体类识别
安装:pip install pyhanlp
国内源安装:pip install pyhanlp -i https://pypi.tuna.tsinghua.edu.cn/simple
通过crf算法识别实体
from pyhanlp import *
# 音译人名示例
CRFnewSegment = HanLP.newSegment("crf")
term_list = CRFnewSegment.seg("我爱北京天安门!")
print(term_list)
[我/r, 爱/v, 北京/ns, 天安门/ns, !/w]
NLTK词性标注
安装:pip install nltk
国内源安装:pip install nltk -i https://pypi.tuna.tsinghua.edu.cn/simple
import nltk
s = 'I love natural language processing technology!'
s_token = nltk.word_tokenize(s)
s_tagged = nltk.pos_tag(s_token)
s_ner = nltk.chunk.ne_chunk(s_tagged)
print(s_ner)
spaCy命名实体识别
安装:pip install spaCy
国内源安装:pip install spaCy -i https://pypi.tuna.tsinghua.edu.cn/simple
import spacy
eng_model = spacy.load('en')
s = 'I want to Beijing learning natural language processing technology!'
命名实体识别
s_ent = eng_model(s)
for ent in s_ent.ents:
print(ent, ent.label_, ent.label)
Beijing GPE 382