NER

NER和POS tagging有什么差别?

Part-of-Speech(POS) tagging and Named Entity Recognition(NER) are two different problems.

Part-of-Speech tagging aims on identifying which grammatical group a word belongs to, so whether it is a NOUN, ADJECTIVE, VERB, ADVERBS etc. based on the context. This means it looks for relationships within the sentence and gives each word in a sentence the corresponding tag.
A popular software that helps with POS is spaCy Part-of-speech tagging - spaCy

Named Entity Recognition on the other hand tries to find out whether or not a word is a named entity. Named entities are persons, locations, organizations, time expressions etc. This problem can be broken down into detection of names followed by classification of name into the corresponding categories.

So most often a word recognized by NER may be recognized as a noun by a POS tagger.

Whereas POS is more of a global problem, since there can be relationships between the first and the last word of a sentence, NER is rather local, as named entities are not spread in a sentence and mostly consist of uni-, bi- or trigrams.

NER评价指标

P = 识别出的正确实体数 / 识别出的实体数

R = 识别出的正确实体数 / 样本的实体数

两者的取值都在 0 和 1 之间,数值越接近1,正确率或召回率就越高。正确率和召回率有时会出现矛盾的情况,这是需要综合考虑它们的加权调和平均值,也就是* F 值*,其中最常用的 F1 值,当 F1 值较高时说明试验方法比较有效。F1 值定义如下:

F1值 = (2 * 正确率 * 召回率)/(正确率 + 召回率)

你可能感兴趣的:(NLP,#,知识图谱)