sentiment analysis

定义来自Sentiment Analysis and Opinion Mining 2.1节
Definition (Opinion): An opinion is a quadruple,
(g, s, h, t),
where g is the opinion (or sentiment) target, can be any entity or aspect of the entity s is the sentiment about the target, h is the opinion holder or opinion source and t is the time when the opinion was expressed.

Definition (entity): An entity e is a product, service, topic, issue, person, organization, or event. It is described with a pair, e: (T, W),where T is a hierarchy of parts, sub-parts and so on, and W is a set of attributes of e. Each part or sub-part also has its own set of attributes.
we simplify the hierarchy to two levels and use the term aspects to denote both parts and attributes. In the simplified tree, the root node is still the entity itself, but the second level(also the leaf level) nodes are different aspects of the entity.
example: from http://alt.qcri.org/semeval2015/task12/
(1) It fires up in the morning in less than 30 seconds and I have never had any issues with it freezing. → {LAPTOP#OPERATION_PERFORMANCE}
(2) Sometimes you will be moving your finger and the pointer will not even move. → {MOUSE#OPERATION_PERFORMANCE}

包含了entity的抽取,聚类,ranking等问题
抽取可用的方法:
1、基于规则的抽取,可以根据情感词和entity之间的关系来抽取
2、基于sequence模型
3、基于主题模型。

用stanford parser分析依存关系,然后设计语法规则,抽取修饰aspect(在训练集合中已经标记出来了)的表达式,然后通过SVM来训练。使用的feature如下:
1. POS,词的词性
2.上面提到的语法关系
3.情感词的极性。建立了情感词词典,包括sentiWordNet,MPQA,eBLR(由于情感词的极性有些是领域相关的,所以采用corps based方法:如果一个词在训练集合中只出有positive且频率超过一定值,就把他加入positive列表,negative列表也是如此建立,对于即有positive也有negtive的情况,则如果P比N的频率高则认为是P)

你可能感兴趣的:(ML)