这个单子是主要是进行英文评论积极,消极,中立词的统计,主要是用了一些库,别的没有什么
import pandas as pd
from textblob import TextBlob
test=pd.read_excel('爬虫结果.xls')
test.head()
|
text |
0 |
These are great but not much better then gen1.... |
1 |
Everyone is posting that there isn’t a differe... |
2 |
These AirPods are amazing they automatically p... |
3 |
My son really wanted airpods but his parents t... |
4 |
Poor quality microphone. Not suitable for a re... |
def function(x):
testimonial = TextBlob(x)
testimonial.sentiment
a=testimonial.sentiment.polarity
if a<-0.5:
return '消极'
elif a>0.5:
return '积极'
else:
return '中立'
test['laber']=test.apply(lambda x: function(x['text']),axis=1)
test.head()
|
text |
laber |
0 |
These are great but not much better then gen1.... |
中立 |
1 |
Everyone is posting that there isn’t a differe... |
中立 |
2 |
These AirPods are amazing they automatically p... |
中立 |
3 |
My son really wanted airpods but his parents t... |
中立 |
4 |
Poor quality microphone. Not suitable for a re... |
中立 |
test['laber'].value_counts()
中立 2496
积极 1044
消极 20
Name: laber, dtype: int64
rawgrp = test.groupby('laber')
chapter = rawgrp.agg(sum)
chapter = chapter[chapter.index != 0]
chapter
def function(a):
return a.lower()
chapter['text'] = chapter.apply(lambda x: function(x['text']), axis = 1)
chapter
|
text |
laber |
|
中立 |
these are great but not much better then gen1.... |
消极 |
estuvieron funcionando bien pero la batería no... |
积极 |
excellent, pretty useful... easy to use and re... |
n=[]
a=['works fine','describe honestly','commonly speed','general speed','general speed']
for i in a:
n.append(chapter.text[0].count(i))
n
[3, 0, 0, 0, 0]
n=[]
a=['poor quality','unclearly','rough','slow delivery','over time','wrong address','no reply','impatient','ineffective']
for i in a:
n.append(chapter.text[1].count(i))
n
[0, 0, 0, 0, 0, 0, 0, 0, 0]
n=[]
a=['high grade','high quality','easy to use','quick delivery','good packaging','wrong address','intact','return in time','friendly','effective']
for i in a:
n.append(chapter.text[2].count(i))
n
[0, 2, 20, 2, 1, 0, 0, 0, 2, 1]