一、相关库及函数
pip install wordcloud 文档:wordcloud.WordCloud — wordcloud 1.8.1 documentation
1、生成词云对象
class
wordcloud.
WordCloud
(font_path=None, width=400, height=200, margin=2, ranks_only=None, prefer_horizontal=0.9, mask=None, scale=1, color_func=None, max_words=200, min_font_size=4, stopwords=None, random_state=None, background_color='black', max_font_size=None, font_step=1, mode='RGB', relative_scaling='auto', regexp=None, collocations=True, colormap=None, normalize_plurals=True, contour_width=0, contour_color='black', repeat=False, include_numbers=False, min_word_length=0, collocation_threshold=30)
font_path: 通过选择字体路径指定分词的字体,windows字体路径:C:\Windows\Fonts。
将字体路径粘贴过去就可以
2、生成词云图
fit_words(frequencies) # 根据词频生成词云图
wordcloud = WordCloud(font_path = "C:\Windows\Fonts\STLITI.TTF").fit_words({'人':5,'我们':8,'哈哈':1, '早上好':100})
generate(text) # 根据文本生成词云图, 如下简单词云即是
pip install jieba git地址:GitHub - fxsjy/jieba: 结巴中文分词
“结巴”是比较好用的中文分词组件,支持精确模式、全模式、搜索引擎模式及paddle模式
def cut(self, sentence, cut_all=False, HMM=True, use_paddle=False)
1、jieba.lcut:传入需要分词文本,以list结构返回分词的结果。可以通过传参指定是否使用全模式,是否使用HMM模式,是否使用paddle模式。
二、词云
1、简单词云
import jieba
from wordcloud import WordCloud
txt = '弱小的人才习惯嘲讽和否定而内心强大的人从不吝啬赞美和鼓励我们就是后浪奔涌吧后浪奔涌吧'
words = jieba.lcut(txt) #精确分词
newtxt = ' '.join(words) #空格拼接
wordcloud = WordCloud(font_path = "C:\Windows\Fonts\STLITI.TTF").generate(newtxt)
wordcloud.to_file('中文词云图.jpg')
2、生成指定形状的词云
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Masked wordcloud
================
Using a mask you can generate wordclouds in arbitrary shapes.
"""
from os import path
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import os
from wordcloud import WordCloud, STOPWORDS
# get data directory (using getcwd() is needed to support running example in generated IPython notebook)
d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
# Read the whole text.
text = open(path.join(d, 'alice.txt'),encoding='UTF-8').read()
# read the mask image
# taken from
# http://www.stencilry.org/stencils/movies/alice%20in%20wonderland/255fk.jpg
alice_mask = np.array(Image.open(path.join(d, "alice_mask.png")))
stopwords = set(STOPWORDS)
stopwords.add("said")
wc = WordCloud(background_color="white", max_words=2000, mask=alice_mask,
stopwords=stopwords, contour_width=3, contour_color='steelblue')
# generate word cloud
wc.generate(text)
# store to file
wc.to_file(path.join(d, "alice.png"))
# show
plt.imshow(wc, interpolation='bilinear')
plt.axis("off")
plt.figure()
plt.imshow(alice_mask, cmap=plt.cm.gray, interpolation='bilinear')
plt.axis("off")
plt.show()
更多示例见:Gallery of Examples — wordcloud 1.8.1 documentation
参考文章:python制作词云图_moshanghuali的博客-CSDN博客_python词云图