wordcloud词云——python数据分析后可视化的重要方法

1、需要安装包

import numpy as np  #数据处理

import matplotlib.pyplot as plt  #作图

from wordcloud import WordCloud  #词云函数

import jieba  #分割中文的包

from imageio import imread  #读取图片   ....后面还有根据自己需要安装包

2、遇到的几个问题

读取文件中存在中文,读进去时会有报错,这是编码问题。

解决办法:在open函数中加上encoding="utf-8"

with open("./xxx.txt",'r',encoding='utf-8')as f:

    text=f.read()

    f.close()

在显示词云图的时候,会出现中文字符无法显示,用方框代替的现象。

解决办法:选择一个支持中文显示的字体。如在电脑中C:\Windows\Fonts\选择有个中文的字体,如,font = r'C:\Windows\Fonts\simfang.ttf',后面再使用WordCloud 的参数font_path=font。

几个简单实例:

import numpy as np

import matplotlib.pyplot as plt

from wordcloud import WordCloud

text = "square"  #表示内容

x, y = np.ogrid[:300, :300]

mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2

mask = 255 * mask.astype(int)

wc = WordCloud(background_color="white", repeat=True, mask=mask)

wc.generate(text)

plt.axis("off")

plt.imshow(wc, interpolation="bilinear")

plt.show()

单字内容


import os

from os import path

from wordcloud import WordCloud

# get data directory (using getcwd() is needed to support running example in generated IPython notebook)

d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()

# Read the whole text.

text = open(path.join(d, 'constitution.txt')).read()

# Generate a word cloud image

wordcloud = WordCloud().generate(text)

# Display the generated image:

# the matplotlib way:

import matplotlib.pyplot as plt

plt.imshow(wordcloud, interpolation='bilinear')

plt.axis("off")

# lower max_font_size

wordcloud = WordCloud(max_font_size=40).generate(text)

plt.figure()

plt.imshow(wordcloud, interpolation="bilinear")

plt.axis("off")

plt.show()

多字的内容,内容从本地电脑中获取



from os import path

from PIL import Image

import numpy as np

import matplotlib.pyplot as plt

import os

from wordcloud import WordCloud, STOPWORDS

# get data directory (using getcwd() is needed to support running example in generated IPython notebook)

d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()

# Read the whole text.

text = open(path.join(d, 'alice.txt')).read()

# read the mask image

# taken from

# http://www.stencilry.org/stencils/movies/alice%20in%20wonderland/255fk.jpg

alice_mask = np.array(Image.open(path.join(d, "alice_mask.png")))

stopwords = set(STOPWORDS)

stopwords.add("said")

wc = WordCloud(background_color="white", max_words=2000, mask=alice_mask,

              stopwords=stopwords, contour_width=3, contour_color='steelblue')

# generate word cloud

wc.generate(text)

# store to file

wc.to_file(path.join(d, "alice.png"))

# show

plt.imshow(wc, interpolation='bilinear')

plt.axis("off")

plt.figure()

plt.imshow(alice_mask, cmap=plt.cm.gray, interpolation='bilinear')

plt.axis("off")

plt.show()

使用图片来做词云

实例图片


wordcloud词云——python数据分析后可视化的重要方法_第1张图片
用特定图片作为词云背景
wordcloud词云——python数据分析后可视化的重要方法_第2张图片
多词组成的词云
wordcloud词云——python数据分析后可视化的重要方法_第3张图片
单词组成的词云1
wordcloud词云——python数据分析后可视化的重要方法_第4张图片
单词组成的词云2


wordcloud词云——python数据分析后可视化的重要方法_第5张图片
中文和英文的词云

更多信息可以参看wordcloud官网:

https://amueller.github.io/word_cloud/

上面有更多的例子,上面内容也来自于网站整理。

也可参考网站:

https://blog.csdn.net/xiemanR/article/details/72796739?utm_source=blogxgwz7

你可能感兴趣的:(wordcloud词云——python数据分析后可视化的重要方法)