wordcloud使用总结

在使用wordcloud过程中遇到一个error

ValueError: We need at least 1 word to plot a word cloud, got 0.
这个问题是在用wordcloud处理中文的时候遇到的,但是用jieba分词后的内容,传给wordcloud.generate()时,却没有问题。单独把jieba分词后的结果打印出来,发现时unicode,所以尝试把内容转换为unicode后传给generate()函数,错误消失。

下面是代码:

#! coding=utf-8
import os
import jieba

from os import path
from wordcloud import WordCloud
import numpy as np
from PIL import Image
from os import path

d = path.dirname(__file__)


words = [
         u'古天乐',
         u'郭富城',
         u'刘德华',
         u'周杰伦',
         ]

mask = np.array(Image.open(path.join(d, "xxx.jpg")))

font = r'C:\Windows\Fonts\simfang.ttf'
wordcloud = WordCloud(font_path=font,width=500, height=600 margin=5, background_color="white").generate(" ".join(words))
wordcloud.to_file(path.join(d, "sb_mask.png"))

效果如图
sb_mask.png

默认效果是生成一张矩形图片,你也可以自己找一张背景图,来生成背景图案中的形状,需要注意的是,背景图案中除形状所需部分,必须是纯白(255,255,255)

mask = np.array(Image.open(path.join(d, "xxx.jpg")))

将mask传给wordcloud,将生成mask形状的图案。

wordcloud的参数介绍

--text
specify file of words to build the word cloud (default: stdin)

Default: -

--regexp
override the regular expression defining what constitutes a word

--stopwords
specify file of stopwords (containing one word per line) to remove from the given text after parsing

--imagefile
file the completed PNG image should be written to (default: stdout)

Default: -

--fontfile
path to font file you wish to use (default: DroidSansMono)

--mask
mask to use for the image form

--colormask
color mask to use for image coloring

--contour_width
if greater than 0, draw mask contour (default: 0)

Default: 0

--contour_color
use given color as mask contour color - accepts any value from PIL.ImageColor.getcolor

Default: “black”

--relative_scaling
scaling of words by frequency (0 - 1)

Default: 0

--margin
spacing to leave around words

Default: 2

--width
define output image width

Default: 400

--height
define output image height

Default: 200

--color
use given color as coloring for the image - accepts any value from PIL.ImageColor.getcolor

--background
use given color as background color for the image - accepts any value from PIL.ImageColor.getcolor

Default: “black”

--no_collocations
do not add collocations (bigrams) to word cloud (default: add unigrams and bigrams)

Default: True

--include_numbers
include numbers in wordcloud?

Default: False

--min_word_length
only include words with more than X letters

Default: 0

--prefer_horizontal
ratio of times to try horizontal fitting as opposed to vertical

Default: 0.9

--scale
scaling between computation and drawing

Default: 1

--colormap
matplotlib colormap name

Default: “viridis”

--mode
use RGB or RGBA for transparent background

Default: “RGB”

--max_words
maximum number of words

Default: 200

--min_font_size
smallest font size to use

Default: 4

--max_font_size
maximum font size for the largest word

--font_step
step size for the font

Default: 1

--random_state
random seed

--no_normalize_plurals
whether to remove trailing ‘s’ from words

Default: True

--repeat
whether to repeat words and phrases

Default: False

--version
show program’s version number and exit

你可能感兴趣的:(wordcloud使用总结)