20220213memo#艽野尘梦#读书笔记

蓝色框代表军队相关的词汇,棕色框代表西藏相关
灰色框代表人名

用python生成词云图片,速读 陈渠珍 的自传《艽野尘梦》  

上面2张图片都是从艽野尘梦.txt 生成的词云图片,只不过背景图片不同。




《艽野尘梦》.txt from:

https://d3.qinkan.net/d/txt/89/%E3%80%8A%E8%89%BD%E9%87%8E%E5%B0%98%E6%A2%A6%E3%80%8B(%E5%87%BA%E4%B9%A6%E7%89%88)_qinkan.net.txt  

陈渠珍 简介:

https://baike.baidu.com/item/%E9%99%88%E6%B8%A0%E7%8F%8D/2576492    

python生成词云的python 脚本程序

(除 艽野尘梦.txt之外,还需要背景图片.png,剔除词列表.txt,才能生成图片)  

import matplotlib.pyplot as plt

import pickle

from wordcloud import WordCloud,STOPWORDS,ImageColorGenerator

import jieba as jb

import numpy as np

from PIL import Image


text = ''

with open(r'艽野尘梦.txt') as fin:

    for line in fin.readlines():

        line = line.strip('\n')

        text += ' '.join(jb.cut(line))

        text += ' '

fout = open('book2.txt','wb')


pickle.dump(text,fout)

fout.close()

sw = set(STOPWORDS)

sw= {}.fromkeys([ line.rstrip() for line in open('剔除词列表.txt') ])

fr = open('book2.txt','rb')

text = pickle.load(fr)

backgroud_Image =np.array(Image.open('背景图片.png'))

wc = WordCloud(

    background_color = 'white',

    mask = backgroud_Image,

    max_words = 800,

    stopwords = sw,

    font_path = 'C:/Windows/fonts/msyh.ttf',  ##  msyh.ttf 微软雅黑 中文 , comic.TTF 英文 ,

    max_font_size =100,

    prefer_horizontal = 1.0

    )

wc.generate(text)

image_colors = ImageColorGenerator(backgroud_Image)

wc.recolor(color_func = image_colors)

plt.imshow(wc)

plt.axis('off')

wc.to_file("输出照片.png")

你可能感兴趣的:(20220213memo#艽野尘梦#读书笔记)