Python第三方库依照安装方式灵活性和难易程度有三个方法:pip工具安装、自定义安装和文件安装。
pip install <拟安装库名>
:\>pip install pygame
...
Installing collected packages: pygame
Successfully installed pygame-1.9.2b1
http://www.numpy.org/
http://www.scipy.org/scipylib/download.html
http://www.lfd.uci.edu/~gohlke/pythonlibs/
:\>pip install D:\pycodes\scipy-0.17.1-cp35-cp35m-win32.whl
Processing d:\pycodes\scipy-0.17.1-cp35-cp35m-win32.whl
Installing collected packages: scipy
Successfully installed scipy-0.17.1
:\>pip -h
Usage:
pip [options]
Commands:
install Install packages.
download Download packages.
uninstall Uninstall packages.
freeze Output installed packages in requirements format.
list List installed packages.
show Show information about installed packages.
search Search PyPI for packages.
wheel Build wheels from your requirements.
hash Compute hashes of package archives.
completion A helper command used for command completion
help Show help for commands.
pip uninstall <拟卸载库名>
pip list
pip show <拟查询库名>
pip download
pip search <拟查询关键字>
:\>pip search installer
winbrew (1.1.7) - Native package installer for Windows
pygitflow-avh (1.2.0) - Pythonic Installer for Git Flow
(AVH Edition).
notouch (0.3) - Notouch Physical Machine
Installer Automation Service
:\>pip install PyInstaller
:>PyInstaller
:>PyInstaller -F
:\>PyInstaller -F SnowView.py
参数 | 功能 |
---|---|
-h, --help | 查看帮助 |
–clean | 清理打包过程中的临时文件 |
-D, --onedir | 默认值,生成dist目录 |
-F, --onefile | 在dist文件夹中只生成独立的打包文件 |
-i <图标文件名.ico > | 指定打包程序使用的图标(icon)文件 |
:\>pip install jieba
>>>import jieba
>>>jieba.lcut("全国计算机等级考试")
Building prefix dict from the default dictionary ...
Loading model from cache C:\AppData\Local\Temp\jieba.cache
Loading model cost 1.001 seconds.
Prefix dict has been built succesfully.
['全国', '计算机', '等级', '考试']
>>>import jieba
>>>ls = jieba.lcut("全国计算机等级考试Python科目")
>>>print(ls)
['全国', '计算机', '等级', '考试', 'Python', '科目']
>>>import jieba
>>>ls = jieba.lcut("全国计算机等级考试Python科目", cut_all=True)
>>>print(ls)
['全国', '国计', '计算', '计算机', '算机', '等级', '考试',
'Python', '科目']
>>>import jieba
>>>ls = jieba.lcut_for_search("全国计算机等级考试Python科目")
>>>print(ls)
['全国', '计算', '算机', '计算机', '等级', '考试', 'Python', '科
目']
>>>import jieba
>>>jieba.add_word("Python科目")
>>>ls = jieba.lcut("全国计算机等级考试Python科目")
>>>print(ls)
['全国', '计算机', '等级', '考试', 'Python科目']
:\>pip install wordcloud
>>>from wordcloud import WordCloud
>>>txt='I like python. I am learning python'
>>>wordcloud = WordCloud().generate(txt)
>>>wordcloud.to_file('testcloud.png')
import jieba
from wordcloud import WordCloud
txt = '程序设计语言是计算机能够理解和识别用户操作意图的一种交互体系,它按
照特定规则组织计算机指令,使计算机能够自动进行各种运算处理。'
words = jieba.lcut(txt) # 精确分词
newtxt = ' '.join(words) # 空格拼接
wordcloud = WordCloud(font_path="msyh.ttc").generate(newtxt)
wordcloud.to_file('词云中文例子图.png') # 保存图片
参数 | 功能 |
---|---|
font_path | 指定字体文件的完整路径,默认None |
width | 生成图片宽度,默认400像素 |
height | 生成图片高度,默认200像素 |
mask | 词云形状,默认None,即,方形图 |
min_font_size | 词云中最小的字体字号,默认4号 |
font_step | 字号步进间隔,默认1 |
min_font_size | 词云中最大的字体字号,默认None,根据高度自动调节 |
max_words | 词云图中最大词数,默认200 |
stopwords | 被排除词列表,排除词不在词云中显示 |
background_color | 图片背景颜色,默认黑色 |
方法 | 功能 |
---|---|
generate(text) | 由text文本生成词云 |
to_file(filename) | 将词云图保存为名为filename的文件 |
from wordcloud import WordCloud
from scipy.misc import imread
mask = imread('AliceMask.png')
with open('AliceInWonderland.txt', 'r', encoding='utf-8') as file:
text = file.read()
wordcloud = WordCloud(background_color="white", \
width=800, \
height=600, \
max_words=200, \
max_font_size=80, \
mask = mask, \
).generate(text)
# 保存图片
wordcloud.to_file('AliceInWonderland.png')
# CalStoryOfStone.py
import jieba
f = open("红楼梦.txt", "r")
txt = f.read()
f.close()
words = jieba.lcut(t)
counts = {}
for word in words:
if len(word) == 1: #排除单个字符的分词结果
continue
else:
counts[word] = counts.get(word,0) + 1
items = list(counts.items())
items.sort(key=lambda x:x[1], reverse=True)
for i in range(15):
word, count = items[i]
print ("{0:<10}{1:>5}".format(word, count))
>>>
宝玉 3748
什么 1613
一个 1451
贾母 1228
我们 1221
那里 1174
凤姐 1100
王夫人 1011
你们 1009
如今 999
说道 973
知道 967
老太太 966
起来 949
姑娘 941
# CalStoexcludes = {"什么","一个","我们","那里","你们","如今", \
"说道","知道","老太太","起来","姑娘","这里", \
"出来","他们","众人","自己","一面","太太", \
"只见","怎么","奶奶","两个","没有","不是", \
"不知","这个","听见"}
for word in excludes:
del(counts[word])
>>>
宝玉 3748
贾母 1228
凤姐 1100
王夫人 1011
贾琏 670
import jieba
from wordcloud import WordCloud
excludes = {"什么","一个","我们","那里","你们","如今", \
"说道","知道","老太太","起来","姑娘","这里", \
"出来","他们","众人","自己","一面","太太", \
"只见","怎么","奶奶","两个","没有","不是", \
"不知","这个","听见"}
f = open("红楼梦.txt", "r")
txt = f.read()
f.close()
words = jieba.lcut(txt)
newtxt = ' '.join(words)
wordcloud = WordCloud(background_color="white", \
width=800, \
height=600, \
font_path="msyh.ttc", \
max_words=200, \
max_font_size=80, \
stopwords = excludes, \
).generate(newtxt)
wordcloud.to_file('红楼梦基本词云.png')
本专题介绍了利用Python第三方库编程的模块编程思想和计算生态的理解和运用,并进一步讲解了如何使用jieba词库对中文文档进行分词并进一步统计文档词频。
本专题主要围绕Python第三方库,讲解了第三方库获取和安装方法,并详细介绍了PyInstaller程序打包功能、jieba中文分词功能和
wordcloud词云可视化功能等3个具体第三方库的使用。通过《红楼梦》人物出场统计和词云效果展示实例帮助读者熟练掌握这3个Python第三方库的具体使用方法。
0参数改为max_words=5,获得前5个出场次数最多人物组成的词云。
[外链图片转存中…(img-gBjtlgN4-1693102662329)]
本专题介绍了利用Python第三方库编程的模块编程思想和计算生态的理解和运用,并进一步讲解了如何使用jieba词库对中文文档进行分词并进一步统计文档词频。
本专题主要围绕Python第三方库,讲解了第三方库获取和安装方法,并详细介绍了PyInstaller程序打包功能、jieba中文分词功能和
wordcloud词云可视化功能等3个具体第三方库的使用。通过《红楼梦》人物出场统计和词云效果展示实例帮助读者熟练掌握这3个Python第三方库的具体使用方法。
古籍中外名著名篇甚多,除了《红楼梦》,还对哪些内容感兴趣?词频统计、人物统计、词云效果,来套组合拳吧!