Python学习第三天

一、字符串序列

-切片
1.首先定义一个序列

l = 'abcdefg'
print(name[1:4])

输出结果是bcd
此时是读取的[起始位置:终止位置:步长]

print(name[0:7:2])
结果:aceg

全切片时可以省略起始位置和终止位置

print(name[::2])
结果:aceg

查看序列的元素个数

print(len(l))

替换

price = '$999'
price = price.replace('$','¥')
print(price)

把列表变成字符串:join

li = ['a', 'b', 'c', 'd']
a = ','.join(li)
print(a)
print(type(a))

二、元组

a = ('zhangsan', 'lisi', 'wangwu',1000)
b = ('lisi',)
c = ( 1000, )
#元组与字符串类似,下标索引从0开始,可以进行截取、组合,
#但是元组里的元素是不允许修改的
#元组中只包含一个元素时,需要在元素后面加逗号

三、字典

-定义

d = {key1 : value1, key2 :value2}
#键一般是唯一的,如果重复,后面的会替换前面,值不需要唯一

info = {'a' :1, 'b' :2, 'c': '3' }
#值可以取任何数据类型,但键必须是不可变的,如字符串,数字或元组。

#访问
print(info['name'])

# 修改
info['c'] = '8'

# 增加
info['d'] = '5'

# 获取字典中所有的键
print(info.keys())

#   获取字典中所有的z值
print(info.values())

# 获取字典中所有的key-value
print(info.items())

#  遍历字典
for k, v in info.items():
    print(k, v)

四、列表排序

#定义一个列表
stu_info = [
    {"name":'zhangsan', "age":18},
    {"name":'lisi', "age":30},
    {"name":'wangwu', "age":99},
    {"name":'tiaqi', "age":3},
]

#自定义函数
# def 函数名(参数):
#     函数体
#reverse=True,倒序排列
def sort_by_age(x):
    return x['age']

stu_info.sort(key=sort_by_age, reverse=True)
print('排序后', stu_info)

五、本地文件读取

# python中使用open内置函数进行文件读取
#file是文件地址,mode设定只读,encoding设置字符编码
f = open(file='./novel/threekingdom.txt', mode='r', encoding='utf-8')


# 写入
txt = 'i like python'
with open('python.txt','w', encoding='utf-8') as f:
    f.write(txt)

-导入jieba模块进行分词

# 导入jieba分词
import jieba
# 三种分词模式
seg = "我来到北京清华大学"
# 精确模式  精确分词
seg_list = jieba.lcut(seg)
print(seg_list)
# 全模式  找出所有可能的分词结果    冗余性大
seg_list1 = jieba.lcut(seg,cut_all=True)
print(seg_list1)
#  搜索引擎模式
seg_list2 = jieba.lcut_for_search(seg)
print(seg_list2)


['我', '来到', '北京', '清华大学']
['我', '来到', '北京', '清华', '清华大学', '华大', '大学']
['我', '来到', '北京', '清华', '华大', '大学', '清华大学']

-词云展示

# 4. 词云的展示
from wordcloud import WordCloud
# 绘制词云
text = 'He was an old man who fished alone in a skiff in the Gulf Stream and he had gone eighty-four days now without taking a fish. In the first forty days a boy had been with him. But after forty days without a fish the boy’s parents had told him that the old man was now definitely and finally salao, which is the worst form of unlucky, and the boy had gone at their orders in another boat which caught three good fish the first week. It made the boy sad to see the old man come in each day with his skiff empty and he always went down to help him carry either the coiled lines or the gaff and harpoon and the sail that was furled around the mast. The sail was patched with flour sacks and, furled, it looked like the flag of permanent defeat.'
wc = WordCloud().generate(text)
wc.to_file('老人与海.png')
image.png

六、一个红楼梦分词的实例

import jieba

with open('./novel/all.txt', 'r', encoding='utf-8') as f:
    words = f.read()

    counts = {}
    excludes = {"什么","一个","我们","你们","如今","说道","知道","姑娘","起来","这里","出来","众人","那里","奶奶","自己",
                "老太太","太太","一面","只见","两个","没有","怎么","不是","不知","这个","听见","这样","进来","咱们","就是",
                "东西","告诉","回来","只是","大家","老爷","只得","丫头","这些","他们","不敢","出去","所以","不过","凤姐儿",
                "不过","不好","姐姐","的话","一时","过来","不能","心里","二爷","她们","如此","银子","今日","二人","答应",
                "几个","这么","还有","只管","说话","那边","一回","这话","外头","自然","打发","哪里","今儿","罢了","那些",
                "哪些","屋里","问道","小丫头","如何","听说","人家","看见","媳妇","不用"}

    words_list = jieba.lcut(words)
    # print(words_list)

    for word in words_list:
        if len(word) <= 1:
            continue
        else:
            counts[word] = counts.get(word, 0) + 1

    counts['贾母'] = counts['贾母'] + counts['老太太']
    counts['黛玉'] = counts['黛玉'] + counts['林黛玉']
    counts['宝玉'] = counts['宝玉'] + counts['贾宝玉']
    counts['宝钗'] = counts['宝钗'] + counts['薛宝钗']
    counts['贾政'] = counts['老爷'] + counts['贾政']
    counts['王夫人'] = counts['王夫人'] + counts['太太']
    counts['凤姐'] = counts['凤姐儿'] + counts['凤姐'] + counts['王熙凤']


    for word in excludes:
        del counts[word]

    items = list(counts.items())
    # print(items)

    def sort_by_count(x):
        return x[1]
    items.sort(key=sort_by_count, reverse=True)

    li = []
    lo = []
    for i in range(20):
        role, count = items[i]
        li.append(role)
        lo.append(count)
    print(li)
    print(lo)

你可能感兴趣的:(Python学习第三天)