一、集合的使用

集合是一个无序的,不重复的数据组合,它的主要作用如下:

  • 去重,把一个列表变成集合,就自动去重了
  • 关系测试,测试两组数据之前的交集、差集、并集等关系
list_1 = [1, 4, 5, 7, 3, 6, 7, 9]
list_1 = set(list_1)

print(list_1, type(list_1))

list_2 = set([2, 6, 0, 66, 22, 8, 4])
print(list_1, list_2)

print(list_1.intersection(list_2))             # 求交集
print(list_1.union(list_2))                    # 求并集
print(list_1.difference(list_2))               # 求差集, in list_1 but not in list_2
print(list_2.difference(list_1))               # 求差集, in list_2 but not in list_1
print(list_1.issubset(list_2))                 # 判断list_1是否为list_2子集
print(list_1.issuperset(list_2))               # 判断list_1是否为list_2父集

list_3 = set([1, 3, 7])
print(list_3.issubset(list_1))                 # list_3是list_1子集
print(list_1.issuperset(list_3))               # list_1是list_3父集
print(list_1.symmetric_difference(list_2))     # 对称差集,两个集合都互相没有都即去掉重复值

list_4 = set([5, 6, 8])
print(list_3.isdisjoint(list_4))               # 交集为null返回true

# 用符号表示
print(list_1 & list_2)                         # 交集
print(list_1 | list_2)                         # 并集
print(list_1 - list_2)                         # 差集,in list_1 but not in list_2
print(list_1 ^ list_2)                         # 对称差集

# 基本操作
list_1.add(999)                                # 添加
print(list_1)

list_1.update([888, 777, 555])                 # 添加多个
print(list_1)

list_1.remove(888)                             # 删除指定元素
print(list_1)

print(list_1.pop())                            # 随机删除一个元素,并返回该元素

list_1.discard("ddd")                          # 删除指定的元素,若元素在集合中则删除,若不在则什么都不做(remove是若元素不再集合中会报错)

x = 5
x in list_4                                    # 测试x是否是list_4的成员
x not in list_4                                # 测试x是否不是list_4的成员
list_4.issubset(list_2)                        # 测试是否list_4中的每一个元素都在list_2中
list_4 <= list_2

list_4.issuperset(list_2)                      # 测试是否list_2中每一个元素都在list_4中
list_4 >= list_2

二、文件操作

1.对文件,得到文件句柄并赋值给一个变量

2.通过句柄对文件进行操作

3.关闭文件

现有文件如下

Somehow, it seems the love I knew was always the most destructive kind
不知为何,我经历的爱情总是最具毁灭性的的那种
Yesterday when I was young
昨日当我年少轻狂
The taste of life was sweet
生命的滋味是甜的
As rain upon my tongue
就如舌尖上的雨露
I teased at life as if it were a foolish game
我戏弄生命 视其为愚蠢的游戏
The way the evening breeze
就如夜晚的微风
May tease the candle flame
逗弄蜡烛的火苗
The thousand dreams I dreamed
我曾千万次梦见
The splendid things I planned
那些我计划的绚丽蓝图
I always built to last on weak and shifting sand
但我总是将之建筑在易逝的流沙上
I lived by night and shunned the naked light of day
我夜夜笙歌 逃避白昼赤裸的阳光
And only now I see how the time ran away
事到如今我才看清岁月是如何匆匆流逝
Yesterday when I was young
昨日当我年少轻狂
So many lovely songs were waiting to be sung
有那么多甜美的曲儿等我歌唱
So many wild pleasures lay in store for me
有那么多肆意的快乐等我享受
And so much pain my eyes refused to see
还有那么多痛苦 我的双眼却视而不见
I ran so fast that time and youth at last ran out
我飞快地奔走 最终时光与青春消逝殆尽
I never stopped to think what life was all about
我从未停下脚步去思考生命的意义
And every conversation that I can now recall
如今回想起的所有对话
Concerned itself with me and nothing else at all
除了和我相关的 什么都记不得了
The game of love I played with arrogance and pride
我用自负和傲慢玩着爱情的游戏
And every flame I lit too quickly, quickly died
所有我点燃的火焰都熄灭得太快
The friends I made all somehow seemed to slip away
所有我交的朋友似乎都不知不觉地离开了
And only now I'm left alone to end the play, yeah
只剩我一个人在台上来结束这场闹剧
Oh, yesterday when I was young
噢 昨日当我年少轻狂
So many, many songs were waiting to be sung
有那么那么多甜美的曲儿等我歌唱
So many wild pleasures lay in store for me
有那么多肆意的快乐等我享受
And so much pain my eyes refused to see
还有那么多痛苦 我的双眼却视而不见
There are so many songs in me that won't be sung
我有太多歌曲永远不会被唱起
I feel the bitter taste of tears upon my tongue
我尝到了舌尖泪水的苦涩滋味
The time has come for me to pay for yesterday
终于到了付出代价的时间 为了昨日
When I was young
当我年少轻狂
data = open("yesterday",encoding="utf-8").read()  
f = open("yesterday",'r',encoding="utf-8")  # 文件句柄,r为只读模式
data = f.read()
data2 = f.read()
print(data)
print('--------data2----%s---' %data2)      # 打印结果data2为空
f.close()
f = open("yesterday2",'w',encoding='utf-8')  # w是创建一个文件去写,会覆盖已有的文件
f.write("我爱北京天安门,\n")
f.write("天安门上太阳升")
f.close()
f = open("yesterday2",'a',encoding="utf-8")  # a 追加模式,不能读
f.write("\n我爱北京天安门,\n天安门上太阳升")
f.close()
f = open("yesterday","r",encoding="utf-8")   # 读前五行
for i in range(5):
    print(f.readline())                      # readline每执行一次读一行
f.close()
f = open("yesterday","r",encoding="utf-8")
print(f.readlines())                         # readlines将所有行读取到一个列表中,适合读小文件
f.close()
f = open("yesterday","r",encoding="utf-8")
for line in f.readlines():                   # 循环读每一行
    print(line.strip())
f.close()

low loop

f = open("yesterday","r",encoding="utf-8")
for index,line in enumerate(f.readlines()):                   # 不打印第9行
    if index == 9:
        print("----我是分割线---")
        continue
    print(line.strip())
f.close()

high bige,速度最快,f不是列表

f = open("yesterday","r",encoding="utf-8")
count = 0
for line in f:
    if count == 9:
        print("----我是分割线----")
        count += 1
        continue
    print(line)
    count += 1
f.close()
f = open("yesterday", "r", encoding="utf-8")
print(f.tell)                   # 打印字符位置
print(f.readline())
print(f.readline())
print(f.readline())
print(f.tell())
f.seek(0)                       # 回到首位
print(f.readline())
f.seek(10)                      # 回到位置10
print(f.readline())
print(f.encoding)               # 打印文件编码
print(f.fileno())               # 打印文件句柄编号
print(f.name)                   # 打印文件名字
print(f.flush())                # 将缓存文件刷新到硬盘
f.close()

进度条

import sys,time

for i in range(20):
    sys.stdout.write("#")
    sys.stdout.flush()
    time.sleep(0.1)
f = open("yesterday2", "a", encoding="utf-8")
f.seek(10)                      # 这个操作失效,不会影响截断位置
f.truncate(10)                  # 截断,从10个字符位置后开始截断
f = open("yesterday2", "r+", encoding="utf-8")      # r+读追加模式
print(f.readline())
print(f.readline())
print(f.readline())
print(f.tell())
f.write("--------diao--------")                     # 会追加到文件内容最末尾,并不会在第san
print(f.readline())
f.close
f = open("yesterday2", "w+", encoding="utf-8")      # w+写读模式,先写后读
f.write("------diao------1\n")
f.write("------diao------1\n")
f.write("------diao------1\n")
f.write("------diao------1\n")
print(f.tell())
f.seek(10)
print(f.tell())
print(f.readline())
f.write("should be at the begining of the second line")     # 从最后位置写
f.close()
f = open("yesterday2", "a+", encoding="utf-8")          # 追加读
f = open("yesterday2", "rb")              # 二进制格式去读文件,不需要encoding;python3只能用二进制网络传输(python2可以用字符);视频文件;
print(f.readline())
f.close()
f = open("yesterday2", "wb")                # 二进制格式写
f.write("hello binary\n".encode())          # 将二进制转换为程序默认字编码
f.close()
f = open("yesterday2", "ab")                  # 二进制格式追加 

总结

  • 打开文件的模式

    • r,只读模式(默认)
    • w,只写模式。【不可读;不存在则创建;存在则删除内容;】
    • a,追加模式。【可读; 不存在则创建;存在则只追加内容;】
  • "+" 表示可以同时读写某个文件

    • r+,可读写文件。【可读;可写;可追加】
    • w+,写读
    • a+,可读可追加
  • "U"表示在读取时,可以将 \r \n \r\n自动转换成 \n (与 r 或 r+ 模式同使用)

    • rU
    • r+U
  • "b"表示处理二进制文件(如:FTP发送上传ISO镜像文件,linux可忽略,windows处理二进制文件时需标注)
    • rb
    • wb
    • ab

修改文件中的字符串,读原来文件,写入新文件

f = open("yesterday","r",encoding="utf-8")
f_new = open("yesterday2.bak","w",encoding="utf-8")
for line in f:
    if "肆意的快乐等我享受" in line:
        line=line.replace("肆意的快乐等我享受","肆意的快乐等Alex享受")
    f_new.write(line)
f.close()
f_new.close()

with 语句,不需要每次都close()

with open("yesterday2","r",encoding="utf-8") as f:
    for line in f:
        print(line)

打开多个文件

with open("yesterday2","r",encoding="utf-8") as f,\
    open("yesterday","r",encoding="utf-8") as f2:
    for line in f:
        print(line)

三、字符编码与转码

1.在python2默认编码是ASCII, python3里默认是unicode

2.unicode 分为 utf-32(占4个字节),utf-16(占两个字节),utf-8(占1-4个字节), so utf-16就是现在最常用的unicode版本, 不过在文件里存的还是utf-8,因为utf8省空间

3.在py3中encode,在转码的同时还会把string 变成bytes类型,decode在解码的同时还会把bytes变回string

python2中的编码转换

#-*- coding:utf8 -*-
import sys

print(sys.getdefaultencoding())  # 打印系统默认编码
s="你好"
print(s)

# utf-8要先转换为unicode
s_to_unicode=s.decode("utf-8")
print(s_to_unicode,type(s_to_unicode))
# unicode转换为gbk
s_to_gbk=s_to_unicode.encode("gbk")
print(s_to_gbk)

# gbk转换为utf-8,需要先将gbk转换为Unicode,再转换为utf-8
gbk_to_utf8=s_to_gbk.decode("gbk").encode("utf-8")
print(gbk_to_utf8)

t=u"你好"               # u默认代表Unicode格式,不需要再将t转换为Unicode
t_to_gbk= t.encode("gbk")
print(t_to_gbk)

Python3默认编码是utf-8

import sys
print(sys.getdefaultencoding())

s = "你哈"
s_gbk=s.encode("gbk")

print(s_gbk)
print(s.encode())

gbk_to_utf8= s_gbk.decode("gbk").encode("utf-8")
print("utf8",gbk_to_utf8)

文件的编码设置为gbk,但程序的默认编码让然是unicode

#!/usr/bin/env python3
# -*- coding:gbk -*-
# Author: Erick Zhang

import sys
print(sys.getdefaultencoding())

s="你哈"
print(s.encode("gbk"))
print(s.encode("utf-8"))
print(s.encode("utf-8").decode("utf-8").encode("gb2312").decode("gb2312"))

四、函数

三种编程方式

面向对象:华山派 ---》类 ----》class

面向过程:少林派 ---》过程 ---》def

函数式编程:逍遥派 ---》函数 ---》def

编程语言中函数的定义:

函数是逻辑结构化和过程化的一种编程方法

python中的函数定义方法:

def test(x):
    """the function definitions"""
    x+=1
    return x

def:定义函数的关键字
test:函数名
():内可定义形参
""" """:文档描述(非必要,但是强烈建议为你的函数添加描述信息)
x+=1:泛指代码块或程序处理逻辑
return:定义返回值

python中过程返回值为None

# 函数
def func1():
    """testing"""
    print('in the func1')
    return 0

# 过程
def func2():
    """testing2"""
    print('in the func2')

x = func1()
y = func2()
print('from func1 return is %s' % x)
print('from func2 return is %s' % y)

运行结果:
in the func1
in the func2
from func1 return is 0
from func2 return is None

函数作用

  • 减少重复代码
  • 使程序变的可扩展
  • 使程序变得易维护
import time

def logger():
    time_format = '%Y-%m-%d %X'
    time_current = time.strftime(time_format)
    with open('a.txt','a+') as f:
        f.write('%s end action\n' % time_current)

def test1():
    print('in the test1')
    logger()

def test2():
    print('in the test2')
    logger()

def test3():
    print('in the test3')
    logger()

test1()
test2()
test3()

return返回值

def test():
    print('in the test1')
    return 0

x=test()
print(x)

三种返回形式

def test1():
    print('in the test1')

def test2():
    print('in the test2')
    return 0

def test3():
    print('in the test3')
    return 1,'hello',['alex','wupeiqi'],{'name':'alex'}

x=test1()
y=test2()
z=test3()
print(x)
print(y)
print(z)
运行结果:
None
0
(1, 'hello', ['alex', 'wupeiqi'], {'name': 'alex'})

总结:

  • 返回值数=0,返回None
  • 返回值数=1,返回object
  • 返回值数>1,返回tuple

函数的参数

1.位置参数

在内存中真正存在的为实参;实参必须与形参一一对应

# x,y为形参,1,2为实参
def test(x,y):
    print(x)
    print(y)

test(1,2)

2.关键字调用

与形参顺序无关,关键字实参必须在位置实参右面,对同一个形参不能重复传值

def test(x,y):
    print(x)
    print(y)

test(y=2,x=3)
test(3,y=2)
test(3,x=2) 报错
test(x=2,3) 报错

3.默认参数

调用函数时,默认参数非必须传递;默认参数的定义应该在位置形参右面;默认参数通常应该定义成不可变类型

def test(x,y=2):
    print(x)
    print(y)

test(1)
test(1,3)
test(1,y=1)

4.可变长参数

可变长指的是实参值的个数不固定,而实参有按位置和按关键字两种形式定义,针对这两种形式的可变长,形参对应有两种解决方案来完整地存放它们,分别是*args,**kwargs

def test(*args):
    print(args)

test(1,2,3,4,5,5)
test(*[1,2,3,4,5,5])    # args=tuple([1,2,3,4,5,5])

def test1(x,*args):
    print(x)
    print(args)

test1(1,2,4,5,6,7)

def test2(**kwargs):
    print(kwargs)

test2(name='alex',age=8,sex='F')
test2(**{'name':'alex','age':8})

def test3(name,**kwargs):
    print(name)
    print(kwargs)

test3('alex',age=18,sex='m')

def test4(name,age=18,**kwargs):
    print(name)
    print(age)
    print(kwargs)

test4('alex',sex='m',hobby='tesla',age=3)

def test5(name,age=18,*args,**kwargs):
    print(name)
    print(age)
    print(args)
    print(kwargs)

test5('alex','a',age=34,sex=18,hobby='tesla')

5 全局变量与局部变量

在子程序中定义的变量称为局部变量,局部变量作用域是定义该变量的子程序

def change_name(name):
    print("before change",name)
    name = "Alex li"
    age = 23
    print("after change", name)

name = "alex"
change_name(name)
print("age", age)

运行结果:
before change alex
after change Alex li
Traceback (most recent call last):
  File "/Users/erick/PycharmProjects/oldboy_python/day2/局部变量.py", line 14, in 
    print("age", age)
NameError: name 'age' is not defined

在程序的一开始定义的变量称为全局变量,全局变量作用域是整个程序

当全局变量与局部变量同名时:在定义局部变量的子程序内,局部变量起作用;在其它地方全局变量起作用。

school = "Oldboy edu"

def change_name(name):
    school = "Mage Linux"
    print("before change",name,school)
    name = "Alex li"
    print("after change", name)

name = "alex"
change_name(name)
print(school)

运行结果:
before change alex Mage Linux
after change Alex li
Oldboy edu

通过global可以在子程序中声明全局变量

school = "Oldboy edu"

def change_name(name):
    global school
    school = "Mage Linux"
    print("before change",name,school)
    name = "Alex li"
    print("after change", name)

name = "alex"
change_name(name)
print("school",school)

运行结果:
before change alex Mage Linux
after change Alex li
school Mage Linux