BourneA

Python3常用代码块汇总

本文章主要用于平时Python3学习和使用中积累的比较常用的代码块。代码都是经过验证可行的。

一、基本数据类型

字符串

字符串常识：

可以利用反斜杠（\）对双引号转义："，或者用单引号引起这个字符串。例如：‘I l"o"ve fishc.com’
字符串支持分片，如：Str1[:6] 返回字符串前6个字符，0-5 index

字符串的方法（都要用dot）,返回一个新的字符串，原来不变。例如字符串s, s.capitalize()返回一个新的字符串。

# 字符串相加
>>> print("nihao"+"a")
nihaoa

# 字符串乘整数，连续输出8次，相当8次字符串相加
>>> print("nihao\n"*3)
nihao
nihao
nihao
 
# 在前面的字符串后面打印后面的字符串，再循环中使用很方便，例如用new line mark or space
>>> print("不分手的", end="恋爱")
不分手的恋爱

# 获得字符串长度
>>> len("chilema")
7

# 在一个字符串的每个字符之间插入一个字符串
>>> str1 = "sh"
>>> str1.join("12345")
'1sh2sh3sh4sh5'

进制转换

#十进制转换二进制
>>> bin(10)
'0b1010'

随机数

Python自带random库支持模拟多种分布，包括Beta、Exponential、Gamma、Gaussian、Log normal distribution、Pareto distribution、Weibull distribution等，具体见 random — Generate pseudo-random numbers

Basic samples

>>> from random import *
>>> random()                             # Random float:  0.0 <= x < 1.0
0.37444887175646646

>>> uniform(2.5, 10.0)                   # Random float:  2.5 <= x < 10.0
3.1800146073117523

>>> expovariate(1 / 5)                   # Interval between arrivals averaging 5 seconds
5.148957571865031

>>> randrange(10)                        # Integer from 0 to 9 inclusive
7

>>> randrange(0, 101, 2)                 # Even integer from 0 to 100 inclusive
26

>>> choice(['win', 'lose', 'draw'])      # Single random element from a sequence
'draw'

>>> deck = 'ace two three four'.split()
>>> shuffle(deck)                        # Shuffle a list
>>> deck
['four', 'two', 'ace', 'three']

>>> sample([10, 20, 30, 40, 50], k=4)    # Four samples without replacement
[40, 10, 50, 30]

Simulations

>>> # Six roulette wheel spins (weighted sampling with replacement)
>>> choices(['red', 'black', 'green'], [18, 18, 2], k=6)
['red', 'green', 'black', 'black', 'red', 'black']

>>> # Deal 20 cards without replacement from a deck of 52 playing cards
>>> # and determine the proportion of cards with a ten-value
>>> # (a ten, jack, queen, or king).
>>> deck = collections.Counter(tens=16, low_cards=36)
>>> seen = sample(list(deck.elements()), k=20)
>>> seen.count('tens') / 20
0.15

>>> # Estimate the probability of getting 5 or more heads from 7 spins
>>> # of a biased coin that settles on heads 60% of the time.
>>> def trial():
...     return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5
...
>>> sum(trial() for i in range(10_000)) / 10_000
0.4169

>>> # Probability of the median of 5 samples being in middle two quartiles
>>> def trial():
...     return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500
...
>>> sum(trial() for i in range(10_000)) / 10_000
0.7958

Simulation of arrival times and service deliveries for a multiserver queue

from heapq import heappush, heappop
from random import expovariate, gauss
from statistics import mean, median, stdev

average_arrival_interval = 5.6
average_service_time = 15.0
stdev_service_time = 3.5
num_servers = 3

waits = []
arrival_time = 0.0
servers = [0.0] * num_servers  # time when each server becomes available
for i in range(100_000):
    arrival_time += expovariate(1.0 / average_arrival_interval)
    next_server_available = heappop(servers)
    wait = max(0.0, next_server_available - arrival_time)
    waits.append(wait)
    service_duration = gauss(average_service_time, stdev_service_time)
    service_completed = arrival_time + wait + service_duration
    heappush(servers, service_completed)

print(f'Mean wait: {mean(waits):.1f}.  Stdev wait: {stdev(waits):.1f}.')
print(f'Median wait: {median(waits):.1f}.  Max wait: {max(waits):.1f}.')

MD5 hash

import hashlib # 导入hashlib模块

md = hashlib.md5() # 获取一个md5加密算法对象
md.update('how to use md5 in hashlib?'.encode('utf-8')) # 制定需要加密的字符串
print(md.hexdigest())  # 获取加密后的16进制字符串

判断变量的类型

>>> tmp = [1,2,3]
>>> isinstance(tmp, list)
# Out: True

二、循环

跳出多层循环

for … else … break

else中的语句是在for循环所有正常执行完毕后执行。所以如果for中有break执行的话，else的语句就不执行了

for i in range(5):
  for j in range(5):
    for k in range(5):
      if i == j == k == 3:
        break
      else:
        print(i, '----', j, '----', k)
    else: continue
    break
  else: continue
  break

上面程序执行到i=j=k=3的时候就跳出所有循环了，不再执行

利用flag变量

a = [[1, 2, 3], [5, 5, 6], [7, 8, 9]]
for i in range(3):
  for j in range(3):
    if a[i][j] == 5:
      flag = False
      break
  if not flag:
    break

自定义异常

class StopLoopError(Exception): pass
try:
  for i in range(5):
    for j in range(5):
      for k in range(5):
        if i == j == k == 3:
          raise StopLoopError()
        else:
          print(i, '----', j, '----', k)
except StopLoopError:
  pass

三、函数

set the default value of arguments def my_func(a, b=5, c=10):
keywords arguments(named arguments): my_func(a=1, c=2)
*args is used to scoop up variable amount of remaining positional arguments(it is a tuple). You cannot add more positional arguments after *args, the parameter name can be anything besides args。unless you use keyword(named) arguments. i.e. def func1(a, b, *args, d): func1(1,2,3,4,d=30)
**kwargs is used to scoop up a variable amount of remaining keyword arguments(it is a dictionary). Unlike keyword-only arguments, it can be specified even if the positional arguments have not been exhausted. No parameters can come after **kwargs

def func1(a, b, *args):
    print(a, b, args)

func1(1,2) #如果不给*args值，就返回一个空的元组
# out: 1 2 ()
l = [1,2,3,4,5]
func1(*l) # unpack a list as arguments
# out: 1 2 (3, 4, 5)

# 求平均数
# a and b，如果两个都为真，返回第二个，如果一个真一个假或者两个都假返回False或者第一个值。
# a or b，如果两个都为真，返回第一个值，如果一个真一个假，则返回真的值，如果两个都假则返回第二个
def avg(*args):
		count = len(args)
		total = sum(args)
    return count and total/count # 通过and判断函数是否有参数输入
    
# to force no positional arguments，you can only give keyword argument when you call the function
def func(*, d): 
	#code
	
# * shows the end of positional parameters
def func(a, b, *, d): # you can only pass two positional arguments, and here d is keyword parameter
	#code

def func(*, d, **kwargs):
    print(d, kwargs)
func(d=1, a=2, b=3, c=4)
#out: 1 {'a': 2, 'b': 3, 'c': 4}

# use *args and **kwargs together
def func(*args, **kwargs):
    print(args, kwargs)

func(1, 2, b=3, c=4, d=5)	
#out: (1, 2) {'b': 3, 'c': 4, 'd': 5}

# cached version of factorial, no more calculation for calculated number
def factorial(n, cache={}):
    if n < 1:
        return 1
    elif n in cache:
        return cache[n]
    else:
        print("caculation {0}".format(n))
        result = n * factorial(n-1)
        cache[n] = result
        return result

Lambda Expression (Anonymous Function)

# lambda with one input
>>> g = lambda x: 3*x + 1
>>> g(3)
10

#lambda with multiple input(two or more), e.g. combining first name and last name
#strip() is to remove the leading and trailing whitespace.
#title() is to ensure the first letter of each string is capitalized
>>> full_name = lambda fn, ln: fn.strip().title() + " " + ln.strip().title()
>>> full_name("  ZHAng  ", "sAN")
'Zhang San'

#sort list by key using lambda
>>> list_example = ["9 jiu", "1 yi", "5 wu", "3 san"]
>>> list_example.sort(key = lambda word: word.split(" ")[0])
>>> list_example
['1 yi', '3 san', '5 wu', '9 jiu']

#function returns function, e.g. Quadratic Functions f(x) = ax^2 +bx + c
>>> def build_quadratic_function(a, b, c):
...	return lambda x: a*x**2 + b*x + c
...
>>> f = build_quadratic_function(1, 3, 2)
>>> f(0)
2
>>> f(1)
6

Reducing function arguments (partial function)

This is just to reduce the number of arguments you need to pass when you call the original function. Sometimes, this is useful because some higher-ordered function can only accept one-parameter function as his arguments, you can see it in the following example.

# calculate the distance from some points to the origin in a x-y coordinate.
origin = (0, 0)
l = [(1,1), (-3, -2), (-2, 1), (0, 0)]
dist = lambda a, b: (a[0] - b[0]) ** 2 + (a[1] - b[1]) ** 2

# the above function needs two arguments, but you want to pass this function to sorted function which can only accept a one-parameter function. So you need to reduce it.

from functools import partial
f = partial(dist, origin)

print(sorted(l, key=f))

# you can also use lambda function
print(sorted(l, key=lambda x: dist(x, origin)))

四、容器及其操作

集合Set

#modify sets
>>> example1 = set()
>>> example1.add("yi") # 添加元素
>>> example1.add("er")
>>> example1.update([1,4],[5,6]) # update可以同时添加多个元素
>>> example2 = set([28, True, 3.14, "nihao", "yi", "er"])
>>> len(example)

# 移除元素
>>> example2.remove(x) # 将元素 x 从集合 example2 中移除，如果元素不存在，则会发生KeyError错误
>>> example2.discard("Facebook")  # 不存在不会发生错误
>>> example2.clear() # 清空集合
>>> x = example2.pop() # 随机删除集合中的一个元素赋值给x

# evaluate union and intersection of two sets
>>> example1.union(example2)
>>> example1.intersection(example2)
>>> "nihao" in example2 # 查看元素是否在集合内
True
>>> "nihao" not in example2
False

# 两个集合间的运算
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a                                  
{'a', 'r', 'b', 'c', 'd'}
>>> a - b                              # 集合a中包含而集合b中不包含的元素
{'r', 'd', 'b'}
>>> a | b                              # 集合a或b中包含的所有元素
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
>>> a & b                              # 集合a和b中都包含了的元素
{'a', 'c'}
>>> a ^ b                              # 不同时包含于a和b的元素
{'r', 'd', 'b', 'm', 'z', 'l'}

>>> example1.isdisjoint(example2)	# 判断两个集合是否包含相同的元素，如果没有返回 True，否则返回 False
>>> issubset()	# 判断指定集合是否为该方法参数集合的子集

元组Tuple

一旦定义，不能改变，不能再赋值，不能用del删除某个元素，只能删除整个元组
元组的切片方法和列表一样
创建空元组：temp=()
创建一个数的元组：temp=(1,) 必须加逗号，括号可以不加
改变一个元组的办法，例如 temp=(1,2,3,4)，令 temp = temp[:2] + (6,) + temp[2:] 输出temp 为 (1,2,6,3,4)，这是元组的拼接，同样适用于字符串。

>>> temp = 1,2,3
>>>temp
(1, 2, 3)

>>> 8*(8,)
(8, 8, 8, 8, 8, 8, 8, 8)

Unzip a list of tuples

zipped_list = [(1, 'a'), (2, 'b'), (3, 'c')]
list_a, list_b = zip(*zipped_list)
print(list_a)
# out: (1,2,3)
print(list_b)
# out: ('a', 'b', 'c')

Iterators returns only elements at a time. len function cannot be used with iterators. We can loop over the zip object or the iterator to get the actual list.

list_a = [1, 2, 3] 
list_b = [4, 5, 6]

zipped = zip(a, b) # out: zip object
len(zipped) = # out: TypeError: object of type 'zip' has no len()
zipped[0] # out: zip object is not subscriptable
list_c = list(zipped) # out: [(1,4), (2,5), (3,6)]
list_d = list(zipped) # out: [] is empty list because of the above statement

Named tuples

Named tuples subclass tuple, and add a layer to assign property names to the potential elements. It is located in the collections standard library module. Named tuples are also regular tuples, we can still handle them just like any other tuple(by index, slice, iterate). Named tuples are immutable.

from collections import namedtuple
'''it is a function(class factory) which generates(return) a new class that
inherits from tuple. The new class provides named properties to access
elements of the tuple and an instance of that class is still a tuple'''

'''namedtuple needs a few things to generate this class:
1.the class name we want to use
2.a sequence(list, tuple) of field names(strings) we want to assign, in the order of the elements in that tuple
'''

Point2D = namedtuple('Point2D', ['x', 'y']) # the variable initial is capitalized, because it receives a class returned from the fucntion
#the following three ones have the same effect
#Point2D = namedtuple('Point2D', ('x', 'y'))
#Point2D = namedtuple('Point2D', 'x, y')
#Point2D = namedtuple('Point2D', 'x y')
'''in fact, the __new__ method of the generated class uses the field names we provided as param names'''

# we can easily find out the field names in a named tuple generated class
>>> Point2D._fields
('x', 'y')
>>> print(Point2D._source)
... # print out what the class is 

>>> pt = Point2D(10, 20)
>>> isinstance(pt, tuple)
True

# extract named tuple values to a dictionary, by using a instance method.
# the keys of the ordered dictionary is in order
>>> pt._asdict()
OrderedDict([('x', 10), ('y', 20)])
# to make it a normal dictionary
>>> dict(pt._asdict())
{'x': 10, 'y': 20}

# we can handle it as we deal with the normal tuple
x, y = pt
x = pt[0]
for e in pt: print(e)

# in addition, we can also access the data using the field name
>>> pt.x # note: you can assign value to it, since it is immutable
10
>>> pt.y
20

# modify named tuples (create a new one)
>>> Stock = namedtuple('Stock', 'symbol year month day open high low close')
>>> djia = Stock('DJIA', 2018, 1, 25, 26_313, 26_458, 26_260, 26_393)
>>> djia
Stock(symbol='DJIA', year=2018, month=1, day=25, open=26313, high=26458, low=26260, close=26393)

>>> djia = djia._replace(year = 2017, open = 10000)
>>> djia
Stock(symbol='DJIA', year=2017, month=1, day=25, open=10000, high=26458, low=26260, close=26393)

>>> Stock._make(djia[:7] + (1000, )) # _make can take a tuple as parameter
Stock(symbol='DJIA', year=2017, month=1, day=25, open=10000, high=26458, low=26260, close=1000)

# extend named tuples 
Stock = namedtuple('Stock', Stock._fields + ('newOne', ))

# set default values by using __defaults__
>>> Stock = namedtuple('Stock', 'symbol year month day open high low close')
>>> Stock.__new__.__defaults__ = (0, 0, 0) # the last three parameter, read from backwards
>>> djia = Stock(1, 2, 3, 4, 5)
>>> djia
Stock(symbol=1, year=2, month=3, day=4, open=5, high=0, low=0, close=0)

# update defaults
Stock.__new__.__defaults__ = (-10, -10, -10)
>>> djia = Stock(1, 2, 3, 4, 5)
>>> djia
Stock(symbol=1, year=2, month=3, day=4, open=5, high=-10, low=-10, close=-10)

# return multiple values using named tuple
# here is to return a random color
from random import randint, random
from collections import namedtuple

Color = namedtuple('Color', 'red green blue alpha')

def random_color():
	  red = randint(0, 255)
	  green = randint(0, 255)
    blue = randint(0, 255)
    alpha = round(random(), 2) # 精确到两位小数
    return Color(red, green, blue, alpha)
    
# transform a dictionary to a nametuple

def tuplify_dicts(dicts):
    keys = {key for dict_ in dicts for key in dict_.keys()}
    Struct = namedtuple('Struct', sorted(keys), rename=True)
    Struct.__new__.__defaults__ = (None, ) * len(Struct._fields)
    return [Struct(**dict_) for dict_ in dicts]

data_list = [
    {'key2': 2, 'key1': 1},
    {'key1': 3,'key2': 4},
    {'key1': 5, 'key2': 6, 'key3': 7},
    {'key2': 100}
]

tuple_list = tuplify_dicts(data_list)

>>> tuple_list
[Struct(key1=1, key2=2, key3=None),
 Struct(key1=3, key2=4, key3=None),
 Struct(key1=5, key2=6, key3=7),
 Struct(key1=None, key2=100, key3=None)]

'''If you just read a lot of key-value pairs, you can use namedtuple rather than dictionary due to efficiency.
And if your class only has a lot of values and doesn't need mutability, namedtuple is preferred, due to saving space'''

列表

判断列表的连续数字范围并分块

列表中的数字是连续数字（从小到大）

from itertools import groupby
lst = [1,2,3,5,6,7,8,11,12,13,19]

func = lambda x: x[1] - x[0]
for k, g in groupy(enumerate(lst), func):
	l1 = [j for i, j in g]
	if len(l1) > 1:
		scop = str(min(l1)) + '_' + str(max(l1))
	else:
		scop = l1[0]
	print("连续数字范围: {}".format(scop))

里面中的数字是非连续数字即没有排序，先排序

lst = [4, 2, 1, 5, 6, 7, 8, 11, 12, 13, 19]

for i in range(len(lst)):
	j = i + 1
	for j in range(len(lst)):
	    if lst[i] < lst[j]:
	        temp = lst[i]
	        lst[i] = lst[j]
	        lst[j] = temp
print("排序后列表：{}".format(lst))

列表元素的排列组合

排列

from itertools import product
l = [1, 2, 3]
print(list(product(l, l)))
# out: [(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]
print(list(product(l, repeat=3)))
# out: [(1, 1, 1), (1, 1, 2), (1, 1, 3), (1, 2, 1), (1, 2, 2), (1, 2, 3), (1, 3, 1), (1, 3, 2), (1, 3, 3), (2, 1, 1), (2, 1, 2), (2, 1, 3), (2, 2, 1), (2, 2, 2), (2, 2, 3), (2, 3, 1), (2, 3, 2), (2, 3, 3), (3, 1, 1), (3, 1, 2), (3, 1, 3), (3, 2, 1), (3, 2, 2), (3, 2, 3), (3, 3, 1), (3, 3, 2), (3, 3, 3)]

组合

from itertools import combinations
print(list(combinations([1,2,3,4,5], 3)))
# out: [(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)]

Map, Filter, Reduce

Map

>>> import math
>>> def area(r):
	"""Area of a circle with radius 'r'."""
	return math.pi * (r**2)

>>> radii = [2, 5, 7.1, 0.3, 10]
>>> map(area, radii)
<map object at 0x112f870f0>
>>> list(map(area, radii))
[12.566370614359172, 78.53981633974483, 158.36768566746147, 0.2827433388230814, 314.1592653589793]

#convert Celsius to Fahrenheit
>>> temps = [("Berlin", 29), ("Beijing", 36), ("New York", 28)]
>>> c_to_f = lambda data: (data[0], (9/5)*data[1] + 32)
>>> list(map(c_to_f, temps))
[('Berlin', 84.2), ('Beijing', 96.8), ('New York', 82.4)]

Filter

In Python, {}, [], (), "", 0, 0.0, 0j, False, None are treated as False.

#filter the values above the average
>>> import statistics
>>> data = [1.3, 2.7, 0.8, 4.1, 4.3]
>>> avg = statistics.mean(data)
>>> avg
2.64
>>> filter(lambda x: x > avg, data)
<filter object at 0x112f87780>
>>> list(filter(lambda x: x > avg, data))
[2.7, 4.1, 4.3]

#remove missing values

>>> countries = ["", "China", "Brazil", "", "Germany"]
>>> list(filter(None, countries))
['China', 'Brazil', 'Germany']

Reduce

“Use functools.reduce() if you really need it; however, 99% of the time an explicit for loop is more readable.” - Guido van Rossum(Python creator)

>>> from functools import reduce
>>> data = [2, 3, 5, 7, 11]
>>> multiplier = lambda x, y: x*y
>>> reduce(multiplier, data) # use the product of first two elements to multiply the third, then use the result to multiply the fourth, and so on.
2310

字典

几点注意：

for each in 字典名：each为字典中每个项的关键字
.keys() 返回所有key
.values() 返回所有value
.Items() 返回字典所有项，以元组的形式
.get(key) 获得该键对应的值，如果该key不存在的话，相当于反会了空值，False
key in 字典名，存在则返回true，不存在false
.clear() 清空字典，被字典赋值的另外的字典也被清空
.copy() 拷贝字典拷贝之后不会被原来的字典影响，区别与直接赋值的方法,dict2=dict1，这个方法在改编dict2时会改变dict1
.pop(key) 弹出该键的值，并在原字典中删除
.popitem()随机弹出一个，并在原字典中删除
.setdefault(key, value) 向字典中随机位置加入一个项
字典1.update(字典2) ，把字典1中与字典2中有相同的key的项的值变成和字典2中一样
.fromkeys((key1, key2, key3), ‘we are the same’)。生成一个新的字典，字典的每个value都是一样的，等于第二个参数
del(字典名[key])可以删除字典中的该项

# 函数dict()只有一个参数，所以在输入许多元组或列表时要在加一个括号都括起来。下面的元组可以换成列表
>>> dict((('F',70), ('i',105), ('s',115)))
{'s': 115, 'i': 105, 'F': 70}

# 下面的key不要加引号。如果已有这个键则重新赋值，没有则创建一个
>>> dict(key1 = 1, key2 =2, key3=3)
{'key2': 2, 'key3': 3, 'key1': 1}

# 给字典赋值的另一种方法
>>> MyDict = {}
>>> (MyDict['id'],MyDict['name'],MyDict['sex']) = ['212','lala','man']
>>> MyDict
{'id': '212', 'sex': 'man', 'name': 'lala'}

# 把字典的key和value合并成元组
>>> n = {1: 'a', 2: 'b', 3: 'c'}
>>> for x, y in n.items():
	print((x, y))

(1, 'a')
(2, 'b')
(3, 'c')

# 字典推导式
>>> b = {i: i % 2 == 0 for i in range(10)}
>>> b
{0: True, 1: False, 2: True, 3: False, 4: True, 5: False, 6: True, 7: False, 8: True, 9: False}

Sort by multiple keys in dictionary

First, the dictionaries in the list is sorted by the key of “fname”, then based on the result, it is sorted by the key of “lname” partially again.

from operator import itemgetter

users = [
    {'fname': 'Bucky', 'lname': 'Roberts'},
    {'fname': 'Tom', 'lname': 'Roberts'},
    {'fname': 'Bernie', 'lname': 'Zunks'},
    {'fname': 'Jenna', 'lname': 'Hayes'},
    {'fname': 'Sally', 'lname': 'Jones'},
    {'fname': 'Amanda', 'lname': 'Roberts'},
    {'fname': 'Tom', 'lname': 'Williams'},
    {'fname': 'Dean', 'lname': 'Hayes'},
    {'fname': 'Bernie', 'lname': 'Barbie'},
    {'fname': 'Tom', 'lname': 'Jones'},
]

for x in sorted(users, key=itemgetter('fname', 'lname')):
    print(x)

# OUTPUT:
{'fname': 'Amanda', 'lname': 'Roberts'}
{'fname': 'Bernie', 'lname': 'Barbie'}
{'fname': 'Bernie', 'lname': 'Zunks'}
{'fname': 'Bucky', 'lname': 'Roberts'}
{'fname': 'Dean', 'lname': 'Hayes'}
{'fname': 'Jenna', 'lname': 'Hayes'}
{'fname': 'Sally', 'lname': 'Jones'}
{'fname': 'Tom', 'lname': 'Jones'}
{'fname': 'Tom', 'lname': 'Roberts'}
{'fname': 'Tom', 'lname': 'Williams'}

Getting key with maximum value in dictionary

key_with_max_value = max(stats, key=stats.get)

Update

用字典b update来更新字典 a，会有两种情况：

有相同的键时：会使用最新的字典 b 中该 key 对应的 value 值。
有新的键时：会直接把字典 b 中的 key、value 加入到 a 中。

>>> a = {1: 2, 2: 2}
>>> b = {1: 1, 3: 3}
>>> a.update(b)
>>> print(a)
{1: 1, 2: 2, 3: 3}

也可以使用元组更新字典

d = {'x': 2}
d.update(y = 3, z = 0)
print(d)

# out
# {'x': 2, 'y': 3, 'z': 0}

五、类

inheritance, Magic, Property Decorator

class People():
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __repr__(self):
        return "People('{}', {})".format(self.name, self.age)

    def __str__(self):
        return "I'm {}, and I am {} years old".format(self.name, self.age)
        
people = People("Zhang San", 24)
print(people)
print(people.__repr__()) # use Magic Method

# single inheritance
class Male(People):
    def __init__(self, name, age, hobby):
        super().__init__(name, age)
        self.hobby = hobby

class Play():
    def __init__(self, game):
        self.game = game

# multiple inheritance
class Boy(Male, Play):
    def __init__(self, name, age, hobby, game, favor_toy):
        Male.__init__(self, name, age, hobby)
        Play.__init__(self, game)
        self.favor_toy = favor_toy

    # use Property Decorator, which makes a method become a property of the instance
    @property
    def my_favor_toy(self):
        return "My favourite toy is " + self.favor_toy
    
boy = Boy('Tim', 24, 'Play video game', 'Street Fighter', 'Lego')

print(boy.name)
print(boy.hobby)
print(boy.game)
print(boy.favor_toy)
print(boy.my_favor_toy)

魔法方法总是被双下划线包围，体现在魔法方法总是在适当的时候被自动调用。

构造器__new__，如果继承一个不可改变的类如，str，这时必须在初始化之前改变它，__new__就是在__init__实例化之前执行的方法。其中cls可以是任何名字，但是用cls是convention。通过对算数魔法方法的重写可以自定义任何对象间的算数运算。

装饰器Decorators

If you wrap some function inside another function which adds some functionality to it and executes the wrapped function, you decorated the wrapped function with the outside function. The outside function is a decorator function. A decorator function takes a function as an argument and it returns a closure.

Decorator can be stacked, if you have two decorator functions, you can just use:

@decorator1
@decorator2
def func(...):
    #code

The order of the decorators does matter and can matter. The above code is equivalent to decorator1(decorator2(func)) which is executed from outside to inside.

Use a decorator to build a function to calculate Fibonacci Number Series.

from functools import lru_cache
'''lru_cache is a decorator which can cache the result of a
function, the parameter maxsize can set the maximum number of 
items you can cache, the default value is 128, and it's better 
to be the exponential of 2'''
@lru_cache(maxsize=32)
def fib(n):
    print("calculating...{{{0}}}".format(n)) # use double curly brackets {{}} to print out {} 
    return 1 if n <= 2 else fib(n-1) + fib(n-2)
    
# we can also build a caching decorator by ourselves
def memoize_fib(fn):
    cache = dict()
    
    def inner(n):
        if n not in cache:
            cache[n] = fn(n)
        return cache[n]
    return inner
    
@memoize_fib
def fib(n):
    print("calculating...{{{0}}}".format(n))
    return 1 if n <= 2 else fib(n-1) + fib(n-2)

If you want to pass a parameter to the decorator function like @memoize_fib(reps), you can wrap the original decorator function with a new outer function, which has a parameter ‘reps’, then return the original decorator when called.
Any arguments passed to outer can be referenced (as free variables) inside our decorator. We call this outer function a decorator factory(it is a function that creates a new decorator each time it is called).

Decorator class

Build a decorator using a class. You can add some parameters in __init__ function, which can act as parameters in decorator factory.

class Memoize_fib:
    
    def __init__(self):
        self.cache = dict()
    
    def __call__(self, fn):
        def inner(n):
            if n not in self.cache:
                self.cache[n] = fn(n)
            return self.cache[n]
        return inner
        
@Memoize_fib()
def fib(n):
    print("calculating...{{{0}}}".format(n))
    return 1 if n <= 2 else fib(n-1) + fib(n-2)

Decorating classes

Build a simple debugger for a class by decorator.

from datetime import datetime, timezone

def info(self):
    results = []
    results.append("time: {0}".format(datetime.now(timezone.utc)))
    results.append("Class: {0}".format(self.__class__.__name__))
    results.append("id: {0}".format(hex(id(self))))
    for k, v in vars(self).items():
        results.append("{0}: {1}".format(k, v))
    return results
    

def debug_info(cls):
    cls.debug = info
    return cls

@debug_info
class People():

    def __init__(self, name, age): # __init__ is a method which is called when one instance is created, self is the object it self, it represents the instance created
        self.name = name  
        self.age = age # but here it is calling the setter, the initializing step is finished in the setter

    # in python, use property instead of getter and setter to encapasulate variables. the name of the two following function can be the same as attributes name
    @property
    def age(self):
        print("getting")
        return self._age

    @age.setter
    def age(self, new_age):
        if new_age <= 0:
            raise ValueError("Width must be positive.")
        else:
            print("setting")
            self._age = new_age

>>> p = People("John",5)
>>> p.debug()
['time: 2018-03-31 08:22:51.794910+00:00',
 'Class: People',
 'id: 0x104e1f780',
 'name: John',
 '_age: 5']

If you have overridden the operators of “==” and “<”, you can realize other operators like “<=”, “>=”, “!=” by decorating a class. The decorator function is in python standard library. As along you have one comparison in the class, the decorator will complete the others.

from functools import total_ordering
from math import sqrt

@total_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    
    def __abs__(self):
        return sqrt(self.x**2 + self.y**2)
    
    def __eq__(self, other):
        if isinstance(other, Point):
            return self.x == other.x and self.y == other.y
        else:
            return False
    
    def __lt__(self, other):
        if isinstance(other, Point):
            return abs(self) < abs(other)
        else:
            return NotImplemented

>>> p1, p2, p3 = Point(2,3), Point(3,4), Point(3,4)
>>> p1 >= p2
False
>>> p3 == p2
True

For the usage of single dispatch generic functions from functools import singledispatch, check the python documentation

闭包Closures

# use closure to realize the averager    which has the same function of the averager made by using class

# use class
class Averager:
    def __init__(self):
        self.total = 0
        self.count = 0
    def add(self, number):
        self.total += number
        self.count += 1
        return self.total / self.count
        
# use closure
def averager():
    total = 0
    count = 0
    def add(number):
        nonlocal total # 这样使得add函数里的total和外部函数中的相同，不再是local变量
        nonlocal count
        total += number
        count += 1
        return total / count
    return add
    
# make a timer, class
from time import perf_counter

class Timer:
    def __init__(self):
        self.start = perf_counter()
    def __call__(self): # call the instance of the class will call the __call__ method directly
        return perf_counter() - self.start
        
# closure
def timer():
    start = perf_counter()
    def poll():
        return perf_counter() - start
    return poll
    
# build a counter which counts the called times of the passed function
def counter(fn, counters):
    cnt = 0
    def call(*args, **kwargs):
        nonlocal cnt
        cnt += 1
        counters[fn.__name__] = cnt
        return fn(*args, **kwargs)
    return call
    
def add(a, b):
    return a + b

c = dict()
add = counter(add, c)

>>> add(2,3)
5
>>> add(3,3)
6
>>> c
{'add': 2}

六、程序性能

程序运行时间

time

这两种方法包含了所有程序的时间，即从运行start到运行end的时间（没有程序运行也会计算时间）。

start = time.time()
run_func()
end = time.time()
print(end-start)

start = time.clock()
run_fun()
end = time.clock()
print(end-start)

datetime

该方法只计算start和end之间CPU运行的程序的时间，和前面对比。

import datetime
starttime = datetime.datetime.now()
endtime = datetime.datetime.now()
print((endtime - starttime).seconds) # 统计比较长的时间把seconds换成date

七、I/O读写与文件

open方法

参数值：

‘r+’ 等价于 rw 可读可写
‘w+’ 等价于 wr 可读可写
‘a+’ 等价于 ar 可追加可写

对应的二进制文件：'rb', 'wb', 'ab', 'rb+', 'wb+', 'ab+'

r+ Open for reading and writing. The stream is positioned at the beginning of the file.

a+ Open for reading and appending (writing at end of file). The file is created if it does not exist. The output is appended to the end of the file.

file = r'./test.txt'
with open(file, 'a+') as f:
    f.write("some text" + "\n")

Remove newline ‘\n’ remark of each line

temp = file_.read().splitlines()
# or
temp = [line[:-1] for line in file_]
# or
temp = line.strip()

递归遍历目录

os.walk(top[, topdown=True[, οnerrοr=None[, followlinks=False]]])
根目录下的每一个文件夹(包含它自己), 产生3-元组 (dirpath, dirnames, filenames)【文件夹路径, 文件夹名字, 文件名】

topdown 可选，为True或者没有指定, 一个目录的的3-元组将比它的任何子文件夹的3-元组先产生
(目录自上而下)。如果topdown为 False, 一个目录的3-元组将比它的任何子文件夹的3-元组后产生 (目录自下而上)
onerror 可选，是一个函数; 它调用时有一个参数, 一个OSError实例。报告这错误后，继续walk,或者抛出exception终止walk。
followlinks 设置为true，则通过软链接访问目录。

import os 

# 打印所有文件路径, cur_dir表示file_list里的当前文件所在的路径
g = os.walk("/path/to/dir")  
for cur_dir, dir_list, file_list in g:  
    for file_name in file_list:  
        print(os.path.join(cur_dir, file_name) )

# 打印所有文件夹路径
for cur_dir, dir_list, file_list in g:  
    for dir_name in dir_list:
        print(os.path.join(cur_dir, dir_name))

Concatenate files

filenames = [file1.txt, file2.txt, ...]
with open('path/to/output/file', 'w') as outfile:
	for fname in filenames:
		with open(fname) as infile:
		    for line in infile:
		        outfile.write(line)

import shutil
with open('output_file.txt', 'wb') as wfd:
	for f in ['seg1.txt', 'seg2.txt', 'seg3.txt']:
		with open(f, 'rb') as fd:
			shutil.copyfileobj(fd, wfd)

CSV文件

把二维列表写进csv文件

import  csv
list_of_lists = [[1,2,3],[4,5,6],[7,8,9]]
with open("out.csv","w") as f:
    writer = csv.writer(f, delimiter=" ") # 设置分隔符，如逗号、空格等
    writer.writerows(list_of_lists) # 最后输出格式为二维表格，each sublist is a row.

批量拼接（concatenate）CSV文件

此处代码为收集一个大文件夹的各个子文件夹内的CSV文件，并且拼接成一个大的CSV文件，并且加入了过滤空文件，其他类型文件的功能

import pandas as pd
import glob
import os

files_folder=[]
week = 1

sub_folders = glob.glob('/PATH/*')

for folder in sub_folders:
    all_files = []
    files = os.listdir(folder)
    for file in files:
        if file[-3:] == 'csv':
            all_files.append(folder +'/' + file)
    files_folder.append(all_files)

for folder in files_folder:
    tables = []
    for file in folder:
        if os.path.getsize(file) > 0:
            table = pd.read_csv(file)
            tables.append(table)
    result = pd.concat(tables, ignore_index=True)
    for row in range(result.shape[0]):
        if str(result.loc[row, 'items']).find(',') == -1:
            result = result.drop([row])
    result.to_csv('/PATH/merge_week{}.csv'.format(week), index=False)
    week += 1

JSON文件

Json data is almost identical to a python dictionary and it is shorter than XML.

>>>import json
>>>json_file = open("/path/to/jsonfile", "r", encoding="utf-8")
>>>loadedJson = json.load(json_file) # json_file can be a string
>>>json_file.close()

#you can access the content by key like
>>>loadedJson["keyName"]

#convert a dictionary to a json string
>>>dic = {"name": "yi", "gender": "male"}
>>>json.dumps(dic)

#write it to a file
>>>file = open("/path/to/store/jsonfile", "w", encoding="utf-8")
>>>json.dump(dic, file)
file.close()

Pickle

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy.

The following types can be pickled:

None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose __dict__ or the result of calling __getstate__() is picklable

import pickle

# To store a list
with open('outfile', 'wb') as fp:
    pickle.dump(itemlist, fp)

# To read it back:
with open ('outfile', 'rb') as fp:
    itemlist = pickle.load(fp)
    
# To store a dictionary
import pickle

# An arbitrary collection of objects supported by pickle.
data = {
    'a': [1, 2.0, 3, 4+6j],
    'b': ("character string", b"byte string"),
    'c': {None, True, False}
}

with open('data.pickle', 'wb') as f:
    # Pickle the 'data' dictionary using the highest protocol available.
    pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)

# To read it back:  
with open('data.pickle', 'rb') as f:
    # The protocol version used is detected automatically, so we do not
    # have to specify it.
    data = pickle.load(f)

八、系统操作

文件

设定连续且不重复的文件夹名，易于日志管理

最简单的办法就是用创建时间区分，即timestamp

import datetime
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "/PATH/logs"
logdir = "{}/run-{}".format(root_logdir, now) # 之后就用logdir在loop中命名文件夹就行了

recursively find absolute path of certain file

from pathlib import Path
for filename in Path('src').rglob('*.c'):
    print(filename)

创建目录/文件夹

# old method
import os
if not os.path.exists(directory):
    os.makedirs(directory)
    
# new method
# recursively creates the directory and does not raise an 
# exception if the directory already exists. If you don't need
# or want the parents to be created, skip the parents argument.
from pathlib import Path
Path("/my/directory").mkdir(parents=True, exist_ok=True)

其他Path类的功能

from pathlib import Path
p = Path(file)
p.cwd() # 获取当前路径，Python程序所在路径，而不是指定文件的当前路径
p.stat()  # 获取当前文件的信息
p.exists()  # 判断当前路径是否是文件或者文件夹
p.is_dir()  # 判断该路径是否是文件夹
p.is_file()  # 判断该路径是否是文件
p.iterdir()  #当path为文件夹时，通过yield产生path文件夹下的所有文件、文件夹路径的迭代器
p.rename(target)  # 当target是string时，重命名文件或文件夹;当target是Path时，重命名并移动文件或文件夹
p.replace(target)  # 重命名当前文件或文件夹，如果target所指示的文件或文件夹已存在，则覆盖原文件
p.parent(),p.parents()  # parent获取path的上级路径，parents获取path的所有上级路径
p.is_absolute()  # 判断path是否是绝对路径
p.rmdir()  # 当path为空文件夹的时候，删除该文件夹
p.suffix  # 获取path文件后缀
p.match(pattern)  # 判断path是否满足pattern

文件运行路径

os.getcwd() 输出起始执行目录，就是在哪个目录运行python命令行，就输出哪个目录的绝对路径

sys.path[0] 输出被初始执行的脚本的所在目录，比如python ./test/test.py，就输出test.py所在的目录的绝对路径

sys.argv[0] 输出第一个参数，就是运行文件本身 ./test/test.py

os.path.split(os.path.realpath(__file__))[0] 输出运行该命令的的python文件的所在的目录的绝对路径，该命令所在的文件的目录不同，输出的绝对路径就不同

工作路径

import inspect
import os

aa = inspect.getfile(inspect.currentframe())
print(aa)
print(os.path.abspath(aa))
print(os.path.dirname(os.path.abspath(aa)))
print(os.path.dirname(os.path.dirname(os.path.abspath(aa))))

输出

c:\users\.spyder-py3\temp.py
c:\users\.spyder-py3\temp.py
c:\users\.spyder-py3
c:\users

九、异常

Assert断言

当assert这个关键字后面的条件为假的时候，程序自动崩溃并抛出AssertionError的异常。

>>> assert 3>4
Traceback (most recent call last):
  File "", line 1, in <module>
    assert 3>4
AssertionError

# assert ,
>>> assert 2 + 2 == 5, "Houston we've got a problem"
Traceback (most recent call last):
  File "", line 1, in <module>
    assert 2 + 2 == 5, "Houston we've got a problem"
AssertionError: Houston we've got a problem

一般来说我们可以用assert在程序中插入检查点，当需要确保程序中的某个条件一定为真才能让程序正常工作的话，assert就非常有用。(Assert statements are a convenient way to insert debugging assertions into a program)

def avg(marks):
    assert len(marks) != 0,"List is empty."
    return sum(marks)/len(marks)

mark2 = [55,88,78,90,79]
print("Average of mark2:",avg(mark2))

mark1 = []
print("Average of mark1:",avg(mark1))

# output:
# Average of mark2: 78.0
# AssertionError: List is empty.

十、模块Module

模块是包含所有定义函数和变量的文件，后缀为.py。使用之前要用import引入。os模块，会帮助你在不同的操作系统环境下与文件，目录交互。

Package包

Packages are special modules. Packages can contain modules and packages called sub-packages. If a module is a package, it must have a value set for __path__.

The reason to use packages is that they have the ability to break code up into smaller chunks, make our code:

easier to write
easier to test and debug
easier to read/understand
easier to document

After you have imported a module, you can easily see if that module is a package by inspecting the __path__ attribute (empty -> module, non-empty -> package). Packages represent a hierarchy of modules/packages, just like books are broken down into chapters, sections, paragraphs, etc. E.g.

pack1.mod1
pack1.pack1_1.mod1_1

On a file system we therefore have to use directories for packages. The directory name becomes the package name.

To define a package in our file system, we must:

create a directory whose name will be the package name
create a file called __init__.py inside that directory

That __init__.py file is what tells Python that the directory is a package as opposed to a standard directory

Pip

pip install -r requirements.txt 安装目录下的requirements.txt中的python包

第三方库

scipy

读取.mat文件

import scipy.io as scio 
m = scio.loadmat("/path/to/your/.mat")
# m是字典格式，通过下面查看有哪些key
m.keys()

# 保存python字典到mat文件
scio.savemat(dataNew, {'A':data['A']})

numpy

读取存储

Numpy也可以存储Python的字典

embedding_dict = {1:222,2:333}
np.save("embedding_dict.npy", embedding_dict)
embedding_dict=np.load("embedding_dict.npy")

你可能感兴趣的:(Python编程,python,编程语言,代码规范)

Python保龄球计分Demo 清风序来 python 开发语言
找工作，笔试题，恶心到想吐，和大家分享下，在网上只有Java的demo,我这个python菜鸟，分享下python的demo......需求：保龄球计分规则如下，写出一个计分的demo一局有十格，每格有两次投球机会，如在第一次投球时没能全中，就有需要投第二球。每格可能出现的情况：1、失球在两次投后，未能击倒10个瓶，此格的分数为击倒的个数。如果一次击球未击到一个，则用一个“-”标记2、补中第二次补
Python隐式反馈数据集库之implicit使用详解 Rocky006 python 开发语言
概要Implicit是一个专注于隐式反馈数据集的协同过滤推荐系统Python库，由BenFrederickson开发。与显式反馈（如用户明确给予的评分）不同，隐式反馈是指用户通过行为间接表达偏好的数据，如点击次数、浏览时长或购买历史。这类数据在实际应用中更为普遍，但也更难以处理。传统推荐系统如Surprise或LightFM虽然功能全面，但在处理大规模稀疏矩阵时性能不佳。Implicit库通过优化
ReadTimeoutError: HTTPSConnectionPool(host=‘files.pythonhosted.org‘, port=443): Read timed out. 微信公众号：AI创造财富 python 开发语言
ERROR:Exception:Traceback(mostrecentcalllast):File"/home/powersys/work/miniconda/lib/python3.13/site-packages/pip/_vendor/urllib3/response.py",line438,in_error_catcheryieldFile"/home/powersys/work/min
Python虚拟环境管理：conda、venv、pipenv三国杀 network爬虫 python conda 数据库 jupyter
Python虚拟环境管理：conda、venv、pipenv三国杀作为一名在Python生态系统中学习实践了六年的开发者，我深刻体会到了Python虚拟环境管理工具的重要性和复杂性。从最初接触virtualenv时的懵懂，到现在熟练使用conda、venv、pipenv等工具，每一次的学习和实践都让我对Python环境管理有了更深的理解。今天，我想和大家分享一下这几年来对这三个主流工具的使用心得，
YOLOV10的tensorrt C++部署 dddccc1234 YOLO
根据博客进行python版本安装YOLOv10最全使用教程（含ONNX和TensorRT推理）-CSDN博客并将pt转为onnx：yoloexportmodel=yolov10s.ptformat=onnxopset=13simplify然后采用：https://github.com/hamdiboukamcha/yolov10-tensorrt.git进行c++编译配置好cuda11.7tens
华为OD机考2025B卷 - 特殊的加密算法（Java & Python& JS & C++ & C ）算法大师最新华为OD机试真题华为OD机试真题 (Java/JS/Py/C)华为od java python 华为OD机考2025B卷 javascript c++
最新华为OD机试真题目录：点击查看目录华为OD面试真题精选：点击立即查看2025华为od机试2025B卷-华为机考OD2025年B卷题目描述有一种特殊的加密算法，明文为一段数字串，经过密码本查找转换，生成另一段密文数字串。规则如下：明文为一段数字串由0~9组成密码本为数字0~9组成的二维数组需要按明文串的数字顺序在密码本里找到同样的数字串，密码本里的数字串是由相邻的单元格数字组成，上下和左右是相邻
华为OD机考2025B卷 - 查找接口成功率最优时间段（Java & Python& JS & C++ & C ）算法大师最新华为OD机试真题华为OD机试真题 (Java/JS/Py/C)华为od java python javascript c++
最新华为OD机试真题目录：点击查看目录华为OD面试真题精选：点击立即查看2025华为od机试2025B卷-华为机考OD2025年B卷题目描述服务之间交换的接口成功率作为服务调用关键质量特性，某个时间段内的接口失败率使用一个数组表示，数组中每个元素都是单位时间内失败率数值，数组中的数值为0~100的整数，给定一个数值(minAverageLost)表示某个时间段内平均失败率容忍值，即平均失败率小于等
第3关：Numpy数组的切片与索引 -阿呆- #numpy初体验 python
相关知识一维Numpy数组的切片操作与Python列表的切片一样。下面首先来定义数字012直到8的数组，然后通过指定下标3到7来选择数组的部分元素，这实际上就是提取数组中值为3到6的元素。In:importnumpyasnpIn:a=np.arange(9)In:a[3:7]Out:array([3,4,5,6])同时用下标选择元素，下标范围从0到7，并且下标每次递增2，如下所示：In:a[:7:
华为OD机考2025B卷 - 停车费用统计（Java & Python& JS & C++ & C ）算法大师最新华为OD机试真题华为OD机试真题 (Java/JS/Py/C)华为od java python 华为OD机考2025B卷 javascript c++
最新华为OD机试真题目录：点击查看目录华为OD面试真题精选：点击立即查看2025华为od机试2025B卷-华为机考OD2025年B卷题目描述停车场统计当日总收费，包月的车不统计，不包月的车半个小时收一块钱，不满半小时不收钱，如果超过半小时，零头不满半小时按半小时算，每天11:30-13:30时间段不收钱，如果一辆车停车时间超过8小时后不收费(网友回忆，数值不一定为8，正式机考的时候注意一下)。输入
Python 音乐爬虫实战：从网页抓包到歌曲下载维他奶糖61 python 爬虫开发语言
在数字音乐的世界里，通过编程的方式获取自己喜欢的音乐，是一件既有趣又充满挑战的事情。今天，我们就用Python来打造一个简单的音乐爬虫，实现从网页抓包分析，到最终下载歌曲的全过程。一、代码概览流程先来看一下完整的Python代码：importos#抓包过滤媒体#id#EltfAyJRBlZeEF1aUCQFAFhfFF8NUnheUVhfF11XUyQaVldTR19NVndTVVlSQ1hfVw
智联招聘爬虫维他奶糖61 爬虫 python 开发语言数据挖掘
使用Python和Selenium进行招聘信息爬取在当今数字化时代，数据已成为企业决策的重要依据。对于人力资源部门或求职者而言，获取最新的招聘信息至关重要。然而，手动浏览和收集招聘信息不仅耗时费力，而且效率低下。为了解决这个问题，我们可以使用Python和Selenium库来自动化这一过程，实现从招聘网站上批量爬取招聘信息。准备工作在开始之前，你需要确保已经安装了以下库：Python（建议版本3.
Python 图片爬虫实战：从代码解析到应用技巧维他奶糖61 python 爬虫开发语言
在数字时代，图片资源丰富多样，通过爬虫技术批量获取心仪的图片成为不少人的需求。本文将以爬取彼岸桌面壁纸网4K美女壁纸为例，深入解析Python图片爬虫代码，分享实用技巧，带你轻松掌握图片爬虫技术。一、爬虫实现思路爬虫的核心是模拟浏览器访问网页，解析页面内容，提取所需信息。本次爬虫的流程如下：构建目标网页URL列表，循环访问各页面；发送HTTP请求获取页面内容，解析HTML文档；定位图片元素，提取图
《Python 实现 B 站视频信息爬虫：从批量获取到 CSV 保存》维他奶糖61 python 音视频爬虫
B站视频信息爬虫实战：用Python批量获取B站视频数据引言在数据分析和内容研究场景中，获取B站视频的标题、播放量、作者等信息是常见需求。本文将介绍如何使用Python编写一个B站视频爬虫，通过DrissionPage库实现自动化数据采集，并保存为CSV格式。相比传统Selenium，DrissionPage的API更简洁，适合快速开发爬虫脚本。技术栈与环境准备核心库：DrissionPage：基
Java 多线程并发编程面试笔录一览 weixin_34318272 面试 python java
2019独角兽企业重金招聘Python工程师标准>>>知识体系图：1、线程是什么？线程是进程中独立运行的子任务。2、创建线程的方式方式一：将类声明为Thread的子类。该子类应重写Thread类的run方法方式二：声明实现Runnable接口的类。该类然后实现run方法推荐方式二，因为接口方式比继承方式更灵活，也减少程序间的耦合。3、获取当前线程信息？Thread.currentThread()4
语法糖：编程中的甜蜜简化 (附 Vue 3 & Javascript 实战示例) Pu_Nine_9 前端学习 javascript vue.js 前端语法糖
什么是语法糖？语法糖（SyntacticSugar）是编程语言中一种特殊的语法结构，它不引入新的功能，而是提供一种更简洁、更易读的方式来表达已有功能。就像给咖啡加糖一样，它让代码"更甜"——更易于理解和编写。语法糖的四大核心价值可读性提升：让代码更接近自然语言表达开发效率：减少样板代码，专注业务逻辑错误预防：通过标准化模式减少人为失误维护便捷：简洁的代码结构更易于后期维护经典语法糖示例深度解析示例
python为指定目录下的文件名批量加前缀 jghhh01 python java 前端
功能描述：批量重命名指定目录下的文件，文件名加前缀，默认格式为“目录名_原文件名”。代码importargparseimportosimportsysimportloggingdefgen_args():"""说明-----解析命令行参数"""parser=argparse.ArgumentParser(prog="批量文件重命名工具",description="批量重命名目录中的文件名,新文件名
学而思编程周赛语言普及奠基组 | 2025年春第15周T1 新二进制热爱编程的通信人算法 c++
欢迎大家订阅我的专栏：算法题解：C++与Python实现！本专栏旨在帮助大家从基础到进阶，逐步提升编程能力，助力信息学竞赛备战！专栏特色1.经典算法练习：根据信息学竞赛大纲，精心挑选经典算法题目，提供清晰的代码实现与详细指导，帮助您夯实算法基础。2.系统化学习路径：按照算法类别和难度分级，从基础到进阶，循序渐进，帮助您全面提升编程能力与算法思维。适合人群：准备参加蓝桥杯、GESP、CSP-J、CS
学而思编程周赛语言普及奠基组 | 2025年春第15周T2 散步热爱编程的通信人算法 c++
欢迎大家订阅我的专栏：算法题解：C++与Python实现！本专栏旨在帮助大家从基础到进阶，逐步提升编程能力，助力信息学竞赛备战！专栏特色1.经典算法练习：根据信息学竞赛大纲，精心挑选经典算法题目，提供清晰的代码实现与详细指导，帮助您夯实算法基础。2.系统化学习路径：按照算法类别和难度分级，从基础到进阶，循序渐进，帮助您全面提升编程能力与算法思维。适合人群：准备参加蓝桥杯、GESP、CSP-J、CS
Python商务数据分析——Matplotlib 数据可视化学习笔记爱吃代码的小皇冠 python numpy matplotlib pandas 学习笔记数据分析
一、Matplotlib基础认知1.1库功能与定位核心作用：将数据可视化展示，提升数据直观性与说服力应用场景：绘制折线图、饼图、柱状图等2D/3D图表双接口模式：MATLAB风格：通过pyplot函数快速绘图（自动管理图形对象）面向对象：显式创建Figure和Axes对象（适合复杂绘图）1.2核心对象架构容器类：图(Figure)、坐标系(Axes)、坐标轴(Axis)、刻度(Tick)基础类：线
Python爬虫：Requests与Beautiful Soup库详解 Pu_Nine_9 Python爬虫的学习 python 爬虫 requests beautifulsoup
前言在当今数据驱动的时代，网络爬虫成为了获取网络信息的重要工具。Python作为最流行的爬虫语言之一，拥有丰富的库支持。今天我们就来介绍两个最基础也最强大的爬虫库：Requests和BeautifulSoup，并补充关于lxml解析器和RequestsSession的内容。一、Requests库：让HTTP请求变得简单Requests是一个优雅而简单的HTTP库，它让发送HTTP请求变得非常简单，
centos 7+hadoop 2.7.3 mozhw c/c++linu/unix java
安装JDK版本:jdk-8u131-linux-x64.tar.gz需要先删除系统自带的openjdk先查找java再移除[hadoop@localhost~]$rpm-qa|grepjavajava-1.7.0-openjdk-1.7.0.111-2.6.7.8.el7.x86_64python-javapackages-3.4.1-11.el7.noarchtzdata-java-2016g-
Python 数据分析：numpy，抽提，基本索引。听故事学知识点怎么这么容易？好开心啊没烦恼 numpy python 数据分析 numpy 开发语言数据挖掘人工智能机器学习
目录1示例代码2欢迎纠错3免费爬虫------以下关于Markdown编辑器新的改变功能快捷键合理的创建标题，有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能，丰富你的文章UML图表FLowchart流程图导出与导
Python 数据分析：pandas 的 DataFrame，抽行、抽列、抽行列。df[] / df.loc[] / df.iloc[]，位置索引 / 标签索引，切片 / 不切片好开心啊没烦恼 Python数据分析 python 数据分析 pandas 开发语言数据挖掘
目录1预备知识：Series1.1生成1.2抽提（1）单条（2）多条不连（3）多条连1.3取值2正文：DataFrame2.1生成df2.2抽提2.2.1抽列（1）单列df[]df.loc[]df.iloc[]（2）多列不连df[]df.loc[]df.iloc[]（3）多列连df[]←不存在这种抽提法！df.loc[]df.iloc[]2.2.2抽行（1）单行df[]df.loc[]df.ilo
Python 数据分析：numpy.transpose() ，转换维度。听故事学知识点怎么这么容易？好开心啊没烦恼 numpy numpy python 开发语言数据分析数据挖掘人工智能机器学习
目录1一维数组2二维数组3三维数组4欢迎纠错5免费爬虫------以下关于Markdown编辑器新的改变功能快捷键合理的创建标题，有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能，丰富你的文章UML图表FLowch
Python 编辑器：Geany，不是内部或外部命令，系统找不到指定路径
目录1找到设置选项2开始设置2.1complie2.2execute3欢迎纠错4免费爬虫------以下关于Markdown编辑器新的改变功能快捷键合理的创建标题，有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能，
Vue2案例尔-尔学习笔记 vue 前端
一、自定义创建项目1、基于VueCli自定义创建项目Babel/Router/Vuex/CSS/LinterVue2.xVueRouterhash模式CSS预处理LessESlint:StandardconfigLintonSaveIndedicatedconfigfiles(配置文件所在位置)Npm2、ESlint代码规范1.认识代码规范代码规范:一套写代码的约定规则。赋值符号的左右是否需要空格
基于django+Spark+大数据+爬虫技术的国漫推荐与可视化平台设计和实现(源码+论文+部署讲解等) 阿勇学长大数据项目实战案例 Java精品毕业设计实例 Python数据可视化项目案例大数据 django spark 国漫推荐与可视化平台毕业设计 Java
博主介绍：✌全网粉丝50W+,csdn特邀作者、博客专家、CSDN新星计划导师、Java领域优质创作者,博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java技术领域和学生毕业项目实战,高校老师/讲师/同行前辈交流✌技术范围：SpringBoot、Vue、SSM、HLMT、Jsp、PHP、Nodejs、Python、爬虫、数据可视化、小程序、安卓app、大数据、物联网、机器学习等
【有源码】基于爬虫+python的美食数据分析与可视化flask热门美食推荐系统的设计与实现 Q2643365023 Python 大数据 python 爬虫计算机毕设选题毕业设计源码计算机毕设项目数据分析美食推荐系统
注意：该项目只展示部分功能，如需了解，文末咨询即可。本文目录1.开发环境2系统设计2.1设计背景2.2设计内容3系统展示3.1功能展示视频3.2系统页面4更多推荐5部分功能代码1.开发环境开发语言：Python采用技术：flask、爬虫数据库：MySQL开发环境：PyCharm2系统设计2.1设计背景在现代社会中，人们对美食的兴趣和需求日益增长。互联网和社交媒体的普及使得各种美食信息、评论和推荐变
【零基础学AI】第10讲：线性回归 1989 0基础学AI 人工智能线性回归算法 python 回归 numpy 开源
本节课你将学到理解线性回归的原理和应用场景掌握最小二乘法的基本思想使用Python构建房价预测模型学会评估回归模型的性能指标开始之前环境要求Python3.8+JupyterNotebook或任何PythonIDE需要安装的包pipinstallscikit-learnpandasmatplotlibseabornnumpy前置知识第9讲：机器学习概述基本的Python和数据处理能力核心概念什么是
【零基础学AI】第9讲：机器学习概述 1989 0基础学AI 人工智能机器学习 python numpy devops 开源
本节课你将学到理解什么是机器学习，以及它与传统编程的区别掌握监督学习、无监督学习的基本概念使用scikit-learn完成你的第一个机器学习项目构建一个完整的iris花朵分类器开始之前环境要求Python3.8+JupyterNotebook或任何PythonIDE需要安装的包pipinstallscikit-learnpandasmatplotlibseaborn前置知识基本的Python语法（
ViewController添加button按钮解析。（翻译）张亚雄 c
<div class="it610-blog-content-contain" style="font-size: 14px"></div>// ViewController.m // Reservation software // // Created by 张亚雄 on 15/6/2.
mongoDB 简单的增删改查开窍的石头 mongodb
在上一篇文章中我们已经讲了mongodb怎么安装和数据库/表的创建。在这里我们讲mongoDB的数据库操作在mongo中对于不存在的表当你用db.表名他会自动统计下边用到的user是表明，db代表的是数据库添加(insert):
log4j配置 0624chenhong log4j
1) 新建java项目 2) 导入jar包，项目右击，properties—java build path—libraries—Add External jar，加入log4j.jar包。 3) 新建一个类com.hand.Log4jTest package com.hand; import org.apache.log4j.Logger; public class
多点触摸(图片缩放为例) 不懂事的小屁孩多点触摸
多点触摸的事件跟单点是大同小异的，上个图片缩放的代码，供大家参考一下 import android.app.Activity; import android.os.Bundle; import android.view.MotionEvent; import android.view.View; import android.view.View.OnTouchListener
有关浏览器窗口宽度高度几个值的解析换个号韩国红果果 JavaScript html
1 元素的 offsetWidth 包括border padding content 整体的宽度。 clientWidth 只包括内容区 padding 不包括border。 clientLeft = offsetWidth -clientWidth 即这个元素border的值 offsetLeft 若无已定位的包裹元素
数据库产品巡礼：IBM DB2概览蓝儿唯美 db2
IBM DB2是一个支持了NoSQL功能的关系数据库管理系统，其包含了对XML，图像存储和Java脚本对象表示（JSON）的支持。DB2可被各种类型的企业使用，它提供了一个数据平台，同时支持事务和分析操作，通过提供持续的数据流来保持事务工作流和分析操作的高效性。 DB2支持的操作系统 DB2可应用于以下三个主要的平台: 工作站，DB2可在Linus、Unix、Windo
java笔记5 a-john java
控制执行流程： 1，true和false 利用条件表达式的真或假来决定执行路径。例：（a==b）。它利用条件操作符“==”来判断a值是否等于b值，返回true或false。java不允许我们将一个数字作为布尔值使用，虽然这在C和C++里是允许的。如果想在布尔测试中使用一个非布尔值，那么首先必须用一个条件表达式将其转化成布尔值，例如if(a!=0)。 2，if-els
Web开发常用手册汇总 aijuans PHP
一门技术，如果没有好的参考手册指导,很难普及大众。这其实就是为什么很多技术，非常好，却得不到普遍运用的原因。正如我们学习一门技术，过程大概是这个样子： ①我们日常工作中，遇到了问题，困难。寻找解决方案，即寻找新的技术； ②为什么要学习这门技术？这门技术是不是很好的解决了我们遇到的难题，困惑。这个问题，非常重要，我们不是为了学习技术而学习技术，而是为了更好的处理我们遇到的问题，才需要学习新的
今天帮助人解决的一个sql问题 asialee sql
今天有个人问了一个问题，如下： type AD value A
意图对象传递数据百合不是茶 android 意图Intent Bundle对象数据的传递
学习意图将数据传递给目标活动; 初学者需要好好研究的 1,将下面的代码添加到main.xml中 <?xml version="1.0" encoding="utf-8"?> <LinearLayout xmlns:android="http:/
oracle查询锁表解锁语句 bijian1013 oracle object session kill
一.查询锁定的表如下语句，都可以查询锁定的表语句一： select a.sid, a.serial#, p.spid, c.object_name, b.session_id, b.oracle_username, b.os_user_name from v$process p, v$s
mac osx 10.10 下安装 mysql 5.6 二进制文件［tar.gz］征客丶 mysql osx
场景：在 mac osx 10.10 下安装 mysql 5.6 的二进制文件。环境：mac osx 10.10、mysql 5.6 的二进制文件步骤：[所有目录请从根“/”目录开始取，以免层级弄错导致找不到目录] 1、下载 mysql 5.6 的二进制文件，下载目录下面称之为 mysql5.6SourceDir；下载地址：http://dev.mysql.com/downl
分布式系统与框架 bit1129 分布式
RPC框架 Dubbo 什么是Dubbo Dubbo是一个分布式服务框架，致力于提供高性能和透明化的RPC远程服务调用方案，以及SOA服务治理方案。其核心部分包含: 远程通讯: 提供对多种基于长连接的NIO框架抽象封装，包括多种线程模型，序列化，以及“请求-响应”模式的信息交换方式。集群容错: 提供基于接
那些令人蛋痛的专业术语白糖_ spring Web SSO IOC
spring 【控制反转(IOC)/依赖注入(DI)】：由容器控制程序之间的关系，而非传统实现中，由程序代码直接操控。这也就是所谓“控制反转”的概念所在：控制权由应用代码中转到了外部容器，控制权的转移，是所谓反转。简单的说：对象的创建又容器(比如spring容器)来执行，程序里不直接new对象。 Web 【单点登录(SSO)】：SSO的定义是在多个应用系统中，用户
《给大忙人看的java8》摘抄 braveCS java8
函数式接口：只包含一个抽象方法的接口 lambda表达式：是一段可以传递的代码你最好将一个lambda表达式想象成一个函数，而不是一个对象，并记住它可以被转换为一个函数式接口。事实上，函数式接口的转换是你在Java中使用lambda表达式能做的唯一一件事。方法引用：又是要传递给其他代码的操作已经有实现的方法了，这时可以使
编程之美-计算字符串的相似度 bylijinnan java 算法编程之美
public class StringDistance { /** * 编程之美计算字符串的相似度 * 我们定义一套操作方法来把两个不相同的字符串变得相同，具体的操作方法为： * 1.修改一个字符（如把“a”替换为“b”）; * 2.增加一个字符（如把“abdd”变为“aebdd”）; * 3.删除一个字符（如把“travelling”变为“trav
上传、下载压缩图片 chengxuyuancsdn 下载
/** * * @param uploadImage --本地路径(tomacat路径) * @param serverDir --服务器路径 * @param imageType --文件或图片类型 * 此方法可以上传文件或图片.txt,.jpg,.gif等 */ public void upload(String uploadImage,Str
bellman-ford(贝尔曼-福特)算法 comsci 算法 F#
Bellman-Ford算法(根据发明者 Richard Bellman 和 Lester Ford 命名)是求解单源最短路径问题的一种算法。单源点的最短路径问题是指：给定一个加权有向图G和源点s，对于图G中的任意一点v，求从s到v的最短路径。有时候这种算法也被称为 Moore-Bellman-Ford 算法，因为 Edward F. Moore zu 也为这个算法的发展做出了贡献。与迪科
oracle ASM中ASM_POWER_LIMIT参数 daizj ASM oracle ASM_POWER_LIMIT 磁盘平衡
ASM_POWER_LIMIT 该初始化参数用于指定ASM例程平衡磁盘所用的最大权值，其数值范围为0~11，默认值为1。该初始化参数是动态参数，可以使用ALTER SESSION或ALTER SYSTEM命令进行修改。示例如下： SQL>ALTER SESSION SET Asm_power_limit=2;
高级排序:快速排序 dieslrae 快速排序
public void quickSort(int[] array){ this.quickSort(array, 0, array.length - 1); } public void quickSort(int[] array,int left,int right){ if(right - left <= 0
C语言学习六指针_何谓变量的地址一个指针变量到底占几个字节 dcj3sjt126com C语言
# include <stdio.h> int main(void) { /* 1、一个变量的地址只用第一个字节表示 2、虽然他只使用了第一个字节表示，但是他本身指针变量类型就可以确定出他指向的指针变量占几个字节了 3、他都只存了第一个字节地址，为什么只需要存一个字节的地址，却占了4个字节，虽然只有一个字节，但是这些字节比较多，所以编号就比较大，
phpize使用方法 dcj3sjt126com PHP
phpize是用来扩展php扩展模块的，通过phpize可以建立php的外挂模块,下面介绍一个它的使用方法,需要的朋友可以参考下安装（fastcgi模式）的时候，常常有这样一句命令：代码如下: /usr/local/webserver/php/bin/phpize 一、phpize是干嘛的？ phpize是什么？ phpize是用来扩展php扩展模块的，通过phpi
Java虚拟机学习 - 对象引用强度 shuizhaosi888 JAVA虚拟机
本文原文链接：http://blog.csdn.net/java2000_wl/article/details/8090276 转载请注明出处！无论是通过计数算法判断对象的引用数量，还是通过根搜索算法判断对象引用链是否可达，判定对象是否存活都与“引用”相关。引用主要分为：强引用(Strong Reference)、软引用(Soft Reference)、弱引用(Wea
.NET Framework 3.5 Service Pack 1（完整软件包）下载地址 happyqing .net 下载 framework
Microsoft .NET Framework 3.5 Service Pack 1（完整软件包） http://www.microsoft.com/zh-cn/download/details.aspx?id=25150 Microsoft .NET Framework 3.5 Service Pack 1 是一个累积更新，包含很多基于 .NET Framewo
JAVA定时器的使用 jingjing0907 java timer 线程定时器
1、在应用开发中，经常需要一些周期性的操作，比如每5分钟执行某一操作等。对于这样的操作最方便、高效的实现方式就是使用java.util.Timer工具类。 privatejava.util.Timer timer; timer = newTimer(true); timer.schedule( newjava.util.TimerTask() { public void run()
Webbench 流浪鱼 webbench
首页下载地址 http://home.tiscali.cz/~cz210552/webbench.html Webbench是知名的网站压力测试工具，它是由Lionbridge公司（http://www.lionbridge.com）开发。 Webbench能测试处在相同硬件上，不同服务的性能以及不同硬件上同一个服务的运行状况。webbench的标准测试可以向我们展示服务器的两项内容：每秒钟相
第11章动画效果（中） onestopweb 动画
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
windows下制作bat启动脚本. sanyecao2314 java cmd 脚本 bat
java -classpath C:\dwjj\commons-dbcp.jar;C:\dwjj\commons-pool.jar;C:\dwjj\log4j-1.2.16.jar;C:\dwjj\poi-3.9-20121203.jar;C:\dwjj\sqljdbc4.jar;C:\dwjj\voucherimp.jar com.citsamex.core.startup.MainStart
Java进行RSA加解密的例子 tomcat_oracle java
加密是保证数据安全的手段之一。加密是将纯文本数据转换为难以理解的密文；解密是将密文转换回纯文本。　　数据的加解密属于密码学的范畴。通常，加密和解密都需要使用一些秘密信息，这些秘密信息叫做密钥，将纯文本转为密文或者转回的时候都要用到这些密钥。　　对称加密指的是发送者和接收者共用同一个密钥的加解密方法。　　非对称加密(又称公钥加密)指的是需要一个私有密钥一个公开密钥，两个不同的密钥的
Android_ViewStub 阿尔萨斯 ViewStub
public final class ViewStub extends View java.lang.Object android.view.View android.view.ViewStub 类摘要： ViewStub 是一个隐藏的，不占用内存空间的视图对象，它可以在运行时延迟加载布局资源文件。当 ViewSt