1、sklearn StandardScaler与“ with_std = False或True”之间的差异和“ with_mean = False或True”（https://www.it1352.com/1794584.html）

****如果将with_mean和with_std设置为False，则将平均值μ设置为0并std设为1，假定列/特征来自正态高斯分布（均值为0和1 std）。

如果将with_mean和with_std设置为True，那么您实际上将使用数据的真实μ和σ。这是最常见的方法。

2、python之meshgrid的使用(https://blog.csdn.net/qq_30638831/article/details/84628976)

image.png

4、关于numpy的astype（bool）和astype（int）等等（https://blog.csdn.net/wuxulong123/article/details/103387222）
import numpy as np
a=[[1,2,1],[2,3,5]]
b=[[0,0,0],[2,3,5]]
c=np.array(a).astype(bool)
d=np.array(b).astype(bool)
print(c)
print(d)

tight_layout会自动调整子图参数，使之填充整个图像区域。

6、Python isalpha()方法（https://www.runoob.com/python/att-string-isalpha.html）

Python isalpha() 方法检测字符串是否只由字母组成。

7、Python List count()方法(https://www.runoob.com/python/att-list-count.html)

count() 方法用于统计某个元素在列表中出现的次数。

8、from future import print_function用法（https://zhuanlan.zhihu.com/p/28641474）
首先我们需要明白该句语句是python2的概念，那么python3对于python2就是future了，也就是说，在python2的环境下，超前使用python3的print函数。

举例如下：
在python2.x的环境是使用下面语句，则第二句语法检查通过，第三句语法检查失败

1 from future import print_function
2 print('you are good')
3 print 'you are good'

10、np.newaxis作用（https://blog.csdn.net/weixin_42866962/article/details/82811082）

np.newaxis的功能:插入新维度

a=np.array([1,2,3,4,5])
aa=a[:,np.newaxis]
print(aa.shape)
print (aa)

输出：(5, 1)
[[1]
[2]
[3]
[4]
[5]]

11、Mac python matplotlib Glyph xxxxx missing from current font的解决方案

https://blog.csdn.net/fwj_ntu/article/details/105598145
https://blog.csdn.net/qiqiqi98/article/details/106732789
https://blog.csdn.net/weixin_38037405/article/details/107127610

https://blog.csdn.net/Fwuyi/article/details/123084642
plt.rcParams['font.sans-serif'] = ['SimHei'] #运行配置参数中的字体（font）为黑体（SimHei）

plt.rcParams['axes.unicode_minus'] = False #运行配置参数总的轴（axes）正常显示正负号（minus）

12、numpy 学习之 np.c_的用法（https://blog.csdn.net/qq_33728095/article/details/102512600）
np.c 中的c 是 column(列)的缩写，就是按列叠加两个矩阵，就是把两个矩阵左右组合，要求行数相等。

16、Xlim函数--Matplotlib(https://blog.csdn.net/chongbaikaishi/article/details/108782039)

获取或设置x轴数值显示范围
left, right = xlim() # return the current xlim 返回当前x轴边界
xlim((left, right)) # set the xlim to left, right 设置x轴边界

21、python self.class.name 理解（https://blog.csdn.net/aaa958099161/article/details/90177791）

获取类名

22、sklearn中的make_blobs的用法(https://blog.csdn.net/weixin_44177568/article/details/102213508)
data, label = make_blobs(n_features=2, n_samples=100, centers=3, random_state=3, cluster_std=[0.8, 2, 5])

n_features表示每一个样本有多少特征值
n_samples表示样本的个数
centers是聚类中心点的个数，可以理解为label的种类数
random_state是随机种子，可以固定生成的数据
cluster_std设置每个类别的方差

23、core_indices[model.core_sample_indices_] = True
https://www.cnblogs.com/xiguapipipipi/p/10109789.html
https://blog.csdn.net/qq_30031221/article/details/116494511

core_indices[model.core_sample_indices_] = True#model.core_sample_indices_：核心点的索引，因为labels_不能区分核心点还是边界点，所以需要用这个索引确定核心点。
model.core_sample_indices_ border point位于labels中的下标

24、np.unique( )的用法（https://blog.csdn.net/u012193416/article/details/79672729）

该函数是去除数组中的重复数字，并进行排序之后输出。

25、axis=0 与 axis=1 的区分(https://blog.csdn.net/guoyang768/article/details/84818774)

1表示横轴，方向从左到右；0表示纵轴，方向从上到下

26、分群评估指标（二）|调整互信息与Homogeneity, completeness and V-measure(https://zhuanlan.zhihu.com/p/425253563)

homogeneity : float
score between 0.0 and 1.0. ;1.0 stands for perfectly homogeneous labeling

completeness:float
score between 0.0 and 1.0. ;1.0 stands for perfectly complete labeling

v_measure:float
harmonic mean of the first two

A clustering result satisfies homogeneity: if all of its clusters contain only data points which are members of a single class.

A clustering result satisfies completeness: if all the data points that are members of a given class are elements of the same cluster.

27、聚类︱python实现六大分群质量评估指标（兰德系数、互信息、轮廓系数）（https://blog.csdn.net/sinat_26917383/article/details/70577710）
https://blog.csdn.net/howhigh/article/details/73928635

28、numpy.median()(https://blog.csdn.net/qq_42518956/article/details/103987722)
numpy模块下的median作用为：
计算沿指定轴的均值
返回数组元素的均值

image.png

29、np.logspace() 对数等⽐数列(https://wenku.baidu.com/view/2f9546020422192e453610661ed9ad51f01d54f6.html)

logspace等⽐数列，默认以10为底

image.png

30、numpy.histogramdd（https://www.cjavapy.com/article/1105/）
计算某些数据的多维直方图。

返回值：
H ：ndarray
样本x的多维直方图。有关不同的可能语义，请参见normed和weights。

edges ：list
D数组的列表，描述每个维度的面元边缘。

31、python列表去重的两种方法（https://blog.csdn.net/CHQC388/article/details/114648761）
def test2():
lst = [1,2,5,6,3,5,7,3]
tmp = list(set(lst))
print(tmp) # 顺序改变
tmp.sort(key=lst.index)
print(tmp) # 顺序不变

32、【numpy】argmax参数辨析（axis=0,axis=1,axis=-1)(https://blog.csdn.net/weixin_39190382/article/details/105854567)

https://blog.csdn.net/byron123456sfsfsfa/article/details/88923085

argmax：一句话概括，返回最大值的索引。
当axis=0，是在列中比较，选出最大的行索引
当axis=1，是在行中比较，选出最大的列索引

33、python中endswith()函数的用法（https://blog.csdn.net/qq_40678222/article/details/83033587）
判断字符串是否以指定字符或子字符串结尾。

34、np.array()和np.asarray()的联系与区别（https://blog.csdn.net/weixin_40922744/article/details/106737424）

从定义中可以看出两者的主要区别在于 np.array（默认情况下）将会copy该对象，而 np.asarray除非必要，否则不会copy该对象。

35、np.isnan()是判断是否是空值（https://blog.csdn.net/tian_jiangnan/article/details/104862085）

36、np.argmax（https://blog.csdn.net/CSDNwei/article/details/109183313）

格式：np.argmax(a)
注意：返回的是a中元素最大值所对应的索引值

37、np.hstack将参数元组的元素数组按水平方向进行叠加(https://blog.csdn.net/G66565906/article/details/84142034)

import numpy as np

arr1 = np.array([[1,3], [2,4] ])
arr2 = np.array([[1,4], [2,6] ])
res = np.hstack((arr1, arr2))

print (res)

[[1 3 1 4]
[2 4 2 6]]

38、dtype=np.uint8（https://blog.csdn.net/qq_42191914/article/details/103103460）

https://zhidao.baidu.com/question/532862991.html
uint8是8位无符号整型，uint16是16位无符号整型。

https://zhidao.baidu.com/question/519987520.html
uint8是指0~2^8-1 = 255数据类型，一般在图像处理中很常见。

今天踩了一个坑，在opencv-python中，若想为图像创建一个容器，需要指定dtype=np.uint8，否则虽然你的容器矩阵中是有值的，但是无法正常imshow

image.png

39、eval() 函数用来执行一个字符串表达式，并返回表达式的值。(https://www.runoob.com/python/python-func-eval.html)

n=81
eval("n + 4")
85

40、python中assert的用法（https://blog.csdn.net/qq_37369201/article/details/109195257）

def zero(s):
a = int(s)
assert a > 0,"a超出范围" #这句的意思：如果a确实大于0，程序正常往下运行
return a

zero("-2") #但是如果a是小于0的，程序会抛出AssertionError错误，报错为参数内容“a超出范围”

41、类型提示(self, nums: List[int]) -＞ List[int] （https://blog.csdn.net/chengyikang20/article/details/124778296）

def greeting(name: str) -> str:
return 'Hello ' + name

greeting 函数中，参数name的类型是str，返回类型也是str。子类型也可以当作参数。

42、查看当前numpy版本(https://blog.csdn.net/cpick/article/details/122503241)

在cmd中依次输入：

1.python

2.import numpy

3.numpy.version

43、numpy 安装与卸载（https://blog.csdn.net/weixin_42081389/article/details/98185411/）
pip uninstall numpy
pip install numpy==1.16.4

44、os.walk()的详细理解（https://blog.csdn.net/qq_37344125/article/details/107972463）

for root, dirs, files in os.walk(operate_path):
print('root:',root)
print('dirs:',dirs)
print('files:',files)
print('\n')

root：输出了mm文件夹的的绝对路径;
dirs：保存了mm文件夹下的所有子文件夹的目录名（只有一层）
files：则是一个保存了mm文件夹下的所有文件的文件名，并保存到list中

45、PIL库中getpixel()-方法的使用(https://blog.csdn.net/qq_36430012/article/details/114303458)
getpixel（）函数是用来获取图像中某一点的像素的RGB颜色值，getpixel的参数是一个像素点的坐标。对于图象的不同的模式，getpixel函数返回的值不同。

46、L，P，RGB，RGBA，CMYK，YCbCr，I， F，不同的图像模式（https://www.likecs.com/show-204895834.html）
模式"L"为灰度图像，它的每个像素用8个bit位表示，其中0表示黑，255表示白，其它数字表示不同的灰度。

47、numpy.ndarray 排序（https://www.jb51.net/article/130651.htm）
ndarray.sort(axis=-1,kind='quicksort',order=None)

使用方法：a.sort

参数说明：

axis：排序沿着数组的方向，0表示按行，1表示按列

kind：排序的算法，提供了快排、混排、堆排

order：不是指的顺序，以后用的时候再去分析这个

作用效果：对数组a排序，排序后直接改变了a

48、pandas从dataframe中提取多列数据(https://blog.csdn.net/weixin_44561414/article/details/125673541)

X = df_gps_org[["latitude","longitude"]]

49、Pandas＞＞按照行、列进行求和（https://blog.csdn.net/panfuyong11/article/details/115349576）

x_train[['L']].apply(lambda x:x.sum())
x_train[['温度']].apply(lambda x:x.sum())
x_train[['PH']].apply(lambda x:x.sum())

50、pandasDataFrame数据转为list的方法(https://wenku.baidu.com/view/2605fc37b4360b4c2e3f5727a5e9856a57122649.html)

caohao= all_data['槽号']
print(type(caohao))
caohao_list = caohao.tolist()

51、pandas转numpy（http://t.zoukankan.com/Renyi-Fan-p-13882431.html）
1.使用DataFrame中的values方法

df.values
2.使用DataFrame中的as_matrix()方法

df.as_matrix()
3.使用Numpy中的array方法

np.array(df)

52、pandas根据某列去重（https://blog.csdn.net/qq_43965708/article/details/109892053）

53、在matplotlib中创建子图的多种方式（https://www.dandelioncloud.cn/article/details/1498083567296163841）

import numpy as np
import matplotlib.pyplot as plt
x = np.arange(100)

创建图像布局对象fig

fig = plt.figure(figsize = (12, 6))

221代表创建2行2列一共4个子图，并从左往右第1个子图开始绘图。

ax1 = fig.add_subplot(221)
ax1.plot(x, x)
ax2 = fig.add_subplot(222)
ax2.plot(x, -x)
ax3 = fig.add_subplot(223)
ax3.plot(x, x ** 2)
ax4 = fig.add_subplot(224)
ax4.plot(-x, x ** 2)
plt.show()

54、sklearn-数据集划分(https://blog.csdn.net/qq_36387683/article/details/80468011)

x_train, x_test, y_train, y_test = train_test_split(data, label, test_size = 0.3, random_state = 7)

55、进度条
import time
from tqdm import tqdm

for i in tqdm(range(1000)):
time.sleep(.01)

56、如何将py文件转化为exe（https://blog.csdn.net/m0_54812370/article/details/124493642）
https://blog.csdn.net/a789865315/article/details/124259965

57、反编译，如何将Python打包后的exe还原成.py？(https://blog.csdn.net/Csy79/article/details/125103466)
https://blog.csdn.net/weixin_49764009/article/details/120340153

58、调用fit_transform()与调用transform()的区别（https://blog.csdn.net/data_curd/article/details/112556315）
fit_trainfrom方法时trainfrom和fit方法的结合，其意思是找出x_train的均值和方差，并应用到x_train上
而下来调用trainform就直接用x_train求出来的方差和均值就行了。

fit(): Method calculates the parameters μ and σ and saves them as internal objects.
解释：简单来说，就是求得训练集X的均值，方差，最大值，最小值,这些训练集X固有的属性。
transform(): Method using these calculated parameters apply the transformation to a particular dataset.
解释：在fit的基础上，进行标准化，降维，归一化等操作（看具体用的是哪个工具，如PCA，StandardScaler等）。

59、Python—计算方差、标准差（https://blog.csdn.net/weixin_46560950/article/details/104905883）
import numpy as np
arr = [1,2,3,4,5,6]

求方差

arr_var = np.var(arr)

求标准差

arr_std = np.std(arr,ddof=1)
print("方差为：%f" % arr_var)
print("标准差为:%f" % arr_std)

60、python怎么去掉换行符_python去除字符串中的换行符(https://blog.csdn.net/weixin_39610759/article/details/109924504)

一、去除空格

strip()

" xyz ".strip() # returns "xyz"

" xyz ".lstrip() # returns "xyz "

" xyz ".rstrip() # returns " xyz"

" x y z ".replace(' ', '') # returns "xyz"

二、替换 replace("space","")

用replace("\n", ""),后边的串替换掉前边的

61、Python字符串截取方式（https://blog.csdn.net/luckjump/article/details/119251647）

62、如何用Python求众数（https://blog.csdn.net/weixin_35757704/article/details/120842651?spm=1001.2101.3001.6650.6&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7ERate-6-120842651-blog-108671660.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7ERate-6-120842651-blog-108671660.pc_relevant_default&utm_relevant_index=7）

import numpy as np
import statistics

my_list = [1, 1, 2, 3, 4, 1, 2, 3, 3, 3, 3, 3]
mode1 = statistics.mode(my_list)
mode2 = np.argmax(np.bincount(my_list))
print(mode1)
print(mode2)

63、ceil() 函数返回数字的上入整数。
import math

math.ceil( x )

64、格式化输出（https://www.jb51.net/article/240789.htm）
print(f"I love {'Geeks'} for "{'Geeks'}!"")

65、使用random.randint函数可以生成一个范围内的整数，但是会重复（https://blog.csdn.net/u012759006/article/details/108252836）

a = np.random.randint(0, 2, 10)

print(a) # [0 0 1 1 0 0 1 0 0 0]

66、Python二维列表的创建、转换以及访问详解（https://www.jb51.net/article/246265.htm）
追加一维列标来生成二维列标
row1 = [3, 4, 5]
row2 = [1, 5, 9]
row3 = [2, 5, 8]
row4 = [7, 8, 9]
matrix = []
matrix.append(row1)
matrix.append(row2)
matrix.append(row3)
matrix.append(row4)
print(matrix)

67、python中怎么删除列表中的元素（https://m.php.cn/article/471345.html)

emove: 删除单个元素，删除首个符合条件的元素，按值删除

str=[1,2,3,4,5,2,6]

str.remove(2)

str

[1, 3, 4, 5, 2, 6]

68、# python判断目录是否存在，不存在则创建目录
import os
wjjname=input("请输入存放目录\n") #输入目标目录
if os.path.exists(wjjname): #判断目标目录是否存在
print("目录存在")
else:
print("目录不存在")
print("正在为您创建目录")
os.mkdir(wjjname) #如果不存在则创建目标目录
print("目录创建完成")
input("按回车键退出")

69、Python将列表中的元素转化为数字并排序的示例（http://www.kaotop.com/it/22493.html）
numbers = ['2', '4', '1', '3']
numbers = [2, 4, 1, 3]
numbers = list(map(int, numbers))

70、Python中的groupby分组（https://blog.csdn.net/qq_32618817/article/details/80587228）

for i in df.groupby(['key1','key2']):
print(i)

输出：

(('a', 'one'), data1 data2 key1 key2
0 -0.293828 0.571930 a one
4 -1.943001 0.106842 a one)
(('a', 'two'), data1 data2 key1 key2
1 1.872765 1.085445 a two)
(('b', 'one'), data1 data2 key1 key2
2 -0.466504 1.26214 b one)
(('b', 'two'), data1 data2 key1 key2
3 -1.125619 -0.836119 b two)

71、【Python】Json配置文件及简单的封装函数使用（https://blog.csdn.net/AwesomeP/article/details/126832604）
import json

def readFileJson():
# 读取配置文件
with open('global_data.json','r') as f:
data = json.load(f)
return data

72、python读取文件最后几行_python读取文件最后一行两种方法（https://blog.csdn.net/weixin_39997300/article/details/109879260）

2 with open(fname, 'r', encoding='utf-8') as f: #打开文件

3 lines = f.readlines() #读取所有行

4 first_line = lines[0] #取第一行

5 last_line = lines[-1] #取最后一行

73、Python列表逆序排列（https://www.jb51.net/article/248826.htm）

会直接将列表里面的元素倒序排列不需要创建新的副本储存结果
优点:1.节省内存
缺点:1.直接修改了源数据，如果后面使用源数据的话不方便，需要再倒序一次（多余的操作)

mylist = [1, 2, 3, 4, 5]
print(mylist)
mylist.reverse()
print(mylist)

74、python文件打包（https://blog.csdn.net/linZinan_/article/details/115573895）
https://www.php.cn/faq/415527.html

方法1 pyinstaller -F -w --icon=“窗口文件图标绝对路径” 文件名.py 打包为单个exe文件，一般内部包含了依赖库，所以较大

方法2 pyinstaller -D -w --icon=“窗口文件图标绝对路径” 文件名.py 打包为一个文件夹，其中exe文件在文件夹内部，这样子单个exe文件就比较小

75、# Pandas 读取 csv 文件提示：DtypeWarning: Columns (3) have mixed types. Specify dtype option on import or set low_memory=False.

data = pd.read_csv(f, low_memory=False)

76、python读取csv文件的几种方式（含实例说明）（https://blog.csdn.net/qq_43160348/article/details/124331781）

import pandas as pd

df = pd.read_csv('../data_pro/audito_whole.csv')
print(df)

77、# Python获取文件夹目录下文件

os.listdir:参数为文件夹路径，可以返回文件夹下的所有子文件夹、文件名称

不能返回子文件夹下的文件

for file_name in os.listdir(path):
print(file_name)

78、python编写程序，生成100个0 ~ 10之间的随机整数，并统计每个元素的出现次数。（https://blog.csdn.net/jxydwhb/article/details/105418304）

import random
i = 1
d = {}
while i<101:
value = random.randint(0,10)
d[value] = d.get(value,0)+1
i = i+1
for i in range(0,11):
print("元素 {}出现的次数：{}".format(i,d[i]))

79、函数、模块、包库关系
函数<模块<包<库
function < module < package < Libraries

80、

AI-python

创建图像布局对象fig

221代表创建2行2列一共4个子图，并从左往右第1个子图开始绘图。

求方差

求标准差

输出：

os.listdir:参数为文件夹路径，可以返回文件夹下的所有子文件夹、文件名称

不能返回子文件夹下的文件

你可能感兴趣的:(AI-python)