weixin_40009063

python使用如下方法规范化数组_NumPy 数据分析练习

# NumPy 数据分析练习

Numpy练习的目标仅作为学习numpy的参考，并让你脱离基础性的NumPy使用。这些问题有4个级别的难度，其中L1是最容易的，L4是最难的。

如果你想快速进阶你的numpy知识，那么numpy基础知识和高级numpy教程可能就是你要寻找的内容。

**更新：**现在有一套类似的关于pandas的练习。

# NumPy数据分析问答

# 1、导入numpy作为np，并查看版本

**难度等级：**L1

**问题：**将numpy导入为 np 并打印版本号。

答案：

import numpy as np

print(np.__version__)

# > 1.13.3

你必须将numpy导入np，才能使本练习中的其余代码正常工作。

要安装numpy，建议安装anaconda，里面已经包含了numpy。

# 2、如何创建一维数组？

**难度等级：**L1

**问题：**创建从0到9的一维数字数组

期望输出：

# > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

答案：

arr = np.arange(10)

arr

# > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# 3. 如何创建一个布尔数组？

**难度等级：**L1

**问题：**创建一个numpy数组元素值全为True(真)的数组

答案：

np.full((3, 3), True, dtype=bool)

# > array([[ True, True, True],

# > [ True, True, True],

# > [ True, True, True]], dtype=bool)

# Alternate method:

np.ones((3,3), dtype=bool)

# 4. 如何从一维数组中提取满足指定条件的元素？

**难度等级：**L1

**问题：**从 arr 中提取所有的奇数

给定：

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

期望的输出：

# > array([1, 3, 5, 7, 9])

答案：

# Input

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Solution

arr[arr % 2 == 1]

# > array([1, 3, 5, 7, 9])

# 5. 如何用numpy数组中的另一个值替换满足条件的元素项？

**难度等级：**L1

**问题：**将arr中的所有奇数替换为-1。

给定：

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

期望的输出：

# > array([ 0, -1, 2, -1, 4, -1, 6, -1, 8, -1])

答案：

arr[arr % 2 == 1] = -1

arr

# > array([ 0, -1, 2, -1, 4, -1, 6, -1, 8, -1])

# 6. 如何在不影响原始数组的情况下替换满足条件的元素项？

**难度等级：**L2

**问题：**将arr中的所有奇数替换为-1，而不改变arr。

给定：

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

期望的输出：

out

# > array([ 0, -1, 2, -1, 4, -1, 6, -1, 8, -1])

arr

# > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

答案：

arr = np.arange(10)

out = np.where(arr % 2 == 1, -1, arr)

print(arr)

out

# > [0 1 2 3 4 5 6 7 8 9]

array([ 0, -1, 2, -1, 4, -1, 6, -1, 8, -1])

# 7. 如何改变数组的形状？

**难度等级：**L1

**问题：**将一维数组转换为2行的2维数组

给定：

np.arange(10)

# > array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

期望的输出：

# > array([[0, 1, 2, 3, 4],

# > [5, 6, 7, 8, 9]])

答案：

arr = np.arange(10)

arr.reshape(2, -1) # Setting to -1 automatically decides the number of cols

# > array([[0, 1, 2, 3, 4],

# > [5, 6, 7, 8, 9]])

# 8. 如何垂直叠加两个数组？

**难度等级：**L2

**问题：**垂直堆叠数组a和数组b

给定：

a = np.arange(10).reshape(2,-1)

b = np.repeat(1, 10).reshape(2,-1)

期望的输出：

# > array([[0, 1, 2, 3, 4],

# > [5, 6, 7, 8, 9],

# > [1, 1, 1, 1, 1],

# > [1, 1, 1, 1, 1]])

答案：

a = np.arange(10).reshape(2,-1)

b = np.repeat(1, 10).reshape(2,-1)

# Answers

# Method 1:

np.concatenate([a, b], axis=0)

# Method 2:

np.vstack([a, b])

# Method 3:

np.r_[a, b]

# > array([[0, 1, 2, 3, 4],

# > [5, 6, 7, 8, 9],

# > [1, 1, 1, 1, 1],

# > [1, 1, 1, 1, 1]])

# 9. 如何水平叠加两个数组？

**难度等级：**L2

**问题：**将数组a和数组b水平堆叠。

给定：

a = np.arange(10).reshape(2,-1)

b = np.repeat(1, 10).reshape(2,-1)

期望的输出：

# > array([[0, 1, 2, 3, 4, 1, 1, 1, 1, 1],

# > [5, 6, 7, 8, 9, 1, 1, 1, 1, 1]])

答案：

a = np.arange(10).reshape(2,-1)

b = np.repeat(1, 10).reshape(2,-1)

# Answers

# Method 1:

np.concatenate([a, b], axis=1)

# Method 2:

np.hstack([a, b])

# Method 3:

np.c_[a, b]

# > array([[0, 1, 2, 3, 4, 1, 1, 1, 1, 1],

# > [5, 6, 7, 8, 9, 1, 1, 1, 1, 1]])

# 10. 如何在无硬编码的情况下生成numpy中的自定义序列？

**难度等级：**L2

**问题：**创建以下模式而不使用硬编码。只使用numpy函数和下面的输入数组a。

给定：

a = np.array([1,2,3])`

期望的输出：

# > array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])

答案：

np.r_[np.repeat(a, 3), np.tile(a, 3)]

# > array([1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])

# 11. 如何获取两个numpy数组之间的公共项？

**难度等级：**L2

**问题：**获取数组a和数组b之间的公共项。

给定：

a = np.array([1,2,3,2,3,4,3,4,5,6])

b = np.array([7,2,10,2,7,4,9,4,9,8])

期望的输出：

array([2, 4])

答案：

a = np.array([1,2,3,2,3,4,3,4,5,6])

b = np.array([7,2,10,2,7,4,9,4,9,8])

np.intersect1d(a,b)

# > array([2, 4])

# 12. 如何从一个数组中删除存在于另一个数组中的项？

**难度等级：**L2

**问题：**从数组a中删除数组b中的所有项。

给定：

a = np.array([1,2,3,4,5])

b = np.array([5,6,7,8,9])

期望的输出：

array([1,2,3,4])

答案：

a = np.array([1,2,3,4,5])

b = np.array([5,6,7,8,9])

# From 'a' remove all of 'b'

np.setdiff1d(a,b)

# > array([1, 2, 3, 4])

# 13. 如何得到两个数组元素匹配的位置？

**难度等级：**L2

**问题：**获取a和b元素匹配的位置。

给定：

a = np.array([1,2,3,2,3,4,3,4,5,6])

b = np.array([7,2,10,2,7,4,9,4,9,8])

期望的输出：

# > (array([1, 3, 5, 7]),)

答案：

a = np.array([1,2,3,2,3,4,3,4,5,6])

b = np.array([7,2,10,2,7,4,9,4,9,8])

np.where(a == b)

# > (array([1, 3, 5, 7]),)

# 14. 如何从numpy数组中提取给定范围内的所有数字？

**难度等级：**L2

**问题：**获取5到10之间的所有项目。

给定：

a = np.array([2, 6, 1, 9, 10, 3, 27])

期望的输出：

(array([6, 9, 10]),)

答案：

a = np.arange(15)

# Method 1

index = np.where((a >= 5) & (a <= 10))

a[index]

# Method 2:

index = np.where(np.logical_and(a>=5, a<=10))

a[index]

# > (array([6, 9, 10]),)

# Method 3: (thanks loganzk!)

a[(a >= 5) & (a <= 10)]

# 15. 如何创建一个python函数来处理scalars并在numpy数组上工作？

**难度等级：**L2

**问题：**转换适用于两个标量的函数maxx，以处理两个数组。

给定：

def maxx(x, y):

"""Get the maximum of two items"""

if x >= y:

return x

else:

return y

maxx(1, 5)

# > 5

期望的输出：

a = np.array([5, 7, 9, 8, 6, 4, 5])

b = np.array([6, 3, 4, 8, 9, 7, 1])

pair_max(a, b)

# > array([ 6., 7., 9., 8., 9., 7., 5.])

答案：

def maxx(x, y):

"""Get the maximum of two items"""

if x >= y:

return x

else:

return y

pair_max = np.vectorize(maxx, otypes=[float])

a = np.array([5, 7, 9, 8, 6, 4, 5])

b = np.array([6, 3, 4, 8, 9, 7, 1])

pair_max(a, b)

# > array([ 6., 7., 9., 8., 9., 7., 5.])

# 16. 如何交换二维numpy数组中的两列？

**难度等级：**L2

**问题：**在数组arr中交换列1和2。

给定：

arr = np.arange(9).reshape(3,3)

arr

答案：

# Input

arr = np.arange(9).reshape(3,3)

arr

# Solution

arr[:, [1,0,2]]

# > array([[1, 0, 2],

# > [4, 3, 5],

# > [7, 6, 8]])

# 17. 如何交换二维numpy数组中的两行？

**难度等级：**L2

**问题：**交换数组arr中的第1和第2行：

给定：

arr = np.arange(9).reshape(3,3)

arr

答案：

# Input

arr = np.arange(9).reshape(3,3)

# Solution

arr[[1,0,2], :]

# > array([[3, 4, 5],

# > [0, 1, 2],

# > [6, 7, 8]])

# 18. 如何反转二维数组的行？

**难度等级：**L2

**问题：**反转二维数组arr的行。

给定：

# Input

arr = np.arange(9).reshape(3,3)

答案：

# Input

arr = np.arange(9).reshape(3,3)

# Solution

arr[::-1]

array([[6, 7, 8],

[3, 4, 5],

[0, 1, 2]])

# 19. 如何反转二维数组的列？

**难度等级：**L2

**问题：**反转二维数组arr的列。

给定：

# Input

arr = np.arange(9).reshape(3,3)

答案：

# Input

arr = np.arange(9).reshape(3,3)

# Solution

arr[:, ::-1]

# > array([[2, 1, 0],

# > [5, 4, 3],

# > [8, 7, 6]])

# 20. 如何创建包含5到10之间随机浮动的二维数组？

**难度等级：**L2

**问题：**创建一个形状为5x3的二维数组，以包含5到10之间的随机十进制数。

答案：

# Input

arr = np.arange(9).reshape(3,3)

# Solution Method 1:

rand_arr = np.random.randint(low=5, high=10, size=(5,3)) + np.random.random((5,3))

# print(rand_arr)

# Solution Method 2:

rand_arr = np.random.uniform(5,10, size=(5,3))

print(rand_arr)

# > [[ 8.50061025 9.10531502 6.85867783]

# > [ 9.76262069 9.87717411 7.13466701]

# > [ 7.48966403 8.33409158 6.16808631]

# > [ 7.75010551 9.94535696 5.27373226]

# > [ 8.0850361 5.56165518 7.31244004]]

# 21. 如何在numpy数组中只打印小数点后三位？

**难度等级：**L1

**问题：**只打印或显示numpy数组rand_arr的小数点后3位。

给定：

rand_arr = np.random.random((5,3))

答案：

# Input

rand_arr = np.random.random((5,3))

# Create the random array

rand_arr = np.random.random([5,3])

# Limit to 3 decimal places

np.set_printoptions(precision=3)

rand_arr[:4]

# > array([[ 0.443, 0.109, 0.97 ],

# > [ 0.388, 0.447, 0.191],

# > [ 0.891, 0.474, 0.212],

# > [ 0.609, 0.518, 0.403]])

# 22. 如何通过e式科学记数法(如1e10)来打印一个numpy数组？

**难度等级：**L1

**问题：**通过e式科学记数法来打印rand_arr(如1e10)

给定：

# Create the random array

np.random.seed(100)

rand_arr = np.random.random([3,3])/1e3

rand_arr

# > array([[ 5.434049e-04, 2.783694e-04, 4.245176e-04],

# > [ 8.447761e-04, 4.718856e-06, 1.215691e-04],

# > [ 6.707491e-04, 8.258528e-04, 1.367066e-04]])

期望的输出：

# > array([[ 0.000543, 0.000278, 0.000425],

# > [ 0.000845, 0.000005, 0.000122],

# > [ 0.000671, 0.000826, 0.000137]])

答案：

# Reset printoptions to default

np.set_printoptions(suppress=False)

# Create the random array

np.random.seed(100)

rand_arr = np.random.random([3,3])/1e3

rand_arr

# > array([[ 5.434049e-04, 2.783694e-04, 4.245176e-04],

# > [ 8.447761e-04, 4.718856e-06, 1.215691e-04],

# > [ 6.707491e-04, 8.258528e-04, 1.367066e-04]])

np.set_printoptions(suppress=True, precision=6) # precision is optional

rand_arr

# > array([[ 0.000543, 0.000278, 0.000425],

# > [ 0.000845, 0.000005, 0.000122],

# > [ 0.000671, 0.000826, 0.000137]])

# 23. 如何限制numpy数组输出中打印的项目数？

**难度等级：**L1

**问题：**将numpy数组a中打印的项数限制为最多6个元素。

给定：

a = np.arange(15)

# > array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

期望的输出：

# > array([ 0, 1, 2, ..., 12, 13, 14])

答案：

np.set_printoptions(threshold=6)

a = np.arange(15)

# > array([ 0, 1, 2, ..., 12, 13, 14])

# 24. 如何打印完整的numpy数组而不截断

**难度等级：**L1

**问题：**打印完整的numpy数组a而不截断。

给定：

np.set_printoptions(threshold=6)

a = np.arange(15)

# > array([ 0, 1, 2, ..., 12, 13, 14])

期望的输出：

# > array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

答案：

# Input

np.set_printoptions(threshold=6)

a = np.arange(15)

# Solution

np.set_printoptions(threshold=np.nan)

# > array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

# 25. 如何导入数字和文本的数据集保持文本在numpy数组中完好无损？

**难度等级：**L2

**问题：**导入鸢尾属植物数据集，保持文本不变。

答案：

# Solution

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

# Print the first 3 rows

iris[:3]

# > array([[b'5.1', b'3.5', b'1.4', b'0.2', b'Iris-setosa'],

# > [b'4.9', b'3.0', b'1.4', b'0.2', b'Iris-setosa'],

# > [b'4.7', b'3.2', b'1.3', b'0.2', b'Iris-setosa']], dtype=object)

# 26. 如何从1维元组数组中提取特定列？

**难度等级：**L2

**问题：**从前面问题中导入的一维鸢尾属植物数据集中提取文本列的物种。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_1d = np.genfromtxt(url, delimiter=',', dtype=None)

答案：

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_1d = np.genfromtxt(url, delimiter=',', dtype=None)

print(iris_1d.shape)

# Solution:

species = np.array([row[4] for row in iris_1d])

species[:5]

# > (150,)

# > array([b'Iris-setosa', b'Iris-setosa', b'Iris-setosa', b'Iris-setosa',

# > b'Iris-setosa'],

# > dtype='|S18')

# 27. 如何将1维元组数组转换为2维numpy数组？

**难度等级：**L2

**问题：**通过省略鸢尾属植物数据集种类的文本字段，将一维鸢尾属植物数据集转换为二维数组iris_2d。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_1d = np.genfromtxt(url, delimiter=',', dtype=None)

答案：

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_1d = np.genfromtxt(url, delimiter=',', dtype=None)

# Solution:

# Method 1: Convert each row to a list and get the first 4 items

iris_2d = np.array([row.tolist()[:4] for row in iris_1d])

iris_2d[:4]

# Alt Method 2: Import only the first 4 columns from source url

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

iris_2d[:4]

# > array([[ 5.1, 3.5, 1.4, 0.2],

# > [ 4.9, 3. , 1.4, 0.2],

# > [ 4.7, 3.2, 1.3, 0.2],

# > [ 4.6, 3.1, 1.5, 0.2]])

# 28. 如何计算numpy数组的均值，中位数，标准差？

**难度等级：**L1

**问题：**求出鸢尾属植物萼片长度的平均值、中位数和标准差(第1列)

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

sepallength = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0])

# Solution

mu, med, sd = np.mean(sepallength), np.median(sepallength), np.std(sepallength)

print(mu, med, sd)

# > 5.84333333333 5.8 0.825301291785

# 29. 如何规范化数组，使数组的值正好介于0和1之间？

**难度等级：**L2

**问题：**创建一种标准化形式的鸢尾属植物间隔长度，其值正好介于0和1之间，这样最小值为0，最大值为1。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

sepallength = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

sepallength = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0])

# Solution

Smax, Smin = sepallength.max(), sepallength.min()

S = (sepallength - Smin)/(Smax - Smin)

# or

S = (sepallength - Smin)/sepallength.ptp() # Thanks, David Ojeda!

print(S)

# > [ 0.222 0.167 0.111 0.083 0.194 0.306 0.083 0.194 0.028 0.167

# > 0.306 0.139 0.139 0. 0.417 0.389 0.306 0.222 0.389 0.222

# > 0.306 0.222 0.083 0.222 0.139 0.194 0.194 0.25 0.25 0.111

# > 0.139 0.306 0.25 0.333 0.167 0.194 0.333 0.167 0.028 0.222

# > 0.194 0.056 0.028 0.194 0.222 0.139 0.222 0.083 0.278 0.194

# > 0.75 0.583 0.722 0.333 0.611 0.389 0.556 0.167 0.639 0.25

# > 0.194 0.444 0.472 0.5 0.361 0.667 0.361 0.417 0.528 0.361

# > 0.444 0.5 0.556 0.5 0.583 0.639 0.694 0.667 0.472 0.389

# > 0.333 0.333 0.417 0.472 0.306 0.472 0.667 0.556 0.361 0.333

# > 0.333 0.5 0.417 0.194 0.361 0.389 0.389 0.528 0.222 0.389

# > 0.556 0.417 0.778 0.556 0.611 0.917 0.167 0.833 0.667 0.806

# > 0.611 0.583 0.694 0.389 0.417 0.583 0.611 0.944 0.944 0.472

# > 0.722 0.361 0.944 0.556 0.667 0.806 0.528 0.5 0.583 0.806

# > 0.861 1. 0.583 0.556 0.5 0.944 0.556 0.583 0.472 0.722

# > 0.667 0.722 0.417 0.694 0.667 0.667 0.556 0.611 0.528 0.444]

# 30. 如何计算Softmax得分？

**难度等级：**L3

**问题：**计算sepallength的softmax分数。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

sepallength = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

sepallength = np.array([float(row[0]) for row in iris])

# Solution

def softmax(x):

"""Compute softmax values for each sets of scores in x.

https://stackoverflow.com/questions/34968722/how-to-implement-the-softmax-function-in-python"""

e_x = np.exp(x - np.max(x))

return e_x / e_x.sum(axis=0)

print(softmax(sepallength))

# > [ 0.002 0.002 0.001 0.001 0.002 0.003 0.001 0.002 0.001 0.002

# > 0.003 0.002 0.002 0.001 0.004 0.004 0.003 0.002 0.004 0.002

# > 0.003 0.002 0.001 0.002 0.002 0.002 0.002 0.002 0.002 0.001

# > 0.002 0.003 0.002 0.003 0.002 0.002 0.003 0.002 0.001 0.002

# > 0.002 0.001 0.001 0.002 0.002 0.002 0.002 0.001 0.003 0.002

# > 0.015 0.008 0.013 0.003 0.009 0.004 0.007 0.002 0.01 0.002

# > 0.002 0.005 0.005 0.006 0.004 0.011 0.004 0.004 0.007 0.004

# > 0.005 0.006 0.007 0.006 0.008 0.01 0.012 0.011 0.005 0.004

# > 0.003 0.003 0.004 0.005 0.003 0.005 0.011 0.007 0.004 0.003

# > 0.003 0.006 0.004 0.002 0.004 0.004 0.004 0.007 0.002 0.004

# > 0.007 0.004 0.016 0.007 0.009 0.027 0.002 0.02 0.011 0.018

# > 0.009 0.008 0.012 0.004 0.004 0.008 0.009 0.03 0.03 0.005

# > 0.013 0.004 0.03 0.007 0.011 0.018 0.007 0.006 0.008 0.018

# > 0.022 0.037 0.008 0.007 0.006 0.03 0.007 0.008 0.005 0.013

# > 0.011 0.013 0.004 0.012 0.011 0.011 0.007 0.009 0.007 0.005]

# 31. 如何找到numpy数组的百分位数？

**难度等级：**L1

**问题：**找到鸢尾属植物数据集的第5和第95百分位数

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

sepallength = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

sepallength = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0])

# Solution

np.percentile(sepallength, q=[5, 95])

# > array([ 4.6 , 7.255])

# 32. 如何在数组中的随机位置插入值？

**难度等级：**L2

**问题：**在iris_2d数据集中的20个随机位置插入np.nan值

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='object')

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='object')

# Method 1

i, j = np.where(iris_2d)

# i, j contain the row numbers and column numbers of 600 elements of iris_x

np.random.seed(100)

iris_2d[np.random.choice((i), 20), np.random.choice((j), 20)] = np.nan

# Method 2

np.random.seed(100)

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

# Print first 10 rows

print(iris_2d[:10])

# > [[b'5.1' b'3.5' b'1.4' b'0.2' b'Iris-setosa']

# > [b'4.9' b'3.0' b'1.4' b'0.2' b'Iris-setosa']

# > [b'4.7' b'3.2' b'1.3' b'0.2' b'Iris-setosa']

# > [b'4.6' b'3.1' b'1.5' b'0.2' b'Iris-setosa']

# > [b'5.0' b'3.6' b'1.4' b'0.2' b'Iris-setosa']

# > [b'5.4' b'3.9' b'1.7' b'0.4' b'Iris-setosa']

# > [b'4.6' b'3.4' b'1.4' b'0.3' b'Iris-setosa']

# > [b'5.0' b'3.4' b'1.5' b'0.2' b'Iris-setosa']

# > [b'4.4' nan b'1.4' b'0.2' b'Iris-setosa']

# > [b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']]

# 33. 如何在numpy数组中找到缺失值的位置？

**难度等级：**L2

**问题：**在iris_2d的sepallength中查找缺失值的数量和位置(第1列)

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float')

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

# Solution

print("Number of missing values: \n", np.isnan(iris_2d[:, 0]).sum())

print("Position of missing values: \n", np.where(np.isnan(iris_2d[:, 0])))

# > Number of missing values:

# > 5

# > Position of missing values:

# > (array([ 39, 88, 99, 130, 147]),)

# 34. 如何根据两个或多个条件过滤numpy数组？

**难度等级：**L3

**问题：**过滤具有petallength(第3列)> 1.5 和 sepallength(第1列)< 5.0 的iris_2d行

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

# Solution

condition = (iris_2d[:, 2] > 1.5) & (iris_2d[:, 0] < 5.0)

iris_2d[condition]

# > array([[ 4.8, 3.4, 1.6, 0.2],

# > [ 4.8, 3.4, 1.9, 0.2],

# > [ 4.7, 3.2, 1.6, 0.2],

# > [ 4.8, 3.1, 1.6, 0.2],

# > [ 4.9, 2.4, 3.3, 1. ],

# > [ 4.9, 2.5, 4.5, 1.7]])

# 35. 如何从numpy数组中删除包含缺失值的行？

**难度等级：**L3:

**问题：**选择没有任何nan值的iris_2d行。

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

# Solution

# No direct numpy function for this.

# Method 1:

any_nan_in_row = np.array([~np.any(np.isnan(row)) for row in iris_2d])

iris_2d[any_nan_in_row][:5]

# Method 2: (By Rong)

iris_2d[np.sum(np.isnan(iris_2d), axis = 1) == 0][:5]

# > array([[ 4.9, 3. , 1.4, 0.2],

# > [ 4.7, 3.2, 1.3, 0.2],

# > [ 4.6, 3.1, 1.5, 0.2],

# > [ 5. , 3.6, 1.4, 0.2],

# > [ 5.4, 3.9, 1.7, 0.4]])

# 36. 如何找到numpy数组的两列之间的相关性？

**难度等级：**L2

**问题：**在iris_2d中找出SepalLength(第1列)和PetalLength(第3列)之间的相关性

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

# Solution 1

np.corrcoef(iris[:, 0], iris[:, 2])[0, 1]

# Solution 2

from scipy.stats.stats import pearsonr

corr, p_value = pearsonr(iris[:, 0], iris[:, 2])

print(corr)

# Correlation coef indicates the degree of linear relationship between two numeric variables.

# It can range between -1 to +1.

# The p-value roughly indicates the probability of an uncorrelated system producing

# datasets that have a correlation at least as extreme as the one computed.

# The lower the p-value (<0.01), stronger is the significance of the relationship.

# It is not an indicator of the strength.

# > 0.871754157305

# 37. 如何查找给定数组是否具有任何空值？

**难度等级：**L2

**问题：**找出iris_2d是否有任何缺失值。

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

np.isnan(iris_2d).any()

# > False

# 38. 如何在numpy数组中用0替换所有缺失值？

**难度等级：**L2

**问题：**在numpy数组中将所有出现的nan替换为0

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

iris_2d[np.random.randint(150, size=20), np.random.randint(4, size=20)] = np.nan

# Solution

iris_2d[np.isnan(iris_2d)] = 0

iris_2d[:4]

# > array([[ 5.1, 3.5, 1.4, 0. ],

# > [ 4.9, 3. , 1.4, 0.2],

# > [ 4.7, 3.2, 1.3, 0.2],

# > [ 4.6, 3.1, 1.5, 0.2]])

# 39. 如何在numpy数组中查找唯一值的计数？

**难度等级：**L2

**问题：**找出鸢尾属植物物种中的独特值和独特值的数量

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

答案：

# Import iris keeping the text column intact

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

# Solution

# Extract the species column as an array

species = np.array([row.tolist()[4] for row in iris])

# Get the unique values and the counts

np.unique(species, return_counts=True)

# > (array([b'Iris-setosa', b'Iris-versicolor', b'Iris-virginica'],

# > dtype='|S15'), array([50, 50, 50]))

# 40. 如何将数字转换为分类(文本)数组？

**难度等级：**L2

**问题：**将iris_2d的花瓣长度(第3列)加入以形成文本数组，这样如果花瓣长度为： Less than 3 --> 'small'

3-5 --> 'medium'

'>=5 --> 'large'

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

# Bin petallength

petal_length_bin = np.digitize(iris[:, 2].astype('float'), [0, 3, 5, 10])

# Map it to respective category

label_map = {1: 'small', 2: 'medium', 3: 'large', 4: np.nan}

petal_length_cat = [label_map[x] for x in petal_length_bin]

# View

petal_length_cat[:4]

['small', 'small', 'small', 'small']

# 41. 如何从numpy数组的现有列创建新列？

**难度等级：**L2

**问题：**在iris_2d中为卷创建一个新列，其中volume是(pi x petallength x sepal_length ^ 2)/ 3

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris_2d = np.genfromtxt(url, delimiter=',', dtype='object')

# Solution

# Compute volume

sepallength = iris_2d[:, 0].astype('float')

petallength = iris_2d[:, 2].astype('float')

volume = (np.pi * petallength * (sepallength**2))/3

# Introduce new dimension to match iris_2d's

volume = volume[:, np.newaxis]

# Add the new column

out = np.hstack([iris_2d, volume])

# View

out[:4]

# > array([[b'5.1', b'3.5', b'1.4', b'0.2', b'Iris-setosa', 38.13265162927291],

# > [b'4.9', b'3.0', b'1.4', b'0.2', b'Iris-setosa', 35.200498485922445],

# > [b'4.7', b'3.2', b'1.3', b'0.2', b'Iris-setosa', 30.0723720777127],

# > [b'4.6', b'3.1', b'1.5', b'0.2', b'Iris-setosa', 33.238050274980004]], dtype=object)

# 42. 如何在numpy中进行概率抽样？

**难度等级：**L3

**问题：**随机抽鸢尾属植物的种类，使得刚毛的数量是云芝和维吉尼亚的两倍

给定：

# Import iris keeping the text column intact

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

答案：

# Import iris keeping the text column intact

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

# Solution

# Get the species column

species = iris[:, 4]

# Approach 1: Generate Probablistically

np.random.seed(100)

a = np.array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])

species_out = np.random.choice(a, 150, p=[0.5, 0.25, 0.25])

# Approach 2: Probablistic Sampling (preferred)

np.random.seed(100)

probs = np.r_[np.linspace(0, 0.500, num=50), np.linspace(0.501, .750, num=50), np.linspace(.751, 1.0, num=50)]

index = np.searchsorted(probs, np.random.random(150))

species_out = species[index]

print(np.unique(species_out, return_counts=True))

# > (array([b'Iris-setosa', b'Iris-versicolor', b'Iris-virginica'], dtype=object), array([77, 37, 36]))

方法2是首选方法，因为它创建了一个索引变量，该变量可用于取样2维表格数据。

# 43. 如何在按另一个数组分组时获取数组的第二大值？

**难度等级：**L2

**问题：**第二长的物种setosa的价值是多少

给定：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

答案：

# Import iris keeping the text column intact

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

# Solution

# Get the species and petal length columns

petal_len_setosa = iris[iris[:, 4] == b'Iris-setosa', [2]].astype('float')

# Get the second last value

np.unique(np.sort(petal_len_setosa))[-2]

# > 1.7

# 44. 如何按列对2D数组进行排序

**难度等级：**L2

**问题：**根据sepallength列对虹膜数据集进行排序。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

答案：

# Sort by column position 0: SepalLength

print(iris[iris[:,0].argsort()][:20])

# > [[b'4.3' b'3.0' b'1.1' b'0.1' b'Iris-setosa']

# > [b'4.4' b'3.2' b'1.3' b'0.2' b'Iris-setosa']

# > [b'4.4' b'3.0' b'1.3' b'0.2' b'Iris-setosa']

# > [b'4.4' b'2.9' b'1.4' b'0.2' b'Iris-setosa']

# > [b'4.5' b'2.3' b'1.3' b'0.3' b'Iris-setosa']

# > [b'4.6' b'3.6' b'1.0' b'0.2' b'Iris-setosa']

# > [b'4.6' b'3.1' b'1.5' b'0.2' b'Iris-setosa']

# > [b'4.6' b'3.4' b'1.4' b'0.3' b'Iris-setosa']

# > [b'4.6' b'3.2' b'1.4' b'0.2' b'Iris-setosa']

# > [b'4.7' b'3.2' b'1.3' b'0.2' b'Iris-setosa']

# > [b'4.7' b'3.2' b'1.6' b'0.2' b'Iris-setosa']

# > [b'4.8' b'3.0' b'1.4' b'0.1' b'Iris-setosa']

# > [b'4.8' b'3.0' b'1.4' b'0.3' b'Iris-setosa']

# > [b'4.8' b'3.4' b'1.9' b'0.2' b'Iris-setosa']

# > [b'4.8' b'3.4' b'1.6' b'0.2' b'Iris-setosa']

# > [b'4.8' b'3.1' b'1.6' b'0.2' b'Iris-setosa']

# > [b'4.9' b'2.4' b'3.3' b'1.0' b'Iris-versicolor']

# > [b'4.9' b'2.5' b'4.5' b'1.7' b'Iris-virginica']

# > [b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']

# > [b'4.9' b'3.1' b'1.5' b'0.1' b'Iris-setosa']]

# 45. 如何在numpy数组中找到最常见的值？

**难度等级：**L1

**问题：**在鸢尾属植物数据集中找到最常见的花瓣长度值(第3列)。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

答案：

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

# Solution:

vals, counts = np.unique(iris[:, 2], return_counts=True)

print(vals[np.argmax(counts)])

# > b'1.5'

# 46. 如何找到第一次出现的值大于给定值的位置？

**难度等级：**L2

**问题：**在虹膜数据集的petalwidth第4列中查找第一次出现的值大于1.0的位置。

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

答案：

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

# Solution: (edit: changed argmax to argwhere. Thanks Rong!)

np.argwhere(iris[:, 3].astype(float) > 1.0)[0]

# > 50

# 47. 如何将大于给定值的所有值替换为给定的截止值？

**难度等级：**L2

**问题：**从数组a中，替换所有大于30到30和小于10到10的值。

给定：

np.random.seed(100)

a = np.random.uniform(1,50, 20)

答案：

# Input

np.set_printoptions(precision=2)

np.random.seed(100)

a = np.random.uniform(1,50, 20)

# Solution 1: Using np.clip

np.clip(a, a_min=10, a_max=30)

# Solution 2: Using np.where

print(np.where(a < 10, 10, np.where(a > 30, 30, a)))

# > [ 27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30.

# > 11.25 10.08 10. 11.77 30. 30. 10. 30. 14.43]

# 48. 如何从numpy数组中获取最大n值的位置？

**难度等级：**L2

**问题：**获取给定数组a中前5个最大值的位置。

np.random.seed(100)

a = np.random.uniform(1,50, 20)

答案：

# Input

np.random.seed(100)

a = np.random.uniform(1,50, 20)

# Solution:

print(a.argsort())

# > [18 7 3 10 15]

# Solution 2:

np.argpartition(-a, 5)[:5]

# > [15 10 3 7 18]

# Below methods will get you the values.

# Method 1:

a[a.argsort()][-5:]

# Method 2:

np.sort(a)[-5:]

# Method 3:

np.partition(a, kth=-5)[-5:]

# Method 4:

a[np.argpartition(-a, 5)][:5]

# 49. 如何计算数组中所有可能值的行数？

**难度等级：**L4

**问题：**按行计算唯一值的计数。

给定：

np.random.seed(100)

arr = np.random.randint(1,11,size=(6, 10))

arr

> array([[ 9, 9, 4, 8, 8, 1, 5, 3, 6, 3],

> [ 3, 3, 2, 1, 9, 5, 1, 10, 7, 3],

> [ 5, 2, 6, 4, 5, 5, 4, 8, 2, 2],

> [ 8, 8, 1, 3, 10, 10, 4, 3, 6, 9],

> [ 2, 1, 8, 7, 3, 1, 9, 3, 6, 2],

> [ 9, 2, 6, 5, 3, 9, 4, 6, 1, 10]])

期望的输出：

> [[1, 0, 2, 1, 1, 1, 0, 2, 2, 0],

> [2, 1, 3, 0, 1, 0, 1, 0, 1, 1],

> [0, 3, 0, 2, 3, 1, 0, 1, 0, 0],

> [1, 0, 2, 1, 0, 1, 0, 2, 1, 2],

> [2, 2, 2, 0, 0, 1, 1, 1, 1, 0],

> [1, 1, 1, 1, 1, 2, 0, 0, 2, 1]]

输出包含10列，表示从1到10的数字。这些值是各行中数字的计数。

例如，cell(0，2)的值为2，这意味着数字3在第一行中恰好出现了2次。

答案：

# **给定：**

np.random.seed(100)

arr = np.random.randint(1,11,size=(6, 10))

arr

# > array([[ 9, 9, 4, 8, 8, 1, 5, 3, 6, 3],

# > [ 3, 3, 2, 1, 9, 5, 1, 10, 7, 3],

# > [ 5, 2, 6, 4, 5, 5, 4, 8, 2, 2],

# > [ 8, 8, 1, 3, 10, 10, 4, 3, 6, 9],

# > [ 2, 1, 8, 7, 3, 1, 9, 3, 6, 2],

# > [ 9, 2, 6, 5, 3, 9, 4, 6, 1, 10]])

# Solution

def counts_of_all_values_rowwise(arr2d):

# Unique values and its counts row wise

num_counts_array = [np.unique(row, return_counts=True) for row in arr2d]

# Counts of all values row wise

return([[int(b[a==i]) if i in a else 0 for i in np.unique(arr2d)] for a, b in num_counts_array])

# Print

print(np.arange(1,11))

counts_of_all_values_rowwise(arr)

# > [ 1 2 3 4 5 6 7 8 9 10]

# > [[1, 0, 2, 1, 1, 1, 0, 2, 2, 0],

# > [2, 1, 3, 0, 1, 0, 1, 0, 1, 1],

# > [0, 3, 0, 2, 3, 1, 0, 1, 0, 0],

# > [1, 0, 2, 1, 0, 1, 0, 2, 1, 2],

# > [2, 2, 2, 0, 0, 1, 1, 1, 1, 0],

# > [1, 1, 1, 1, 1, 2, 0, 0, 2, 1]]

# Example 2:

arr = np.array([np.array(list('bill clinton')), np.array(list('narendramodi')), np.array(list('jjayalalitha'))])

print(np.unique(arr))

counts_of_all_values_rowwise(arr)

# > [' ' 'a' 'b' 'c' 'd' 'e' 'h' 'i' 'j' 'l' 'm' 'n' 'o' 'r' 't' 'y']

# > [[1, 0, 1, 1, 0, 0, 0, 2, 0, 3, 0, 2, 1, 0, 1, 0],

# > [0, 2, 0, 0, 2, 1, 0, 1, 0, 0, 1, 2, 1, 2, 0, 0],

# > [0, 4, 0, 0, 0, 0, 1, 1, 2, 2, 0, 0, 0, 0, 1, 1]]

# 50. 如何将数组转换为平面一维数组？

**难度等级：**L2

**问题：**将array_of_arrays转换为扁平线性1d数组。

给定：

# **给定：**

arr1 = np.arange(3)

arr2 = np.arange(3,7)

arr3 = np.arange(7,10)

array_of_arrays = np.array([arr1, arr2, arr3])

array_of_arrays

# > array([array([0, 1, 2]), array([3, 4, 5, 6]), array([7, 8, 9])], dtype=object)

期望的输出：

# > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

答案：

# **给定：**

arr1 = np.arange(3)

arr2 = np.arange(3,7)

arr3 = np.arange(7,10)

array_of_arrays = np.array([arr1, arr2, arr3])

print('array_of_arrays: ', array_of_arrays)

# Solution 1

arr_2d = np.array([a for arr in array_of_arrays for a in arr])

# Solution 2:

arr_2d = np.concatenate(array_of_arrays)

print(arr_2d)

# > array_of_arrays: [array([0, 1, 2]) array([3, 4, 5, 6]) array([7, 8, 9])]

# > [0 1 2 3 4 5 6 7 8 9]

# 51. 如何在numpy中为数组生成单热编码？

**难度等级：**L4

**问题：**计算一次性编码(数组中每个唯一值的虚拟二进制变量)

给定：

np.random.seed(101)

arr = np.random.randint(1,4, size=6)

arr

# > array([2, 3, 2, 2, 2, 1])

期望输出：

# > array([[ 0., 1., 0.],

# > [ 0., 0., 1.],

# > [ 0., 1., 0.],

# > [ 1., 0., 0.]])

答案：

# **给定：**

np.random.seed(101)

arr = np.random.randint(1,4, size=6)

arr

# > array([2, 3, 2, 2, 2, 1])

# Solution:

def one_hot_encodings(arr):

uniqs = np.unique(arr)

out = np.zeros((arr.shape[0], uniqs.shape[0]))

for i, k in enumerate(arr):

out[i, k-1] = 1

return out

one_hot_encodings(arr)

# > array([[ 0., 1., 0.],

# > [ 0., 0., 1.],

# > [ 0., 1., 0.],

# > [ 1., 0., 0.]])

# Method 2:

(arr[:, None] == np.unique(arr)).view(np.int8)

# 52. 如何创建按分类变量分组的行号？

**难度等级：**L3

**问题：**创建按分类变量分组的行号。使用以下来自鸢尾属植物物种的样本作为输入。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

species = np.genfromtxt(url, delimiter=',', dtype='str', usecols=4)

species_small = np.sort(np.random.choice(species, size=20))

species_small

# > array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',

# > 'Iris-setosa', 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica', 'Iris-virginica', 'Iris-virginica'],

# > dtype='

期望的输出：

# > [0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 7]

答案：

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

species = np.genfromtxt(url, delimiter=',', dtype='str', usecols=4)

np.random.seed(100)

species_small = np.sort(np.random.choice(species, size=20))

species_small

# > array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',

# > 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica'],

# > dtype='

print([i for val in np.unique(species_small) for i, grp in enumerate(species_small[species_small==val])])

[0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5]

# 53. 如何根据给定的分类变量创建组ID？

**难度等级：**L4

**问题：**根据给定的分类变量创建组ID。使用以下来自鸢尾属植物物种的样本作为输入。

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

species = np.genfromtxt(url, delimiter=',', dtype='str', usecols=4)

species_small = np.sort(np.random.choice(species, size=20))

species_small

# > array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',

# > 'Iris-setosa', 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica', 'Iris-virginica', 'Iris-virginica'],

# > dtype='

期望的输出：

# > [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]

答案：

# **给定：**

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

species = np.genfromtxt(url, delimiter=',', dtype='str', usecols=4)

np.random.seed(100)

species_small = np.sort(np.random.choice(species, size=20))

species_small

# > array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',

# > 'Iris-setosa', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',

# > 'Iris-versicolor', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica', 'Iris-virginica', 'Iris-virginica',

# > 'Iris-virginica'],

# > dtype='

# Solution:

output = [np.argwhere(np.unique(species_small) == s).tolist()[0][0] for val in np.unique(species_small) for s in species_small[species_small==val]]

# Solution: For Loop version

output = []

uniqs = np.unique(species_small)

for val in uniqs: # uniq values in group

for s in species_small[species_small==val]: # each element in group

groupid = np.argwhere(uniqs == s).tolist()[0][0] # groupid

output.append(groupid)

print(output)

# > [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]

# 54. 如何使用numpy对数组中的项进行排名？

**难度等级：**L2

**问题：**为给定的数字数组a创建排名。

给定：

np.random.seed(10)

a = np.random.randint(20, size=10)

print(a)

# > [ 9 4 15 0 17 16 17 8 9 0]

期望输出：

[4 2 6 0 8 7 9 3 5 1]

答案：

np.random.seed(10)

a = np.random.randint(20, size=10)

print('Array: ', a)

# Solution

print(a.argsort().argsort())

print('Array: ', a)

# > Array: [ 9 4 15 0 17 16 17 8 9 0]

# > [4 2 6 0 8 7 9 3 5 1]

# > Array: [ 9 4 15 0 17 16 17 8 9 0]

# 55. 如何使用numpy对多维数组中的项进行排名？

**难度等级：**L3

**问题：**创建与给定数字数组a相同形状的排名数组。

给定：

np.random.seed(10)

a = np.random.randint(20, size=[2,5])

print(a)

# > [[ 9 4 15 0 17]

# > [16 17 8 9 0]]

期望输出：

# > [[4 2 6 0 8]

# > [7 9 3 5 1]]

答案：

# **给定：**

np.random.seed(10)

a = np.random.randint(20, size=[2,5])

print(a)

# Solution

print(a.ravel().argsort().argsort().reshape(a.shape))

# > [[ 9 4 15 0 17]

# > [16 17 8 9 0]]

# > [[4 2 6 0 8]

# > [7 9 3 5 1]]

# 56. 如何在二维numpy数组的每一行中找到最大值？

**难度等级：**L2

**问题：**计算给定数组中每行的最大值。

给定：

np.random.seed(100)

a = np.random.randint(1,10, [5,3])

# > array([[9, 9, 4],

# > [8, 8, 1],

# > [5, 3, 6],

# > [3, 3, 3],

# > [2, 1, 9]])

答案：

# Input

np.random.seed(100)

a = np.random.randint(1,10, [5,3])

# Solution 1

np.amax(a, axis=1)

# Solution 2

np.apply_along_axis(np.max, arr=a, axis=1)

# > array([9, 8, 6, 3, 9])

# 57. 如何计算二维numpy数组每行的最小值？

**难度等级：**L3

**问题：**为给定的二维numpy数组计算每行的最小值。

给定：

np.random.seed(100)

a = np.random.randint(1,10, [5,3])

# > array([[9, 9, 4],

# > [8, 8, 1],

# > [5, 3, 6],

# > [3, 3, 3],

# > [2, 1, 9]])

答案：

# Input

np.random.seed(100)

a = np.random.randint(1,10, [5,3])

# Solution

np.apply_along_axis(lambda x: np.min(x)/np.max(x), arr=a, axis=1)

# > array([ 0.44444444, 0.125 , 0.5 , 1. , 0.11111111])

# 58. 如何在numpy数组中找到重复的记录？

**难度等级：**L3

**问题：**在给定的numpy数组中找到重复的条目(第二次出现以后)，并将它们标记为True。第一次出现应该是False的。

给定：

# Input

np.random.seed(100)

a = np.random.randint(0, 5, 10)

print('Array: ', a)

# > Array: [0 0 3 0 2 4 2 2 2 2]

期望的输出：

# > [False True False True False False True True True True]

答案：

# Input

np.random.seed(100)

a = np.random.randint(0, 5, 10)

## Solution

# There is no direct function to do this as of 1.13.3

# Create an all True array

out = np.full(a.shape[0], True)

# Find the index positions of unique elements

unique_positions = np.unique(a, return_index=True)[1]

# Mark those positions as False

out[unique_positions] = False

print(out)

# > [False True False True False False True True True True]

# 59. 如何找出数字的分组均值？

**难度等级：**L3

**问题：**在二维数字数组中查找按分类列分组的数值列的平均值

给定：

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

理想的输出：

# > [[b'Iris-setosa', 3.418],

# > [b'Iris-versicolor', 2.770],

# > [b'Iris-virginica', 2.974]]

答案：

# Input

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

iris = np.genfromtxt(url, delimiter=',', dtype='object')

names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')

# Solution

# No direct way to implement this. Just a version of a workaround.

numeric_column = iris[:, 1].astype('float') # sepalwidth

grouping_column = iris[:, 4] # species

# List comprehension version

[[group_val, numeric_column[grouping_column==group_val].mean()] for group_val in np.unique(grouping_column)]

# For Loop version

output = []

for group_val in np.unique(grouping_column):

output.append([group_val, numeric_column[grouping_column==group_val].mean()])

output

# > [[b'Iris-setosa', 3.418],

# > [b'Iris-versicolor', 2.770],

# > [b'Iris-virginica', 2.974]]

# 60. 如何将PIL图像转换为numpy数组？

**难度等级：**L3

**问题：**从以下URL导入图像并将其转换为numpy数组。

URL = 'https://upload.wikimedia.org/wikipedia/commons/8/8b/Denali_Mt_McKinley.jpg'

答案：

from io import BytesIO

from PIL import Image

import PIL, requests

# Import image from URL

URL = 'https://upload.wikimedia.org/wikipedia/commons/8/8b/Denali_Mt_McKinley.jpg'

response = requests.get(URL)

# Read it as Image

I = Image.open(BytesIO(response.content))

# Optionally resize

I = I.resize([150,150])

# Convert to numpy array

arr = np.asarray(I)

# Optionaly Convert it back to an image and show

im = PIL.Image.fromarray(np.uint8(arr))

Image.Image.show(im)

# 61. 如何删除numpy数组中所有缺少的值？

**难度等级：**L2

**问题：**从一维numpy数组中删除所有NaN值

给定：

np.array([1,2,3,np.nan,5,6,7,np.nan])

期望的输出：

array([ 1., 2., 3., 5., 6., 7.])

答案：

a = np.array([1,2,3,np.nan,5,6,7,np.nan])

a[~np.isnan(a)]

# > array([ 1., 2., 3., 5., 6., 7.])

# 62. 如何计算两个数组之间的欧氏距离？

**难度等级：**L3

**问题：**计算两个数组a和数组b之间的欧氏距离。

给定：

a = np.array([1,2,3,4,5])

b = np.array([4,5,6,7,8])

答案：

# Input

a = np.array([1,2,3,4,5])

b = np.array([4,5,6,7,8])

# Solution

dist = np.linalg.norm(a-b)

dist

# > 6.7082039324993694

# 63. 如何在一维数组中找到所有的局部极大值(或峰值)？

**难度等级：**L4

**问题：**找到一个一维数字数组a中的所有峰值。峰顶是两边被较小数值包围的点。

给定：

a = np.array([1, 3, 7, 1, 2, 6, 0, 1])

期望的输出：

# > array([2, 5])

其中，2和5是峰值7和6的位置。

答案：

a = np.array([1, 3, 7, 1, 2, 6, 0, 1])

doublediff = np.diff(np.sign(np.diff(a)))

peak_locations = np.where(doublediff == -2)[0] + 1

peak_locations

# > array([2, 5])

# 64. 如何从二维数组中减去一维数组，其中一维数组的每一项从各自的行中减去？

**难度等级：**L2

**问题：**从2d数组a_2d中减去一维数组b_1D，使得b_1D的每一项从a_2d的相应行中减去。

a_2d = np.array([[3,3,3],[4,4,4],[5,5,5]])

b_1d = np.array([1,2,3])

期望的输出：

# > [[2 2 2]

# > [2 2 2]

# > [2 2 2]]

答案：

# Input

a_2d = np.array([[3,3,3],[4,4,4],[5,5,5]])

b_1d = np.array([1,2,3])

# Solution

print(a_2d - b_1d[:,None])

# > [[2 2 2]

# > [2 2 2]

# > [2 2 2]]

# 65. 如何查找数组中项的第n次重复索引？

**难度等级：**L2

**问题：**找出x中数字1的第5次重复的索引。

x = np.array([1, 2, 1, 1, 3, 4, 3, 1, 1, 2, 1, 1, 2])

答案：

x = np.array([1, 2, 1, 1, 3, 4, 3, 1, 1, 2, 1, 1, 2])

n = 5

# Solution 1: List comprehension

[i for i, v in enumerate(x) if v == 1][n-1]

# Solution 2: Numpy version

np.where(x == 1)[0][n-1]

# > 8

# 66. 如何将numpy的datetime 64对象转换为datetime的datetime对象？

**难度等级：**L2

**问题：**将numpy的datetime64对象转换为datetime的datetime对象

# **给定：** a numpy datetime64 object

dt64 = np.datetime64('2018-02-25 22:10:10')

答案：

# **给定：** a numpy datetime64 object

dt64 = np.datetime64('2018-02-25 22:10:10')

# Solution

from datetime import datetime

dt64.tolist()

# or

dt64.astype(datetime)

# > datetime.datetime(2018, 2, 25, 22, 10, 10)

# 67. 如何计算numpy数组的移动平均值？

**难度等级：**L3

**问题：**对于给定的一维数组，计算窗口大小为3的移动平均值。

给定：

np.random.seed(100)

Z = np.random.randint(10, size=10)

答案：

# Solution

# Source: https://stackoverflow.com/questions/14313510/how-to-calculate-moving-average-using-numpy

def moving_average(a, n=3) :

ret = np.cumsum(a, dtype=float)

ret[n:] = ret[n:] - ret[:-n]

return ret[n - 1:] / n

np.random.seed(100)

Z = np.random.randint(10, size=10)

print('array: ', Z)

# Method 1

moving_average(Z, n=3).round(2)

# Method 2: # Thanks AlanLRH!

# np.ones(3)/3 gives equal weights. Use np.ones(4)/4 for window size 4.

np.convolve(Z, np.ones(3)/3, mode='valid') .

# > array: [8 8 3 7 7 0 4 2 5 2]

# > moving average: [ 6.33 6. 5.67 4.67 3.67 2. 3.67 3. ]

# 68. 如何在给定起始点、长度和步骤的情况下创建一个numpy数组序列？

**难度等级：**L2

**问题：**创建长度为10的numpy数组，从5开始，在连续的数字之间的步长为3。

答案：

length = 10

start = 5

step = 3

def seq(start, length, step):

end = start + (step*length)

return np.arange(start, end, step)

seq(start, length, step)

# > array([ 5, 8, 11, 14, 17, 20, 23, 26, 29, 32])

# 69. 如何填写不规则系列的numpy日期中的缺失日期？

**难度等级：**L3

**问题：**给定一系列不连续的日期序列。填写缺失的日期，使其成为连续的日期序列。

给定：

# Input

dates = np.arange(np.datetime64('2018-02-01'), np.datetime64('2018-02-25'), 2)

print(dates)

# > ['2018-02-01' '2018-02-03' '2018-02-05' '2018-02-07' '2018-02-09'

# > '2018-02-11' '2018-02-13' '2018-02-15' '2018-02-17' '2018-02-19'

# > '2018-02-21' '2018-02-23']

答案：

# Input

dates = np.arange(np.datetime64('2018-02-01'), np.datetime64('2018-02-25'), 2)

print(dates)

# Solution ---------------

filled_in = np.array([np.arange(date, (date+d)) for date, d in zip(dates, np.diff(dates))]).reshape(-1)

# add the last day

output = np.hstack([filled_in, dates[-1]])

output

# For loop version -------

out = []

for date, d in zip(dates, np.diff(dates)):

out.append(np.arange(date, (date+d)))

filled_in = np.array(out).reshape(-1)

# add the last day

output = np.hstack([filled_in, dates[-1]])

output

# > ['2018-02-01' '2018-02-03' '2018-02-05' '2018-02-07' '2018-02-09'

# > '2018-02-11' '2018-02-13' '2018-02-15' '2018-02-17' '2018-02-19'

# > '2018-02-21' '2018-02-23']

# > array(['2018-02-01', '2018-02-02', '2018-02-03', '2018-02-04',

# > '2018-02-05', '2018-02-06', '2018-02-07', '2018-02-08',

# > '2018-02-09', '2018-02-10', '2018-02-11', '2018-02-12',

# > '2018-02-13', '2018-02-14', '2018-02-15', '2018-02-16',

# > '2018-02-17', '2018-02-18', '2018-02-19', '2018-02-20',

# > '2018-02-21', '2018-02-22', '2018-02-23'], dtype='datetime64[D]')

# 70. 如何从给定的一维数组创建步长？

**难度等级：**L4

**问题：**从给定的一维数组arr中，利用步进生成一个二维矩阵，窗口长度为4，步距为2，类似于 [[0,1,2,3], [2,3,4,5], [4,5,6,7]..]

给定：

arr = np.arange(15)

arr

# > array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

期望的输出：

# > [[ 0 1 2 3]

# > [ 2 3 4 5]

# > [ 4 5 6 7]

# > [ 6 7 8 9]

# > [ 8 9 10 11]

# > [10 11 12 13]]

答案：

def gen_strides(a, stride_len=5, window_len=5):

n_strides = ((a.size-window_len)//stride_len) + 1

# return np.array([a[s:(s+window_len)] for s in np.arange(0, a.size, stride_len)[:n_strides]])

return np.array([a[s:(s+window_len)] for s in np.arange(0, n_strides*stride_len, stride_len)])

print(gen_strides(np.arange(15), stride_len=2, window_len=4))

# > [[ 0 1 2 3]

# > [ 2 3 4 5]

# > [ 4 5 6 7]

# > [ 6 7 8 9]

# > [ 8 9 10 11]

# > [10 11 12 13]]

未完待续...

# 文章出处

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
【iOS】MVC设计模式 Magnetic_h ios mvc 设计模式 objective-c 学习 ui
MVC前言如何设计一个程序的结构，这是一门专门的学问，叫做"架构模式"（architecturalpattern），属于编程的方法论。MVC模式就是架构模式的一种。它是Apple官方推荐的App开发架构，也是一般开发者最先遇到、最经典的架构。MVC各层controller层Controller/ViewController/VC（控制器）负责协调Model和View，处理大部分逻辑它将数据从Mod
C语言宏函数南林yan C语言 c语言
一、什么是宏函数？通过宏定义的函数是宏函数。如下，编译器在预处理阶段会将Add(x,y)替换为((x)*(y))#defineAdd(x,y)((x)*(y))#defineAdd(x,y)((x)*(y))intmain(){inta=10;intb=20;intd=10;intc=Add(a+d,b)*2;cout<
《投行人生》读书笔记小蘑菇的树洞
《投行人生》----作者詹姆斯-A-朗德摩根斯坦利副主席40年的职业洞见-很短小精悍的篇幅，比较适合初入职场的新人。第一部分成功的职业生涯需要规划1.情商归为适应能力分享与协作同理心适应能力，更多的是自我意识，你有能力识别自己的情并分辨这些情绪如何影响你的思想和行为。2.对于初入职场的人的建议，细节，截止日期和数据很重要截止日期，一种有效的方法是请老板为你所有的任务进行优先级排序。和老板喝咖啡的好
30天风格练习-DAY2 黄希夷
Day2（重义）在一个周日/一周的最后一天，我来到位于市中心/市区繁华地带的一家购物中心/商场，中心内人很多/熙熙攘攘。我注意到/看见一个独行/孤身一人的年轻女孩/，留着一头引人注目/长过腰际的头发，上身穿一件暗红色/比正红色更深的衣服/穿在身体上的东西。走下扶梯的时候，她摔倒了/跌向地面，在她正要站起来/让身体离开地面的时候，过长/超过一般人长度的头发被支撑身体/躯干的手掌压/按在下面，她赶紧用
向内而求陈陈_19b4
10月27日，阴。阅读书目:《次第花开》。作者:希阿荣博堪布，是当今藏传佛家宁玛派最伟大的上师法王，如意宝晋美彭措仁波切颇具影响力的弟子之一。多年以来，赴海内外各地弘扬佛法，以正式授课、现场开示、发表文章等多种方法指导佛学弟子修行佛法。代表作《寂静之道》、《生命这出戏》、《透过佛法看世界》自出版以来一直是佛教类书籍中的畅销书。图片发自App金句:1.佛陀说，一切痛苦的根源在于我们长期以来对自身及外
html 中如何使用 uniapp 的部分方法某公司摸鱼前端 html uni-app 前端
示例代码：Documentconsole.log(window);效果展示：好了，现在就可以uni.使用相关的方法了
高级编程--XML+socket练习题 masa010 java 开发语言
1.北京华北2114.8万人上海华东2,500万人广州华南1292.68万人成都华西1417万人（1）使用dom4j将信息存入xml中（2）读取信息，并打印控制台（3）添加一个city节点与子节点（4）使用socketTCP协议编写服务端与客户端，客户端输入城市ID，服务器响应相应城市信息（5）使用socketTCP协议编写服务端与客户端，客户端要求用户输入city对象，服务端接收并使用dom4j
抖音乐买买怎么加入赚钱?赚钱方法是什么测评君高省
你会在抖音买东西吗?如果会，那么一定要免费注册一个乐买买，抖音直播间，橱窗，小视频里的小黄车买东西都可以返佣金!省下来都是自己的，分享还可以赚钱乐买买是好省旗下的抖音返佣平台，乐买买分析社交电商的价值，乐买买属于今年难得的副业项目风口机会，2019年错过做好省的搞钱的黄金时期，那么2022年千万别再错过乐买买至于我为何转到高省呢？当然是高省APP佣金更高，模式更好，终端用户不流失。【高省】是一个自
水平垂直居中的几种方法（总结） LJ小番茄 CSS_玄学语言 html javascript 前端 css css3
1.使用flexbox的justify-content和align-items.parent{display:flex;justify-content:center;/*水平居中*/align-items:center;/*垂直居中*/height:100vh;/*需要指定高度*/}2.使用grid的place-items:center.parent{display:grid;place-item
Python数据分析与可视化实战指南 William数据分析 python python 数据
在数据驱动的时代，Python因其简洁的语法、强大的库生态系统以及活跃的社区，成为了数据分析与可视化的首选语言。本文将通过一个详细的案例，带领大家学习如何使用Python进行数据分析，并通过可视化来直观呈现分析结果。一、环境准备1.1安装必要库在开始数据分析和可视化之前，我们需要安装一些常用的库。主要包括pandas、numpy、matplotlib和seaborn等。这些库分别用于数据处理、数学
每日一题——第八十一题互联网打工人no1 C语言程序设计每日一练 c语言
打印如下图案:#includeintmain(){inti,j;charch='A';for(i=1;i<5;i++,ch++){for(j=0;j<5-i;j++){printf("");//控制空格输出}for(j=1;j<2*i;j++)//条件j<2*i{printf("%c",ch);//控制字符输出}printf("\n");}return0;}
Python中os.environ基本介绍及使用方法鹤冲天Pro #Python python 服务器开发语言
文章目录python中os.environos.environ简介os.environ进行环境变量的增删改查python中os.environ的使用详解1.简介2.key字段详解2.1常见key字段3.os.environ.get()用法4.环境变量的增删改查和判断是否存在4.1新增环境变量4.2更新环境变量4.3获取环境变量4.4删除环境变量4.5判断环境变量是否存在python中os.envi
水泥质量纠纷案代理词徐宝峰律师
贵州领航建设有限公司诉贵州纳雍隆庆乌江水泥有限公司产品质量纠纷案代理词尊敬的审判长、审判员：贵州千里律师事务所接受被告贵州纳雍隆庆乌江水泥有限公司的委托，指派我担任其诉讼代理人，参加本案的诉讼活动。下面，我结合本案事实和相关法律规定发表如下代理意见，供合议庭评议案件时参考：原告应当举证证明其遭受的损失与被告生产的水泥质量的因果关系。首先水泥是一种粉状水硬性无机胶凝材料。加水搅拌后成浆体，能在空气中
Pyecharts数据可视化大屏：打造沉浸式数据分析体验我的运维人生信息可视化数据分析数据挖掘运维开发技术共享
Pyecharts数据可视化大屏：打造沉浸式数据分析体验在当今这个数据驱动的时代，如何将海量数据以直观、生动的方式展现出来，成为了数据分析师和企业决策者关注的焦点。Pyecharts，作为一款基于Python的开源数据可视化库，凭借其丰富的图表类型、灵活的配置选项以及高度的定制化能力，成为了构建数据可视化大屏的理想选择。本文将深入探讨如何利用Pyecharts打造数据可视化大屏，并通过实际代码案例
第四天旅游线路预览——从换乘中心到喀纳斯湖陟彼高冈yu 基于Google earth studio 的旅游规划和预览旅游
第四天：从贾登峪到喀纳斯风景区入口，晚上住宿贾登峪；换乘中心有4路车，喀纳斯①号车，去喀纳斯湖，路程时长约5分钟；将上面的的行程安排进行动态展示，具体步骤见”Googleearthstudio进行动态轨迹显示制作过程“、“Googleearthstudio入门教程”和“Googleearthstudio进阶教程“相关内容，得到行程如下所示：Day4-2-480p
linux中sdl的使用教程,sdl使用入门 Melissa Corvinus linux中sdl的使用教程
本文通过一个简单示例讲解SDL的基本使用流程。示例中展示一个窗口，窗口里面有个随机颜色快随机移动。当我们鼠标点击关闭按钮时间窗口关闭。基本步骤如下：1.初始化SDL并创建一个窗口。SDL_Init()初始化SDL_CreateWindow()创建窗口2.纹理渲染存储RGB和存储纹理的区别：比如一个从左到右由红色渐变到蓝色的矩形，用存储RGB的话就需要把矩形中每个点的具体颜色值存储下来；而纹理只是一
PHP环境搭建详细教程好看资源平台前端 php
PHP是一个流行的服务器端脚本语言，广泛用于Web开发。为了使PHP能够在本地或服务器上运行，我们需要搭建一个合适的PHP环境。本教程将结合最新资料，介绍在不同操作系统上搭建PHP开发环境的多种方法，包括Windows、macOS和Linux系统的安装步骤，以及本地和Docker环境的配置。1.PHP环境搭建概述PHP环境的搭建主要分为以下几类：集成开发环境：例如XAMPP、WAMP、MAMP，这
下载github patch到本地小米人er 我的博客 git patch
以下是几种从GitHub上下载以.patch结尾的补丁文件的方法：通过浏览器直接下载打开包含该.patch文件的GitHub仓库。在仓库的文件列表中找到对应的.patch文件。点击该文件，浏览器会显示文件的内容，在页面的右上角通常会有一个“Raw”按钮，点击它可以获取原始文件内容。然后在浏览器中使用快捷键（如Ctrl+S或者Command+S）将原始文件保存到本地，选择保存的文件名并确保后缀为.p
关于提高复杂业务逻辑代码可读性的思考编程经验分享开发经验 java 数据库开发语言
目录前言需求场景常规写法拆分方法领域对象总结前言实际工作中大部分时间都是在写业务逻辑，一般都是三层架构，表示层（Controller）接收客户端请求，并对入参做检验，业务逻辑层（Service）负责处理业务逻辑，一般开发都是在这一层中写具体的业务逻辑。数据访问层（Dao）是直接和数据库交互的，用于查数据给业务逻辑层，或者是将业务逻辑层处理后的数据写入数据库。简单的增删改查接口不用多说，基本上写好一
拥有断舍离的心态，过精简生活--《断舍离》读书笔记爱吃丸子的小樱桃
不知不觉间房间里的东西越来越多，虽然摆放整齐，但也时常会觉得空间逼仄，令人心生烦闷。抱着断舍离的态度，我开始阅读《断舍离》这本书，希望从书中能找到一些有效的方法，帮助我实现空间、物品上的断舍离。《断舍离》是日本作家山下英子通过自己的经历、思考和实践总结而成的，整体内涵也从刚开始的私人生活哲学的“断舍离”升华成了“人生实践哲学”，接着又成为每个人都能实行的“改变人生的断舍离”，从“哲学”逐渐升华成“
如果做到轻松在股市赚钱？只要坚持这三个原则。履霜之人
大A股里向来就有七亏二平一赚的说法，能赚钱的都是少数人。否则股市就成了慈善机构，人人都有钱赚，谁还要上班？所以说亏钱是正常的，或者说是应该的。那么那些赚钱的人又是如何做到的呢？普通人能不能找到捷径去分一杯羹呢？方法是有的，但要做到需要你有极高的自律。第一，控制仓位，散户最大的问题是追涨杀跌，只要涨起来，就把钱往股票上砸，然后被套，隔天跌的受不了，又一刀切，全部割肉。来来回回间，遍体鳞伤。所以散户首
高端密码学院笔记285 柚子_b4b4
高端幸福密码学院（高级班）幸福使者：李华第（598）期《幸福》之回归内在深层生命原动力基础篇——揭秘“激励”成长的喜悦心理案例分析主讲：刘莉一，知识扩充:成功=艰苦劳动+正确方法+少说空话。贪图省力的船夫，目标永远下游。智者的梦再美，也不如愚人实干的脚印。幸福早课堂2020.10.16星期五一笔记:1，重视和珍惜的前提是知道它的价值非常重要，当你珍惜了，你就真正定下来，真正的学到身上。2，大家需要
从0到500+，我是如何利用自媒体赚钱？一列脚印
运营公众号半个多月，从零基础的小白到现在慢慢懂了一些运营的知识。做好公众号是很不容易的，要做很多事情；排版、码字、引流…通通需要自己解决，业余时间全都花费在这上面涨这么多粉丝是真的不容易，对比知乎大佬来说，我们这种没资源，没人脉，还没钱的小透明来说，想要一个月涨粉上万，怕是今天没睡醒（不过你有的方法，算我piapia打脸）至少我是清醒的，自己慢慢努力，实现我的万粉目标！大家快来围观、支持我吧！孩子
python是什么意思中文-在python中%是什么意思编程大乐趣
Python中%有两种：1、数值运算：%代表取模，返回除法的余数。如：>>>7%212、%操作符（字符串格式化，stringformatting），说明如下：%[(name)][flags][width].[precision]typecode(name)为命名flags可以有+，-，''或0。+表示右对齐。-表示左对齐。''为一个空格，表示在正数的左侧填充一个空格，从而与负数对齐。0表示使用0填
使用Apify加载Twitter消息以进行微调的完整指南 nseejrukjhad twitter easyui 前端 python
#使用Apify加载Twitter消息以进行微调的完整指南##引言在自然语言处理领域，微调模型以适应特定任务是提升模型性能的常见方法。本文将介绍如何使用Apify从Twitter导出聊天信息，以便进一步进行微调。##主要内容###使用Apify导出推文首先，我们需要从Twitter导出推文。Apify可以帮助我们做到这一点。通过Apify的强大功能，我们可以批量抓取和导出数据，适用于各类应用场景。
深入理解 MultiQueryRetriever：提升向量数据库检索效果的强大工具 nseejrukjhad 数据库 python
深入理解MultiQueryRetriever：提升向量数据库检索效果的强大工具引言在人工智能和自然语言处理领域，高效准确的信息检索一直是一个关键挑战。传统的基于距离的向量数据库检索方法虽然广泛应用，但仍存在一些局限性。本文将介绍一种创新的解决方案：MultiQueryRetriever，它通过自动生成多个查询视角来增强检索效果，提高结果的相关性和多样性。MultiQueryRetriever的工
数组去重好奇的猫猫猫
整理自js中基础数据结构数组去重问题思考？如何去除数组中重复的项例如数组：[1,3,4,3,5]我们在做去重的时候，一开始想到的肯定是，逐个比较，外面一层循环，内层后一个与前一个一比较，如果是久不将当前这一项放进新的数组，挨个比较完之后返回一个新的去过重复的数组不好的实践方式上述方法效率极低，代码量还多，思考？有没有更好的方法这时候不禁一想当然有了！！！hashtable啊，通过对象的hash办法
Day1笔记-Python简介&标识符和关键字&输入输出 ~在杰难逃~ Python python 开发语言大数据数据分析数据挖掘
大家好，从今天开始呢，杰哥开展一个新的专栏，当然，数据分析部分也会不定时更新的，这个新的专栏主要是讲解一些Python的基础语法和知识，帮助0基础的小伙伴入门和学习Python，感兴趣的小伙伴可以开始认真学习啦！一、Python简介【了解】1.计算机工作原理编程语言就是用来定义计算机程序的形式语言。我们通过编程语言来编写程序代码，再通过语言处理程序执行向计算机发送指令，让计算机完成对应的工作，编程
121. 买卖股票的最佳时机薄荷糖的味道_fb40
给定一个数组，它的第i个元素是一支给定股票第i天的价格。如果你最多只允许完成一笔交易（即买入和卖出一支股票），设计一个算法来计算你所能获取的最大利润。注意你不能在买入股票前卖出股票。示例1:输入:[7,1,5,3,6,4]输出:5解释:在第2天（股票价格=1）的时候买入，在第5天（股票价格=6）的时候卖出，最大利润=6-1=5。注意利润不能是7-1=6,因为卖出价格需要大于买入价格。示例2:输入:
对于规范和实现，你会混淆吗？ yangshangchuan HotSpot
昨晚和朋友聊天，喝了点咖啡，由于我经常喝茶，很长时间没喝咖啡了，所以失眠了，于是起床读JVM规范，读完后在朋友圈发了一条信息： JVM Run-Time Data Areas：The Java Virtual Machine defines various run-time data areas that are used during execution of a program. So
android 网络百合不是茶网络
android的网络编程和java的一样没什么好分析的都是一些死的照着写就可以了,所以记录下来方便查找 , 服务器使用的是TomCat 服务器代码; servlet的使用需要在xml中注册 package servlet; import java.io.IOException; import java.util.Arr
[读书笔记]读法拉第传 comsci 读书笔记
1831年的时候,一年可以赚到1000英镑的人..应该很少的... 要成为一个科学家,没有足够的资金支持,很多实验都无法完成但是当钱赚够了以后....就不能够一直在商业和市场中徘徊......
随机数的产生沐刃青蛟随机数
c++中阐述随机数的方法有两种：一是产生假随机数（不管操作多少次，所产生的数都不会改变）这类随机数是使用了默认的种子值产生的，所以每次都是一样的。 //默认种子 for (int i = 0; i < 5; i++) { cout<<
PHP检测函数所在的文件名 IT独行者 PHP 函数
很简单的功能，用到PHP中的反射机制，具体使用的是ReflectionFunction类，可以获取指定函数所在PHP脚本中的具体位置。创建引用脚本。代码： [php] view plain copy // Filename: functions.php <?php&nbs
银行各系统功能简介文强chu 金融
银行各系统功能简介　业务系统核心业务系统业务功能包括：总账管理、卡系统管理、客户信息管理、额度控管、存款、贷款、资金业务、国际结算、支付结算、对外接口等清分清算系统以清算日期为准，将账务类交易、非账务类交易的手续费、代理费、网络服务费等相关费用，按费用类型计算应收、应付金额，经过清算人员确认后上送核心系统完成结算的过程国际结算系
Python学习1(pip django 安装以及第一个project) 小桔子 python django pip
最近开始学习python,要安装个pip的工具。听说这个工具很强大，安装了它，在安装第三方工具的话so easy!然后也下载了，按照别人给的教程开始安装，奶奶的怎么也安装不上！第一步：官方下载pip-1.5.6.tar.gz, https://pypi.python.org/pypi/pip easy! 第二部：解压这个压缩文件，会看到一个setup.p
php 数组 aichenglong PHP 排序数组循环多维数组
1 php中的创建数组 $product = array('tires','oil','spark');//array()实际上是语言结构而不是函数 2 如果需要创建一个升序的排列的数字保存在一个数组中，可以使用range()函数来自动创建数组 $numbers=range(1,10)//1 2 3 4 5 6 7 8 9 10 $numbers=range(1,10,
安装python2.7 AILIKES python
安装python2.7 1、下载可从 http://www.python.org/进行下载#wget https://www.python.org/ftp/python/2.7.10/Python-2.7.10.tgz 2、复制解压 #mkdir -p /opt/usr/python #cp /opt/soft/Python-2
java异常的处理探讨百合不是茶 JAVA异常
//java异常 /* 1，了解java 中的异常处理机制，有三种操作 a,声明异常 b,抛出异常 c,捕获异常 2，学会使用try-catch-finally来处理异常 3，学会如何声明异常和抛出异常 4，学会创建自己的异常 */ //2，学会使用try-catch-finally来处理异常
getElementsByName实例 bijian1013 element
实例1： <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/x
探索JUnit4扩展：Runner bijian1013 java 单元测试 JUnit
参加敏捷培训时，教练提到Junit4的Runner和Rule，于是特上网查一下，发现很多都讲的太理论，或者是举的例子实在是太牵强。多搜索了几下，搜索到两篇我觉得写的非常好的文章。文章地址：http://www.blogjava.net/jiangshachina/archive/20
[MongoDB学习笔记二]MongoDB副本集 bit1129 mongodb
1. 副本集的特性 1)一台主服务器(Primary),多台从服务器(Secondary) 2)Primary挂了之后，从服务器自动完成从它们之中选举一台服务器作为主服务器，继续工作，这就解决了单点故障，因此，在这种情况下，MongoDB集群能够继续工作 3)挂了的主服务器恢复到集群中只能以Secondary服务器的角色加入进来 2
【Spark八十一】Hive in the spark assembly bit1129 assembly
Spark SQL supports most commonly used features of HiveQL. However, different HiveQL statements are executed in different manners: 1. DDL statements (e.g. CREATE TABLE, DROP TABLE, etc.)
Nginx问题定位之监控进程异常退出 ronin47
nginx在运行过程中是否稳定，是否有异常退出过？这里总结几项平时会用到的小技巧。 1. 在error.log中查看是否有signal项，如果有，看看signal是多少。比如，这是一个异常退出的情况： $grep signal error.log 2012/12/24 16:39:56 [alert] 13661#0: worker process 13666 exited on s
No grammar constraints (DTD or XML schema).....两种解决方法 byalias xml
方法一：常用方法关闭XML验证工具栏：windows => preferences => xml => xml files => validation => Indicate when no grammar is specified:选择Ignore即可。方法二：（个人推荐）添加内容如下 <?xml version=
Netty源码学习-DefaultChannelPipeline bylijinnan netty
package com.ljn.channel; /** * ChannelPipeline采用的是Intercepting Filter 模式 * 但由于用到两个双向链表和内部类，这个模式看起来不是那么明显，需要仔细查看调用过程才发现 * * 下面对ChannelPipeline作一个模拟，只模拟关键代码： */ public class Pipeline {
MYSQL数据库常用备份及恢复语句 chicony mysql
备份MySQL数据库的命令，可以加选不同的参数选项来实现不同格式的要求。 mysqldump -h主机 -u用户名 -p密码数据库名 > 文件备份MySQL数据库为带删除表的格式，能够让该备份覆盖已有数据库而不需要手动删除原有数据库。 mysqldump -–add-drop-table -uusername -ppassword databasename > ba
小白谈谈云计算--基于Google三大论文 CrazyMizzz Google 云计算 GFS
之前在没有接触到云计算之前，只是对云计算有一点点模糊的概念，觉得这是一个很高大上的东西，似乎离我们大一的还很远。后来有机会上了一节云计算的普及课程吧，并且在之前的一周里拜读了谷歌三大论文。不敢说理解，至少囫囵吞枣啃下了一大堆看不明白的理论。现在就简单聊聊我对于云计算的了解。我先说说GFS &n
hadoop 平衡空间设置方法 daizj hadoop balancer
在hdfs-site.xml中增加设置balance的带宽，默认只有1M： <property> <name>dfs.balance.bandwidthPerSec</name> <value>10485760</value> <description&g
Eclipse程序员要掌握的常用快捷键 dcj3sjt126com 编程
判断一个人的编程水平，就看他用键盘多，还是鼠标多。用键盘一是为了输入代码（当然了，也包括注释），再有就是熟练使用快捷键。曾有人在豆瓣评《卓有成效的程序员》：“人有多大懒，才有多大闲”。之前我整理了一个程序员图书列表，目的也就是通过读书，让程序员变懒。程序员作为特殊的群体，有的人可以这么懒，懒到事情都交给机器去做，而有的人又可以那么勤奋，每天都孜孜不倦得
Android学习之路 dcj3sjt126com Android学习
转自：http://blog.csdn.net/ryantang03/article/details/6901459 以前有J2EE基础，接触JAVA也有两三年的时间了，上手Android并不困难，思维上稍微转变一下就可以很快适应。以前做的都是WEB项目，现今体验移动终端项目，让我越来越觉得移动互联网应用是未来的主宰。下面说说我学习Android的感受，我学Android首先是看MARS的视
java 遍历Map的四种方法 eksliang java HashMap java 遍历Map的四种方法
转载请出自出处： http://eksliang.iteye.com/blog/2059996 package com.ickes; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.Map.Entry; /** * 遍历Map的四种方式
【精典】数据库相关相关 gengzg 数据库
package C3P0; import java.sql.Connection; import java.sql.SQLException; import java.beans.PropertyVetoException; import com.mchange.v2.c3p0.ComboPooledDataSource; public class DBPool{
自动补全 huyana_town 自动补全
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml&quo
jquery在线预览PDF文件，打开PDF文件天梯梦 jquery
最主要的是使用到了一个jquery的插件jquery.media.js，使用这个插件就很容易实现了。核心代码 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.
ViewPager刷新单个页面的方法 lovelease android viewpager tag 刷新
使用ViewPager做滑动切换图片的效果时，如果图片是从网络下载的，那么再子线程中下载完图片时我们会使用handler通知UI线程，然后UI线程就可以调用mViewPager.getAdapter().notifyDataSetChanged()进行页面的刷新，但是viewpager不同于listview，你会发现单纯的调用notifyDataSetChanged()并不能刷新页面
利用按位取反（~）从复合枚举值里清除枚举值草料场 enum
以 C# 中的 System.Drawing.FontStyle 为例。如果需要同时有多种效果，如：“粗体”和“下划线”的效果，可以用按位或（|） FontStyle style = FontStyle.Bold | FontStyle.Underline; 如果需要去除 style 里的某一种效果，
Linux系统新手学习的11点建议刘星宇编程工作 linux 脚本
　　随着Linux应用的扩展许多朋友开始接触Linux，根据学习Windwos的经验往往有一些茫然的感觉：不知从何处开始学起。这里介绍学习Linux的一些建议。　　一、从基础开始：常常有些朋友在Linux论坛问一些问题，不过，其中大多数的问题都是很基础的。例如：为什么我使用一个命令的时候，系统告诉我找不到该目录，我要如何限制使用者的权限等问题，这些问题其实都不是很难的，只要了解了 Linu
hibernate dao层应用之HibernateDaoSupport二次封装 wangzhezichuan DAO Hibernate
/** * 方法描述:sql语句查询返回List<Class> * 方法备注: Class 只能是自定义类 * @param calzz * @param sql * @return * 创建人：王川 * 创建时间：Jul

python使用如下方法规范化数组_NumPy 数据分析练习

你可能感兴趣的:(python使用如下方法规范化数组_NumPy 数据分析练习)