In NumPy dimensions are called axes. The dimensions of the array is shape, a tuple of integers indicating the size of the array in each dimension.
By default, operations apply to the array as though it were a list of numbers, regardless of its shape. However, by specifying the axis parameter you can apply an operation along the specified axis of an array.
Example:
>>> import numpy as np
>>> a = np.arange(24).reshape(2,3,4)
>>> a
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
三维数组 a 可以看做是2个前后分布的3行4列数组,按数学里的模型“行”为x轴,“列”为y轴,“2个”为z轴。
对z轴(轴0)求最大值:
>>> a.max(axis=0)
array([[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
前后比较,结果为“后面”那个数组,是对的。
对x轴(轴1)求最大值:
>>> a.max(axis=1)
array([[ 8, 9, 10, 11],
[20, 21, 22, 23]])
结果怎么是这个,3行4列的数组按行求最大值不应该是这样的吗:
array([[ 3, 7, 11],
[15, 19, 23]])
为什么是按列求最大值的结果,结果反了不是么:
>>> a.max(axis=2)
array([[ 3, 7, 11],
[15, 19, 23]])
真的反了吗?Numpy有这种bug?来看看大神们的正解:
By definition, the axis number of the dimension is the index of that dimension within the array’s shape. It is also the position used to access that dimension during indexing.
For example, if a 2D array a has shape (5,6), then you can access a[0,0] up to a[4,5]. Axis 0 is thus the first dimension (the “rows”), and axis 1 is the second dimension (the “columns”). In higher dimensions, where “row” and “column” stop really making sense, try to think of the axes in terms of the shapes and indices involved.
If you do .max(axis=n), for example, then dimension n is collapsed and deleted, with all values in the new matrix equal to the max of the corresponding collapsed values. For example, if b has shape (5,6,7,8), and you do c = b.max(axis=2), then axis 2 (dimension with size 7) is collapsed, and the result has shape (5,6,8). Furthermore, c[x,y,z] is equal to the max of all elements c[x,y,:,z].
如果,b是一个shap(5, 6, 7, 8)的numpy array,
然后,c = b.max(axis=2)
那么,c的shape将是(5, 6, 8) ,因为“7”就是axis=2,被清除了。
而且,c[x, y, z] = max( b[x, y, : , z])
如果这位外国友人还没让你明白,看看下面这位国人的中文解释:
通过不同的axis,numpy会沿着不同的方向进行操作:如果不设置,那么对所有的元素操作;如果axis=0,则沿着纵轴进行操作;axis=1,则沿着横轴进行操作。但这只是简单的二位数组,如果是多维的呢?可以总结为一句话:设axis=i,则numpy沿着第i个下标变化的放下进行操作。例如刚刚的例子,可以将表示为:data =[[a00, a01],[a10,a11]],所以axis=0时,沿着第0个下标变化的方向进行操作,也就是a00->a10, a01->a11,也就是纵坐标的方向,axis=1时也类似。
回到基本的数学概念:“沿着X轴”,它的物理意义是什么?X为自变量不断增大。再看 a.max(axis=1),a.max(axis=0)的结果:
>>> a
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>> a.shape
(2, 3, 4)
>>> a.max(axis=1)
array([[ 8, 9, 10, 11],
[20, 21, 22, 23]])
>>> a.max(axis=0)
array([[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
在numpy中,使用的axis的地方非常多,比较常见的有average、max、min、sum,sort和prod等。
参考:1.http://blog.csdn.net/fangjian1204/article/details/53055219
2.http://stackoverflow.com/questions/17079279/how-is-axis-indexed-in-numpys-array/17079437#17079437
3.http://blog.csdn.net/vincent2610/article/details/53419297