Numpy:常用函数

1.写入文件可以使用numpy.savetxt(‘filename’,array)可以把数组写入到文件filename中。使用numpy.loadtxt(‘filename’,delimiter=’,or something’,usecls=sequence,unpack=True/False)读取文件。这两个函数也能对大部分数据存储使用的csv格式文件进行操作。

2.使用numpy.average(arrayone,weights=arraytwo)可以arrayone在arraytwo加权上的均值。numpy.mean(array)直接求array的均值。比如可以求成交量加权价格,时间加权价格等。

>>> c,v=numpy.loadtxt('apple.csv', delimiter=',', unpack=True)
>>> c
array([ 344.17,  345.17,  346.17,  347.17,  348.17,  349.17,  350.17,
        351.17,  352.17])
>>> v
array([ 344.4,  345.4,  346.4,  347.4,  348.4,  349.4,  350.4,  351.4,
        352.4])
>>> k=numpy.average(c,weights=v)
>>> k
348.18913509376193

3.使用numpy.max(array)和numpy.min(array)分别可求array的最大值和最小值。而numpy.ptp(array)是求array的极差,也就是最大和最小值的差。numpy.median(array)计算array排序后的中位数。numpy.var(array)计算array的方差,而numpy.std(array)计算array的标准差。(注意样本方差和总体方差的计算区别,总体方差是用总体个数去除离差平方和,而样本使用样本个数减1去除离差平方和,其中样本个数减1(即n-1)称为自由度。样本方差如此计算是为了保证样本方差是一个无偏估计量。而这些区别在numpy中具体有没有体现,还得摸索)。ndarray中array.mean()也可以直接计算array均值。

>>> c
array([ 1.,  1.,  1.,  5.,  1.,  1.])
>>> v
array([ 1.,  1.,  1.,  5.,  1.,  1.])
>>> numpy.max(c)
5.0
>>> numpy.max(v)
5.0
>>> numpy.min(v)
1.0
>>> numpy.ptp(c)
4.0
>>> c
array([ 1.,  1.,  1.,  5.,  1.,  1.])
>>> numpy.median(c)
1.0
>>> numpy.var(c)
2.2222222222222219
>>> numpy.var(c)==numpy.mean((c-c.mean())**2)##验证var()
True

4.可以使用numpy.diff(array)计算array中相邻的两个元素的差值。使用numpy.log(array)计算array中每个元素的对数值。numpy是面向浮点型数值运算的。注意numpy.loadtxt()中的converters参数的使用。numpy.where(array>num)可以提取出array元素中大于num值的下标数组。numpy.take(array,arrayindexs)可以提取出array数组中arrayindexs下标的值。numpy.argmax(array)返回array中最大值的下标,而numpy.argmin(array)返回array中最小值的小标。numpy.apply_along_axis()函数的使用要着重探讨。考察numpy.apply_along_axis()的性能提升。

>>> v
array([ 1.,  1.,  1.,  5.,  1.,  1.])
>>> numpy.diff(v)
array([ 0.,  0.,  4., -4.,  0.])
>>> numpy.diff(v)/v[:-1]
array([ 0. ,  0. ,  4. , -0.8,  0. ])
>>> def datestr2num(s):##定义日期转换函数,日期转换为数字
    return datetime.datetime.strptime(s,'%Y/%m/%d').date().weekday()

>>> dates,price=numpy.loadtxt('apple.csv',delimiter=',',usecols=(2,0),unpack=True,converters={2:datestr2num})
>>> dates
array([ 3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,
        1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.])
>>> price
array([ 46.5,  47.5,  48.5,  49.5,  50.5,  51.5,  52.5,  53.5,  54.5,
        55.5,  56.5,  57.5,  58.5,  59.5,  60.5,  61.5,  62.5,  63.5,
        64.5,  65.5,  66.5,  67.5,  68.5,  69.5,  70.5])
>>> numpy.zeros(5)##初始化一个数组
array([ 0.,  0.,  0.,  0.,  0.])

>>> for i in range(5):
    indices = numpy.where(dates==i)
    prices=numpy.take(price,indices)
    agv = numpy.mean(prices)
    print "Day",i,'prices',prices,"averange",agv


Day 0 prices [[ 48.5  53.5  58.5  63.5  68.5]] averange 58.5
Day 1 prices [[ 49.5  54.5  59.5  64.5  69.5]] averange 59.5
Day 2 prices [[ 50.5  55.5  60.5  65.5  70.5]] averange 60.5
Day 3 prices [[ 46.5  51.5  56.5  61.5  66.5]] averange 56.5
Day 4 prices [[ 47.5  52.5  57.5  62.5  67.5]] averange 57.5
>>> numpy.argmax(prices)
4
>>> numpy.argmin(prices)
0
>>> 
+++++++++++++++++++++++++++++++++++++++++++++++++++++
apple.csv
opendata         date       high     low    close
46.5    765.98  2015/1/1    48.99   44.11   47.66
47.5    766.98  2015/1/2    49.99   45.11   48.66
48.5    767.98  2015/1/5    50.99   46.11   49.66
49.5    768.98  2015/1/6    51.99   47.11   50.66
50.5    769.98  2015/1/7    52.99   48.11   51.66
51.5    770.98  2015/1/8    53.99   49.11   52.66
52.5    771.98  2015/1/9    54.99   50.11   53.66
53.5    772.98  2015/1/12   55.99   51.11   54.66
54.5    773.98  2015/1/13   56.99   52.11   55.66
55.5    774.98  2015/1/14   57.99   53.11   56.66
56.5    775.98  2015/1/15   58.99   54.11   57.66
57.5    776.98  2015/1/16   59.99   55.11   58.66
58.5    777.98  2015/1/19   60.99   56.11   59.66
59.5    778.98  2015/1/20   61.99   57.11   60.66
60.5    779.98  2015/1/21   62.99   58.11   61.66
61.5    780.98  2015/1/22   63.99   59.11   62.66
62.5    781.98  2015/1/23   64.99   60.11   63.66
63.5    782.98  2015/1/26   65.99   61.11   64.66
64.5    783.98  2015/1/27   66.99   62.11   65.66
65.5    784.98  2015/1/28   67.99   63.11   66.66
66.5    785.98  2015/1/29   68.99   64.11   67.66
67.5    786.98  2015/1/30   69.99   65.11   68.66
68.5    787.98  2015/2/2    70.99   66.11   69.66
69.5    788.98  2015/2/3    71.99   67.11   70.66
70.5    789.98  2015/2/4    72.99   68.11   71.66
+++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> opendata,highdata,lowdata,closedata=numpy.loadtxt('apple.csv',delimiter=',',usecols=(0,3,4,5),unpack=True)
>>> opendata
array([ 46.5,  47.5,  48.5,  49.5,  50.5,  51.5,  52.5,  53.5,  54.5,
        55.5,  56.5,  57.5,  58.5,  59.5,  60.5,  61.5,  62.5,  63.5,
        64.5,  65.5,  66.5,  67.5,  68.5,  69.5,  70.5])
>>> highdata
array([ 48.99,  49.99,  50.99,  51.99,  52.99,  53.99,  54.99,  55.99,
        56.99,  57.99,  58.99,  59.99,  60.99,  61.99,  62.99,  63.99,
        64.99,  65.99,  66.99,  67.99,  68.99,  69.99,  70.99,  71.99,
        72.99])
>>> lowdata
array([ 44.11,  45.11,  46.11,  47.11,  48.11,  49.11,  50.11,  51.11,
        52.11,  53.11,  54.11,  55.11,  56.11,  57.11,  58.11,  59.11,
        60.11,  61.11,  62.11,  63.11,  64.11,  65.11,  66.11,  67.11,
        68.11])
>>> closedata
array([ 47.66,  48.66,  49.66,  50.66,  51.66,  52.66,  53.66,  54.66,
        55.66,  56.66,  57.66,  58.66,  59.66,  60.66,  61.66,  62.66,
        63.66,  64.66,  65.66,  66.66,  67.66,  68.66,  69.66,  70.66,
        71.66])
>>> weekdate=numpy.loadtxt('apple.csv',delimiter=',',usecols=(2,),converters={2:datestr2num})
>>> weekdate
array([ 3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,
        1.,  2.,  3.,  4.,  0.,  1.,  2.,  3.,  4.,  0.,  1.,  2.])
>>> numpy.ravel(numpy.where(weekdate==0))[0]
2
>>> numpy.ravel(numpy.where(weekdate==4))[-1]
21
>>> weekdatearray=numpy.arange(2,22)
>>> weekdatearray
array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
       19, 20, 21])
>>> weekdatearray=numpy.split(weekdatearray,4)
>>> weekdatearray
[array([2, 3, 4, 5, 6]), array([ 7,  8,  9, 10, 11]), array([12, 13, 14, 15, 16]), array([17, 18, 19, 20, 21])]
>>> def sumerize(a,o,h,l,c):
    monday_open=o[a[0]]
    week_high=numpy.max(numpy.take(h,a))
    week_low=numpy.min(numpy.take(l,a))
    friday_close=c[a[-1]]
    return ("apple ",monday_open,week_high,week_low,friday_close)
>>> weeksummary=numpy.apply_along_axis(sumerize,1,weekdatearray,opendata,highdata,lowdata,closedata)
>>> weeksummary
array([['apple ', '48.5', '54.99', '46.11', '53.66'],
       ['apple ', '53.5', '59.99', '51.11', '58.66'],
       ['apple ', '58.5', '64.99', '56.11', '63.66'],
       ['apple ', '63.5', '69.99', '61.11', '68.66']], 
      dtype='|S6')
>>> numpy.savetxt('applesumeray.csv',weeksummary,delimiter=',',fmt="%s")

Numpy:常用函数_第1张图片

5.numpy.maximum()与numpy.minimum()的使用。

>>> numpy.maximum([2, 3, 4], [1, 5, 2])

array([2, 5, 4])
>>> 

你可能感兴趣的:(Numpy)