numpy 移动平均与卷积

文章目录

      • 一、滑动平均
      • 二、一维最佳实践
      • 三、二维图像平滑平均

一、滑动平均

滑动平均滤波法(又称递推平均滤波法),时把连续取N个采样值看成一个队列 ,队列的长度固定为N ,每次采样到一个新数据放入队尾,并扔掉原来队首的一次数据.(先进先出原则) 把队列中的N个数据进行算术平均运算,就可获得新的滤波结果。N值的选取:流量,N=12;压力:N=4;液面,N=4~12;温度,N=1~4

优点: 对周期性干扰有良好的抑制作用,平滑度高 适用于高频振荡的系统

缺点: 灵敏度低 对偶然出现的脉冲性干扰的抑制作用较差 不易消除由于脉冲干扰所引起的采样值偏差 不适用于脉冲干扰比较严重的场合 比较浪费RAM

numpy 移动平均与卷积_第1张图片
numpy.convolve(a, v, mode=‘full’) 详解

参数:
a:(N,)输入的一维数组
v:(M,)输入的第二个一维数组
mode:{‘full’, ‘valid’, ‘same’}参数可选
  ‘full’ 默认值,返回每一个卷积值,长度是N+M-1,在卷积的边缘处,信号不重叠,存在边际效应。
  ‘same’ 返回的数组长度为max(M, N),边际效应依旧存在。
  ‘valid’  返回的数组长度为max(M,N)-min(M,N)+1,此时返回的是完全重叠的点。边缘的点无效。

改方法和一维卷积参数类似,a就是被卷积数据,v是卷积核大小。

关于运行平均值与一维卷积
在Python的SciPy函数或NumPy函数或模块中,用于在给定特定窗口的情况下计算一维数组的运行平均值。

python实现方法。优点:无依赖。缺点:慢

mylist = [1, 2, 3, 4, 5, 6, 7]
N = 3
cumsum, moving_aves = [0], []

for i, x in enumerate(mylist, 1):
    cumsum.append(cumsum[i-1] + x)
    if i>=N:
        moving_ave = (cumsum[i] - cumsum[i-N])/N
        #can do stuff with moving_ave here
        moving_aves.append(moving_ave)

运行平均值是卷积数学运算的一种情况。对于运行平均值,您可以沿输入滑动窗口并计算该窗口内容的平均值。对于离散的一维信号,卷积是相同的事情,除了用平均值代替之外,您还可以计算任意线性组合,即将每个元素乘以相应的系数,然后将结果相加。窗口中每个位置对应的那些系数有时称为卷积核。
现在,N个值的算术平均值为(x_1 + x_2 + … + x_N) / N,因此对应的内核为(1/N, 1/N, …, 1/N),而这正是我们使用所得到的np.ones((N,))/N。

numpy计算库方法

import numpy as np
import matplotlib.pyplot as plt

modes = ['full', 'same', 'valid']
for m in modes:
    plt.plot(np.convolve(np.ones((200,)), np.ones((50,))/50, mode=m));
plt.axis([-10, 251, -.1, 1.1]);
plt.legend(modes, loc='lower center');
plt.show()
def running_mean(x, N):
    cumsum = numpy.cumsum(numpy.insert(x, 0, 0)) 
    return (cumsum[N:] - cumsum[:-N]) / float(N)


In[3]: x = numpy.random.random(100000)
In[4]: N = 1000
In[5]: %timeit result1 = numpy.convolve(x, numpy.ones((N,))/N, mode='valid')
10 loops, best of 3: 41.4 ms per loop
In[6]: %timeit result2 = running_mean(x, N)
1000 loops, best of 3: 1.04 ms per loop

panda方法
pandas.rolling_mean

pd.Series(x).rolling(window=N).mean().iloc[N-1:].values

import numpy as np
import pandas as pd

def running_mean(x, N):
    cumsum = np.cumsum(np.insert(x, 0, 0)) 
    return (cumsum[N:] - cumsum[:-N]) / N


In [4]: x = np.random.random(100000)
In [5]: N = 1000

In [6]: %timeit np.convolve(x, np.ones((N,))/N, mode='valid')
10 loops, best of 3: 172 ms per loop

In [7]: %timeit running_mean(x, N)
100 loops, best of 3: 6.72 ms per loop

In [8]: %timeit pd.rolling_mean(x, N)[N-1:]
100 loops, best of 3: 4.74 ms per loop

In [9]: np.allclose(pd.rolling_mean(x, N)[N-1:], running_mean(x, N))
Out[9]: True

二、一维最佳实践

import talib as ta
import numpy as np
import pandas as pd
import scipy
from scipy import signal
import time as t

PAIR = info.primary_pair
PERIOD = 30

def initialize():
    storage.reset()
    storage.elapsed = storage.get('elapsed', [0,0,0,0,0,0])

def cumsum_sma(array, period):
    ret = np.cumsum(array, dtype=float)
    ret[period:] = ret[period:] - ret[:-period]
    return ret[period - 1:] / period

def pandas_sma(array, period):
    return pd.rolling_mean(array, period)

def api_sma(array, period):
    # this method is native to Tradewave and does NOT return an array
    return (data[PAIR].ma(PERIOD))

def talib_sma(array, period):
    return ta.MA(array, period)

def convolve_sma(array, period):
    return np.convolve(array, np.ones((period,))/period, mode='valid')

def fftconvolve_sma(array, period):    
    return scipy.signal.fftconvolve(
        array, np.ones((period,))/period, mode='valid')    

def tick():

    close = data[PAIR].warmup_period('close')

    t1 = t.time()
    sma_api = api_sma(close, PERIOD)
    t2 = t.time()
    sma_cumsum = cumsum_sma(close, PERIOD)
    t3 = t.time()
    sma_pandas = pandas_sma(close, PERIOD)
    t4 = t.time()
    sma_talib = talib_sma(close, PERIOD)
    t5 = t.time()
    sma_convolve = convolve_sma(close, PERIOD)
    t6 = t.time()
    sma_fftconvolve = fftconvolve_sma(close, PERIOD)
    t7 = t.time()

    storage.elapsed[-1] = storage.elapsed[-1] + t2-t1
    storage.elapsed[-2] = storage.elapsed[-2] + t3-t2
    storage.elapsed[-3] = storage.elapsed[-3] + t4-t3
    storage.elapsed[-4] = storage.elapsed[-4] + t5-t4
    storage.elapsed[-5] = storage.elapsed[-5] + t6-t5    
    storage.elapsed[-6] = storage.elapsed[-6] + t7-t6        

    plot('sma_api', sma_api)  
    plot('sma_cumsum', sma_cumsum[-5])
    plot('sma_pandas', sma_pandas[-10])
    plot('sma_talib', sma_talib[-15])
    plot('sma_convolve', sma_convolve[-20])    
    plot('sma_fftconvolve', sma_fftconvolve[-25])

def stop():

    log('ticks....: %s' % info.max_ticks)

    log('api......: %.5f' % storage.elapsed[-1])
    log('cumsum...: %.5f' % storage.elapsed[-2])
    log('pandas...: %.5f' % storage.elapsed[-3])
    log('talib....: %.5f' % storage.elapsed[-4])
    log('convolve.: %.5f' % storage.elapsed[-5])    
    log('fft......: %.5f' % storage.elapsed[-6])
#========================================================
[2015-01-31 23:00:00] ticks....: 744
[2015-01-31 23:00:00] api......: 0.16445
[2015-01-31 23:00:00] cumsum...: 0.03189
[2015-01-31 23:00:00] pandas...: 0.03677
[2015-01-31 23:00:00] talib....: 0.00700  # <<< Winner!
[2015-01-31 23:00:00] convolve.: 0.04871
[2015-01-31 23:00:00] fft......: 0.22306

三、二维图像平滑平均

图像卷积核平滑卷积

# 移动卷积均值
localMean=cv2.boxFilter(Y_Channel,-1,(winSize,winSize),normalize=True)

# 移动卷积方差
localVar=cv2.boxFilter(np.multiply(Y_Channel,Y_Channel),-1,(winSize,winSize),normalize=True)-np.multiply(localMean,localMean)

# 移动卷积标准差
localStd = np.sqrt(localVar)

参考与鸣谢
https://stackoverflow.com/questions/13728392/moving-average-or-running-mean
https://www.cnblogs.com/xiaosongshine/p/10874644.html

你可能感兴趣的:(科学计算库与可视化,numpy)