如何优雅地平均多条曲线 & scipy.interpolate.interp1d below the interpolation range error

为了实验结果的有效性减少偶然性,我们经常需要重复实验,平均多个结果。但是像类似auc曲线的这种东西,每次出来的一条折线的拐点都是不对齐的,这时该如何平均多条曲线呢?我的做法是先插值取x轴特定点上的y值,然后在平均每个x对应的y值们。代码如下:

from scipy import interpolate

def averageCurve(xs,ys,xnew=np.linspace(0,1,1000),kind='slinear'):
    ynews = []
    for x,y in zip(xs,ys):
        f=interpolate.interp1d(x,y,kind=kind)
        ynew=f(xnew)
        ynews.append(ynew)
    ynew = np.mean(ynews,axis=0)
    return xnew,ynew

 

然后出现了ValueError: A value in x_new is below the interpolation range.的问题

~/userfolder/anaconda3/lib/python3.7/site-packages/scipy/interpolate/interpolate.py in _evaluate(self, x_new)
    662         y_new = self._call(self, x_new)
    663         if not self._extrapolate:
--> 664             below_bounds, above_bounds = self._check_bounds(x_new)
    665             if len(y_new) > 0:
    666                 # Note fill_value must be broadcast up to the proper size

~/userfolder/anaconda3/lib/python3.7/site-packages/scipy/interpolate/interpolate.py in _check_bounds(self, x_new)
    691         # !! Could provide more information about which values are out of bounds
    692         if self.bounds_error and below_bounds.any():
--> 693             raise ValueError("A value in x_new is below the interpolation "
    694                              "range.")
    695         if self.bounds_error and above_bounds.any():

ValueError: A value in x_new is below the interpolation range.

原因是插值的点只能在现有点之内...譬如提供的参考点x从0.2到0.8,那么就不能在x=0的位置插值..

 

对于auc曲线,我的解决方案是当出现错误时添加(fpr,tpr) = (0,0)的参考点。

 

def averageAUCCurve(xs,ys,xnew=np.linspace(0,1,1000),kind='slinear'):
    ynews = []
    for x,y in zip(xs,ys):
        if min(x) >0:
            x.append(0)
            y.append(0)
        f=interpolate.interp1d(x,y,kind=kind)
        ynew=f(xnew)
        ynews.append(ynew)
    ynew = np.mean(ynews,axis=0)
    return xnew,ynew

 

你可能感兴趣的:(python,可视化)