信号处理之基于BuishandU的变点检测

Buishand U test 变点检测的基本原理如下:

Let X denote a normal random variate, then the following model with a single shift (change-point) can be proposed:

x [ i ] = μ + ε [ i ] x[i] = μ + ε[i] x[i]=μ+ε[i] for i = 1, …, m and x [ i ] = μ + δ + ε i x[i] = μ + δ + ε_i x[i]=μ+δ+εi for i = m + 1, …, n

with ε ≈ N ( 0 , σ ) ε \approx N(0,σ) εN(0,σ). The null hypothesis Δ = 0 is tested against the alternative δ != 0.

In the Buishand U test, the rescaled adjusted partial sums are calculated as

S [ k ] = ∑ ( x [ i ] − m e a n ( x ) ) ( 1 < = i < = n ) S[k] = ∑ (x[i] - mean(x)) (1<= i <= n) S[k]=(x[i]mean(x))(1<=i<=n)

The sample standard deviation is

D ( x ) = n − 1 ∑ ( x − μ ) D(x) = \sqrt{n^{-1} ∑(x - μ)} D(x)=n1(xμ)

The test statistic is calculated as:

U = 1 n ∗ ( n + 1 ) ∗ ∑ k = 1 n − 1 ( S [ k ] − D x ) 2 U = \frac{1} { n * (n + 1)} * \sum_{k=1}^{n-1} (S[k] - Dx)^2 U=n(n+1)1k=1n1(S[k]Dx)2

变点位置K的计算方式为:
K = arg max ⁡ ∣ S ∣ K=\argmax|S| K=argmaxS

其关键代码如下:

    xmean <- mean(x)
    n <- length(x)
    k <- 1:n
    Sk <- sapply(k, function(i) sum(x[1:i] - xmean))
    sigma <- sd(x)
    U <- 1 / (n * ( n + 1)) * sum((Sk[1:(n-1)] / sigma)^2)
    Ska <- abs(Sk)
    S <- max(Ska)
    K <- k[Ska == S]

    ## standardised value
    Skk <- (Sk / sigma)
    if (is.ts(x)){
        fr <- frequency(x)
        st <- start(x)
        ed <- end(x)
        Skk <- ts(Sk, start=st, end = ed, frequency= fr)
    }    
    ...
    attr(Skk, 'nm') <- "Sk**"

R中的bu.test()方法包含在trend包中,使用方法如下:

bu.test(x, m = 20000)

参数说明:
x
a vector of class “numeric” or a time series object of class “ts”

m
numeric, number of Monte-Carlo replicates, defaults to 20000

输出说明:
data.name
character string that denotes the input data

p.value
the p-value

statistic
the test statistic

null.value
the null hypothesis

estimates
the time of the probable change point

alternative
the alternative hypothesis

method
character string that denotes the test

data
numeric vector of Sk for plotting

The p.value is estimated with a Monte Carlo simulation using m replicates.

Critical values based on m = 19999 Monte Carlo simulations are tabulated for U by Buishand (1982, 1984).

我们来使用Nile数据集来测试一下Buishand U Test(完整代码请移步我的github)

data(Nile)
(out <- bu.test(Nile))
print(out)

Buishand U test

data: Nile
U = 2.4764, n = 100, p-value < 2.2e-16
alternative hypothesis: true delta is not equal to 0
sample estimates:
probable change point at time K
28

par(mfrow=c(2,1))
start=1871
cp=unname(out$estimate)
x=start+cp-1
plot(Nile)
abline(v=x,col='red')
plot(out)
abline(v=x,col='red')

结果如下:
信号处理之基于BuishandU的变点检测_第1张图片

可以看出以上方法实际上是检测均值的跃变点。

参考文献

Buishand U Test for Change-Point Detection

你可能感兴趣的:(AI,ML,信号处理,机器学习)