Square Loss Function in Frequentist and Bayesian View

Suppose we have X 1 , . . . . , X 2 ∼ N ( θ , σ 0 2 ) X_1,....,X_2 \sim N(\theta, \sigma_0^2) X1,....,X2N(θ,σ02)
Loss Function: Square Loss
L ( δ ( x ⃗ ) − θ ) = ( δ ( x ⃗ ) − θ ) 2 L(\delta(\vec{x}) - \theta) = (\delta(\vec{x})-\theta)^2 L(δ(x )θ)=(δ(x )θ)2
The parameter you want to estimate is θ \theta θ.

Under frequentist perspective the Rist Function can be written as:

E X ( R ( δ ( x ⃗ ) , θ ) ) = E X ( δ ( x ⃗ ) − θ ) 2 E_X(R(\delta(\vec{x}), \theta)) = E_X(\delta(\vec{x})- \theta)^2 EX(R(δ(x ),θ))=EX(δ(x )θ)2 (here we take expectation with respect to X)
is equivalent to MSE, so we can decompose it into variance + bias^2:
M S E = E X ( δ ( x ⃗ ) − θ ) 2 = E X ( δ ( x ⃗ ) − E ( δ ( x ⃗ ) ) + E ( δ ( x ⃗ ) ) − θ ) 2 = E X ( δ ( x ⃗ ) − E ( δ ( x ⃗ ) ) ) 2 + [ E X ( E ( δ ( x ⃗ ) ) − θ ) ] 2 = V a r ( δ ( x ⃗ ) ) + [ E ( δ ( x ⃗ ) ) − θ ) ] 2 = V a r ( δ ( x ⃗ ) ) + B i a s 2 \begin{aligned} MSE = E_X(\delta(\vec{x})- \theta)^2 & =E_X(\delta(\vec{x}) - E(\delta(\vec{x})) + E(\delta(\vec{x})) - \theta)^2 \\ & = E_X(\delta(\vec{x}) - E(\delta(\vec{x})) )^2 + [E_X(E(\delta(\vec{x})) - \theta)]^2\\ & =Var(\delta(\vec{x})) + [ E(\delta(\vec{x})) - \theta)]^2\\ &=Var(\delta(\vec{x})) + Bias^2 \end{aligned} MSE=EX(δ(x )θ)2=EX(δ(x )E(δ(x ))+E(δ(x ))θ)2=EX(δ(x )E(δ(x )))2+[EX(E(δ(x ))θ)]2=Var(δ(x ))+[E(δ(x ))θ)]2=Var(δ(x ))+Bias2
The above frequentist way shows that the random variable is the statistic δ ( x ⃗ ) \delta(\vec{x}) δ(x ). And finally we can find its corresponding variance and bias then the computation will be finished.

Under Bayesian perspective the Posterior Expectation can be written as:
E θ ∣ X [ δ ( x ⃗ ) − θ ] 2 = E θ ∣ X [ θ − E θ ∣ X ( θ ) + E θ ∣ X ( θ ) + δ ( x ⃗ ) ] 2 = E θ ∣ X [ θ − E θ ∣ X ( θ ) ] 2 + [ E θ ∣ X [ E θ ∣ X ( θ ) − δ ( x ⃗ ) ] ] 2 = V a r θ ∣ X ( θ ) + [ E θ ∣ X ( θ ) − δ ( x ⃗ ) ] 2 \begin{aligned} E_{\theta|X}[\delta(\vec{x}) - \theta]^2 & = E_{\theta|X}[\theta - E_{\theta|X}(\theta) + E_{\theta|X}(\theta)+\delta(\vec{x})]^2\\ &=E_{\theta|X}[\theta - E_{\theta|X}(\theta)]^2 + [E_{\theta|X}[E_{\theta|X}(\theta)-\delta(\vec{x})]]^2\\ &=Var_{\theta|X}(\theta) + [E_{\theta|X}(\theta)-\delta(\vec{x})]^2 \end{aligned} EθX[δ(x )θ]2=EθX[θEθX(θ)+EθX(θ)+δ(x )]2=EθX[θEθX(θ)]2+[EθX[EθX(θ)δ(x )]]2=VarθX(θ)+[EθX(θ)δ(x )]2
The above bayesian way shows that this time, we the theta will be treated as random variable.Then we can find its corresponding posterior mean and variance the computation is done.
Here I just want to mention that in different scenario, the manipulation is different. In frequentist way, since we treat statistic δ ( x ⃗ ) \delta(\vec{x}) δ(x ) as the random variable, we need to subtract E ( δ ( x ) ⃗ ) E(\delta\vec{(x)}) E(δ(x) ). In the other hand, in the Bayesian way, we should subtract E θ ∣ X ( θ ) E_{\theta|X}(\theta) EθX(θ)

你可能感兴趣的:(Square Loss Function in Frequentist and Bayesian View)