Understanding of Hilbert-Schmidt Independence Criterion (HSIC)

  • Problem
    Given ( x 1 , y 1 ) , … , ( x n , y n ) ∈ P ( x , y ) , (x_1, y_1), …, (x_n, y_n) \in P(x, y), (x1,y1),,(xn,yn)P(x,y),
    determing whether P ( x , y ) = P ( x ) × P ( y ) . P(x, y) = P(x) \times P(y). P(x,y)=P(x)×P(y).
    Or measure degree of dependence.
    Below is the formal problem setting:
    Let Pxy be a Borel probability measure defined on a domain X x Y, and let Px and Py be the respective marginal distributions on X and Y. Given an I.I.D sample Z := (X, Y) = {(x1, y1), …, (xm, ym)} of size m drawn independently and identically distributed according to Pxy, does Pxy factorize as PxPy.

  • Applications

    • Independent component analysis;
    • Dimsionality reduction and feature extraction;
    • Statistical modeling.
  • Indirect Approach

    • Perform density estimate of P(x, y)
    • Check whether the estimate approximately factorizes
  • Direct Approach

    • Check properties of factorizing distributions
    • e.g. kurtosis, covariance operators.

HSIC is defined as the squared Hilbert-Schmidt (HS) norm of the associated cross-covariance operator Cxy:
H S I C ( P x y , F , G ) = ∣ ∣ C x y ∣ ∣ H S 2 . {\rm HSIC}(P_{xy}, F, G) = || C_{xy} || ^2 _{HS}. HSIC(Pxy,F,G)=CxyHS2.
Above definition of HSIC involves two other definitions. One is cross-covariance operator which is defined:
C x y : = E x y ( [ f ( x ) − E x ( f ( x ) ) ] [ g ( y ) − E y ( g ( y ) ) ] ) . C_{xy} := {\rm E}_{xy}([f(x) - {\rm E}_{x}(f(x))] [g(y) - {\rm E}_{y}(g(y))]). Cxy:=Exy([f(x)Ex(f(x))][g(y)Ey(g(y))]).
It is obvious that Cxy is a linear opertor that maps from G to F. The operator itself can be writen in: C x y : = E x y [ ( f ( x ) − μ x ) ⊗ ( g ( y ) − μ y ) ] , C_{xy} := {\rm E}_{xy}[(f(x) - \mu_x) \otimes (g(y) - \mu_y)], Cxy:=Exy[(f(x)μx)(g(y)μy)],
where it denotes the tensor product.

你可能感兴趣的:(Understanding of Hilbert-Schmidt Independence Criterion (HSIC))