(一)连续随机量的生成-双变量方法

连续随机量的生成——双变量方法

  • 1. 双变量方法的理论
  • 2. 从 f X ( x ) = 1 C e − x 4 f_X(x)=\frac{1}{C} e^{-x^4} fX(x)=C1ex4采样的具体操作
  • 3. 用Python编程实现 f X ( x ) = 1 C e − ∣ x ∣ 5 2 f_{X}(x)=\frac{1}{C} e^{-\frac{|x|^5}{2}} fX(x)=C1e2x5采样
    • 3.1 Python程序
    • 3.2 Python程序步骤解析

1. 双变量方法的理论

If X = ( X 1 , X 2 ) X=\left(X_1, X_2\right) X=(X1,X2) has a joint density f X ( x 1 , x 2 ) f_X\left(x_1, x_2\right) fX(x1,x2) and g : ( x 1 , x 2 ) → ( y 1 , y 2 ) g_{:}\left(x_1, x_2\right) \rightarrow\left(y_1, y_2\right) g:(x1,x2)(y1,y2) is a differentiable transform with the inverse ( x 1 , x 2 ) = g − 1 ( y 1 , y 2 ) \left(x_1, x_2\right)=g^{-1}\left(y_1, y_2\right) (x1,x2)=g1(y1,y2). Then the random vector ( Y 1 , Y 2 ) = g ( X 1 , X 2 ) \left(Y_1, Y_2\right)=g\left(X_1, X_2\right) (Y1,Y2)=g(X1,X2) has a density
f Y ( y 1 , y 2 ) = f X ( g − 1 ( y 1 , y 2 ) ) J ( y 1 , y 2 ) f_Y\left(y_1, y_2\right)=f_X\left(g^{-1}\left(y_1, y_2\right)\right) J\left(y_1, y_2\right) fY(y1,y2)=fX(g1(y1,y2))J(y1,y2)
where J ( y 1 , y 2 ) J\left(y_1, y_2\right) J(y1,y2) is the Jacobi matrix J ( y 1 , y 2 ) = ∣ ∂ x 1 ∂ y 1 ∂ x 2 ∂ y 1 ∂ x 1 ∂ y 2 ∂ x 2 ∂ y 2 ∣ J\left(y_1, y_2\right)=\left|\begin{array}{ll}\frac{\partial x_1}{\partial y_1} & \frac{\partial x_2}{\partial y_1} \\ \frac{\partial x_1}{\partial y_2} & \frac{\partial x_2}{\partial y_2}\end{array}\right| J(y1,y2)= y1x1y2x1y1x2y2x2 .

Example: Let U 1 ∼ U ( [ 0 , 1 ] ) , U 2 ∼ U ( [ 0 , 1 ] ) U_1 \sim U([0,1]), U_2 \sim U([0,1]) U1U([0,1]),U2U([0,1]) be independent, define
x 1 = − 2 log ⁡ u 1 cos ⁡ ( 2 π u 2 ) , x 2 = − 2 log ⁡ u 1 sin ⁡ ( 2 π u 2 ) x_1=\sqrt{-2 \log u_1} \cos \left(2 \pi u_2\right), x_2=\sqrt{-2 \log u_1} \sin \left(2 \pi u_2\right) x1=2logu1 cos(2πu2),x2=2logu1 sin(2πu2)

i.e. ( x 1 , x 2 ) = g ( u 1 , u 2 ) = ( − 2 log ⁡ u 1 cos ⁡ ( 2 π u 2 ) , − 2 log ⁡ u 1 sin ⁡ ( 2 π u 2 ) ) \left(x_1, x_2\right)=g\left(u_1, u_2\right)=\left(\sqrt{-2 \log u_1} \cos (2 \pi u_2), \sqrt{-2 \log u_1} \sin \left(2 \pi u_2\right)\right) (x1,x2)=g(u1,u2)=(2logu1 cos(2πu2),2logu1 sin(2πu2)).
g − 1 ( x 1 , x 2 ) = ( exp ⁡ ( − x 1 2 + x 2 2 2 ) , 1 2 π tan ⁡ − 1 ( x 2 x 1 ) ) , f U ( u 1 , u 2 ) = 1  for  ( u 1 , u 2 ) ∈ [ 0 , 1 ] × [ 0 , 1 ] . J ( x 1 , x 2 ) = ∣ ∂ u 1 ∂ x 1 ∂ u 2 ∂ x 2 ∂ u 1 ∂ x 2 ∂ u 2 ∂ x 2 ∣ = ∣ ∂ g 1 − 1 ( x 1 , x 2 ) ∂ x 1 ∂ g 2 − 1 ( x 1 , x 2 ) ∂ x 1 ∂ g 1 − 1 ( x 1 , x 2 ) ∂ x 2 ∂ g 2 − 1 ( x 1 , x 2 ) ∂ x 2 ∣ = 1 2 π e − x 1 2 + x 2 2 2 \begin{aligned} g^{-1}\left(x_1, x_2\right) & =\left(\exp \left(-\frac{x_1^2+x_2^2}{2}\right), \frac{1}{2 \pi} \tan ^{-1}\left(\frac{x_2}{x_1}\right)\right), \\ f_U\left(u_1, u_2\right) & =1 \text { for }\left(u_1, u_2\right) \in[0,1] \times[0,1] . \\ J\left(x_1, x_2\right)=\left|\begin{array}{ll} \frac{\partial u_1}{\partial x_1} & \frac{\partial u_2}{\partial x_2} \\ \frac{\partial u_1}{\partial x_2} & \frac{\partial u_2}{\partial x_2} \end{array}\right| & =\left|\begin{array}{ll} \frac{\partial g_1^{-1}\left(x_1, x_2\right)}{\partial x_1} & \frac{\partial g_2^{-1}\left(x_1, x_2\right)}{\partial x_1} \\ \frac{\partial g_1^{-1}\left(x_1, x_2\right)}{\partial x_2} & \frac{\partial g_2^{-1}\left(x_1, x_2\right)}{\partial x_2} \end{array}\right|=\frac{1}{2 \pi} e^{-\frac{x_1^2+x_2^2}{2}} \end{aligned} g1(x1,x2)fU(u1,u2)J(x1,x2)= x1u1x2u1x2u2x2u2 =(exp(2x12+x22),2π1tan1(x1x2)),=1 for (u1,u2)[0,1]×[0,1].= x1g11(x1,x2)x2g11(x1,x2)x1g21(x1,x2)x2g21(x1,x2) =2π1e2x12+x22
Hence f X ( x 1 , x 2 ) = 1 2 π exp ⁡ ( − x 1 2 + x 2 2 2 ) f_X\left(x_1, x_2\right)=\frac{1}{2 \pi} \exp \left(-\frac{x_1^2+x_2^2}{2}\right) fX(x1,x2)=2π1exp(2x12+x22).

2. 从 f X ( x ) = 1 C e − x 4 f_X(x)=\frac{1}{C} e^{-x^4} fX(x)=C1ex4采样的具体操作

In data science, we often come across a distribution whose density is known up to a constant, e.g. a random variable X X X with a density
f X ( x ) = 1 C e − x 4 f_X(x)=\frac{1}{C} e^{-x^4} fX(x)=C1ex4
the constant C = ∫ − ∞ + ∞ e − x 4 d x C=\int_{-\infty}^{+\infty} e^{-x^4} d x C=+ex4dx is not known. Taking
f ∗ ( x ) = e − x 4 , f^*(x)=e^{-x^4}, f(x)=ex4,
define C f = { ( x 1 , x 2 ) : 0 ⩽ x 1 ⩽ f ∗ ( x 2 / x 1 ) } C_f=\left\{\left(x_1, x_2\right): 0 \leqslant x_1 \leqslant \sqrt{f^*\left(x_2 / x_1\right)}\right\} Cf={(x1,x2):0x1f(x2/x1) }. Let ( U 1 , U 2 ) \left(U_1, U_2\right) (U1,U2) be the random vector uniformly distributed on C f C_f Cf. Then Y = U 2 U 1 Y=\frac{U_2}{U_1} Y=U1U2 has a density f ∗ ( y ) ∫ − ∞ + ∞ f ∗ ( y ) d y \frac{f^*(y)}{\int_{-\infty}^{+\infty} f^*(y) d y} +f(y)dyf(y).

Indeed, let x = u 1 , y = u 2 u 1 x=u_1, y=\frac{u_2}{u_1} x=u1,y=u1u2, i.e.
( x , y ) = g ( u 1 , u 2 ) = ( u 1 , u 2 u 1 ) . ( u 1 , u 2 ) = g − 1 ( x , y ) = ( x , x y ) . J ( x , y ) = ∣ ∂ g 1 − 1 ( x , y ) ∂ x ∂ g 2 − 1 ( x , y ) ∂ x ∂ g 2 − 1 ( x , y ) ∂ y ∂ g 2 − 1 ( x , y ) ∂ y ∣ = ∣ 1 y 0 x ∣ = x . \begin{gathered} (x, y)=g\left(u_1, u_2\right)=\left(u_1, \frac{u_2}{u_1}\right) . \\ \left(u_1, u_2\right)=g^{-1}(x, y)=(x, x y) . \\ J(x, y)=\left|\begin{array}{ll} \frac{\partial g_1^{-1}(x, y)}{\partial x} & \frac{\partial g_2^{-1}(x, y)}{\partial x} \\ \frac{\partial g_2^{-1}(x, y)}{\partial y} & \frac{\partial g_2^{-1}(x, y)}{\partial y} \end{array}\right|=\left|\begin{array}{cc} 1 & y \\ 0 & x \end{array}\right|=x . \end{gathered} (x,y)=g(u1,u2)=(u1,u1u2).(u1,u2)=g1(x,y)=(x,xy).J(x,y)= xg11(x,y)yg21(x,y)xg21(x,y)yg21(x,y) = 10yx =x.
We have f ( X , Y ) ( x , y ) = x ⋅ k f_{(X, Y)}(x, y)=x \cdot k f(X,Y)(x,y)=xk, where k = 1  Area  ( C f ) k=\frac{1}{\text { Area }\left(C_f\right)} k= Area (Cf)1, since f ( U 1 , U 2 ) ( u 1 , u 2 ) = k f\left(U_1, U_2\right)\left(u_1, u_2\right)=k f(U1,U2)(u1,u2)=k for ( u 1 , u 2 ) ∈ C f \left(u_1, u_2\right) \in C_f (u1,u2)Cf. Now we consider the distribution of Y Y Y, which is given by
f Y ( y ) = ∫ − ∞ + ∞ f ( X , Y ) ( x , y ) d x = ∫ − ∞ + ∞ x ⋅ k 1 C f d x = ∫ 0 f ∗ ( y ) k x d x = k f ∗ ( y ) = f ∗ ( y ) ∫ − ∞ + ∞ f ∗ ( y ) d y . \begin{aligned} f_Y(y) & =\int_{-\infty}^{+\infty} f_{(X, Y)}(x, y) d x \\ & =\int_{-\infty}^{+\infty} x \cdot k1_{C_f} d x\\ &=\int_0^{\sqrt{f^*(y)}} k x d x \\ & =k f^*(y)=\frac{f^*(y)}{\int_{-\infty}^{+\infty} f^*(y) d y} . \end{aligned} fY(y)=+f(X,Y)(x,y)dx=+xk1Cfdx=0f(y) kxdx=kf(y)=+f(y)dyf(y).

3. 用Python编程实现 f X ( x ) = 1 C e − ∣ x ∣ 5 2 f_{X}(x)=\frac{1}{C} e^{-\frac{|x|^5}{2}} fX(x)=C1e2x5采样

Make a python or R R R program to generate 1000 random quantities from the probability distribution with a density f X ( x ) = 1 C e − ∣ x ∣ 5 2 f_{X}(x)=\frac{1}{C} e^{-\frac{|x|^5}{2}} fX(x)=C1e2x5 with C = ∫ − ∞ + ∞ e − ∣ x ∣ 5 2 d x C=\int_{-\infty}^{+\infty} e^{-\frac{|x|^5}{2}} d x C=+e2x5dx.

3.1 Python程序

import numpy as np
import matplotlib.pyplot as plt

# Define the probability density function f*(x)
def f_star(x):
    return np.exp(-np.abs(x)**5 / 2)

# Define the region C_f
def in_region(x1, x2):
    return (0 <= x1) and (x1 <= np.sqrt(f_star(x2 / x1)))

# Generate random quantities for U1 and U2
def generate_random_quantities(n):
    U1 = []
    U2 = []
    while len(U1) < n:
        x1 = np.random.rand()
        x2 = np.random.rand()
        if in_region(x1, x2):
            U1.append(x1)
            U2.append(x2)
    return U1, U2

# Generate random quantities of Y = U2 / U1
def generate_random_Y(U1, U2):
    Y = [u2 / u1 for u1, u2 in zip(U1, U2)]
    return Y

# Number of random quantities to generate
num_samples = 1000

# Generate random quantities for U1 and U2
U1, U2 = generate_random_quantities(num_samples)

# Calculate random quantities for Y = U2 / U1
random_Y = generate_random_Y(U1, U2)

# Draw a histogram
plt.hist(random_Y, bins=30, density=True, alpha=0.6, color='b', edgecolor='black')
plt.xlabel('Random Quantities Y')
plt.ylabel('Density')
plt.title('Histogram of Random Quantities Y')
plt.show()

3.2 Python程序步骤解析

import numpy as np
import matplotlib.pyplot as plt

These two lines import the 'nump’y and ‘matplotlib.pyplot’ libraries, which are used for numerical computations and plotting.

def f_star(x):
    return np.exp(-np.abs(x)**5 / 2)

This is a function ‘f_star(x)’ that defines the probability density function f ∗ ( x ) f^*(x) f(x).

def in_region(x1, x2):
    return (0 <= x1) and (x1 <= np.sqrt(f_star(x2 / x1)))

This function ‘in_region(x1, x2)’ checks whether ( x 1 , x 2 ) (x_1, x_2) (x1,x2) is within the region C f C_f Cf.

def generate_random_quantities(n):
    U1 = []
    U2 = []
    while len(U1) < n:
        x1 = np.random.rand()
        x2 = np.random.rand()
        if in_region(x1, x2):
            U1.append(x1)
            U2.append(x2)
    return U1, U2

This function ‘generate_random_quantities(n)’ generates random U 1 U_1 U1 and U 2 U_2 U2 while ensuring they are within the C f C_f Cf region. It’s a loop that continues until n n n points satisfying the condition are generated.

def generate_random_Y(U1, U2):
    Y = [u2 / u1 for u1, u2 in zip(U1, U2)]
    return Y

This function ‘generate_random_Y(U1, U2)’ calculates the random quantity Y = U 2 / U 1 Y = U_2 / U_1 Y=U2/U1 for each ( U 1 , U 2 ) (U_1, U_2) (U1,U2) pair.

num_samples = 1000

‘num_samples’ specifies how many random quantities should be generated.

U1, U2 = generate_random_quantities(num_samples)

This line generates ‘num_samples’ pairs of ( U 1 , U 2 ) (U_1, U_2) (U1,U2) that satisfy the condition.

random_Y = generate_random_Y(U1, U2)

This line calculates the random quantity Y = U 2 / U 1 Y = U_2 / U_1 Y=U2/U1 for each ( U 1 , U 2 ) (U_1, U_2) (U1,U2) pair.

plt.hist(random_Y, bins=30, density=True, alpha=0.6, color='b', edgecolor='black')
plt.xlabel('Random Quantities Y')
plt.ylabel('Density')
plt.title('Histogram of Random Quantities Y')
plt.show()

This part of the code plots a histogram of the random quantities Y. The ‘hist’ function is used to draw the histogram, ‘xlabel’ sets the x-axis label, ‘ylabel’ sets the y-axis label, ‘title’ sets the title of the plot, and finally, ‘show’ displays the plot.

Lastly, make sure you have installed the ‘numpy’ and ‘matplotlib’ libraries by using the following command:

pip install numpy matplotlib

Running this code will display a histogram of the generated random quantities Y based on the given distribution.

你可能感兴趣的:(数据科学和Python实现,算法,python)