P 1 ( x ) = 0 P_1(x)=0 P1(x)=0
P 2 ( x ) = 0 P_2(x)=0 P2(x)=0
.
.
.
P m ( x ) = 0 P_m(x)=0 Pm(x)=0
x = [ x 1 x 2 . . . x n ] x=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix} x=⎣⎢⎢⎢⎢⎢⎢⎡x1x2...xn⎦⎥⎥⎥⎥⎥⎥⎤ 属于线性解方程问题,不属于最优解
最优解的话,可以是3个未知量,满足1000个方程组等等
最难的是:把不是最优解的问题转化成最优化问题。
Ex:
P 1 ( x ) = 0 ↔ P 1 2 ( x ) = 0 P_1(x)=0\leftrightarrow P_1^2(x)=0 P1(x)=0↔P12(x)=0
P 2 ( x ) = 0 ↔ P 2 2 ( x ) = 0 P_2(x)=0\leftrightarrow P_2^2(x)=0 P2(x)=0↔P22(x)=0
.
.
.
P m ( x ) = 0 ↔ P m 2 ( x ) = 0 P_m(x)=0\leftrightarrow P_m^2(x)=0 Pm(x)=0↔Pm2(x)=0
线性方程 → \rightarrow →二次方程
实际上就是解 f ( x ) = ∑ i = 1 m P i 2 ( x ) f(x)=\sum_{i=1}^mP_i^2(x) f(x)=∑i=1mPi2(x)的min f(x)
Assumption: f(x) is twice continuously differentiable.
f ( x ) = ∑ i = 1 m w i P i 2 ( x ) w i > 0 f(x)=\sum_{i=1}^mw_iP_i^2(x) \ \ \ \ w_i>0 f(x)=∑i=1mwiPi2(x) wi>0
拿到结果后,再调整权。实际上是人工智能解决的问题
Basic structure of a numerical algorithm for minimizing f(x)
Step1: choose an initial point x 0 x_0 x0, set a convergence tolerance ε \varepsilon ε, and set a counter k = 0
Step2: determine a search direction d k d_k dk for reducing f(x) from point x k x_k xk.
Step3: determine a step size α k \alpha_k αk such that f ( x k + α d k ) f( x_k+ \alpha d_k) f(xk+αdk) is minimized for α ≥ 0 \alpha \geq0 α≥0, and constuct x k + 1 = x k + α d k x_{k+1} =x_k+ \alpha d_k xk+1=xk+αdk
Step4: if ∣ ∣ α k d k ∣ ∣ < ε ||\alpha_k d_k||<\varepsilon ∣∣αkdk∣∣<ε, stop and output a solution x k + 1 x_{k+1} xk+1,else set k := k+1 and repeat from Step 2.
a) Steps 2 and 3 are key steps of an optimization algorithm,
b) Different ways to accomplish Step 2 leads to different algorithms.
c) Step 3 is a one-dimensional optimization problem and it is often called a line search step.
First-order necessary condition: If x ∗ x^* x∗ is a minimum point (minimizer), then ▽ ( x ∗ ) = 0 \bigtriangledown (x^*)=0 ▽(x∗)=0. In other words,
if x ∗ x^* x∗ is a minimum point, then it must be a stationary point.
Second-order sufficient condition: If ▽ ( x ∗ ) = 0 \bigtriangledown (x^*)=0 ▽(x∗)=0 and H ( x ∗ ) H(x^*) H(x∗) is a positive definite matrix, i.e., H ( x ∗ ) > 0 H(x^*)>0 H(x∗)>0 ,
then x ∗ x^* x∗ is a minimum point (minimizer).
H ( x ) = [ ∂ 2 f ∂ x 1 2 ∂ 2 f ∂ x 1 . ∂ x 2 ∂ 2 f ∂ x 1 . ∂ x 3 . . . ∂ 2 f ∂ x 1 . ∂ x n ∂ 2 f ∂ x 1 . ∂ x 2 ∂ 2 f ∂ x 2 2 . . . . . . . . . . . . . . . . ∂ 2 f ∂ x 1 . ∂ x n . . . . . ∂ 2 f ∂ x n 2 ] \LARGE H(x)=\begin{bmatrix} \frac{\partial ^2f}{\partial x_1^2} & \frac{\partial ^2f}{\partial x_1.\partial x_2} &\frac{\partial ^2f}{\partial x_1.\partial x_3} &.&.&.&\frac{\partial ^2f}{\partial x_1.\partial x_n}\\ \frac{\partial ^2f}{\partial x_1.\partial x_2} & \frac{\partial ^2f}{\partial x_2^2} &.\\ .&.&.&.\\ .&.&.&.&.\\ .&.&.&.&.&.\\ \frac{\partial ^2f}{\partial x_1.\partial x_n} &.&.&.&.&.&\frac{\partial ^2f}{\partial x_n^2}\\ \end{bmatrix} H(x)=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡∂x12∂2f∂x1.∂x2∂2f...∂x1.∂xn∂2f∂x1.∂x2∂2f∂x22∂2f....∂x1.∂x3∂2f.................∂x1.∂xn∂2f∂xn2∂2f⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤
H ( x ) = ▽ ( ▽ T f ( x ) ) \LARGE\color{red}H(x) =\bigtriangledown (\bigtriangledown^Tf(x)) H(x)=▽(▽Tf(x))