使用Python实现Primal-Dual Interior Point Method

Primal Dual Problem

对于每一个优化问题(Primal problem),都有其对应的对偶问题(Dual problem)。每个优化问题的对偶问题可设置一个优化问题的Lagrangian function并求其最优解得出。例如:

对优化问题  min \frac{1}{2}x^{T}Qx+c^{T}x 约束条件为Ax=bx\geq 0,Lagrangian funciton可表示为:L(x)=\frac{1}{2}x^{T}Qx+c^{T}x-\mu ^{T}x+\lambda ^{T}(Ax-b),对其求最优解可得:,将x带入Lagrangian function,则得到对偶问题为:max -\frac{1}{2}(c-\mu +A^{T}\lambda )^{T}Q^{-1}(c-\mu +A^{T}\lambda)-b^{T}\lambda约束条件为\mu \geq 0

在每次迭代求解时,对偶问题都为初始问题提供了一个下限(lower bound),在迭代的过程中初始问题和对偶问题会向着各自的最优解方向移动,当初始问题达到最优解时其对偶问题也达到最优解,此时初始问题和其对偶问题最优解对应的目标函数值相同。

Interior Point Method

Interior Point method 为上述求初始问题和对偶问题的最优解提供了一个可迭代的算法,在每次的迭代过程中初始问题目标函数的值与对偶问题目标函数的值之差会越来越小,当该算法收敛时初始问题的目标函数的值和对偶函数的目标函数的值相等。

Barrier Reformulation

为了更简便地使用迭代算法并保证每次迭代后的点都在可行区间内,可以使用-\mu \sum ln(x)来替代约束条件x\geq 0,因此一个线性优化问题就可转化为:

,其Lagrangian function可表示为:

为求得其最优解:,设z为\mu X^{-1}e,则可构成一个等式系统:F(x,\pi ,z)=\begin{vmatrix} A^{T}\pi -z+c\\ Ax-b\\ XZe\end{vmatrix}=\begin{bmatrix} 0\\ 0\\ \mu e\end{bmatrix}。使用Newton‘s method来依次迭代求得该系统的解:

\begin{vmatrix} 0 & A^{T} & I\\ A& 0& 0\\ Z& 0& X\end{vmatrix}\begin{bmatrix} dx\\ d\pi \\ dz\end{bmatrix}=\begin{bmatrix} 0\\ 0\\ -XZe+\mu e\end{bmatrix},    \begin{bmatrix} x^{(k+1)}\\ \pi ^{(K+1)}\\ z^{(k+1)}\end{bmatrix}=\begin{bmatrix} x^{(k)}\\ \pi ^{(k)}\\ z^{(k)}\end{bmatrix}+\alpha\begin{bmatrix} dx\\ d\pi \\ dz\end{bmatrix}。 该等式系统定义了一条中心路径,沿着这条路径即可找到最优解:,如下图所示:

                                                                                                                                      

为了保证每次迭代得到的x都在中心路径附近,定义一个参数\tau\in (0,1),该参数决定了每次迭代得到的x是否靠近中心路径。同时使用y=\frac{x^{T}z}{n}来测量每次迭代到达最优解的进度,于是上述等式系统就可转化为:\begin{vmatrix} 0 & A^{T} & I\\ A& 0& 0\\ Z& 0& X\end{vmatrix}\begin{bmatrix} dx\\ d\pi \\ dz\end{bmatrix}=\begin{bmatrix} 0\\ 0\\ -XZe+\tau y e\end{bmatrix}

使用Python实现Primal-Dual Interior Point Method

因为舍入误差,算法可能无法收敛到最优解,因此设置停止条件为:

该算法中加入了Predictor-Corrector步骤,其中Predictor步骤不考虑中心路径计算出下降方向,Corrector步骤使用泰勒二阶近似使下降方向靠近中心路径。

该函数只适用于目标函数为二次函数求最优解的情况,其他情况与上述推导过程类似。

def primal_dual_interior_point(c,Q,A,b,x,pi,z,eta):
    """ 
    Description: 
    this function uses primal-dual interior method to 
    solve convex quadratic constrained optimization
    Input:
    c: coefficient of linear part
    Q: coefficient of quadratic part
    A: coefficient of linear constraint
    b: value of linear constraint
    x, pi, z: column vectors of initial infeasible solution
    eta: damping parameter
    Output: 
    Optimal solution of x,pi,z
    """
    # import library
    import numpy as np
    # set tolerance
    tol=1E-8
    # number of variables
    var_num=len(x)
    # calcualte stopping criteria
    c_1=np.linalg.norm(np.dot(A,x)-b)
    c_2=np.linalg.norm(-np.dot(Q,x)+np.dot(A.T,pi)+z-c)
    c_3=np.dot(x.T,z)
    # initialize counter 
    counter=0
    # print the information of initial infeasible solution 
    print("=====Iteration{}=====".format(counter))
    print('x: {}'.format(x))
    print('pi: {}'.format(pi))
    print('z: {}'.format(z))
    print('tau: {}'.format('-'))
    print('primal problem: {}'.format(np.dot(c.T,x)+0.5*np.dot(x.T,Q).dot(x)))
    print('dual problem: {}'.format(-np.dot(b.T,pi)-0.5*np.dot(x.T,Q).dot(x)))
    print('Residual of primal: {}'.format(c_1))
    print('Residual of dual: {}'.format(c_2))
    print('x.T dot x: {}'.format(c_3))
    # check stopping criteria 
    while (c_1 > tol) | (c_2>tol) | (c_3>tol):
        # calculate residules for the infeasible solution
        r_p=np.dot(A,x)-b
        r_d=-np.dot(Q,x)+np.dot(A.T,pi)+z-c
        XZe=np.dot(np.diag(x),np.diag(z).dot(np.ones((var_num))))
        # stack the coefficient matrix
        coef_1=np.vstack((-Q,A,np.diag(z)))
        coef_2=np.vstack((A.T,np.zeros((coef_1.shape[0]-var_num,A.shape[0]))))
        coef_3=np.vstack((np.identity(var_num),np.zeros((A.shape[0],var_num)),np.diag(x)))
        coef=np.hstack((coef_1,coef_2,coef_3))
        # stack the residuals
        r=np.vstack((-r_d.reshape(-1,1),-r_p.reshape(-1,1),-XZe.reshape(-1,1)))
        # solve for affine scaling directions
        d_aff=np.linalg.solve(coef,r)
        # get direction for each part
        d_x_aff=d_aff[:var_num]
        d_pi_aff=d_aff[var_num:coef.shape[0]-var_num]
        d_z_aff=d_aff[-var_num:]
        # find step length of the affine scaling direction
        # create a list to store possible selections
        selections=[1]
        # iterate through possible selections and append result to the list
        for i in range(var_num):
            if d_x_aff[i] < 0:
                selections.append((-x[i]/d_x_aff[i])[0])
            if d_z_aff[i] < 0:
                selections.append((-z[i]/d_z_aff[i])[0])
        # find the minimum value in possible selections 
        selections=np.array(selections)
        alpha_aff=np.min(selections)
        # calculate the duality measure 
        y=np.dot(x.T,z)/var_num
        y_aff=np.dot((x+alpha_aff*d_x_aff.reshape(-1)).T,(z+alpha_aff*d_z_aff.reshape(-1)))/var_num
        # calculate the centering parameter 
        tau=(y_aff/y)**3
        # stack the adjusted residuals
        r_l=-XZe.reshape(-1,1)+np.diag(d_x_aff.reshape(-1)).dot(np.diag(d_z_aff.reshape(-1))).dot(np.ones((var_num,1)))+tau*y*np.ones((var_num,1))
        r_adjust=np.vstack((-r_d.reshape(-1,1),-r_p.reshape(-1,1),r_l))
        # solve for search direction
        d=np.linalg.solve(coef,r_adjust)
        # get direction for each part
        d_x=d[:var_num]
        d_pi=d[var_num:coef.shape[0]-var_num]
        d_z=d[-var_num:]
        # create a list to store possible selections
        selections=[1]
        # iterate through possible selections and append result to the list
        for i in range(var_num):
            if d_x[i] < 0:
                selections.append((-eta*x[i]/d_x[i])[0])
            if d_z[i] < 0:
                selections.append((-eta*z[i]/d_z[i])[0])
        # find the minimum value in possible selections 
        selections=np.array(selections)
        alpha=np.min(selections)
        # calcualte new iterates
        x=x+alpha*d_x.reshape(-1) 
        pi=pi+alpha*d_pi.reshape(-1)
        z=z+alpha*d_z.reshape(-1)
        # calcualte stopping criteria
        c_1=np.linalg.norm(np.dot(A,x)-b)
        c_2=np.linalg.norm(-np.dot(Q,x)+np.dot(A.T,pi)+z-c)
        c_3=np.dot(x.T,z)
        # print the information of each iteration 
        print("=====Iteration{}=====".format(counter+1))
        print('x: {}'.format(x))
        print('pi: {}'.format(pi))
        print('z: {}'.format(z))
        print('tau: {}'.format(tau))
        print('primal problem: {}'.format(np.dot(c.T,x)+0.5*np.dot(x.T,Q).dot(x)))
        print('dual problem: {}'.format(-np.dot(b.T,pi)-0.5*np.dot(x.T,Q).dot(x)))
        print('Residual of primal: {}'.format(c_1))
        print('Residual of dual: {}'.format(c_2))
        print('x.T dot x: {}'.format(c_3))
        # update counter
        counter+=1
    return x,pi,z

                                                                                                                  

你可能感兴趣的:(学习心得,算法,python,机器学习)