Python 的优化问题 Scipy.optimize 和 CVXOPT

看 Python 量化金融投资,摘录的一些统计函数。为了以后更好的查找。

优化问题

    • 优化问题
      • 无约束优化
        • Nelder-Mead 单纯形法
        • Broyden-Fletcher-Goldfarb-Shanno
        • 牛顿共轭梯度法
        • 有约束优化问题
      • CVXOPT 解 二次规划问题
        • 投资组合中的应用

import numpy as np
import scipy.optimize as opt

优化问题

  • 这里讨论的问题全部是凸优化问题,即目标函数为凸函数,其自变量的可行集为凸集(详细定义可见 斯坦福大学教材 剑桥课件)

无约束优化

  • 无约束优化问题是指一个优化问题的寻优可行集合是目标函数自变量的定义域,即没有外部的限制条件。

  • 例如,求解优化问题:

    • Minimize  f ( x ) = x 2 − 4.8 x + 1.2 \text{Minimize } f(x) = x^2 - 4.8x + 1.2 Minimize f(x)=x24.8x+1.2
    • 为无约束优化问题
  • 转为带约束的优化问题

    • min  f ( x ) = x 2 − 4.8 x + 1.2 \text{min } f(x) = x^2 - 4.8x + 1.2 min f(x)=x24.8x+1.2
    • s.t  x ≥ 0 \text{s.t } x\geq 0 s.t x0
  • 以 Rosenbrock 函数举例
    f ( x ) = ∑ i = 1 N − 1 100 ( x i − x i − 1 2 ) 2 + ( 1 − x i − 1 ) 2 f(x) = \sum^{N-1}_{i=1}100(x_i-x_{i-1}^2 )^2+(1-x_{i-1})^2 f(x)=i=1N1100(xixi12)2+(1xi1)2

# Rosenbrock
def rosen(x):
    """
    The Rosenbrock Function
    """
    return sum(100.0*(x[1:] - x[:-1]**2.)**2.+(1-x[:-1])**2.)

Nelder-Mead 单纯形法

  • 单纯形法为运筹学中求解线性规划问题的通用方法,这里的 Nelder-Mead 单纯形法 与其并不相同,只是用到单纯形的概念。
  • 设定起始点 x 0 = [ 0.5 , 1.6 , 1.1 , 0.8 , 1.2 ] x_0 = [0.5,1.6,1.1,0.8,1.2] x0=[0.5,1.6,1.1,0.8,1.2] 并行性最小化寻优。
  • xtol 表示迭代收敛的容忍误差上界。
  • 其全域最小值位于 x i = 1 x_i=1 xi=1, 数值为 f ( x i ) = 0 f(x_i)=0 f(xi)=0
x_0 = np.array([0.5,1.6,1.1,0.8,1.2])
opt.minimize(rosen,x_0,method='nelder-mead',options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 386
         Function evaluations: 622

 final_simplex: (array([[1.        , 1.00000003, 1.00000007, 1.00000017, 1.0000003 ],
       [1.        , 0.99999999, 1.00000004, 1.0000001 , 1.00000022],
       [0.99999999, 0.99999999, 1.00000004, 1.00000007, 1.00000011],
       [0.99999998, 0.99999993, 0.99999991, 0.99999982, 0.9999996 ],
       [1.00000008, 1.00000016, 1.00000033, 1.00000067, 1.00000127],
       [1.00000002, 1.00000004, 1.00000015, 1.00000033, 1.0000006 ]]), array([4.14390143e-13, 4.94153342e-13, 5.19631652e-13, 5.35708759e-13,
       9.80969042e-13, 9.81943884e-13]))
           fun: 4.143901431865225e-13
       message: 'Optimization terminated successfully.'
          nfev: 622
           nit: 386
        status: 0
       success: True
             x: array([1.        , 1.00000003, 1.00000007, 1.00000017, 1.0000003 ])
  • Rosenbrock 函数性能比较好,简单的优化方法就可以处理了。
  • 还可以使用 Powell 方法,method="powell"
opt.minimize(rosen,x_0,method='powell',options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 24
         Function evaluations: 1618

   direc: array([[ 4.55588497e-04,  1.36409332e-03,  2.24683410e-03,
         4.12376042e-03,  7.99776305e-03],
       [-1.91331747e-03, -3.00268845e-03, -6.76968505e-03,
        -1.34778007e-02, -2.66903472e-02],
       [-3.76306326e-02, -2.30543912e-02,  1.05016733e-02,
         3.42182501e-05,  7.33576548e-05],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00,  0.00000000e+00],
       [ 4.02021587e-06,  1.15777807e-05,  2.01895943e-05,
         5.10192097e-05,  1.09407425e-04]])
     fun: 2.051360957724131e-21
 message: 'Optimization terminated successfully.'
    nfev: 1618
     nit: 24
  status: 0
 success: True
       x: array([1., 1., 1., 1., 1.])
  • 这两种方法并不使用函数的梯度,在略微复杂的情形下收敛速度比较慢。接下来介绍 利用函数梯度寻优。

Broyden-Fletcher-Goldfarb-Shanno

  • Broyden-Fletcher-Goldfarb-Shanno(BFGS) 修正拟牛顿法

  • 首先求 Rosenbrock 函数的梯度:

∂ f ∂ x j = ∑ i = 1 N 200 ( x i − x i − 1 ) + ( δ i , j − 2 x i − 1 δ i − 1 , j ) = 200 ( x j − x j − 1 2 ) − 400 x j ( x j + 1 − x j 2 ) − 2 ( 1 − x j ) \begin{aligned}\frac{\partial f}{\partial x_j} &= \sum^{N}_{i=1}200(x_i-x_{i-1})+(\delta_{i,j}-2 x_{i-1}\delta_{i-1,j})\\&=200(x_j -x_{j-1}^2)-400x_j(x_{j+1}-x_j^2)-2(1-x_j)\end{aligned} xjf=i=1N200(xixi1)+(δi,j2xi1δi1,j)=200(xjxj12)400xj(xj+1xj2)2(1xj)

  • { δ i , j = 1 i = j δ i , j = 0 else \begin{aligned}\begin{cases}\delta_{i,j}=1 & i=j \\ \delta_{i,j} =0 &\text{else}\end{cases}\end{aligned} {δi,j=1δi,j=0i=jelse
  • 特例:
    • ∂ f ∂ x 0 = − 400 x 0 ( x 1 − x 0 2 ) − 2 ( 1 − x 0 ) \frac{\partial f}{\partial x_0} = -400 x_0(x_1-x_0^2)-2(1-x_0) x0f=400x0(x1x02)2(1x0)
    • ∂ f ∂ x N − 1 = 200 ( x N − x N − 1 2 ) \frac{\partial f}{\partial x_{N-1}} = 200(x_{N}-x^2_{N-1}) xN1f=200(xNxN12)
def rosen_der(x):
    xj = x[1:-1]
    xj_m1 = x[:-2]
    xj_p1 = x[2:]
    der = np.zeros_like(x)
    der[1:-1] = 200*(xj-xj_m1**2) -400*xj*(xj_p1-xj**2)-2*(1-xj)
    der[0] = -400*x[0]*(x[1]-x[0]**2)-2*(1-x[0])
    der[-1] = 200*(x[-1]-x[-2]**2)
    return der
opt.minimize(rosen,x_0,method='BFGS',jac=rosen_der,options={'disp':True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 39
         Function evaluations: 47
         Gradient evaluations: 47

      fun: 1.569191726013783e-14
 hess_inv: array([[0.00742883, 0.01251316, 0.02376685, 0.04697638, 0.09387584],
       [0.01251316, 0.02505532, 0.04784533, 0.094432  , 0.18862433],
       [0.02376685, 0.04784533, 0.09594869, 0.18938093, 0.37814437],
       [0.04697638, 0.094432  , 0.18938093, 0.37864606, 0.7559884 ],
       [0.09387584, 0.18862433, 0.37814437, 0.7559884 , 1.51454413]])
      jac: array([-3.60424798e-06,  2.74743159e-06, -1.94696995e-07,  2.78416205e-06,
       -1.40985001e-06])
  message: 'Optimization terminated successfully.'
     nfev: 47
      nit: 39
     njev: 47
   status: 0
  success: True
        x: array([1.        , 1.00000001, 1.00000002, 1.00000004, 1.00000007])
  • 这里 梯度信息的引入 在 minimize 函数中通过 jac 指定。
  • 可以看到迭代次数比之前小了不少。
    • Iterations 迭代次数
    • Function evaluations 函数调用次数

牛顿共轭梯度法

  • Newton-Conjugate-Gradient algorithm
  • 简称 牛顿法
  • 收敛速度最快的方法,缺点在于求解 Hessian 矩阵(二阶导数矩阵)
  • 大致思路:采用泰勒展开的二阶近似,使用共轭梯度近似 Hessian 矩阵的逆矩阵。
  • Rosenbrock 函数 的 Hessian 矩阵元素通式:
    H ( i , j ) = ∂ 2 f ∂ x i ∂ x j = 200 ( δ i , j − 2 x i − 1 δ i − 1 , j ) − 400 x i ( δ i + 1 , j − 2 x i δ i , j ) − 400 δ i , j ( x i + 1 , j − x i 2 ) + 2 δ i , j H(i,j) = \frac{\partial^2 f}{\partial x_i\partial x_j} = 200(\delta_{i,j}-2x_{i-1}\delta_{i-1,j})-400x_i(\delta_{i+1,j}-2x_i\delta_{i,j})-400\delta_{i,j}(x_{i+1,j}-x_i^2)+2\delta_{i,j} H(i,j)=xixj2f=200(δi,j2xi1δi1,j)400xi(δi+1,j2xiδi,j)400δi,j(xi+1,jxi2)+2δi,j

H ( i , j ) = ( 202 + 1200 x i 2 − 400 x i + 1 ) δ i , j − 400 x i δ i + 1 , j − 400 x i − 1 δ i − 1 , j H(i,j) = (202+1200x_i^2-400x_{i+1})\delta_{i,j}-400x_i\delta_{i+1,j}-400x_{i-1}\delta_{i-1,j} H(i,j)=(202+1200xi2400xi+1)δi,j400xiδi+1,j400xi1δi1,j

  • 边界条件:
    • ∂ 2 f ∂ x 0 2 = 1200 x 0 2 − 400 x 1 + 2 \frac{\partial^2 f}{\partial x_0^2} = 1200x_0^2 -400 x_1 +2 x022f=1200x02400x1+2
    • ∂ 2 f ∂ x 0 ∂ x 1 = − 400 x 0 \frac{\partial^2 f}{\partial x_0\partial x_1} = -400x_0 x0x12f=400x0
    • ∂ 2 f ∂ x N − 1 ∂ x N − 2 = − 400 x N − 2 \frac{\partial^2 f}{\partial x_{N-1}\partial x_{N-2}} = -400x_{N-2} xN1xN22f=400xN2
    • ∂ 2 f ∂ x N − 1 2 = 200 \frac{\partial^2 f}{\partial x_{N-1}^2} = 200 xN122f=200
def rosen_hess(x):
    H = np.diag(-400*x[:-1],1)+np.diag(-400*x[:-1],-1)
    diagonal = np.zeros_like(x)
    diagonal[0] = 1200*x[0]**2 -400*x[1]+2
    diagonal[-1] = 200
    diagonal[1:-1] = 202+1200*x[1:-1]**2 -400*x[2:]
    H = H+np.diag(diagonal)
    return H
opt.minimize(rosen,x_0,method='Newton-CG',jac=rosen_der,hess=rosen_hess,options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 20
         Function evaluations: 22
         Gradient evaluations: 41
         Hessian evaluations: 20
         
     fun: 1.47606641102778e-19
     jac: array([-3.62847530e-11,  2.68148992e-09,  1.16637362e-08,  4.81693414e-08,
       -2.76999090e-08])
 message: 'Optimization terminated successfully.'
    nfev: 22
    nhev: 20
     nit: 20
    njev: 41
  status: 0
 success: True
       x: array([1., 1., 1., 1., 1.])
  • 对于一些大型的优化问题,Hessian 矩阵将异常大,牛顿法用到的仅是Hessian 矩阵的一个任意向量的乘积。
  • 因此只要提供一个向量p,就减少存储的开销。
def rosen_hessp(x,p):
    Hp = np.zeros_like(x)
    Hp[0] = (1200*x[0]**2-400*x[1]+2)*p[0] -400*x[0]*p[1]
    Hp[1:-1] = -400*x[:-2]*p[:-2]+(202+1200*x[1:-1]**2-400*x[2:])*p[1:-1]-400*x[1:-1]*p[2:]
    Hp[-1] = -400*x[-2]*p[-2]+200*p[-1]
    return Hp
    
opt.minimize(rosen,x_0,method='Newton-CG',jac=rosen_der,hessp=rosen_hessp,options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 20
         Function evaluations: 22
         Gradient evaluations: 41
         Hessian evaluations: 58

     fun: 1.47606641102778e-19
     jac: array([-3.62847530e-11,  2.68148992e-09,  1.16637362e-08,  4.81693414e-08,
       -2.76999090e-08])
 message: 'Optimization terminated successfully.'
    nfev: 22
    nhev: 58
     nit: 20
    njev: 41
  status: 0
 success: True
       x: array([1., 1., 1., 1., 1.])

有约束优化问题

  • 标准形式为:
    min  f ( x ) s.t.  g i ( x ) ≤ 0 ,   i = 1 , 2 , ⋯   , m A x = b \begin{aligned}\text{min } &f(x)\\\text{s.t. }&g_i(x)\leq0, \ i=1,2,\cdots, m\\ Ax&=b\end{aligned} min s.t. Axf(x)gi(x)0, i=1,2,,m=b

  • 其中 g 1 , ⋯   , g m : R n → R g_1,\cdots, g_m: \mathbb{R}^n\to \mathbb{R} g1,,gm:RnR R n \mathbb{R}^n Rn 空间上的二次可微分的凸函数; A A A p × n p\times n p×n 矩阵且秩 rank ( A ) = p < n \text{rank}(A)=p< n rank(A)=p<n

  • 例:
    Minimize  f ( x , y ) = 2 x y + 2 x − x 2 − 2 y 2 subject to  x 3 − y = 0 y − 1 ≥ 0 \begin{aligned}\text{Minimize } &f(x,y) = 2xy+2x-x^2-2y^2\\\text{subject to }& x^3-y = 0\\ y-1\geq0\end{aligned} Minimize subject to y10f(x,y)=2xy+2xx22y2x3y=0

def func(x,sign=1.):
    """
    objective function
    """
    return sign*(2*x[0]*x[1]+2*x[0] -x[0]**2-2*x[1]**2)

def func_deriv(x,sign=1.):
    """
    derivative of objective function
    """
    dfdx0 = sign * (-2*x[0]+2*x[1]+2)
    dfdx1 = sign * (-4*x[1]+2*x[0])
    return np.array([dfdx0,dfdx1])
  • 其中 sign 表示求解 最大或最小值。进一步定义约束条件:
cons = ({'type':'eq', 'fun':lambda x:np.array([x[0]**3. -x[1]]),\
         'jac':lambda x: np.array([3.*x[0]**2.,-1.])},\
        {'type':'ineq','fun':lambda x: np.array([x[1]-1]),\
        'jac':lambda x: np.array([0,1])})
  • 最后使用 SLSQP (sequential Least SQuares Programming optimization algorithm)
opt.minimize(func,[-1.,1.],\
             args=(-1.,),\
             jac=func_deriv, \
             constraints=cons,method='SLSQP',\
             options={'disp':True})
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -1.0000001831052137
            Iterations: 9
            Function evaluations: 14
            Gradient evaluations: 9

     fun: -1.0000001831052137
     jac: array([-1.99999982,  1.99999982])
 message: 'Optimization terminated successfully.'
    nfev: 14
     nit: 9
    njev: 9
  status: 0
 success: True
       x: array([1.00000009, 1.        ])
print('无约束优化进行比较')
print('Result of unconstrained optimization')
opt.minimize(func,[-1.,1.],\
             args=(-1.,),\
             jac=func_deriv, \
             method='SLSQP',\
             options={'disp':True})
无约束优化进行比较
Result of unconstrained optimization
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -2.0
            Iterations: 4
            Function evaluations: 5
            Gradient evaluations: 4

     fun: -2.0
     jac: array([-0., -0.])
 message: 'Optimization terminated successfully.'
    nfev: 5
     nit: 4
    njev: 4
  status: 0
 success: True
       x: array([2., 1.])

CVXOPT 解 二次规划问题

  • Python 中 除了可以使用 scipy.minimize 工具处理优化问题以外,也有专门处理优化的扩展模块,例如 CVXOPT
  • 二次规划问题标准形式:

min  1 2 x T P x + q T x s.t.  G x ≤ h ,   A x = b \begin{aligned}\text{min } &\frac{1}{2} x^T P x + q^T x\\\text{s.t. }&Gx\leq h, \ Ax=b\end{aligned} min s.t. 21xTPx+qTxGxh, Ax=b

  • 例: 求:
    min  2 x 2 + x y + y 2 + x + y s.t.  x ≥ 0 ,   y ≥ 0 x + y = 1 \begin{aligned}\text{min } &2x^2 + xy + y^2 + x + y\\ \text{s.t. } &x\geq 0,\ y\geq 0\\& x+y=1\end{aligned} min s.t. 2x2+xy+y2+x+yx0, y0x+y=1

p = [ 4 1 1 2 ] q = [ 1 1 ] G = [ − 1 0 0 − 1 ] ⋯ \begin{aligned} p &= \begin{bmatrix} 4 & 1 \\ 1 & 2 \end{bmatrix}\\ q &= \begin{bmatrix} 1 & 1\end{bmatrix}\\ G &= \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}\\ \cdots\end{aligned} pqG=[4112]=[11]=[1001]

from cvxopt import solvers,matrix
p = matrix([[4., 1.], [1., 2.]])
q = matrix([1., 1.])
G = matrix([[-1.,0.],[0.,-1.]])
h = matrix([0.,0.]) # matrix里区分int和double,所以数字后面都需要加小数点
A = matrix([1., 1.], (1,2)) # A必须是一个1行2列
b = matrix(1.)

sol=solvers.qp(p, q, G, h, A, b)
print(sol['x'])
     pcost       dcost       gap    pres   dres
 0:  1.8889e+00  7.7778e-01  1e+00  3e-16  2e+00
 1:  1.8769e+00  1.8320e+00  4e-02  2e-16  6e-02
 2:  1.8750e+00  1.8739e+00  1e-03  2e-16  5e-04
 3:  1.8750e+00  1.8750e+00  1e-05  6e-17  5e-06
 4:  1.8750e+00  1.8750e+00  1e-07  2e-16  5e-08
Optimal solution found.
[ 2.50e-01]
[ 7.50e-01]
  • 解得 x = 0.25 , y = 0.75 x=0.25,y=0.75 x=0.25,y=0.75
  • 以下是 solvers 内的其他参数:
sol
{'x': <2x1 matrix, tc='d'>,
 'y': <1x1 matrix, tc='d'>,
 's': <2x1 matrix, tc='d'>,
 'z': <2x1 matrix, tc='d'>,
 'status': 'optimal',
 'gap': 1.0527028380515569e-07,
 'relative gap': 5.6144154514915067e-08,
 'primal objective': 1.8750000000000182,
 'dual objective': 1.8749998947297344,
 'primal infeasibility': 2.482534153247273e-16,
 'dual infeasibility': 5.3147593337403756e-08,
 'primal slack': 0.2500000952702474,
 'dual slack': 1.0000000000000035e-08,
 'iterations': 4}

投资组合中的应用

  • 给出下列3个资产数据
    • s1 , s2 , b
s1 = [0.,.04,.13,.19,-.15,-.27,.37,.24,-.07,.07,.19,.33,-.05,.22,.23,.06,.32,.19,.05,.17]
s2 = [.07,.13,.14,.43,.67,.64,0.,-.22,.18,.31,.59,.99,-.25,.04,-.11,-.15,-.12,.16,.22,-.02]
b = [.06,.07,.05,.04,.07,.08,.06,.04,.05,.07,.1,.11,.15,.11,.09,.1,.08,.06,.05,.07]
x = np.array([s1,s2,b]) # 资产数据
p = matrix(np.cov(x)) # 求得协方差矩阵
q = matrix([0.,0.,0.])
A = matrix([[1.],[1.],[1.]])
b = matrix([1.])
G = matrix([[-1.,0.,0.,1.,0.,0.,-.113],\
            [0.,-1.,0.,0.,1.,0.,-.185],\
            [0.,0.,-1.,0.,0.,1.,-.0755]])
h = matrix([0.,0.,0.,1.,1.,1.,-.13])
sol = solvers.qp(p,q,G,h,A,b)
     pcost       dcost       gap    pres   dres
 0:  6.1729e-03 -3.3446e+00  1e+01  2e+00  4e-01
 1:  7.4918e-03 -1.5175e+00  2e+00  2e-02  5e-03
 2:  9.0497e-03 -7.7600e-02  9e-02  1e-03  3e-04
 3:  8.4511e-03  2.0141e-03  6e-03  7e-05  2e-05
 4:  7.6044e-03  7.1661e-03  4e-04  3e-07  8e-08
 5:  7.5351e-03  7.5293e-03  6e-06  3e-09  8e-10
 6:  7.5340e-03  7.5340e-03  6e-08  3e-11  8e-12
Optimal solution found.
print(sol['x'])
[ 5.06e-01]
[ 3.24e-01]
[ 1.69e-01]
  • 可见资产1 投资比例为 51%,资产2 投资比例为 32.4%, 资产3 投资比例为 16.9%
sol
{'x': <3x1 matrix, tc='d'>,
 'y': <1x1 matrix, tc='d'>,
 's': <7x1 matrix, tc='d'>,
 'z': <7x1 matrix, tc='d'>,
 'status': 'optimal',
 'gap': 5.750887504859599e-08,
 'relative gap': 7.633272007468671e-06,
 'primal objective': 0.007534031792304567,
 'dual objective': 0.007533974289443271,
 'primal infeasibility': 3.318376136551126e-11,
 'dual infeasibility': 8.328707321687642e-12,
 'primal slack': 3.811267544311688e-08,
 'dual slack': 1.88724298640649e-09,
 'iterations': 6}

你可能感兴趣的:(Python,python,scipy,optimization,cvxopt,线性代数)