对于一些很转化为凸或拟凸问题的非凸优化问题,此时可以有限考虑获得该问题的一个稳定点(Stationary Point,又称驻点、平稳点)。
令 f : C → R f: \mathcal{C} \rightarrow \mathbb{R} f:C→R是一个连续的非凸函数,可能不可微,其中 C ⊆ R n \mathcal{C} \subseteq \mathbb{R}^{n} C⊆Rn是一个闭凸集。考虑如下的一个最小化问题
min x ∈ C f ( x ) \min _{\mathbf{x} \in \mathcal{C}} f(\mathbf{x}) x∈Cminf(x)
f f f关于点 x \mathbf{x} x在方向 v \mathbf{v} v上的方向导数定义为:
f ′ ( x ; v ) ≜ lim inf λ ↓ 0 f ( x + λ v ) − f ( x ) λ = lim λ → 0 + inf 0 < μ ≤ λ f ( x + μ v ) − f ( x ) μ \begin{aligned} f^{\prime}(\mathbf{x} ; \mathbf{v}) & \triangleq \liminf _{\lambda \downarrow 0} \frac{f(\mathbf{x}+\lambda \mathbf{v})-f(\mathbf{x})}{\lambda} \\ &=\lim _{\lambda \rightarrow 0^{+}} \inf _{0<\mu \leq \lambda} \frac{f(\mathbf{x}+\mu \mathbf{v})-f(\mathbf{x})}{\mu} \end{aligned} f′(x;v)≜λ↓0liminfλf(x+λv)−f(x)=λ→0+lim0<μ≤λinfμf(x+μv)−f(x)
如果对于所有的 v \mathbf{v} v满足 f ′ ( x ; v ) ≥ 0 f^{\prime}(\mathbf{x} ; \mathbf{v}) \geq 0 f′(x;v)≥0使得 x + v ∈ C \mathbf{x}+\mathbf{v} \in \mathcal{C} x+v∈C,则称 v \mathbf{v} v为稳定点。当 f f f可微时, f ′ ( x ; v ) f^{\prime}(\mathbf{x} ; \mathbf{v}) f′(x;v)等价为 ∇ f ( x ) = 0 \nabla f(\mathrm{x})=0 ∇f(x)=0。
一般来说,一个稳定点可以是局部最小点、局部最大点或者鞍点,当 f f f为凸时没稳定点就是凸优化问题的全局最优解。
在介绍BSUM方法之前,先结合稳定点的定义,引入函数正则性。令 f : R n → R f: \mathbb{R}^{n} \rightarrow \mathbb{R} f:Rn→R和 x = ( x 1 , … , x m ) ∈ dom f \mathrm{x}=\left(\mathrm{x}_{1}, \ldots, \mathrm{x}_{m}\right) \in \operatorname{dom} f x=(x1,…,xm)∈domf,其中 x i ∈ R n i \mathbf{x}_{i} \in \mathbb{R}^{n_{i}} xi∈Rni,且 n 1 + ⋯ + n m = n n_{1}+\cdots+n_{m}=n n1+⋯+nm=n。如果对于所有
{ v ≜ ( v 1 , … , v m ) ∈ R n 1 × ⋯ × R n m v i ≜ ( 0 n 1 , … , 0 n i − 1 , v i , 0 n i + 1 , … , 0 n m ) , v i ∈ R n i \left\{\begin{array}{l} \mathbf{v} \triangleq\left(\mathbf{v}_{1}, \ldots, \mathbf{v}_{m}\right) \in \mathbb{R}^{n_{1}} \times \cdots \times \mathbb{R}^{n_{m}} \\ \boldsymbol{v}_{i} \triangleq\left(\mathbf{0}_{n_{1}}, \ldots, \boldsymbol{0}_{n_{i-1}}, \mathbf{v}_{i}, \mathbf{0}_{n_{i+1}}, \ldots, \mathbf{0}_{n_{m}}\right), \mathbf{v}_{i} \in \mathbb{R}^{n_{i}} \end{array}\right. {v≜(v1,…,vm)∈Rn1×⋯×Rnmvi≜(0n1,…,0ni−1,vi,0ni+1,…,0nm),vi∈Rni
都有 f ′ ( x ; v i ) ≥ 0 f^{\prime}\left(\mathbf{x} ; \boldsymbol{v}_{i}\right) \geq 0 f′(x;vi)≥0,其中 i = 1 , . . . , m i=1,...,m i=1,...,m(即: f ′ ( x ; v ) ≥ 0 f^{\prime}(\mathbf{x} ; \mathbf{v}) \geq 0 f′(x;v)≥0),则称 f : R n → R f: \mathbb{R}^{n} \rightarrow \mathbb{R} f:Rn→R是正则的。如果 f f f在点 x \mathbf{x} x是可微的,则:
f ′ ( x ; v ) = ∇ f ( x ) T v = ∇ f ( x ) T { ∑ i = 1 m v i } = ∑ i = 1 m f ′ ( x ; v i ) ≥ 0 \begin{aligned} f^{\prime}(\mathbf{x} ; \mathbf{v}) &=\nabla f(\mathbf{x})^{T} \mathbf{v}=\nabla f(\mathbf{x})^{T}\left\{\sum_{i=1}^{m} v_{i}\right\} \\ &=\sum_{i=1}^{m} f^{\prime}\left(\mathbf{x} ; \boldsymbol{v}_{i}\right) \geq 0 \end{aligned} f′(x;v)=∇f(x)Tv=∇f(x)T{i=1∑mvi}=i=1∑mf′(x;vi)≥0
如果 f ′ ( x ; v i ) ≥ 0 , ∀ i f^{\prime}\left(\mathbf{x} ; \boldsymbol{v}_{i}\right) \geq 0\quad, \forall i f′(x;vi)≥0,∀i所以 x \mathbf{x} x一定是 f f f的正则点。
假设 C = C 1 × ⋯ × C m \mathcal{C}=\mathcal{C}_{1} \times \cdots \times \mathcal{C}_{m} C=C1×⋯×Cm,其中 C i ⊆ R n i , i = 1 , … , m \mathcal{C}_{i} \subseteq \mathbb{R}^{n_{i}}, i=1, \ldots, m Ci⊆Rni,i=1,…,m是闭凸集,并且 ∑ i = 1 m n i = n \sum_{i=1}^{m} n_{i}=n ∑i=1mni=n。通过合理地利用这种块结构,BSUM以轮询的方式迭代更新 m m m个变量块,从高效地得到问题 min x ∈ C f ( x ) \min _{\mathbf{x} \in \mathcal{C}} f(\mathbf{x}) minx∈Cf(x)的稳定点。
具体来说,就是已知第 ( r − 1 ) (r-1) (r−1)次迭代中的一个可行点 x ‾ = ( x ‾ 1 , … , x ‾ m ) ∈ C \overline{\mathbf{x}}=\left(\overline{\mathbf{x}}_{1}, \ldots, \overline{\mathbf{x}}_{m}\right) \in \mathcal{C} x=(x1,…,xm)∈C,那么在第 r r r次迭代中,第 i i i个块 x ‾ i \overline{\mathbf{x}}_{i} xi的更新公式就为,
x ‾ i = arg min x i ∈ C i f ˉ i ( x i ∣ x ‾ ) \overline{\mathbf{x}}_{i}=\arg \min _{\mathbf{x}_{i} \in \mathcal{C}_{i}} \bar{f}_{i}\left(\mathbf{x}_{i} \mid \overline{\mathbf{x}}\right) xi=argxi∈Ciminfˉi(xi∣x)
其中 i = ( ( r − 1 ) m o d m ) + 1 i=((r-1) \bmod m)+1 i=((r−1)modm)+1, f ˉ i ( x i ∣ x ‾ ) \bar{f}_{i}\left(\mathbf{x}_{i} \mid \overline{\mathbf{x}}\right) fˉi(xi∣x)是 f ( x ) f(\mathbf{x}) f(x)在参考点 x = x ‾ ∈ C \mathrm{x}=\overline{\mathrm{x}} \in \mathcal{C} x=x∈C处关于第 i i i个块的一个上限近似值。图1展示了BSUM在 m = n = 1 m=n=1 m=n=1时的一个迭代过程。
假设以下的两个条件为真:
则对于任意的 x i ∈ C i , x ‾ ∈ C \mathbf{x}_{i} \in \mathcal{C}_{i}, \overline{\mathbf{x}} \in \mathcal{C} xi∈Ci,x∈C和任意的 x i + v i ∈ C i , ∀ i \mathbf{x}_{i}+\mathbf{v}_{i} \in \mathcal{C}_{i}, \forall i xi+vi∈Ci,∀i,只要:
成立,则BSUM算法产生的迭代序列 x ‾ \overline{\mathbf{x}} x能够收敛到 min x ∈ C f ( x ) \min _{\mathbf{x} \in \mathcal{C}} f(\mathbf{x}) minx∈Cf(x)的一个平稳点。
若可微时,可化为
x ‾ i = arg min { f ˉ i ( x i ∣ x ‾ ) − f ( x ‾ 1 , … , x ‾ i − 1 , x i , x ‾ i + 1 , … , x ‾ m ) } ⇒ ∇ x i ( f ˉ i ( x ‾ i ∣ x ‾ ) − f ( x ‾ ) ) = 0 ⇒ ∇ f ˉ i ( x ‾ i ∣ x ‾ ) = ∇ x i f ( x ‾ ) ⇒ ∇ f ˉ i ( x ‾ i ∣ x ‾ ) T v i = ∇ x i f ( x ‾ ) T v i = ∇ f ( x ‾ ) T v i ⇒ f ˉ i ′ ( x ‾ i ; v i ∣ x ‾ ) = f ′ ( x ‾ ; v i ) \begin{aligned} & \overline{\mathbf{x}}_{i}=\arg \min \left\{\bar{f}_{i}\left(\mathbf{x}_{i} \mid \overline{\mathbf{x}}\right)-f\left(\overline{\mathbf{x}}_{1}, \ldots, \overline{\mathbf{x}}_{i-1}, \mathbf{x}_{i}, \overline{\mathbf{x}}_{i+1}, \ldots, \overline{\mathbf{x}}_{m}\right)\right\} \\ \Rightarrow & \nabla_{\mathbf{x}_{i}}\left(\bar{f}_{i}\left(\overline{\mathbf{x}}_{i} \mid \overline{\mathbf{x}}\right)-f(\overline{\mathbf{x}})\right)=0 \\ \Rightarrow & \nabla \bar{f}_{i}\left(\overline{\mathbf{x}}_{i} \mid \overline{\mathbf{x}}\right)=\nabla_{\mathbf{x}_{i}} f(\overline{\mathbf{x}}) \\ \Rightarrow & \nabla \bar{f}_{i}\left(\overline{\mathbf{x}}_{i} \mid \overline{\mathbf{x}}\right)^{T} \mathbf{v}_{i}=\nabla_{\mathbf{x}_{i}} f(\overline{\mathbf{x}})^{T} \mathbf{v}_{i}=\nabla f(\overline{\mathbf{x}})^{T} \boldsymbol{v}_{i} \\ \Rightarrow & \bar{f}_{i}^{\prime}\left(\overline{\mathbf{x}}_{i} ; \mathbf{v}_{i} \mid \overline{\mathbf{x}}\right)=f^{\prime}\left(\overline{\mathbf{x}} ; \boldsymbol{v}_{i}\right) \end{aligned} ⇒⇒⇒⇒xi=argmin{fˉi(xi∣x)−f(x1,…,xi−1,xi,xi+1,…,xm)}∇xi(fˉi(xi∣x)−f(x))=0∇fˉi(xi∣x)=∇xif(x)∇fˉi(xi∣x)Tvi=∇xif(x)Tvi=∇f(x)Tvifˉi′(xi;vi∣x)=f′(x;vi)
也就是说(4.162c)已定位针,因此BSUM方法的收敛条件可以简化为式子(4.162a)、(4.162b)、(4.162d)、(4.162e)。
通过BSUM方法来求解 min x ∈ C f ( x ) \min _{\mathbf{x} \in \mathcal{C}} f(\mathbf{x}) minx∈Cf(x)的一个平稳点。的关键是合适地射界或找到一个近似函数 f ˉ i ( x i ∣ x ‾ ) , i = 1 , … , m \bar{f}_{i}\left(\mathbf{x}_{i} \mid \overline{\mathbf{x}}\right), i=1, \ldots, m fˉi(xi∣x),i=1,…,m,一方面满足式子(4.162e)的所有条件,另一方面可以有效的解决 x ‾ i = arg min x i ∈ C i f ˉ i ( x i ∣ x ‾ ) \overline{\mathbf{x}}_{i}=\arg \min _{\mathbf{x}_{i} \in \mathcal{C}_{i}} \bar{f}_{i}\left(\mathbf{x}_{i} \mid \overline{\mathbf{x}}\right) xi=argminxi∈Cifˉi(xi∣x)
考虑一个两用户的SISO信道。其中两个单天线发射机同时同频地和各自的单天线接收机通信。因此这两个收发对在信号接收端相互干扰彼此。该系统的信号模型可以表示为:
y 1 = x 1 + h 21 x 2 + n 1 y_{1}=x_{1}+h_{21} x_{2}+n_{1} y1=x1+h21x2+n1
y 2 = h 12 x 1 + x 2 + n 2 y_{2}=h_{12} x_{1}+x_{2}+n_{2} y2=h12x1+x2+n2
其中, y i y_{i} yi是第 i i i个接收机的信号, x i x_{i} xi是第 i i i个发射机的信号, h k i ∈ C h_{k i} \in \mathbb{C} hki∈C是发射机 k k k和接收机 i i i之间的交叉连接信道增益。 n i ∼ C N ( 0 , σ i 2 ) n_{i} \sim \mathcal{C} \mathcal{N}\left(0, \sigma_{i}^{2}\right) ni∼CN(0,σi2)是接收机 i u iu iu的信号。(注:接收信号 y i y_i yi)已经用 h i i h_{ii} hii做了归一化处理,为了简单起见,直接 h i i = 1 h_{ii}=1 hii=1。假定传输信号 x i x_{i} xi经过零均值、方差为 p i p_i pi的高斯编码,经过检测解码后获得期望的信号 x i x_{i} xi。根据每个 y i y_i yi的SINR,两个收发对的可达速率为:
r 1 ( p 1 , p 2 ) = log 2 ( 1 + E { ∣ x 1 ∣ 2 } E { ∣ h 21 x 2 + n 1 ∣ 2 } ) = log 2 ( 1 + p 1 ∣ h 21 ∣ 2 p 2 + σ 1 2 ) bits / transmission \begin{aligned} r_{1}\left(p_{1}, p_{2}\right) &=\log _{2}\left(1+\frac{\mathbb{E}\left\{\left|x_{1}\right|^{2}\right\}}{\mathbb{E}\left\{\left|h_{21} x_{2}+n_{1}\right|^{2}\right\}}\right) \\ &=\log _{2}\left(1+\frac{p_{1}}{\left|h_{21}\right|^{2} p_{2}+\sigma_{1}^{2}}\right) \quad \text { bits } / \text { transmission } \end{aligned} r1(p1,p2)=log2⎝⎛1+E{∣h21x2+n1∣2}E{∣x1∣2}⎠⎞=log2(1+∣h21∣2p2+σ12p1) bits / transmission
r 2 ( p 1 , p 2 ) = log 2 ( 1 + E { ∣ x 2 ∣ 2 } E { ∣ h 12 x 1 + n 2 ∣ 2 } ) = log 2 ( 1 + p 2 ∣ h 12 ∣ 2 p 1 + σ 2 2 ) bits/transmission. \begin{aligned} r_{2}\left(p_{1}, p_{2}\right) &=\log _{2}\left(1+\frac{\mathbb{E}\left\{\left|x_{2}\right|^{2}\right\}}{\mathbb{E}\left\{\left|h_{12} x_{1}+n_{2}\right|^{2}\right\}}\right) \\ &=\log _{2}\left(1+\frac{p_{2}}{\left|h_{12}\right|^{2} p_{1}+\sigma_{2}^{2}}\right) \quad \text { bits/transmission. } \end{aligned} r2(p1,p2)=log2⎝⎛1+E{∣h12x1+n2∣2}E{∣x2∣2}⎠⎞=log2(1+∣h12∣2p1+σ22p2) bits/transmission.
为了最大化和速率,考虑如下的功率控制问题:
max p 1 , p 2 r 1 ( p 1 , p 2 ) + r 2 ( p 1 , p 2 ) s.t. 0 ≤ p 1 ≤ P 1 0 ≤ p 2 ≤ P 2 \begin{array}{cl}\max _{p_{1}, p_{2}} & r_{1}\left(p_{1}, p_{2}\right)+r_{2}\left(p_{1}, p_{2}\right) \\ \text { s.t. } & 0 \leq p_{1} \leq P_{1} \\ & 0 \leq p_{2} \leq P_{2}\end{array} maxp1,p2 s.t. r1(p1,p2)+r2(p1,p2)0≤p1≤P10≤p2≤P2
其中 P 1 P_1 P1和 P 2 P_2 P2分别为接收机1和接收机2的最大发射功率。
对该问题的分析:1.目标函数非凸非凹,所以该问题在 ( p 1 , p 2 ) (p1,p2) (p1,p2)上式非凸的。2.可行解是闭的,且为凸函。
利用BSUM的方法求解。首先将该问题写为标准的优化问题形式:
min p 1 , p 2 f ( p 1 , p 2 ) ≜ − r 1 ( p 1 , p 2 ) − r 2 ( p 1 , p 2 ) s.t. 0 ≤ p 1 ≤ P 1 0 ≤ p 2 ≤ P 2 \begin{array}{rl}\min _{p_{1}, p_{2}} & f\left(p_{1}, p_{2}\right) \triangleq-r_{1}\left(p_{1}, p_{2}\right)-r_{2}\left(p_{1}, p_{2}\right) \\ \text { s.t. } & 0 \leq p_{1} \leq P_{1} \\ & 0 \leq p_{2} \leq P_{2}\end{array} minp1,p2 s.t. f(p1,p2)≜−r1(p1,p2)−r2(p1,p2)0≤p1≤P10≤p2≤P2
由二阶条件可以知道 − r 1 ( p 1 , p 2 ) -r_{1}\left(p_{1}, p_{2}\right) −r1(p1,p2)在 p 1 p_1 p1处为凸,在 p 2 p_2 p2处为凹,而 − r 2 ( p 1 , p 1 ) -r_{2}\left(p_{1}, p_{1}\right) −r2(p1,p1)在 p 2 p_2 p2处为凸,在 p 1 p_1 p1处为凹。利用凹函数的一阶近似来得到期望近似函数,分别表示为: f ˉ 1 ( p 1 ∣ p ˉ 1 , p ˉ 2 ) \bar{f}_{1}\left(p_{1} \mid \bar{p}_{1}, \bar{p}_{2}\right) fˉ1(p1∣pˉ1,pˉ2)和 f ˉ 2 ( p 2 ∣ p ˉ 1 , p ˉ 2 ) \bar{f}_{2}\left(p_{2} \mid \bar{p}_{1}, \bar{p}_{2}\right) fˉ2(p2∣pˉ1,pˉ2),并且这两个函数满足条件(4.162),具体如下:
f ˉ 1 ( p 1 ∣ p ˉ 1 , p ˉ 2 ) ≜ − r 1 ( p 1 , p ˉ 2 ) − r 2 ( p ˉ 1 , p ˉ 2 ) + ( p 1 − p ˉ 1 ) ∂ { − r 2 ( p 1 , p ˉ 2 ) } ∂ p 1 ∣ p 1 = p ˉ 1 = − r 1 ( p 1 , p ˉ 2 ) − r 2 ( p ˉ 1 , p ˉ 2 ) + ∣ h 12 ∣ 2 p ˉ 2 ( p 1 − p ˉ 1 ) / log 2 ( p ˉ 2 + ∣ h 12 ∣ 2 p ˉ 1 + σ 2 2 ) ( ∣ h 12 ∣ 2 p ˉ 1 + σ 2 2 ) ≥ f ( p 1 , p ˉ 2 ) \begin{aligned} \bar{f}_{1}\left(p_{1} \mid \bar{p}_{1}, \bar{p}_{2}\right) & \triangleq-r_{1}\left(p_{1}, \bar{p}_{2}\right)-r_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right)+\left.\left(p_{1}-\bar{p}_{1}\right) \frac{\partial\left\{-r_{2}\left(p_{1}, \bar{p}_{2}\right)\right\}}{\partial p_{1}}\right|_{p_{1}=\bar{p}_{1}} \\ &=-r_{1}\left(p_{1}, \bar{p}_{2}\right)-r_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right)+\frac{\left|h_{12}\right|^{2} \bar{p}_{2}\left(p_{1}-\bar{p}_{1}\right) / \log 2}{\left(\bar{p}_{2}+\left|h_{12}\right|^{2} \bar{p}_{1}+\sigma_{2}^{2}\right)\left(\left|h_{12}\right|^{2} \bar{p}_{1}+\sigma_{2}^{2}\right)} \\ & \geq f\left(p_{1}, \bar{p}_{2}\right) \end{aligned} fˉ1(p1∣pˉ1,pˉ2)≜−r1(p1,pˉ2)−r2(pˉ1,pˉ2)+(p1−pˉ1)∂p1∂{−r2(p1,pˉ2)}∣∣∣∣p1=pˉ1=−r1(p1,pˉ2)−r2(pˉ1,pˉ2)+(pˉ2+∣h12∣2pˉ1+σ22)(∣h12∣2pˉ1+σ22)∣h12∣2pˉ2(p1−pˉ1)/log2≥f(p1,pˉ2)
和
f ˉ 2 ( p 2 ∣ p ˉ 1 , p ˉ 2 ) ≜ − r 2 ( p ˉ 1 , p 2 ) − r 1 ( p ˉ 1 , p ˉ 2 ) + ( p 2 − p ˉ 2 ) ∂ { − r 1 ( p ˉ 1 , p 2 ) } ∂ p 2 ∣ p 2 = p ˉ 2 = − r 2 ( p ˉ 2 , p 2 ) − r 1 ( p ˉ 1 , p ˉ 2 ) + ∣ h 21 ∣ 2 p ˉ 1 ( p 2 − p ˉ 2 ) / log 2 ( p ˉ 1 + ∣ h 21 ∣ 2 p ˉ 2 + σ 1 2 ) ( ∣ h 21 ∣ 2 p ˉ 2 + σ 1 2 ) ≥ f ( p ˉ 1 , p 2 ) \begin{aligned} \bar{f}_{2}\left(p_{2} \mid \bar{p}_{1}, \bar{p}_{2}\right) & \triangleq-r_{2}\left(\bar{p}_{1}, p_{2}\right)-r_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right)+\left.\left(p_{2}-\bar{p}_{2}\right) \frac{\partial\left\{-r_{1}\left(\bar{p}_{1}, p_{2}\right)\right\}}{\partial p_{2}}\right|_{p_{2}=\bar{p}_{2}} \\ &=-r_{2}\left(\bar{p}_{2}, p_{2}\right)-r_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right)+\frac{\left|h_{21}\right|^{2} \bar{p}_{1}\left(p_{2}-\bar{p}_{2}\right) / \log 2}{\left(\bar{p}_{1}+\left|h_{21}\right|^{2} \bar{p}_{2}+\sigma_{1}^{2}\right)\left(\left|h_{21}\right|^{2} \bar{p}_{2}+\sigma_{1}^{2}\right)} \\ & \geq f\left(\bar{p}_{1}, p_{2}\right) \end{aligned} fˉ2(p2∣pˉ1,pˉ2)≜−r2(pˉ1,p2)−r1(pˉ1,pˉ2)+(p2−pˉ2)∂p2∂{−r1(pˉ1,p2)}∣∣∣∣p2=pˉ2=−r2(pˉ2,p2)−r1(pˉ1,pˉ2)+(pˉ1+∣h21∣2pˉ2+σ12)(∣h21∣2pˉ2+σ12)∣h21∣2pˉ1(p2−pˉ2)/log2≥f(pˉ1,p2)
其中 p ˉ 1 \bar{p}_{1} pˉ1和 p ˉ 2 \bar{p}_{2} pˉ2是满足功率约束的任一点,。不难证明上述的近似函数都满足BSUM的收敛要求。此外,两个对应的子问题现在为:
min 0 ≤ p 1 ≤ P 1 f ˉ 1 ( p 1 ∣ p ˉ 1 , p ˉ 2 ) \min _{0 \leq p_{1} \leq P_{1}} \bar{f}_{1}\left(p_{1} \mid \bar{p}_{1}, \bar{p}_{2}\right) 0≤p1≤P1minfˉ1(p1∣pˉ1,pˉ2)
min 0 ≤ p 2 ≤ P 2 f ˉ 2 ( p 2 ∣ p ˉ 1 , p ˉ 2 ) \min _{0 \leq p_{2} \leq P_{2}} \bar{f}_{2}\left(p_{2} \mid \bar{p}_{1}, \bar{p}_{2}\right) 0≤p2≤P2minfˉ2(p2∣pˉ1,pˉ2)
均为凸问题,且具有唯一解(因为 f ˉ 1 \bar{f}_{1} fˉ1和 f ˉ 2 \bar{f}_{2} fˉ2分别是关于 p 1 p_1 p1和 p 2 p_2 p2的严格凸函数)由一阶最优条件可以得到,
p 1 ⋆ = { g 1 ( p ˉ 1 , p ˉ 2 ) , if 0 ≤ g 1 ( p ˉ 1 , p ˉ 2 ) ≤ P 1 P 1 , if g 1 ( p ˉ 1 , p ˉ 2 ) > P 1 0 , if g 1 ( p ˉ 1 , p ˉ 2 ) < 0 ( 4.165 a ) p_{1}^{\star}=\left\{\begin{array}{ll} g_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right), & \text { if } 0 \leq g_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right) \leq P_{1} \\ P_{1}, & \text { if } g_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right)>P_{1} \\ 0, & \text { if } g_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right)<0 \end{array}\right.(4.165a) p1⋆=⎩⎨⎧g1(pˉ1,pˉ2),P1,0, if 0≤g1(pˉ1,pˉ2)≤P1 if g1(pˉ1,pˉ2)>P1 if g1(pˉ1,pˉ2)<0(4.165a)
p 2 ⋆ = { g 2 ( p ˉ 1 , p ˉ 2 ) , if 0 ≤ g 2 ( p ˉ 1 , p ˉ 2 ) ≤ P 2 , P 2 , if g 2 ( p ˉ 1 , p ˉ 2 ) > P 2 0 , if g 2 ( p ˉ 1 , p ˉ 2 ) < 0 ( 4.165 b ) p_{2}^{\star}=\left\{\begin{array}{ll} g_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right), & \text { if } 0 \leq g_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right) \leq P_{2}, \\ P_{2}, & \text { if } g_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right)>P_{2} \\ 0, & \text { if } g_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right)<0 \end{array}\right.(4.165b) p2⋆=⎩⎨⎧g2(pˉ1,pˉ2),P2,0, if 0≤g2(pˉ1,pˉ2)≤P2, if g2(pˉ1,pˉ2)>P2 if g2(pˉ1,pˉ2)<0(4.165b)
其中:
g 1 ( p ˉ 1 , p ˉ 2 ) = ( p ˉ 2 + ∣ h 12 ∣ 2 p ˉ 1 + σ 2 2 ) ( ∣ h 12 ∣ 2 p ˉ 1 + σ 2 2 ) ∣ h 12 ∣ 2 p ˉ 2 − ( ∣ h 21 ∣ 2 p ˉ 2 + σ 1 2 ) g_{1}\left(\bar{p}_{1}, \bar{p}_{2}\right)=\frac{\left(\bar{p}_{2}+\left|h_{12}\right|^{2} \bar{p}_{1}+\sigma_{2}^{2}\right)\left(\left|h_{12}\right|^{2} \bar{p}_{1}+\sigma_{2}^{2}\right)}{\left|h_{12}\right|^{2} \bar{p}_{2}}-\left(\left|h_{21}\right|^{2} \bar{p}_{2}+\sigma_{1}^{2}\right) g1(pˉ1,pˉ2)=∣h12∣2pˉ2(pˉ2+∣h12∣2pˉ1+σ22)(∣h12∣2pˉ1+σ22)−(∣h21∣2pˉ2+σ12)
g 2 ( p ˉ 1 , p ˉ 2 ) = ( p ˉ 1 + ∣ h 21 ∣ 2 p ˉ 2 + σ 1 2 ) ( ∣ h 21 ∣ 2 p ˉ 2 + σ 1 2 ) ∣ h 21 ∣ 2 p ˉ 1 − ( ∣ h 12 ∣ 2 p ˉ 1 + σ 2 2 ) g_{2}\left(\bar{p}_{1}, \bar{p}_{2}\right)=\frac{\left(\bar{p}_{1}+\left|h_{21}\right|^{2} \bar{p}_{2}+\sigma_{1}^{2}\right)\left(\left|h_{21}\right|^{2} \bar{p}_{2}+\sigma_{1}^{2}\right)}{\left|h_{21}\right|^{2} \bar{p}_{1}}-\left(\left|h_{12}\right|^{2} \bar{p}_{1}+\sigma_{2}^{2}\right) g2(pˉ1,pˉ2)=∣h21∣2pˉ1(pˉ1+∣h21∣2pˉ2+σ12)(∣h21∣2pˉ2+σ12)−(∣h12∣2pˉ1+σ22)
因为 f ( p 1 , p 2 ) f\left(p_{1}, p_{2}\right) f(p1,p2)可微,故其在任意点处均是正则的;又因为 f ˉ 1 ( p 1 ∣ p ˉ 1 , p ˉ 2 ) \bar{f}_{1}\left(p_{1} \mid \bar{p}_{1}, \bar{p}_{2}\right) fˉ1(p1∣pˉ1,pˉ2)和 f ˉ 2 ( p 2 ∣ p ˉ 1 , p ˉ 2 ) \bar{f}_{2}\left(p_{2} \mid \bar{p}_{1}, \bar{p}_{2}\right) fˉ2(p2∣pˉ1,pˉ2)都是严格的凸函数,满足前提(4.161a),此外可行集解 { ( p 1 , p 2 ) ∣ 0 ≤ p 1 ≤ P 1 , 0 ≤ p 2 ≤ P 2 } \left\{\left(p_{1}, p_{2}\right) \mid 0 \leq p_{1} \leq P_{1}, 0 \leq p_{2} \leq P_{2}\right\} {(p1,p2)∣0≤p1≤P1,0≤p2≤P2}是紧的,满足前提(4.161b)。则算法4.3一定可以收敛到问题的稳定点。
BSUM和MM算法的区别?