准备要放寒假啦,虽然我们也没啥寒假了,各地疫情又在肆虐了,大家出门回家要小心喔~
这也是我22年春节回家前的最后一稿啦,内容有点杂,但是都是为了给后面填大坑用的,也借鉴了其他一些优秀的blog,在此表示感谢~
正文开始~
施密特正交化是为了让一组向量 ( α 1 , α 2 , . . . , α n ) (\alpha_1, \alpha_2, ..., \alpha_n) (α1,α2,...,αn)通过这个变换得到一组互相正交的向量 ( β 1 , β 2 , . . . , β n ) (\beta_1, \beta_2, ..., \beta_n) (β1,β2,...,βn)。原理比较简单,设第一个正交向量即为第一个原向量 α 1 \alpha_1 α1,则后续的正交向量是通过第i个原向量 α i \alpha_i αi减去其在前i-1个向量 ( β 1 , β 2 , . . . , β i − 1 ) (\beta_1, \beta_2, ..., \beta_{i-1}) (β1,β2,...,βi−1)的投影得到。
令 β 1 = α 1 \beta_1=\alpha_1 β1=α1,接下来要求 β 2 \beta_2 β2,我们可以通过求 α 2 \alpha_2 α2在 α 1 \alpha_1 α1上的投影d,然后利用 α 2 − d \alpha_2-d α2−d就可以得到与 α 1 \alpha_1 α1相垂直的向量 β 2 \beta_2 β2。
d = ∣ ∣ α 2 ∣ ∣ cos θ ⋅ α 1 ∣ ∣ α 1 ∣ ∣ d = ∣ ∣ α 2 ∣ ∣ cos θ ⋅ α 1 ⋅ ∣ ∣ α 1 ∣ ∣ ∣ ∣ α 1 ∣ ∣ 2 = ( α 2 , β 1 ) ( β 1 , β 1 ) ⋅ β 1 \begin{aligned} d&=||\alpha_2||\cos{\theta}\cdot\cfrac{\alpha_1}{||\alpha_1||}\\ d&=||\alpha_2||\cos{\theta}\cdot\cfrac{\alpha_1 \cdot ||\alpha_1||}{||\alpha_1||^2}=\cfrac{(\alpha_2, \beta_1)}{(\beta_1, \beta_1)} \cdot \beta_1 \\ \end{aligned} dd=∣∣α2∣∣cosθ⋅∣∣α1∣∣α1=∣∣α2∣∣cosθ⋅∣∣α1∣∣2α1⋅∣∣α1∣∣=(β1,β1)(α2,β1)⋅β1
按照以上思路,可以得到递推式:
β 1 = α 1 β k = α k − ∑ i = 1 k − 1 ( α k , β i ) ( β i , β i ) ⋅ β i \beta_1=\alpha_1 \\ \beta_k=\alpha_k-\sum_{i=1}^{k-1}\cfrac{(\alpha_k, \beta_i)}{(\beta_i, \beta_i)} \cdot \beta_i \\ β1=α1βk=αk−i=1∑k−1(βi,βi)(αk,βi)⋅βi
引入归一化,即 ϵ 1 = β 1 ∣ ∣ β 1 ∣ ∣ \epsilon_1=\cfrac{\beta_1}{||\beta_1||} ϵ1=∣∣β1∣∣β1,可以得到:
d = ∣ ∣ α 2 ∣ ∣ cos θ ⋅ α 1 ∣ ∣ α 1 ∣ ∣ = ∣ ∣ α 2 ∣ ∣ ∣ ∣ ϵ 1 ∣ ∣ cos θ ⋅ ϵ 1 = ( α 2 , ϵ 1 ) ⋅ ϵ 1 β k = α k − ∑ i = 1 k − 1 ( α k , ϵ i ) ⋅ ϵ i d=||\alpha_2||\cos{\theta}\cdot\cfrac{\alpha_1}{||\alpha_1||}=||\alpha_2|| ||\epsilon_1|| \cos{\theta} \cdot \epsilon_1=(\alpha_2, \epsilon_1) \cdot \epsilon_1 \\ \beta_k=\alpha_k-\sum_{i=1}^{k-1}(\alpha_k, \epsilon_i) \cdot \epsilon_i \\ d=∣∣α2∣∣cosθ⋅∣∣α1∣∣α1=∣∣α2∣∣∣∣ϵ1∣∣cosθ⋅ϵ1=(α2,ϵ1)⋅ϵ1βk=αk−i=1∑k−1(αk,ϵi)⋅ϵi
@classmethod
def Gram_Schmidt_Orthogonalization(self, target):
"""Gram_Schmidt_Orthogonalization
Args:
target ([np.darray]): [target matrix]
Returns:
[np.darray]: [orthogonal matrix]
Last edited: 22-01-13
Author: Junno
"""
# col vectors of target are to be transforms
assert len(target.shape) == 2, 'shape of target should be (M,N)'
M, N = target.shape
# matrix of epsilon
orth = np.zeros_like(target)
for i in range(N):
temp = target[:, i]
Delta = 0
for j in range(i):
Delta -= np.sum(temp*orth[:, j])*orth[:, j]
temp += Delta
orth[:, i] = temp/np.linalg.norm(temp)
return orth
>>> import numpy as np
>>> from Matrix_Solutions import Matrix_Solutions
>>> a=np.random.randint(1,10,(4,4)).astype(np.float32)
>>> a
array([[8., 2., 8., 2.],
[1., 2., 8., 6.],
[1., 1., 9., 6.],
[3., 8., 4., 3.]], dtype=float32)
>>> sa=Matrix_Solutions.Gram_Schmidt_Orthogonalization(a)
>>> sa
array([[ 0.924, -0.372, -0.086, 0.03 ],
[ 0.115, 0.205, 0.613, 0.754],
[ 0.115, 0.061, 0.752, -0.646],
[ 0.346, 0.903, -0.226, -0.115]], dtype=float32)
# Unitary Matrix: U^H @ U = I
>>> sa.T@sa
array([[ 1., 0., 0., -0.],
[ 0., 1., -0., 0.],
[ 0., -0., 1., -0.],
[-0., 0., -0., 1.]], dtype=float32)
这里我使用的方法是将方阵化为上三角矩阵,即行列式为其对角线元素的乘积。话不多说,直接上代码
@classmethod
def det(self, target, eps=1e-6, test=False, np_check=False):
"""calculate determination of matrix: simplify the matrix to the upper triangular matrix
Args:
target ([np.darray]): [target M]
eps ([float]): numerical threshold.
test (bool, optional): [show checking information]. Defaults to False.
np_check (bool, optional): [show answer by numpy method to check]. Defaults to False.
Returns:
[float]: [det]
Last edited: 22-01-13
Author: Junno
"""
assert len(target.shape) == 2
M, N = target.shape
assert M == N
# fast example
if M == 1:
return target[0, 0]
elif M == 2:
return target[0, 0]*target[1, 1]-target[0, 1]*target[1, 0]
# M>=3
A = deepcopy(target)
ans = 1
for i in range(M-1):
if test:
print('During process in row:{}, col:{}'.format(i, i))
if sum(abs(A[i:, i])) > eps:
first_zero_ind = -1
first_non_zero_ind = -1
# resort row
for k in range(i, M):
if abs(A[k, i]) < eps and first_zero_ind < 0:
first_zero_ind = k
if first_non_zero_ind >= 0:
break
elif abs(A[k, i]) > eps and first_non_zero_ind < 0:
first_non_zero_ind = k
if first_zero_ind >= 0:
break
if first_zero_ind == 0:
ans *= -1
A[[first_zero_ind, first_non_zero_ind], :] = A[[
first_non_zero_ind, first_zero_ind], :]
prefix = -A[i+1:, i]/A[i, i]
temp = (np.array(prefix).reshape(-1, 1)
)@A[i, :].reshape((1, -1))
A[i+1:, :] += temp
# calculate det with diag elements
ans *= A[i, i]
else:
return 0
if test:
print(A)
ans *= A[-1, -1]
if np_check:
print("numpy result: ", np.linalg.det(target))
print("my result: ", ans)
return ans
>>> import numpy as np
>>> from Matrix_Solutions import Matrix_Solutions
>>> a=np.diag([2,3,4,5]).astype(np.float32)
>>> a
array([[2., 0., 0., 0.],
[0., 3., 0., 0.],
[0., 0., 4., 0.],
[0., 0., 0., 5.]], dtype=float32)
>>> Matrix_Solutions.det(a,np_check=True)
numpy result: 120.0
my result: 120.0
120.0
>>> a=np.random.randn(4,4)
>>> a
array([[ 1.018, -0.943, -0.267, -0.665],
[-0.877, 0.881, 1.268, 0.152],
[-1.054, -0.288, -0.147, 1.236],
[-0.291, -0.842, -0.541, -1.097]])
>>> Matrix_Solutions.det(a,np_check=True)
numpy result: -2.4311256336989056
my result: -2.431125633698907
-2.431125633698907
先来看定义,对于 A ∈ C m × k A \in C^{m\times k} A∈Cm×k是一个列满秩矩阵,即 r a n k ( A ) = k rank(A)=k rank(A)=k,则A可以分解为
A = Q R A=QR A=QR
其中, Q ∈ C m × k Q \in C^{m\times k} Q∈Cm×k的列向量是A的列空间的标准正交基, R ∈ C k × k R \in C^{k\times k} R∈Ck×k是一个可逆的上三角矩阵。
特别地,结合之前讲到的施密特正交化,若矩阵A可以通过施密特正交化得到正交矩阵U,则可以直接得到其QR分解为(设A=UR):
R = U − 1 A = U H A R=U^{-1}A=U^H A R=U−1A=UHA
对于QR分解的求法,常见的由Household变换和Givens旋转变换求法,大家可以去看一下这个blog,讲得很好,图例解法都很详细,我的实现也大部分参考了其过程,这就不献丑,大家自行学习~
@classmethod
def Householder_Transfrom(self, target, project_axis=0, eps=1e-6):
"""Householder_Transfrom
Args:
target ([np.darray]): [target matrix]
project_axis (int, optional): [projection axis]. Defaults to 0.
eps ([float], optional): [numerical threshold]. Defaults to 1e-6.
Returns:
[np.darray]: [the result of householer transform]
Last edited: 22-01-13
Author: Junno
create elementary-reflection-matrix: I-2ww^T, wwT=1
refer: https://blog.csdn.net/xfijun/article/details/109464005
"""
# flatten
x = target.reshape(len(target), 1)
# generate unit matrix
unit = np.zeros_like(x, dtype=target.dtype)
unit[project_axis] = 1
y = np.linalg.norm(x, ord=2)*unit
if np.sum(np.abs(x+np.sign(x[project_axis, 0])*y)) < eps:
return np.eye(len(x))
else:
w = (x+np.sign(x[project_axis, 0])*y) / \
np.linalg.norm(x+np.sign(x[project_axis, 0])*y)
H = np.eye(len(x))-2*w@w.T
return H
@classmethod
def QR_Fact(self, target, mode='householder'):
"""QR_Fact
Args:
target ([np.darray]): [target matrix]
mode (str, optional): [mode of solver]. Defaults to 'householder'.
Returns:
[np.darray]: [Q and R]
Last edited: 22-01-13
Author: Junno
refer: https://blog.csdn.net/xfijun/article/details/109464005
"""
assert len(target.shape) == 2
M, N = target.shape
assert self.Check_full_rank(
target), "only full_col_rank matrix has QR Fact"
A = deepcopy(target)
Q = np.eye(M, dtype=target.dtype)
if mode == 'householder':
for i in range(M-1):
# generate intermediate matrix
Qi = np.eye(M, dtype=target.dtype)
# get i-th col
x = A[i:, i]
Hi = self.Householder_Transfrom(x)
# operate householder transform to A
A[i:, i:] = Hi@A[i:, i:]
Qi[i:, i:] = Hi
# A(n-1)=H(n-2)H(n-3)...H(0)A
Q = Qi@Q
# induce unit-invertible-matrix D to keep diag elements of R be positive
D = np.diag(np.where(np.diag(A) < 0, -1, 1))
R = D@A
# Q=Q^(-1)D@^(-1)=Q.T@D
Q = Q.T@D
return Q, R
@classmethod
def UR_Fact_Schmidt(self, target):
"""UR_Fact_Schmidt
Args:
target ([np.darray]): [target matrix]
Returns:
[np.darray]: [U and R]
Last edited: 22-01-13
Author: Junno
"""
assert len(target.shape) == 2
M, N = target.shape
assert self.Check_full_rank(
target), "only invertible matrix has UR Fact"
U = self.Gram_Schmidt_Orthogonalization(target, norm=True)
# A=UR,R=U^(-1)A=U^(T)A
R = U.T@target
return U, R