下面我将介绍内嵌物理知识神经网络(PINN)求解微分方程。首先介绍PINN基本方法,并基于Pytorch的PINN求解框架实现求解含时间项的一维burger方程。
内嵌物理知识神经网络(PINN)入门及相关论文
深度学习求解微分方程系列一:PINN求解框架(Poisson 1d)
深度学习求解微分方程系列二:PINN求解burger方程正问题
深度学习求解微分方程系列四:基于自适应激活函数PINN求解burger方程逆问题
神经网络作为一种强大的信息处理工具在计算机视觉、生物医学、 油气工程领域得到广泛应用, 引发多领域技术变革.。深度学习网络具有非常强的学习能力, 不仅能发现物理规律, 还能求解偏微分方程.。近年来,基于深度学习的偏微分方程求解已是研究新热点。内嵌物理知识神经网络(PINN)是一种科学机器在传统数值领域的应用方法,能够用于解决与偏微分方程 (PDE) 相关的各种问题,包括方程求解、参数反演、模型发现、控制与优化等。
PINN的主要思想如图1,先构建一个输出结果为 u ^ \hat{u} u^的神经网络,将其作为PDE解的代理模型,将PDE信息作为约束,编码到神经网络损失函数中进行训练。损失函数主要包括4部分:偏微分结构损失(PDE loss),边值条件损失(BC loss)、初值条件损失(IC loss)以及真实数据条件损失(Data loss)。
特别的,考虑下面这个的PDE问题,其中PDE的解 u ( x ) u(x) u(x)在 Ω ⊂ R d \Omega \subset \mathbb{R}^{d} Ω⊂Rd定义,其中 x = ( x 1 , … , x d ) \mathbf{x}=\left(x_{1}, \ldots, x_{d}\right) x=(x1,…,xd):
f ( x ; ∂ u ∂ x 1 , … , ∂ u ∂ x d ; ∂ 2 u ∂ x 1 ∂ x 1 , … , ∂ 2 u ∂ x 1 ∂ x d ) = 0 , x ∈ Ω f\left(\mathbf{x} ; \frac{\partial u}{\partial x_{1}}, \ldots, \frac{\partial u}{\partial x_{d}} ; \frac{\partial^{2} u}{\partial x_{1} \partial x_{1}}, \ldots, \frac{\partial^{2} u}{\partial x_{1} \partial x_{d}} \right)=0, \quad \mathbf{x} \in \Omega f(x;∂x1∂u,…,∂xd∂u;∂x1∂x1∂2u,…,∂x1∂xd∂2u)=0,x∈Ω
同时,满足下面的边界
B ( u , x ) = 0 on ∂ Ω \mathcal{B}(u, \mathbf{x})=0 \quad \text { on } \quad \partial \Omega B(u,x)=0 on ∂Ω
PINN求解过程主要包括:
值得注意的是,对于逆问题,即方程中的某些参数未知。若只知道PDE方程及边界条件,PDE参数未知,该逆问题为非定问题,所以必须要知道其他信息,如部分观测点 u u u 的值。在这种情况下,PINN做法可将方程中的参数作为未知变量,加到训练器中进行优化,损失函数包括Data loss。
布朗大学Jagtap在19年提出了基于自适应激活函数的PINN。具体而言,在激活函数中引入一个可训练的参数,由于在优化过程中所涉及的损失函数的拓扑结构会动态变化,该参数在训练过程将会被优化以实现网络的最佳性能。相比于传统的固定激活函数的PINN,基于自适应激活函数的PINN具有更好的学习能力,它大大提高了收敛速度以及解的精度,特别是在早期训练时加速效果明显。
Jagtap A D, Kawaguchi K, Karniadakis G E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks[J]. Journal of Computational Physics, 2020, 404: 109136.
u t + u u x = v u x x , x ∈ [ − 1 , 1 ] , t > 0 u ( x , 0 ) = − sin ( π x ) u ( − 1 , t ) = u ( 1 , t ) = 0 \begin{aligned} u_t+u u_x &=v u_{x x}, x \in[-1,1], t>0 \\ u(x, 0) &=-\sin (\pi x) \\ u(-1, t) &=u(1, t)=0 \end{aligned} ut+uuxu(x,0)u(−1,t)=vuxx,x∈[−1,1],t>0=−sin(πx)=u(1,t)=0
式中:参数 v v v为未知参数,真实值为 v ∈ [ 0 , 0.1 / π ] v \in[0,0.1 / \pi] v∈[0,0.1/π]。数值解通过Hopf-Cole transformation获得,如图2。
任务要求:
class Net(nn.Module):
def __init__(self, seq_net, name='MLP', activation=torch.tanh, a=0.1):
super().__init__()
self.features = OrderedDict()
for i in range(len(seq_net) - 1):
self.features['{}_{}'.format(name, i)] = nn.Linear(seq_net[i], seq_net[i + 1], bias=True)
self.features = nn.ModuleDict(self.features)
self.active = activation
self.a = a
# initial_bias
for m in self.modules():
if isinstance(m, nn.Linear):
nn.init.constant_(m.bias, 0)
def forward(self, x):
# x = x.view(-1, 2)
length = len(self.features)
i = 0
for name, layer in self.features.items():
x = layer(x)
if i == length - 1: break
i += 1
x = self.active(10*self.a*x)
return x
from net import Net
import os
from math import pi
import scipy.io
import numpy as np
import torch
from torch.autograd import grad
from math import pi
def d(f, x):
return grad(f, x, grad_outputs=torch.ones_like(f), create_graph=True, only_inputs=True)[0]
def PDE(u, t, x, nu):
return d(u, t) + u * d(u, x) - nu * d(d(u, x), x)
def Ground_true(xx):
out = 0.7 * torch.sin(pi * xx[:, 0:1]) * torch.sin(pi * xx[:, 1:]) + \
0.45 * torch.sin(1 * pi * xx[:, 0:1]) * torch.sin(2 * pi * xx[:, 1:]) - \
0.23 * torch.sin(1 * pi * xx[:, 0:1]) * torch.sin(3 * pi * xx[:, 1:]) - \
0.10 * torch.sin(2 * pi * xx[:, 0:1]) * torch.sin(1 * pi * xx[:, 1:]) + \
0.95 * torch.sin(2 * pi * xx[:, 0:1]) * torch.sin(2 * pi * xx[:, 1:]) - \
0.95 * torch.sin(2 * pi * xx[:, 0:1]) * torch.sin(3 * pi * xx[:, 1:]) + \
0.95 * torch.sin(3 * pi * xx[:, 0:1]) * torch.sin(1 * pi * xx[:, 1:]) - \
0.95 * torch.sin(3 * pi * xx[:, 0:1]) * torch.sin(2 * pi * xx[:, 1:]) + \
0.95 * torch.sin(3 * pi * xx[:, 0:1]) * torch.sin(3 * pi * xx[:, 1:])
return out
def train():
t_left, t_right = 0., 1.
x_left, x_right = -1., 1.
lr = 0.001
epochs = 60000
n_f, n_b_1,n_b_2 = 10000, 400, 400
N_train = 10000
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
device = torch.device('cuda:0') if torch.cuda.is_available() else torch.cuda('cpu')
# Adaptive activation functions for PINN
a = np.array([0.1])
a = torch.from_numpy(a).float().to(device).requires_grad_(True)
a.grad = torch.ones((1)).to(device)
PINN = Net(seq_net=[2, 20, 20, 20, 20, 20, 20, 1], activation=torch.tanh, a=a).to(device)
optimizer = torch.optim.Adam(PINN.parameters(), lr)
# Add the trainable parameter into the optimizer
optimizer.add_param_group({'params': a, 'lr': 0.001})
# Problem parameter initialization
nu = np.array([0])
nu = torch.from_numpy(nu).float().to(device).requires_grad_(True)
nu.grad = torch.ones((1)).to(device)
optimizer.add_param_group({'params': nu, 'lr': 0.00001})
criterion = torch.nn.MSELoss()
# data
data = scipy.io.loadmat('Data/burgers_shock.mat')
Exact = np.real(data['usol']).T
t = data['t'].flatten()[:, None]
x = data['x'].flatten()[:, None]
X, T = np.meshgrid(x, t)
X_star = np.hstack((X.flatten()[:, None], T.flatten()[:, None])) # Lay horizontally (25600, 2)
X_star = X_star.astype(np.float32) # first column represents x, second column represents t
X_star = torch.from_numpy(X_star).cuda().requires_grad_(True)
u_star = Exact.flatten()[:, None]
u_star = u_star.astype(np.float32)
u_star = torch.from_numpy(u_star).cuda().requires_grad_(True)
# data
N, T = 256, 100
# print(X_star.shape)
idx = np.random.choice(N * T, N_train, replace=False)
x_train = X_star[idx, 0:1].requires_grad_(True)
t_train = X_star[idx, 1:2].requires_grad_(True)
u_train = u_star[idx]
loss_history = []
Lambda = []
Loss_history = []
test_loss = []
a_history = []
for epoch in range(epochs):
optimizer.zero_grad()
# PDE residual
t_f = ((t_left + t_right) / 2 + (t_right - t_left) *
(torch.rand(size=(n_f, 1), dtype=torch.float, device=device) - 0.5)
).requires_grad_(True)
x_f = ((x_left + x_right) / 2 + (x_right - x_left) *
(torch.rand(size=(n_f, 1), dtype=torch.float, device=device) - 0.5)
).requires_grad_(True)
u_f = PINN(torch.cat([t_f, x_f], dim=1))
PDE_ = PDE(u_f, t_f, x_f, nu)
mse_PDE = criterion(PDE_, torch.zeros_like(PDE_))
# Boundary
x_rand = ((x_left + x_right) / 2 + (x_right - x_left) *
(torch.rand(size=(n_b_1, 1), dtype=torch.float, device=device) - 0.5)
).requires_grad_(True)
t_b = (t_left * torch.ones_like(x_rand)
).requires_grad_(True)
u_b_1 = PINN(torch.cat([t_b, x_rand], dim=1)) + torch.sin(pi * x_rand)
t_rand = ((t_left + t_right) / 2 + (t_right - t_left) *
(torch.rand(size=(n_b_2, 1), dtype=torch.float, device=device) - 0.5)
).requires_grad_(True)
x_b_1 = (x_left * torch.ones_like(t_rand)
).requires_grad_(True)
x_b_2 = (x_right * torch.ones_like(t_rand)
).requires_grad_(True)
u_b_2 = PINN(torch.cat([t_rand, x_b_1], dim=1))
u_b_3 = PINN(torch.cat([t_rand, x_b_2], dim=1))
mse_BC_1 = criterion(u_b_1, torch.zeros_like(u_b_1))
mse_BC_2 = criterion(u_b_2, torch.zeros_like(u_b_2))
mse_BC_3 = criterion(u_b_3, torch.zeros_like(u_b_3))
mse_BC = mse_BC_1 + mse_BC_2 + mse_BC_3
# Data
u_data = PINN(torch.cat([t_train, x_train], dim=1))
mse_Data = criterion(u_data, u_train)
# loss
loss = 1 * mse_PDE + 1 * mse_BC + 1 * mse_Data
# Pred loss
x_pred = X_star[:, 0:1]
t_pred = X_star[:, 1:2]
u_pred = PINN(torch.cat([t_pred, x_pred], dim=1))
mse_test = criterion(u_pred, u_star)
loss_history.append([mse_PDE.item(), mse_BC.item(), mse_Data.item(), mse_test.item()])
Lambda.append([nu.item()])
Loss_history.append([loss.item()])
test_loss.append([mse_test.item()])
a_history.append([a.item()])
loss.backward(retain_graph=True)
optimizer.step()
if (epoch + 1) % 10000 == 0:
print(
'epoch:{:05d}, PDE: {:.08e}, BC: {:.08e}, loss: {:.08e}'.format(
epoch + 1, mse_PDE.item(), mse_BC.item(), loss.item()
)
)
print('nu:{:.03e}, a:{:.03e}'.format(nu.item(), a.item()))
# save the model parameters and the problem parameter nu
# np.save('./result_data/nu({}).npy'.format(args.epochs), Lambda)
# np.save('./result_data/test_loss({}).npy'.format(args.epochs), test_loss)
# np.save('./result_data/training_loss({}).npy'.format(args.epochs), Loss_history)
# np.save('./result_data/a({}).npy'.format(args.epochs), a_history)
# torch.save(PINN.state_dict(), './result_data/PINN({}).pth'.format(args.epochs))
if __name__ == '__main__':
train()
import numpy as np
import matplotlib.pyplot as plt
from math import pi
import seaborn
def Plot_training_nu():
fig, ax = plt.subplots(1, 2, figsize=(16, 6))
lambda_PINN = np.load('./result_data/nu(60000).npy')
loss_PINN = np.load('./result_data/training_loss(60000).npy')
lambda_PINN_0 = np.load('./result_data/nu(PINN).npy')
loss_PINN_0 = np.load('./result_data/training_loss(PINN).npy')
print('PINN', (0.01 / pi - lambda_PINN[-1]) / (0.01 / pi))
print('PINN_with variable', (0.01 / pi - lambda_PINN_0[-1]) / (0.01 / pi))
ax[0].plot(lambda_PINN_0 * (1e3), '-', label='PINN with $a =1$'
, color='b', lw=1.2, zorder=10)
ax[0].plot(lambda_PINN * (1e3), label='PINN with variable a',
color='g', linestyle='--', lw=1.2, zorder=20)
ax[0].axhline(y=(1e3) * 0.01 / pi, color='black', linestyle='--', label='Exact', lw=1)
axins = ax[0].inset_axes((2 / 3, 6 / 17, 0.3, 0.3))
axins.plot(lambda_PINN_0 * (1e3), '-', label='PINN with $a =1$'
, color='b', lw=1.2, zorder=10)
axins.plot(lambda_PINN * (1e3), label='PINN with variable a',
color='g', linestyle='--', lw=1.2, zorder=20)
axins.axhline(y=(1e3) * 0.01 / pi, color='black', linestyle='--', label='Exact', lw=1)
axins.set_xticks([40000, 50000, 60000])
axins.set_xlim(40000, 60000)
axins.set_ylim(3, 4)
ax[0].set_yscale('linear')
ax[0].legend(loc='best')
y_ticks = np.arange(0, 13, 2)
ax[0].set_yticks(y_ticks)
ax[0].set_ylim(0, 13)
ax[0].text(-6000., 13.08, r'$1e-3$', fontsize=8)
ax[0].set_ylabel('$v$')
ax[1].plot(loss_PINN, label='PINN with variable a',
color='g', linestyle='--', lw=1.2)
ax[1].plot(loss_PINN_0, '-', label='PINN with $a =1$',
color='b', lw=1.2)
ax[1].set_ylabel('Loss')
ax[1].set_yscale('log')
ax[1].set_ylim(1e-5, 1e-1)
handles, labels = ax[1].get_legend_handles_labels()
ax[1].legend(handles[::-1], labels[::-1], loc='best')
# plt.savefig('./result_plot/nu_adapt.png', bbox_inches='tight', dpi=600, format='png',
# pad_inches=0.1)
plt.show()
if __name__ == '__main__':
Plot_training_nu()
训练过程,参数变化图如图3所示。可以清楚看到,在训练的早期,使用了自适应激活函数的PINN能够更快的下降并收敛到exact value。
训练过程中预测结果图如图4-6。