conda create -n DL python==3.11
conda activate DL
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install jupyter
conda install matplotlib TensorBoard
conda install tensorboardX
TensorBoardX 是一个可以在PyTorch中使用TensorBoard的第三方库,可以使用它来记录训练过程中的损失、准确率、模型参数直方图等信息,并在TensorBoard中进行可视化展示。
conda install tensorboardX
或
pip install tensorboardX
在PyTorch中使用TensorBoardX来记录训练过程中的损失:
from tensorboardX import SummaryWriter
# 创建一个SummaryWriter对象,指定记录日志的目录
writer = SummaryWriter('logs')
for epoch in range(num_epochs):
# 在训练循环中记录损失
writer.add_scalar('Train/Loss', train_loss, epoch)
# 训练结束后关闭SummaryWriter
writer.close()
从PyTorch 1.2版本开始,PyTorch也增加了内置的TensorBoard支持:可以使用torch.utils.tensorboard.SummaryWriter
来记录训练过程中的信息,方法与上面的示例类似。
from torch.utils.tensorboard import SummaryWriter
使用下述格式命令来启动TensorBoard(默认端口6006):
tensorboard --logdir=path_to_your_logs
例:
tensorboard --logdir=./Norm --port=6005
日志文件保存目录为Norm,TensorBoard将运行在6005端口上
# Create a SummaryWriter for logging information to TensorBoard
writer = SummaryWriter()
for epoch in range(num_epochs):
print('Starting epoch {}...'.format(epoch), end=' ')
# Iterate through the data loader
for i, (images, labels) in enumerate(data_loader):
step = epoch * len(data_loader) + i + 1
real_images = Variable(images).to(device)
labels = Variable(labels).to(device)
generator.train()
d_loss = 0
# Perform multiple discriminator training steps
for _ in range(n_critic):
d_loss = discriminator_train_step(len(real_images), discriminator,
generator, d_optimizer, criterion,
real_images, labels,
device)
# Perform a single generator training step
g_loss = generator_train_step(batch_size, discriminator, generator, g_optimizer, criterion, device)
# Write the losses to TensorBoard
writer.add_scalars('scalars', {'g_loss': g_loss, 'd_loss': (d_loss / n_critic)}, step)
# Display sample images at certain steps
if step % display_step == 0:
generator.eval()
z = Variable(torch.randn(9, 100)).to(device)
labels = Variable(torch.LongTensor(np.arange(9))).to(device)
sample_images = generator(z, labels).unsqueeze(1)
grid = make_grid(sample_images, nrow=3, normalize=True)
writer.add_image('sample_image', grid, step)
print('Done!')
tensorboard --logdir=./Norm
点击上述链接(浏览器中输入http://localhost:6006
),打开TensorBoard的网页界面:
当使用TensorBoard对深度学习模型进行可视化时,常用的功能包括 Scalars(标量)、Images(图像)和Time Series(时间序列):
Scalas 在 TensorBoard 中用于呈现训练过程中的标量值,例如损失函数值、准确率、学习率等。
- 通过 Scalars 功能,可以观察这些标量值随着训练步骤的变化而变化的趋势图;
- 可以同时对比多个标量,以便分析它们之间的关系和趋势。
toggle y-axis log scale
(切换 Y 轴对数刻度)
Alt+Scroll to Zoom
(Alt+鼠标滚动以缩放)
fit domain to data
(说人话就是:缩放后一键复原)
选择要显示的数据(左面方框
多选,右面圆圈
单选):
(对比实验结果)
Images 功能可用于显示模型生成的图像,以及模型中间层的激活值、过滤器等图片信息。
右侧RESET
恢复默认值
在pillow的10.0.0版本中,ANTIALIAS方法被删除了,使用新的方法即可:
Image.LANCZOS
Image.Resampling.LANCZOS
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
conda install tensorboard
## Package Plan ##
environment location: E:\Software\anaconda3\envs\DL
added / updated specs:
- tensorboard
The following packages will be downloaded:
package | build
---------------------------|-----------------
werkzeug-2.3.8 | py311haa95532_0 445 KB defaults
------------------------------------------------------------
Total: 445 KB
The following NEW packages will be INSTALLED:
protobuf anaconda/pkgs/main/win-64::protobuf-3.20.3-py311hd77b12b_0
werkzeug anaconda/pkgs/main/win-64::werkzeug-2.3.8-py311haa95532_0
Proceed ([y]/n)?