VSCode中调试通过torchrun实现的分布式训练启动程序

train.sh文件实现torchrun如下

#!/bin/bash

py3clean ./
CUDA_VISIBLE_DEVICES=3 torchrun --nproc_per_node=1 --master_port=9006 tools/train.py \
                        configs/basicvsr_plusplus_vimeo90k_bd.py \
                        --seed 0 \

需要进行更改来DeBug,改成launch.json如下所示,怎么打开launch.json百度一下

{
    "version": "0.2.0",
    "configurations": [
        {
            "python":"${command:python.interpreterPath}",
            "name": "Debug Training Script",
            "type": "python",
            "request": "launch",
            "program": "tools/train.py",
            "args": [
                "--seed",
                "0",
                "configs/basicvsr_plusplus_vimeo90k_bd.py"            
            ],
            
            "cwd": "${workspaceFolder}",
            "env": {
                "CUDA_VISIBLE_DEVICES": "3"
            },
            "console": "integratedTerminal",
            "stopOnEntry": false,

            "justMyCode": false
        }
    ]
}

然后打开如下窗口,点击这个按钮开始调试

VSCode中调试通过torchrun实现的分布式训练启动程序_第1张图片

你可能感兴趣的:(笔记,vscode,ide,编辑器)