为方便学术交流,旷视研究院开源了以深度学习任务为核心的基于PyTorch的训练工程VideoAnalyst,它目前以算法SiamFC++为项目实例,通过分析当前深度学习模型训练测试工程的结构特点,开发出一套注重任务扩展性的深度学习训练/测评框架,其系统由5个模块构成:
Tip: 把这个项目的架构认真读完很有用。
├── experiments # 实验配置,网络的结构配置、数据集配置等等,整个项目都会根据配置文件运行,配置即系统构建, in yaml format
├── main
│ ├── train.py # 训练入口,已经集成化了,当模块构建完成后,可直接运行, python3 main/train.py or test.py -cfg configfile.yaml
│ └── test.py # test entry point
├── video_analyst
│ ├── data # modules related to data
│ │ ├── dataset # data fetcher of each individual dataset
│ │ ├── sampler # data sampler, including inner-dataset and intra-dataset sampling procedure
│ │ ├── dataloader.py # data loading procedure
│ │ └── transformer # data augmentation
│ ├── engine # procedure controller, including traiing control / hp&model loading
│ │ ├── monitor # monitor for tasks during training, including visualization / logging / benchmarking
│ │ ├── trainer.py # train a epoch
│ │ ├── tester.py # test a model on a benchmark
│ ├── model # model builder
│ │ ├── backbone # backbone network builder
│ │ ├── common_opr # shared operator (e.g. cross-correlation)
│ │ ├── task_model # holistic model builder
│ │ ├── task_head # head network builder
│ │ └── loss # loss builder
│ ├── pipeline # pipeline builder (tracking / vos)
│ │ ├── segmenter # segmenter builder for vos
│ │ ├── tracker # tracker builder for tracking
│ │ └── utils # pipeline utils
│ ├── config # configuration manager
│ ├── evaluation # benchmark
│ ├── optim # optimization-related module (learning rate, gradient clipping, etc.)
│ │ ├── optimizer # optimizer
│ │ ├── scheduler # learning rate scheduler
│ │ └── grad_modifier # gradient-related operation (parameter freezing)
│ └── utils # useful tools
└── README.md
这是一个通用设计,已被MMDetection和Detectron2等几个主流深度学习工程采用。
注册机制的主要思想是构建一个字典,其键为模块名称,其值为模块类对象,然后通过检索模块类对象并使用预定义的配置文件实例化它们来构建整个pipeline(例如,pipeline的跟踪器/分割器/训练器等)。
以下是注册机制使用的一个示例:(在xxx_base.py里构建字典,在XXX_impl/YYY.py里实现模块类)
# In XXX_base.py
from videoanalyst.utils import Registry
TRACK_TEMPLATE_MODULES = Registry('TRACK_TEMPLATE_MODULE')
VOS_TEMPLATE_MODULES = Registry('VOS_TEMPLATE_MODULE')
TASK_TEMPLATE_MODULES = dict(
track=TRACK_TEMPLATE_MODULES,
vos=VOS_TEMPLATE_MODULES,
)
# In XXX_impl/YYY.py
@TRACK_TEMPLATE_MODULES.register
@VOS_TEMPLATE_MODULES.register
class TemplateModuleImplementation(TemplateModuleBase):
...
videoanalyst基于yaml和yacs,以层级的方式定义配置。
提供的默认的TEMPLATES位于video_analyst/docs/TEMPLATES下。
这一模块用于构建自己的module。将TEMPLATES下的TemplateModule、TEMPLATE_MODULES和template_module替换成你自己的模块名。
最基本的,需要overwrite/complete以下部分:
由于着重于多任务框架的设计和实现,遵守开闭原则 (Open–closed principle),因此video_analyst具有非常好的多任务扩展属性。
举个例子来说,为了增加一个额外的实验ShuffleNetV2。不需要修改追踪文件,只需要添加一些ShuffleNetV2相关的文件。
Untracked files:
(use "git add ..." to include in what will be committed)
experiments/siamfcpp/train/siamfcpp_shufflenetv2x0_5-trn.yaml
experiments/siamfcpp/train/siamfcpp_shufflenetv2x1_0-trn.yaml
tools/train_test-shufflenetv2x0_5.sh
tools/train_test-shufflenetv2x1_0.sh
videoanalyst/model/backbone/backbone_impl/shufflenet_v2.py
Trainer
├── Dataloder (pytorch) # make batch of training data
│ └── AdaptorDataset (pytorch) # adaptor class (pytorch index dataset)
│ └── Datapipeline # integrate data sampling, data augmentation, and target making process
│ ├── Sampler # define sampling strategy
│ │ ├── Dataset # dataset interface
│ │ └── Filter # define rules to filter out invalid sample
│ ├── Transformer # data augmentation
│ └── Target # target making
├── Optimizer
│ ├── Optimizer (pytorch) # pytorch optimizer
│ │ ├── lr_scheduler # learning rate scheduling
│ │ └── lr_multiplier # learning rate multiplication ratio
│ ├── Grad_modifier # grad clip, dynamic freezing, etc.
│ └── TaskModel # model, subclass of pytorch module (torch.nn.Module)
│ ├── Backbone # feature extractor
│ ├── Neck # mid-level feature map operation (e.g. cross-correlation)
│ └── Head # task head (bbox regressor, mask decoder, etc.)│
└── Monitor # monitoring (e.g. pbar.set_description, tensorboard, etc.)
model = builder.build(model_cfg) # model_cfg.loss_cfg, model.loss
optimzier = builder.build(optim_cfg, model)
dataloader = builder.build(data_cfg)
trainer = builder.build(trainer_cfg, optimzier, dataloader)
Tester
├── Benchmark implementation # depend on concrete benchmarks (e.g. VOT / GOT-10k / LaSOT / etc.)
└── Pipeline # manipulate underlying nn model and perform pre/post-processing
└── TaskModel # underlying nereural network
├── Backbone # feature extractor
├── Neck # mid-level feature map operation (e.g. cross-correlation)
└── Head # task head (bbox regressor, mask decoder, etc.)
目前使用的loggers如下
为了进行调试/单元测试,请在源代码的开头插入以下代码。请注意,它要求要调试的文件不应具有相对的导入代码。 (因此,我们始终鼓励绝对导入)。你可以在该项目的许多path.py文件中查这些此代码。
import os.path as osp
import sys # isort:skip
module_name = "videoanalyst"
p = __file__
while osp.basename(p) != module_name:
p = osp.dirname(p)
ROOT_PATH = osp.dirname(p)
ROOT_CFG = osp.join(ROOT_PATH, 'config.yaml')
sys.path.insert(0, ROOT_PATH) # isort:skip
假定:①将exp_cfg_path中的.yaml文件和videoanalyst放在同一级别上;②您有一个索引为0的GPU,则以下代码段将实例化并配置一个可立即用于你自己的应用程序的pipeline对象。 它支持以下API:void init(im, state),state update(im)
import cv2
import torch
from videoanalyst.config.config import cfg as root_cfg
from videoanalyst.config.config import specify_task
from videoanalyst.model import builder as model_builder
from videoanalyst.pipeline import builder as pipeline_builder
root_cfg.merge_from_file(exp_cfg_path)
# resolve config
task, task_cfg = specify_task(root_cfg)
task_cfg.freeze()
exp_cfg_path = osp.realpath(parsed_args.config)
# from IPython import embed;embed()
root_cfg.merge_from_file(exp_cfg_path)
logger.info("Load experiment configuration at: %s" % exp_cfg_path)
# build model
model = model_builder.build_model(task, task_cfg.model)
# build pipeline
pipeline = pipeline_builder.build('track', task_cfg.pipeline, model)
pipeline.set_device(torch.device("cuda:0"))
# register your template
im_template = cv2.imread("test file")
state_template = ...
pipeline.init(im_template)
# perform tracking based on your template
im_current = cv2.imread("test file")
state_current = pipeline.update(im_template)