1、Ensemble Test
python val.py --weights yolov5x.pt yolov5l6.pt --data coco.yaml --img 640 --half
2、Ensemble Inference
python detect.py --weights model1.pt model2.pt --augment
python detect.py --weights yolov5x.pt yolov5l6.pt --img 640 --source data/images
ensembling runs multiple models, while TTA tests a single model at with different augmentations. Typically I've seen the best result when merging output grids directly, (i.e. ensembling YOLOv5l and YOLOv5x), rather than simply appending boxes from multiple models for NMS to sort out. This is not always possible however, for example Ensembling an EfficientDet model with YOLOv5x, you can not merge grids, you must use NMS or WBF (or Merge NMS) to get a final result.
$ python val.py --weights yolov5x.pt --data coco.yaml --img 832 --augment --half
$ python detect.py --weights yolov5s.pt --img 832 --source data/images --augment
ou can customize the TTA ops applied in the YOLOv5 forward_augment()
method here:
def forward_augment(self, x):
img_size = x.shape[-2:] # height, width
s = [1, 0.83, 0.67] # scales
f = [None, 3, None] # flips (2-ud, 3-lr)
y = [] # outputs
for si, fi in zip(s, f):
xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max()))
yi = self.forward_once(xi)[0] # forward
# cv2.imwrite(f'img_{si}.jpg', 255 * xi[0].cpu().numpy().transpose((1, 2, 0))[:, :, ::-1]) # save
yi = self._descale_pred(yi, fi, si, img_size)
return torch.cat(y, 1), None # augmented inference, train