pytorch, sync batch norm and DistributedDataParallel(DDP)
backgroundPytorchcomputebatchstatisticsseparatelyforeachdevice.ThedefaultbehaviorofBatchnorm,inPytorchandmostotherframeworks,istocomputebatchstatisticsseparatelyforeachdevice.Meaningthat,ifweuseamodel