When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanis
AbstractIntroductionRelatedWorkShiftBlockArchitectureVariantsExperimentAblationStudyMLP中的展开比τ\tauτ移位通道的百分比Percentageofshiftedchannels移位的像素数Shiftedpixels训练方案ViT-styletrainingschemeconclusionAbstrac