enas 参数共享_CVPR2020-Paper-Code-Interpretation

插个广告:2020极市计算机视觉开发者榜单已于7月20日开赛,8月31日截止提交,基于火焰识别、电动车头盔识别、后厨老鼠识别、摔倒识别四个赛道,47000+数据集,30万奖励等你挑战!点击这里报名

CVPR2020最新信息及论文下载贴(Papers/Codes/Project/PaperReading/Demos/直播分享/论文分享会等)

官网链接:http://cvpr2020.thecvf.com/

时间:Seattle, Washington,2020年6月14日-6月19日

论文接收公布时间:2020年2月24日

相关问题:

总目录

1.CVPR2020接收论文(持续更新)

分类汇总

目录

目标检测Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

论文地址:https://arxiv.org/abs/1908.01998

AugFPN: Improving Multi-scale Feature Learning for Object Detection

论文地址:https://arxiv.org/abs/1912.05384

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

论文地址:https://arxiv.org/abs/2003.11818

代码:https://github.com/ggjy/HitDet.pytorch

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

论文地址:https://arxiv.org/abs/2003.08813

CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection

论文地址:https://arxiv.org/abs/2003.09119

代码:https://github.com/KiveeDong/CentripetalNet

人脸识别

目标跟踪

三维点云/三维重建/三维检测/三维分割/深度估计

三维点云&重建PointAugment: an Auto-Augmentation Framework for Point Cloud Classification

论文地址:https://arxiv.org/abs/2002.10876

代码:https://github.com/liruihui/PointAugment/

Learning multiview 3D point cloud registration

论文地址:https://arxiv.org/abs/2001.05119

C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds

论文地址:https://arxiv.org/abs/1912.07009

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

论文地址:https://arxiv.org/abs/1911.11236

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

论文地址:https://arxiv.org/abs/2002.12212

Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion

论文地址:https://arxiv.org/abs/2003.01456

In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks

论文地址:https://arxiv.org/pdf/1911.11924.pdf

Attentive Context Normalization for Robust Permutation-Equivariant Learning

论文地址:https://arxiv.org/abs/1907.02545Weiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi

PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes

论文地址:https://arxiv.org/abs/1911.10949

SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans

论文地址:https://arxiv.org/abs/1912.00036

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

论文地址:https://arxiv.org/abs/1912.06378

代码:https://github.com/alibaba/cascade-stereo

三维重建

Leveraging 2D Data to Learn Textured 3D Mesh Generation

论文地址:https://arxiv.org/abs/2004.04180

ARCH: Animatable Reconstruction of Clothed Humans

论文地址:https://arxiv.org/abs/2004.04572

Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions

论文地址:https://arxiv.org/abs/2004.03967

图像识别

图像特征匹配图像字幕

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

论文地址:https://arxiv.org/abs/2003.08897

图像处理Single Image Reflection Removal through Cascaded Refinement

论文地址:https://arxiv.org/abs/1911.06634

Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data

论文地址:https://arxiv.org/abs/2002.11297

RoutedFusion: Learning Real-time Depth Map Fusion

论文地址:https://arxiv.org/pdf/2001.04388.pdf

Neural Contours: Learning to Draw Lines from 3D Shapes

论文地址:https://arxiv.org/abs/2003.10333

Towards Photo-Realistic Virtual Try-On by Adaptively Generating鈫Preserving Image Content

论文地址:https://arxiv.org/abs/2003.05863

图像分类Image Matching across Wide Baselines: From Paper to Practice

论文地址:https://arxiv.org/abs/2003.01587

Towards Robust Image Classification Using Sequential Attention Models

论文地址:https://arxiv.org/abs/1912.02184

Learning in the Frequency Domain

论文地址:https://arxiv.org/abs/2002.12416

Learning from Web Data with Memory Module

论文地址:https://arxiv.org/abs/1906.12028

Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks

论文地址:https://arxiv.org/abs/1912.09393

图像分割Deep Snake for Real-Time Instance Segmentation

论文地址:https://arxiv.org/abs/2001.01629

SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks

论文地址:https://arxiv.org/abs/2003.00678

PolarMask: Single Shot Instance Segmentation with Polar Representation

论文地址:https://arxiv.org/abs/1909.13226

代码:https://github.com/xieenze/PolarMask

xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

论文地址:https://arxiv.org/abs/1911.12676

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

论文地址:https://arxiv.org/abs/2001.00309

Enhancing Generic Segmentation with Learned Region Representations

论文地址:https://arxiv.org/abs/1911.08564

姿态估计/动作识别Distribution-Aware Coordinate Representation for Human Pose Estimation

论文地址:https://arxiv.org/abs/1910.06278

代码:https://github.com/ilovepose/DarkPose

4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras

论文地址:https://arxiv.org/abs/2002.12625

Optimal least-squares solution to the hand-eye calibration problem

论文地址:https://arxiv.org/abs/2002.10838

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

论文地址:https://arxiv.org/abs/2003.01060

Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

论文地址:https://arxiv.org/abs/2001.09691

Distribution Aware Coordinate Representation for Human Pose Estimation

论文地址:https://arxiv.org/abs/1910.06278

The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation

论文地址:https://arxiv.org/abs/1911.07524

PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation

论文地址:https://arxiv.org/abs/1911.04231

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

论文地址:https://arxiv.org/abs/2003.02824

G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features

论文地址:https://arxiv.org/abs/2003.11089

Deep Image Spatial Transformation for Person Image Generation

论文地址:https://arxiv.org/abs/2003.00696

代码:https://github.com/RenYurui/ Global-Flow-Local-Attention

视频分析Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

论文地址:https://arxiv.org/abs/2003.00387

Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning

论文地址:https://arxiv.org/abs/2003.00392

Object Relational Graph with Teacher-Recommended Learning for Video Captioning

论文地址:https://arxiv.org/abs/2002.11566

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

论文地址:https://arxiv.org/abs/2002.11616

Blurry Video Frame Interpolation

论文地址:https://arxiv.org/abs/2002.12259

Hierarchical Conditional Relation Networks for Video Question Answering

论文地址:https://arxiv.org/abs/2002.10698

Action Modifiers:Learning from Adverbs in Instructional Video

论文地址:https://arxiv.org/abs/1912.06617

MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask(视频分析-光流估计)

论文地址:https://arxiv.org/abs/2003.10955

代码:https://github.com/microsoft/MaskFlownet

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects(视频预测)

论文地址:https://arxiv.org/abs/2003.12045

代码:https://ehsanik.github.io/forcecvpr2020

OCR

GAN

小样本/零样本

弱监督/无监督/自监督NestedVAE: Isolating Common Factors via Weak Supervision

论文地址:https://arxiv.org/abs/2002.11576

Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation

论文地址:https://arxiv.org/abs/1911.07450

Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

论文地址:https://arxiv.org/abs/2003.01460

ClusterFit: Improving Generalization of Visual Representations

论文地址:https://arxiv.org/abs/1912.03330

Auto-Encoding Twin-Bottleneck Hashing

论文地址:https://arxiv.org/abs/2002.11930

Learning Representations by Predicting Bags of Visual Words

论文地址:https://arxiv.org/abs/2002.12247

A Characteristic Function Approach to Deep Implicit Generative Modeling

论文地址:https://arxiv.org/abs/1909.07425

行人跟踪/行人检测/ReIDSocial-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction

论文地址:https://arxiv.org/abs/2002.11927

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

论文地址:https://arxiv.org/abs/1912.06445

神经网络/模型压缩/模型加速Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral

论文地址:https://arxiv.org/abs/2003.01826

GPU-Accelerated Mobile Multi-view Style Transfer

论文地址:https://arxiv.org/abs/2003.00706

Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral

论文地址:https://arxiv.org/abs/2003.01826

Holistically-Attracted Wireframe Parsing

论文地址:https://arxiv.org/abs/2003.01663

AdderNet: Do We Really Need Multiplications in Deep Learning?

论文地址:https://arxiv.org/abs/1912.13200

CARS: Contunuous Evolution for Efficient Neural Architecture Search

论文地址:https://arxiv.org/abs/1909.04977

代码:https://github.com/huawei-noah/CARS

Π-nets: Deep Polynomial Neural Networksv

论文地址:https://arxiv.org/abs/2003.03828

Explaining Knowledge Distillation by Quantifying the Knowledge

论文地址:https://arxiv.org/abs/2003.03622

超分辨率

视觉常识/其他Scalable Uncertainty for Computer Vision with Functional Variational Inference

论文地址:https://arxiv.org/abs/2003.03396

Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective

论文地址:https://arxiv.org/abs/2002.10826

Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs

论文地址:https://arxiv.org/abs/2003.00287

12-in-1: Multi-Task Vision and Language Representation Learning

论文地址:https://arxiv.org/abs/1912.02315

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

论文地址:https://arxiv.org/abs/2002.10638

代码:https://github.com/weituo12321/PREVALENT

Unbiased Scene Graph Generation from Biased Training

论文地址:https://arxiv.org/abs/2002.11949

9.Towards Visually Explaining Variational Autoencoders

论文地址:https://arxiv.org/abs/1911.07389

BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

论文地址:http://www.weixiushen.com/publication/cvpr20_BBN.pdf

代码:https://github.com/Megvii-Nanjing/BBN

High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

论文地址:https://arxiv.org/abs/1905.13545

SAM: The Sensitivity of Attribution Methods to Hyperparameters

论文地址:http://s.anhnguyen.me/sam\_cvpr2020.pdf

代码:https://github.com/anguyen8/sam

Π− nets: Deep Polynomial Neural Networks

论文地址:https://arxiv.org/abs/2003.03828

Towards Backward-Compatible Representation Learning

论文地址:https://arxiv.org/abs/2003.11942

On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location

论文地址:https://arxiv.org/abs/2003.07064

KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations(数据集)

论文地址:https://arxiv.org/abs/2002.12687

2.CVPR2020 Oral(持续更新)

3.CVPR2020 论文解读

如今越来越多的研究者开始关注如何将统计中的因果应用于deep learning,来增加其鲁棒性、可解释性等等。但是大部分工作都没有深入因果理论中,更多的是借用了其中一些概念(比如counterfactual反事实),这篇paper旨在能在此基础上再向前走一点。

论文链接:https://arxiv.org/abs/2002.12204

论文代码:https://github.com/Wangt-CN/VC-R-CNN

选择2019年热门框架facebookresearch/maskrcnn-benchmark作为基础,在其基础上搭建了Scene-Graph-Benchmark.pytorch。该代码不仅兼容了maskrcnn-benchmark所支持的所有detector模型,且得益于facebookresearch优秀的代码功底,更大大增加了SGG部分的可读性和可操作性。

论文链接:https://arxiv.org/abs/2002.11949

论文代码:https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

论文链接:https://arxiv.org/abs/1911.04231

论文代码:https://github.com/ethnhe/PVN3D.git

旷视研究院提出一种基于霍夫投票(Hough voting)的 3D 关键点检测神经网络,称之为 PVN3D,以学习逐点到 3D 关键点的偏移并为 3D 关键点投票。把基于 2D 关键点的方法推进至 3D 关键点,以充分利用刚体的几何约束信息,极大提升了 6DoF 估计的精确性。在 YCB-Video 和 LineMOD 两大公开数据集上进行了评估实验,结果表明该方法以大幅优势取得了当前最佳性能。

论文链接:https://arxiv.org/abs/2002.12489

关注红外线-RGB跨模态行人重识别。试图解决:以往大部分跨模态行人重识别算法一般都只关注shared feature learning,而很少关注Specific feature。因为Specific feature在对面模态中是不存在的。例如在红外线图片中是没有彩色颜色信息的。反之在彩图中也不会有热度信息。而实际上做过ReID的都知道,传统ReID之所以性能很高,很大程度上就是有些“过拟合”到了这些specific信息上。比如衣服颜色一直是传统ReID的一个重要的cue。从这个角度出发,尝试利用specific特征。主要思路是利用近邻信息:给定一红外线query。当搜索彩色target时,可以先找到一些简单的置信度高的彩色样本(这些样本大概率是红外线query的positive样本),把这些彩色样本的颜色特异特征给与红外线query。做了这件事后,红外线query样本可以利用这些彩色信息再去搜索更难的彩色样本。

论文链接:https://arxiv.org/abs/1911.11236

代码:https://github.com/QingyongHu/RandLA-Net

提出了一种基于简单高效的随机降采样和局部特征聚合的网络结构(RandLA-Net)。该方法不仅在诸如Semantic3D和SemanticKITTI等大场景点云分割数据集上取得了非常好的效果,并且具有非常高的效率(e.g. 比基于图的方法SPG快了接近200倍)。

论文链接:https://arxiv.org/abs/1908.01998

提出了新的少样本目标检测算法,创新点包括Attention-RPN、多关系检测器以及对比训练策略,另外还构建了包含1000类的少样本检测数据集FSOD,在FSOD上训练得到的论文模型能够直接迁移到新类别的检测中,不需要fine-tune

https://arxiv.org/abs/1909.04977

为了优化进化算法在神经网络结构搜索时候选网络训练过长的问题,参考ENAS和NSGA-III,论文提出连续进化结构搜索方法(continuous evolution architecture search, CARS),最大化利用学习到的知识,如上一轮进化的结构和参数。首先构造用于参数共享的超网,从超网中产生子网,然后使用None-dominated排序策略来选择不同大小的优秀网络,整体耗时仅需要0.5 GPU day。

论文链接:https://arxiv.org/abs/2002.11359

论文提出伪监督目标定位方法(PSOL)来解决目前弱监督目标定位方法的问题,该方法将定位与分类分开成两个独立的网络,然后在训练集上使用Deep descriptor transformation(DDT)生成伪GT进行训练,整体效果达到SOTA。 该论文主要有三点贡献:一、弱监督目标定位应该分为类不可知目标定位和目标分类两个独立的部分,提出PSOL算法;二、尽管生成的bbox有偏差,论文仍然认为应该直接优化他们而不需要类标签,最终达到SOTA;三、在不同的数据集上,PSOL算法不需要fine-tuning也能有很好的定位迁移能力

论文链接:https://arxiv.org/pdf/2002.10322.pdf

在这项工作中,我们提出了一种新的视频中3D人体姿态估计的解决方案。我们不是直接回归3D关节位置,而是从人体骨骼解剖中汲取灵感,将任务分解为骨骼方向预测和骨骼长度预测,从这两个预测中完全可以得到三维关节位置。我们的研究动机是人类骨骼的长度随着时间的推移保持一致。这推动了我们开发有效的技术来利用视频中所有帧的全局信息来进行高精度的骨骼长度预测。此外,对于骨骼方向预测网络,我们提出了一种具有长跳跃连接的全卷积传播结构。本质上,它分层地预测不同骨骼的方向,而不使用任何耗时的存储单元(例如LSTM)。进一步引入了一种新的关节位移损失来连接骨骼长度和骨骼方向预测网络的训练。最后,我们采用一种隐含的注意机制将2D关键点可见性分数作为额外的指导反馈到模型中,这显著地缓解了许多具有挑战性的姿势中的深度歧义。我们的完整模型在Human3.6M和MPI-INF-3dHP数据集上的表现优于之前的最好结果,在这些数据集上的综合评估验证了我们模型的有效性。

论文链接:论文地址:https://arxiv.org/pdf/1912.13458.pdf

微软亚洲研究院提出了一个方法,它既不需要了解换脸后的图像数据,也不需要知道换脸算法,就能对图像做『X-Ray』,鉴别出是否换脸,以及指出换脸的边界。

新模型 Face X-Ray 具有两大属性:能泛化到未知换脸算法、能提供可解释的换脸边界。要获得这样的优良属性,诀窍就藏在换脸算法的一般过程中。如下所示,大多数换脸算法可以分为检测、修改以及融合三部分。与之前的研究不同,Face X-Ray 希望检测第三阶段产生的误差。

论文链接:https://arxiv.org/abs/1911.07524

UDP,解决了现有的SOTA人体姿态估计算法中标准编解码方法存在较大统计误差的问题。同时解决了由于翻转测试而导致的结果不对齐问题。且该算法即用即插,在基本不增加模型复杂度的情况下,有效提升了算法性能。

论文链接:https://arxiv.org/abs/1911.13239

在合成图中,前景和背景是在不同的拍摄条件 (比如时刻、季节、光照、天气) 下拍摄的,所以在亮度色泽等方面存在明显的不匹配问题。图像和谐化 (image harmonization) 旨在调整合成图中的前景,使其与背景和谐。传统的图像和谐化方法一般是从背景或者其他图片转移颜色信息到前景上,但这样无法保证调整之后的前景看起来真实并且与背景和谐。近年来,已经有少量的工作尝试用深度学习做图像和谐化,但成对的合成图和真实图极难获得。如果没有成对的合成图和真实图,深度学习的训练过程缺乏足够强的监督信息,合成图和谐化之后的结果也没有 ground-truth 用于评测。截至目前还没有公开的大规模图像和谐化数据库,我们构建并公布了由四个子数据库组成的图像和谐化数据库。并且,我们提出了域验证 (domain verification) 的概念,尝试了基于域验证的图像和谐化算法。

论文链接:https://arxiv.org/abs/1909.13226

PolarMask基于FCOS,把实例分割统一到了FCN的框架下。FCOS本质上是一种FCN的dense prediction的检测框架,可以在性能上不输anchor based的目标检测方法,让行业看到了anchor free方法的潜力。接下来要解决的问题是实例分割。本工作最大的贡献在于把更复杂的实例分割问题,转化成在网络设计和计算量复杂度上和物体检测一样复杂的任务,把对实例分割的建模变得简单和高效。

论文链接:https://arxiv.org/abs/1911.11907

该论文提供了一个全新的Ghost模块,旨在通过廉价操作生成更多的特征图。基于一组原始的特征图,作者应用一系列线性变换,以很小的代价生成许多能从原始特征发掘所需信息的“幻影”特征图(Ghost feature maps)。该Ghost模块即插即用,通过堆叠Ghost模块得出Ghost bottleneck,进而搭建轻量级神经网络——GhostNet。在ImageNet分类任务,GhostNet在相似计算量情况下Top-1正确率达75.7%,高于MobileNetV3的75.2%。

论文链接:https://arxiv.org/abs/1912.02315

许多视觉和语言的研究集中在一组小而多样的独立任务和支持的数据集上,这些数据集通常是单独研究的;然而,成功完成这些任务所需的视觉语言理解技能有很大的重叠。在这项工作中,我们通过开发一个大规模的、多任务的训练机制来研究视觉和语言任务之间的关系。

4.To do list

CVPR2020复现代码及时更新

CVPR2020论文分享跟进5.Related links

6.CVPR2020 contributors Wechat Group

为了让大家更好得进行交流,极市特别组建了贡献者群及作者微信群,欢迎加小助手微信(cv-mart,备注CVPR2020)进群。

你可能感兴趣的:(enas,参数共享)