CVPR 2022 论文列表(持续更新)

本文包括论文链接及代码

关注公众号:AI基地,及时获取最新资讯,学习资料

CVPR 2022 论文列表(持续更新)_第1张图片

GitHub链接:GitHub - gbstack/cvpr-2022-papers: CVPR 2022 papers with code

因为CSDN的markdown编辑器限制,文本多了保存会卡死。大家尽量到上面的GitHub链接去看吧。

CVPR2022 Papers (Papers/Codes/Demos)

分类目录:

1. 检测

2. 分割(Segmentation)

3. 图像处理(Image Processing)

4. 估计(Estimation)

5. 图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

6. 人脸(Face)

7. 三维视觉(3D Vision)

8. 目标跟踪(Object Tracking)

9. 医学影像(Medical Imaging)

10. 文本检测/识别(Text Detection/Recognition)

11. 遥感图像(Remote Sensing Image)

12. GAN/生成式/对抗式(GAN/Generative/Adversarial)

13. 图像生成/合成(Image Generation/Image Synthesis)

14. 场景图(Scene Graph

15. 视觉定位(Visual Localization)

16. 视觉推理/视觉问答(Visual Reasoning/VQA)

17. 图像分类(Image Classification)

18. 神经网络结构设计(Neural Network Structure Design)

19. 模型压缩(Model Compression)

20. 模型训练/泛化(Model Training/Generalization)

21. 模型评估(Model Evaluation)

22. 数据处理(Data Processing)

23. 主动学习(Active Learning)

24. 小样本学习/零样本学习(Few-shot/Zero-shot Learning)

25. 持续学习(Continual Learning/Life-long Learning)

26. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

27. 度量学习(Metric Learning)

28. 对比学习(Contrastive Learning)

29. 增量学习(Incremental Learning)

30. 强化学习(Reinforcement Learning)

31. 元学习(Meta Learning)

32. 多模态学习(Multi-Modal Learning)

33. 视觉预测(Vision-based Prediction)

34. 数据集(Dataset)

35. 机器人(Robotic)

36. 自监督学习/半监督学习


 

检测

2D目标检测(2D Object Detection)

Focal and Global Knowledge Distillation for Detectors(探测器的焦点和全局知识蒸馏)
keywords: Object Detection, Knowledge Distillation
paper | code

Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild(未知感知对象检测:从野外视频中学习你不知道的东西)
paper | code

Localization Distillation for Dense Object Detection(密集对象检测的定位蒸馏)
keywords: Bounding Box Regression, Localization Quality Estimation, Knowledge Distillation
paper | code
 

视频目标检测(Video Object Detection)

Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering(通过联合表示学习和在线聚类进行无监督活动分割)
paper
 

3D目标检测(3D object detection)

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes(在 3D 场景中实现稳健的定向边界框检测)
paper | code

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation(在全景分割的指导下,用于基于 LiDAR 的 3D 对象检测的多功能多视图框架)
keywords: 3D Object Detection with Point-based Methods, 3D Object Detection with Grid-based Methods, Cluster-free 3D Panoptic Segmentation, CenterPoint 3D Object Detection
paper

Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving(自动驾驶中用于单目 3D 目标检测的伪立体)
keywords: Autonomous Driving, Monocular 3D Object Detection
paper | code
 

伪装目标检测(Camouflaged Object Detection)

Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection(放大和缩小:用于伪装目标检测的混合尺度三元组网络)
paper | code
 

关键点检测(Keypoint Detection)

UKPGAN: A General Self-Supervised Keypoint Detector(一个通用的自监督关键点检测器)
paper | code
 

车道线检测(Lane Detection)

Rethinking Efficient Lane Detection via Curve Modeling(通过曲线建模重新思考高效车道检测)
keywords: Segmentation-based Lane Detection, Point Detection-based Lane Detection, Curve-based Lane Detection, autonomous driving
paper | code
 

分割(Segmentation)

全景分割(Panoptic Segmentation)

Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation(弯曲现实:适应全景语义分割的失真感知Transformer)
keywords: Semanticand panoramic segmentation, Unsupervised domain adaptation, Transformer
paper | code
 

语义分割(Semantic Segmentation)

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels(使用不可靠伪标签的半监督语义分割)
paper | code

Weakly Supervised Semantic Segmentation using Out-of-Distribution Data(使用分布外数据的弱监督语义分割)
paper | code

Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation(弱监督语义分割的自监督图像特定原型探索)
paper | code

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的多类token Transformer)
paper | code

Cross Language Image Matching for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的跨语言图像匹配)
paper

Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers(从注意力中学习亲和力:使用 Transformers 的端到端弱监督语义分割)
paper | code

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation(让自我训练更好地用于半监督语义分割)
keywords: Semi-supervised learning, Semantic segmentation, Uncertainty estimation
paper | code

Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation(弱监督语义分割的类重新激活图)
paper | code
 

实例分割(Instance Segmentation)

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation(一种基于端到端轮廓的高质量高速实例分割方法)
paper | code

Efficient Video Instance Segmentation via Tracklet Query and Proposal(通过 Tracklet Query 和 Proposal 进行高效的视频实例分割)
paper

SoftGroup for 3D Instance Segmentation on Point Clouds(用于点云上的 3D 实例分割)
keywords: 3D Vision, Point Clouds, Instance Segmentation
paper | code
 

估计(Estimation)

姿态估计(Human Pose Estimation)

Forecasting Characteristic 3D Poses of Human Actions()
paper

Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation(学习用于多人姿势估计的局部-全局上下文适应)
keywords: Top-Down Pose Estimation(从上至下姿态估计), Limb-based Grouping, Direct Regression
paper

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video(用于视频中 3D 人体姿势估计的 Seq2seq 混合时空编码器)
paper
 

光流/位姿/运动估计(Optical Flow/Pose/Motion Estimation)

CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild(CPPF:在野外实现稳健的类别级 9D 位姿估计)
paper | code

OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation(用于基于深度的 6D 对象姿态估计的对象视点编码)
paper | code

CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation(用于联合光流和场景流估计的双向相机-LiDAR 融合)
paper
 

深度估计(Depth Estimation)

ChiTransformer:Towards Reliable Stereo from Cues(从线索走向可靠的立体声)
paper

Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation and Focal Loss(重新思考多视图立体的深度估计:统一表示和焦点损失)
paper | code

ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks(立体匹配网络中自动避免捷径和域泛化的信息论方法)
keywords: Learning-based Stereo Matching Networks, Single Domain Generalization, Shortcut Learning
paper

Attention Concatenation Volume for Accurate and Efficient Stereo Matching(用于精确和高效立体匹配的注意力连接体积)
keywords: Stereo Matching, cost volume construction, cost aggregation
paper | code

Occlusion-Aware Cost Constructor for Light Field Depth Estimation(光场深度估计的遮挡感知成本构造函数)
paper | [code](https://github.com/YingqianWang/OACC- Net)

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation(用于单目深度估计的神经窗口全连接 CRF)
keywords: Neural CRFs for Monocular Depth
paper

OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion(通过几何感知融合进行 360 度单目深度估计)
keywords: monocular depth estimation(单目深度估计),transformer
paper
 

图像处理(Image Processing)

超分辨率(Super Resolution)

Reflash Dropout in Image Super-Resolution(图像超分辨率中的闪退dropout)
paper

Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence(迈向双向任意图像缩放:联合优化和循环幂等)
paper

HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening(用于全色锐化的纹理和光谱特征融合Transformer)
paper | code

HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging(光谱压缩成像的高分辨率双域学习)
keywords: HSI Reconstruction, Self-Attention Mechanism, Image Frequency Spectrum Analysis
paper
 

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)

Event-based Video Reconstruction via Potential-assisted Spiking Neural Network(通过电位辅助尖峰神经网络进行基于事件的视频重建)
paper
 

图像去噪/去模糊/去雨去雾(Image Denoising)

E-CIR: Event-Enhanced Continuous Intensity Recovery(事件增强的连续强度恢复)
keywords: Event-Enhanced Deblurring, Video Representation
paper | code
 

图像编辑/图像修复(Image Edit/Inpainting)

HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)
keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks
paper

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding(增量transformer结构增强图像修复与掩蔽位置编码)
keywords: Image Inpainting, Transformer, Image Generation
paper | code
 

图像翻译(Image Translation)

FlexIT: Towards Flexible Semantic Image Translation(迈向灵活的语义图像翻译)
paper

Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks(探索图像到图像翻译任务中对比学习的补丁语义关系)
keywords: image translation, knowledge transfer,Contrastive learning
paper
 

风格迁移(Style Transfer)

Style-ERD: Responsive and Coherent Online Motion Style Transfer(响应式和连贯的在线运动风格迁移)
paper

CLIPstyler: Image Style Transfer with a Single Text Condition(具有单一文本条件的图像风格转移)
keywords: Style Transfer, Text-guided synthesis, Language-Image Pre-Training (CLIP)
paper
 

人脸(Face)

人脸识别/检测(Facial Recognition/Detection)

An Efficient Training Approach for Very Large Scale Face Recognition(一种有效的超大规模人脸识别训练方法)
paper | code
 

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)

Sparse to Dense Dynamic 3D Facial Expression Generation(稀疏到密集的动态 3D 面部表情生成)
keywords: Facial expression generation, 4D face generation, 3D face modeling
paper
 

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

Voice-Face Homogeneity Tells Deepfake
paper | code

Protecting Celebrities with Identity Consistency Transformer(使用身份一致性transformer保护名人)
paper
 

目标跟踪(Object Tracking)

目标跟踪(Object Tracking)

TCTrack: Temporal Contexts for Aerial Tracking(空中跟踪的时间上下文)
paper | code

Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds(超越 3D 连体跟踪:点云中 3D 单对象跟踪的以运动为中心的范式)
keywords: Single Object Tracking, 3D Multi-object Tracking / Detection, Spatial-temporal Learning on Point Clouds
paper

Correlation-Aware Deep Tracking(相关感知深度跟踪)
paper
 

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

BEVT: BERT Pretraining of Video Transformers(视频Transformer的 BERT 预训练)
keywords: Video understanding, Vision transformers, Self-supervised representation learning, BERT pretraining
paper | code
 

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

End-to-End Semi-Supervised Learning for Video Action Detection(视频动作检测的端到端半监督学习)
paper

Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos(模态特定注释视频上多模态动作识别的可学习不相关模态丢失)
paper

Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation(通过代表性片段知识传播的弱监督时间动作定位)
paper | code

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars(通过咨询示例进行有效且高效的在线动作检测)
keywords: Online action detection(在线动作检测)
paper
 

图像/视频字幕(Image/Video Caption)

X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
paper
 

医学影像(Medical Imaging)

医学影像(Medical Imaging)

Adaptive Early-Learning Correction for Segmentation from Noisy Annotations(从噪声标签中分割的自适应早期学习校正)
keywords: medical-imaging segmentation, Noisy Annotations
paper | code

Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations(时间上下文很重要:使用疾病进展表示增强单图像预测)
keywords: Self-supervised Transformer, Temporal modeling of disease progression
paper
 

GAN/生成式/对抗式(GAN/Generative/Adversarial)

GAN/生成式/对抗式(GAN/Generative/Adversarial)

Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon(阴影可能很危险:自然现象的隐秘而有效的物理世界对抗性攻击)
paper

Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer(保护面部隐私:通过风格稳健的化妆转移生成对抗性身份面具)
paper

Adversarial Texture for Fooling Person Detectors in the Physical World(物理世界中愚弄人探测器的对抗性纹理)
paper

Label-Only Model Inversion Attacks via Boundary Repulsion(通过边界排斥的仅标签模型反转攻击)
paper
 

图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

图像生成/图像合成/视频合成(Image Generation/Image Synthesis/Video Generation)

Dynamic Dual-Output Diffusion Models(动态双输出扩散模型)
paper

Exploring Dual-task Correlation for Pose Guided Person Image Generation(探索姿势引导人物图像生成的双任务相关性)
paper | code

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning(告诉我什么并告诉我如何:通过多模式调节进行视频合成)
paper | code

3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces(基于小批量特征交换的三维形状变化自动编码器潜在解纠缠)
paper | code

Interactive Image Synthesis with Panoptic Layout Generation(具有全景布局生成的交互式图像合成)
paper

Polarity Sampling: Quality and Diversity Control of Pre-Trained Generative Networks via Singular Values(极性采样:通过奇异值对预训练生成网络的质量和多样性控制)
paper

Autoregressive Image Generation using Residual Quantization(使用残差量化的自回归图像生成)
paper | code
 

三维视觉(3D Vision)

三维视觉(3D Vision)

X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
paper
 

点云(Point Cloud)

Shape-invariant 3D Adversarial Point Clouds(形状不变的 3D 对抗点云)
paper | code

ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation(通过对抗旋转提高点云分类器的旋转鲁棒性)
paper

Lepard: Learning partial point cloud matching in rigid and deformable scenes(Lepard:在刚性和可变形场景中学习部分点云匹配)
paper | code

A Unified Query-based Paradigm for Point Cloud Understanding(一种基于统一查询的点云理解范式)
paper

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding(用于 3D 点云理解的自监督跨模态对比学习)
keywords: Self-Supervised Learning, Contrastive Learning, 3D Point Cloud, Representation Learning, Cross-Modal Learning
paper | code
 

三维重建(3D Reconstruction)

Neural Face Identification in a 2D Wireframe Projection of a Manifold Object(流形对象的二维线框投影中的神经人脸识别)
paper | [code](https://manycore- research.github.io/faceformer)

Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers()
keywords: semantic segmentation, 3D reconstruction, 3D bio-printers
paper

H4D: Human 4D Modeling by Learning Neural Compositional Representation(通过学习神经组合表示进行人体 4D 建模)
keywords: 4D Representation(4D 表征),Human Body Estimation(人体姿态估计),Fine-grained Human Reconstruction(细粒度人体重建)
paper
 

场景重建/新视角合成(Novel View Synthesis)

Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)
paper | code

CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)
keywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)
paper | code

Point-NeRF: Point-based Neural Radiance Fields(基于点的神经辐射场)
paper | code
 

模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

Focal and Global Knowledge Distillation for Detectors(探测器的焦点和全局知识蒸馏)
keywords: Object Detection, Knowledge Distillation
paper | code
 

量化(Quantization)

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization(学习具有类内异质性的合成图像以进行零样本网络量化)
paper | code
 

神经网络结构设计(Neural Network Structure Design)

神经网络结构设计(Neural Network Structure Design)

BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning(学习探索样本关系以进行鲁棒表征学习)
keywords: sample relationship, data scarcity learning, Contrastive Self-Supervised Learning, long-tailed recognition, zero-shot learning, domain generalization, self-supervised learning
paper | code
 

CNN

DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos(视频中稀疏帧差异的端到端 CNN 推断)
keywords: sparse convolutional neural network, video inference accelerating
paper

A ConvNet for the 2020s
paper | code
 

Transformer

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts(深入研究分布变化下的视觉Transformer的泛化)
keywords: out-of-distribution (OOD) generalization, Vision Transformers
paper | code

Mobile-Former: Bridging MobileNet and Transformer(连接 MobileNet 和 Transformer)
keywords: Light-weight convolutional neural networks(轻量卷积神经网络),Combination of CNN and ViT
paper
 

神经网络架构搜索(NAS)

β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search(可微架构搜索的 Beta-Decay 正则化)
paper
 

MLP

An Image Patch is a Wave: Quantum Inspired Vision MLP(图像补丁是波浪:量子启发的视觉 MLP)
paper | code
 

数据处理(Data Processing)

数据增广(Data Augmentation)

TeachAugment: Data Augmentation Optimization Using Teacher Knowledge(使用教师知识进行数据增强优化)
paper | code

3D Common Corruptions and Data Augmentation(3D 常见损坏和数据增强)
keywords: Data Augmentation, Image restoration, Photorealistic image synthesis
paper
 

异常检测(Anomaly Detection)

Generative Cooperative Learning for Unsupervised Video Anomaly Detection(用于无监督视频异常检测的生成式协作学习)
paper

Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection(用于异常检测的自监督预测卷积注意力块)(论文暂未上传)
paper | code
 

模型训练/泛化(Model Training/Generalization)

模型训练/泛化(Model Training/Generalization)

Towards Efficient and Scalable Sharpness-Aware Minimization(迈向高效和可扩展的锐度感知最小化)
keywords: Sharp Local Minima, Large-Batch Training
paper

CAFE: Learning to Condense Dataset by Aligning Features(通过对齐特征学习压缩数据集)
keywords: dataset condensation, coreset selection, generative models
paper | code

The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration(魔鬼在边缘:用于网络校准的基于边缘的标签平滑)
paper | code

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising(通过引入查询去噪加速 DETR 训练)
keywords: Detection Transformer
paper | code
 

长尾分布(Long-Tailed Distribution)

Targeted Supervised Contrastive Learning for Long-Tailed Recognition(用于长尾识别的有针对性的监督对比学习)
keywords: Long-Tailed Recognition(长尾识别), Contrastive Learning(对比学习)
paper
 

图像特征提取与匹配(Image feature extraction and matching)

图像特征提取与匹配(Image feature extraction and matching)

Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences(弱监督语义对应的概率扭曲一致性)
paper | code
 

多模态学习(Multi-Modal Learning)

视觉语言表征学习(Vision-language Representation Learning)

**L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) **(Oral Presentation)****
paper

HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)
keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks
paper

CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)
keywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)
paper | code

Vision-Language Pre-Training with Triple Contrastive Learning(三重对比学习的视觉语言预训练)
keywords: Vision-language representation learning, Contrastive Learning
paper | code
 

视觉预测(Vision-based Prediction)

视觉预测(Vision-based Prediction)

How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting(多少个观察就足够了? 轨迹预测的知识蒸馏)
keywords: Knowledge Distillation, trajectory forecasting
paper

Motron: Multimodal Probabilistic Human Motion Forecasting(多模式概率人体运动预测)
paper
 

数据集(Dataset)

数据集(Dataset)

Kubric: A scalable dataset generator(Kubric:可扩展的数据集生成器)
paper | code

A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection(用于分段级视频复制检测的大规模综合数据集和复制重叠感知评估协议)
paper
 

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification(小样本分类的相互集中学习)
paper

MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning(用于零样本学习的相互语义蒸馏网络)
keywords: Zero-Shot Learning, Knowledge Distillation
paper | code
 

持续学习(Continual Learning/Life-long Learning)

持续学习(Continual Learning/Life-long Learning)

On Generalizing Beyond Domains in Cross-Domain Continual Learning(关于跨域持续学习中的域外泛化)
paper
 

场景图(Scene Graph)

场景图生成(Scene Graph Generation)

Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs(将视频场景图重新格式化为时间二分图)
keywords: Video Scene Graph Generation, Transformer, Video Grounding
paper | code
 

图像分类(Image Classification)

图像分类(Image Classification)

GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction(用于多类别属性预测的基于全局、局部和内在的密集嵌入网络)
keywords: multi-label classification
paper | code
 

迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

How Well Do Sparse Imagenet Models Transfer?(稀疏 Imagenet 模型的迁移效果如何?)
paper

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation(用于手语翻译的简单多模态迁移学习基线)
paper

Weakly Supervised Object Localization as Domain Adaption(作为域适应的弱监督对象定位)
keywords: Weakly Supervised Object Localization(WSOL), Multi-instance learning based WSOL, Separated-structure based WSOL, Domain Adaption
paper | code
 

度量学习(Metric Learning)

度量学习(Metric Learning)

Enhancing Adversarial Robustness for Deep Metric Learning(增强深度度量学习的对抗鲁棒性)
keywords: Adversarial Attack, Adversarial Defense, Deep Metric Learning
paper
 

对比学习(Contrastive Learning)

对比学习(Contrastive Learning)

Selective-Supervised Contrastive Learning with Noisy Labels(带有噪声标签的选择性监督对比学习)
paper | code

HCSC: Hierarchical Contrastive Selective Coding(分层对比选择性编码)
keywords: Self-supervised Representation Learning, Deep Clustering, Contrastive Learning
paper | code

Crafting Better Contrastive Views for Siamese Representation Learning(为连体表示学习制作更好的对比视图)
paper | code
 

元学习(Meta Learning)

元学习(Meta Learning)

What Matters For Meta-Learning Vision Regression Tasks?(元学习视觉回归任务的重要性是什么?)
paper
 

机器人(Robotic)

机器人(Robotic)

IFOR: Iterative Flow Minimization for Robotic Object Rearrangement(IFOR:机器人对象重排的迭代流最小化)
paper
 

自监督学习/半监督学习(Self-supervised Learning/Semi-supervised Learning)

自监督学习/半监督学习(Self-supervised Learning/Semi-supervised Learning)

Class-Aware Contrastive Semi-Supervised Learning(类感知对比半监督学习)
keywords: Semi-Supervised Learning, Self-Supervised Learning, Real-World Unlabeled Data Learning
paper

A study on the distribution of social biases in self-supervised learning visual models(自监督学习视觉模型中social biases分布的研究)
paper
 

神经网络可解释性(Neural Network Interpretability)

神经网络可解释性(Neural Network Interpretability)

Do Explanations Explain? Model Knows Best(解释解释吗? 模型最清楚)
paper

Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks(神经网络中可解释的部分-整体层次结构和概念语义关系)
paper
 

人群计数(Crowd Counting)

人群计数(Crowd Counting)

Boosting Crowd Counting via Multifaceted Attention(通过多方面注意提高人群计数)
paper | code
 

联邦学习(Federated Learning)

联邦学习(Federated Learning)

Differentially Private Federated Learning with Local Regularization and Sparsification(局部正则化和稀疏化的差分私有联邦学习)
paper
 

暂无分类

暂无分类

**L-Verse: Bidirectional Generation Between Image and Text(图像和文本之间的双向生成) **(视觉语言表征学习)****
paper | code
 

Backbone

Backbone

MPViT : Multi-Path Vision Transformer for Dense Prediction
paper | code
 

CLIP

CLIP

PointCLIP: Point Cloud Understanding by CLIP
paper | code

Blended Diffusion for Text-driven Editing of Natural Images
paper | code
 

NAS

NAS

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior
paper | code
 

NeRF

NeRF

Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
paper

NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images
paper
 

Visual Transformer

Backbone

MPViT : Multi-Path Vision Transformer for Dense Prediction
paper | code
 

应用

Language-based Video Editing via Multi-Modal Multi-Level Transformer
paper | code

Embracing Single Stride 3D Object Detector with Sparse Transformer
paper | code
 

数据增强(Data Augmentation)

数据增强(Data Augmentation)

AlignMix: Improving representation by interpolating aligned features
paper | code
 

实例分割(Instance Segmentation)

自监督实例分割

FreeSOLO: Learning to Segment Objects without Annotations
paper | code
 

图像编辑(Image Editing)

图像编辑(Image Editing)

Blended Diffusion for Text-driven Editing of Natural Images
paper | code
 

Low-level Vision

Low-level Vision

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior
paper | code
 

超分辨率(Super-Resolution)

视频超分辨率

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
paper | code
 

3D点云(3D Point Cloud)

3D点云(3D Point Cloud)

PointCLIP: Point Cloud Understanding by CLIP
paper | code
 

3D目标检测(3D Object Detection)

3D目标检测(3D Object Detection)

Embracing Single Stride 3D Object Detector with Sparse Transformer
paper | code
 

3D人体姿态估计(3D Human Pose Estimation)

3D人体姿态估计(3D Human Pose Estimation)

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation
paper | code
 

3D语义场景补全(3D Semantic Scene Completion)

3D语义场景补全(3D Semantic Scene Completion)

MonoScene: Monocular 3D Semantic Scene Completion
paper | code
 

3D重建(3D Reconstruction)

3D重建(3D Reconstruction)

BANMo: Building Animatable 3D Neural Models from Many Casual Videos
paper | code
 

深度估计(Depth Estimation)

单目深度估计

Toward Practical Self-Supervised Monocular Indoor Depth Estimation
paper | code
 

人群计数(Crowd Counting)

人群计数(Crowd Counting)

Leveraging Self-Supervision for Cross-Domain Crowd Counting
paper | code
 

医学图像(Medical Image)

医学图像(Medical Image)

BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation
paper | code
 

场景图生成(Scene Graph Generation)

场景图生成(Scene Graph Generation)

SGTR: End-to-end Scene Graph Generation with Transformer
paper | code
 

风格迁移(Style Transfer)

风格迁移(Style Transfer)

StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions
paper | code
 

水印(Watermarking)

水印(Watermarking)

Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings
paper | code
 

数据集(Datasets)

数据集(Datasets)

It's About Time: Analog Clock Reading in the Wild
paper | code

Toward Practical Self-Supervised Monocular Indoor Depth Estimation
paper | code
 

新任务(New Task)

新任务(New Task)

Language-based Video Editing via Multi-Modal Multi-Level Transformer
paper | code

It's About Time: Analog Clock Reading in the Wild
paper | code

你可能感兴趣的:(人工智能,深度学习,计算机视觉,深度学习,计算机视觉,人工智能,AI,CVPR)