官网链接:https://cvpr.thecvf.com/Conferences/2023
[7]NeRF-RPN: A general framework for object detection in NeRFs
paper
[6]Detecting Everything in the Open World: Towards Universal Object Detection
paper
[5]Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
paper
[4]CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
paper
[3]Enhanced Training of Query-Based Object Detection via Selective Query Recollection
paper | code
[2]DETRs with Hybrid Matching
paper | code
[1]YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors(YOLOv7)
paper | code
[1]SCOTCH and SODA: A Transformer Video Shadow Detection Framework
paper
[22]Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans
paper
[21]itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection
paper
[20]Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
paper | code
[19]FrustumFormer: Adaptive Instance-aware Resampling for Multi-view 3D Detection
paper | code
[18]NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
paper
[17]Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving
paper
[16]VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
paper | code
[15]OcTr: Octree-based Transformer for 3D Object Detection
paper
[14]MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer
paper
[13]CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
paper | code
[12]Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency
paper
[11]AeDet: Azimuth-invariant Multi-view 3D Object Detection
paper | code
[10]Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
paper
[9]PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
paper | code
[8]MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences
paper
[7]Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
paper
[6]X3KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection
paper
[5]Virtual Sparse Convolution for Multimodal 3D Object Detection
paper | code
[4]MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
paper | code
[3]Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection
paper | code
[2]LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
paper | code
[1]ConQueR: Query Contrast Voxel-DETR for 3D Object Detection(3D 目标检测的Query Contrast Voxel-DETR)
paper | code
[2]Category Query Learning for Human-Object Interaction Classification
paper
[1]Detecting Human-Object Contact in Images
paper
[2]Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings
paper
[1]Texture-guided Saliency Distilling for Unsupervised Salient Object Detection
paper | code
[1]BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline
paper
[2]The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector
paper | code
[1]Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections
paper | code
[8]SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection
paper
[7]Normalizing Flow based Feature Synthesis for Outlier-Aware Object Detection
paper
[6]Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection
paper
[5]DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection
paper
[4]Diversity-Measurable Anomaly Detection
paper
[3]Block Selection Method for Using Feature Norm in Out-of-distribution Detection
paper
[2]Lossy Compression for Robust Unsupervised Time-Series Anomaly Detection
paper
[1]Multimodal Industrial Anomaly Detection via Hybrid Fusion
paper | code
[3]Focused and Collaborative Feedback Integration for Interactive Image Segmentation
paper | code
[2]MP-Former: Mask-Piloted Transformer for Image Segmentation
paper | code
[1]Interactive Segmentation as Gaussian Process Classification
paper
[2]UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration
paper
[1]Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
paper
[20]LaserMix for Semi-Supervised LiDAR Semantic Segmentation
paper | code
[19]Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
paper | code
[18]Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs
paper | code
[17]Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
paper | code
[16]Reliability in Semantic Segmentation: Are We on the Right Track?
paper | code
[15]Generative Semantic Segmentation
paper | code
[14]Novel Class Discovery for 3D Point Cloud Semantic Segmentation
paper | code
[13]MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving
paper | code
[12]Side Adapter Network for Open-Vocabulary Semantic Segmentation
paper | code
[11]Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes
paper
[10]Token Contrast for Weakly-Supervised Semantic Segmentation
paper | code
[9]Delivering Arbitrary-Modal Semantic Segmentation
paper | code
[8]Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation
paper
[7]Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
paper | code
[6]Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
paper | code
[5]SCPNet: Semantic Scene Completion on Point Cloud
paper
[4]On Calibrating Semantic Segmentation Models: Analyses and An Algorithm
paper
[3]Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
paper
[2]Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
paper | code
[1]Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation
paper
[7]A Generalized Framework for Video Instance Segmentation
paper | code
[6]FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation
paper
[5]SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation
paper | code
[4]DynaMask: Dynamic Mask Selection for Instance Segmentation
paper | code
[3]Beyond mAP: Towards better evaluation of instance segmentation
paper
[2]ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution
paper
[1]PolyFormer: Referring Image Segmentation as Sequential Polygon Generation(PolyFormer:将图像分割表述为顺序多边形生成)
paper
[4]Two-shot Video Object Segmentation
paper
[3]Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation
paper
[2]MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation
paper
[1]InstMove: Instance Motion for Object-centric Video Segmentation
paper | code
[2]One-to-Few Label Assignment for End-to-End Dense Detection
paper | code
[1]DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction
paper
[6]A Unified Pyramid Recurrent Network for Video Frame Interpolation
paper
[5]Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior
paper | code
[4]Blind Video Deflickering by Neural Filtering with a Flawed Atlas
paper | code
[3]Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
paper | code
[2]UV Volumes for Real-time Rendering of Editable Free-view Human Performance
paper | code
[1]Exploring Discontinuity for Video Frame Interpolation
[paper]([2202.07291] Exploring Discontinuity for Video Frame Interpolation (arxiv.org))
[3]Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
paper
[2]Text-Visual Prompting for Efficient 2D Temporal Video Grounding
paper
[1]Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
paper | code
[7]Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
paper | code
[6]Conditional Image-to-Video Generation with Latent Flow Diffusion Models
paper
[5]3D Cinemagraphy from a Single Image
paper
[4]VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
paper | code
[3]MOSO: Decomposing MOtion, Scene and Object for Video Prediction
paper | code
[2]SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
paper | code
[1]Video Probabilistic Diffusion Models in Projected Latent Space(投影潜在空间中的视频概率扩散模型)
paper | project
[2]Structured Sparsity Learning for Efficient Video Super-Resolution
paper
[1]Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
paper
[2]DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling
paper
[1]Rethinking Optical Flow from Geometric Matching Consistent Perspective
paper | code
[5]SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates
paper
[4]PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes
paper | code
[3]HRDFuse: Monocular 360°Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions
paper
[2]Fully Self-Supervised Depth Estimation from Defocus Clue
paper | code
[1] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
paper | code
[11]Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation
paper
[10]3D Human Mesh Estimation from Virtual Markers
paper
[9]Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation
paper
[8]Rigidity-Aware Detection for 6D Object Pose Estimation
paper
[7]Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video
paper
[6]Markerless Camera-to-Robot Pose Estimation via Self-supervised Sim-to-Real Transfer
paper
[5]TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation
paper
[4]Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
paper
[3]PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation
paper
[2]DistilPose: Tokenized Pose Regression with Heatmap Distillation
paper
[1]Relightable Neural Human Assets from Multi-view Gradient Illuminations(来自多视图渐变照明的可照明神经人类资产)
paper
[5]Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild
paper
[4]Natural Language-Assisted Sign Language Recognition
paper | code
[3]CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment
paper | code
[2]Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand Disentanglement
paper
[1]Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
paper | code
[3]Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
paper
[2]PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment
paper
[1]DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
paper | code
[6]Activating More Pixels in Image Super-Resolution Transformer
paper | code
[5]Super-Resolution Neural Operator
paper | code
[4]Local Implicit Normalizing Flow for Arbitrary-Scale Image Super-Resolution
paper
[3]Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation
paper | code
[2]N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution
paper | code
[1]Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild(野外鲁棒图像超分辨率的去噪扩散概率模型)
paper | project
[13]CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not
paper
[12]Instant Volumetric Head Avatars
paper
[11]Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank
paper | code
[10]ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
paper | code
[9]Masked Image Modeling with Local Multi-Scale Reconstruction
paper | code
[8]Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective
paper | code
[7]DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration
paper
[6]Robust Unsupervised StyleGAN Image Restoration
paper
[5]Raw Image Reconstruction with Learned Compact Metadata
paper
[4]Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
paper | code
[3]Imagic: Text-Based Real Image Editing with Diffusion Models
paper | project
[2]High-resolution image reconstruction with latent diffusion models from human brain activity
paper | project
[1]Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models
paper
[1]LightPainter: Interactive Portrait Relighting with Freehand Scribble
paper
[6]Masked Image Training for Generalizable Deep Image Denoising
paper | code
[5]Learning A Sparse Transformer Network for Effective Image Deraining
paper | code
[4]Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior
paper
[3]Polarized Color Image Denoising using Pocoformer
paper
[2]Blur Interpolation Transformer for Real-World Motion from Blur
paper | code
[1]Structured Kernel Estimation for Photon-Limited Deconvolution
paper | code
[6]SIEDOB: Semantic Image Editing by Disentangling Object and Background
paper | code
[5]CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing
paper
[4]SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
paper
[3]Interactive Cartoonization with Controllable Perceptual Factors
paper
[2]Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
paper | code
[1]LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data
paper | code
[2]CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability
paper
[1]Quality-aware Pre-trained Models for Blind Image Quality Assessment
paper
[3]Neural Preset for Color Style Transfer
paper | code
[2]StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields
paper
[1]Fix the Noise: Disentangling Source Feature for Transfer Learning of StyleGAN
paper | code
[1]Indescribable Multi-modal Spatial Evaluator
paper | code
[3]Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition
paper
[2]Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection
paper
[1]Multi Modal Facial Expression Recognition with Transformer-Based Fusion Networks and Dynamic Sampling
paper
[7]SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage
paper
[6]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
paper | code
[5]NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images
paper
[4]Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images
paper
[3]Robust Model-based Face Reconstruction through Weakly-Supervised Outlier Segmentation
paper | code
[2]A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images
paper
[1]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation(MetaPortrait:具有快速个性化适应的身份保持谈话头像生成)
paper | code
[3]Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment
paper
[2]Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization
paper | code
[1]Physical-World Optical Adversarial Attacks on 3D Face Recognition
paper
[6]MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
paper
[5]Visual Prompt Multi-Modal Tracking
paper | code
[4]Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking
paper | code
[3]Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation
paper
[2]Referring Multi-Object Tracking
paper
[1]Simple Cues Lead to a Strong Multi-Object Tracker
paper
[10]Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
paper | code
[9]NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
paper
[8]Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
paper
[7]Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
paper
[6]Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
paper | code
[5]Dual-path Adaptation from Image to Video Transformers
paper | code
[4]Data-Free Sketch-Based Image Retrieval
paper
[3]DAA: A Delta Age AdaIN operation for age estimation via binary code transformer
paper
[2]VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
paper | code
[1]Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
paper
[8]Box-Level Active Detection
paper
[7]Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition
paper
[6]Open Set Action Recognition via Multi-Label Evidential Learning
paper
[5]Video Test-Time Adaptation for Action Recognition
paper
[4]Post-Processing Temporal Action Detection
paper
[3]TriDet: Temporal Action Detection with Relative Boundary Modeling
paper | code
[2]Learning Discriminative Representations for Skeleton Based Action Recognition
paper
[1]Continuous Sign Language Recognition with Correlation Network
paper | code
[2]TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification
paper | code
[1]MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
paper | code
[5]Text with Knowledge Graph Augmented Transformer for Video Captioning
paper
[4]Dual-Stream Transformer for Generic Event Boundary Captioning
paper | code
[3]ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
paper | code
[2]Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
paper
[1]Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
paper | code
[7]RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction
paper | code
[6]Orthogonal Annotation Benefits Barely-supervised Medical Image Segmentation
paper | code
[5]Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification
paper
[4]Neuron Structure Modeling for Generalizable Remote Physiological Measurement
paper | code
[3]Unsupervised Contour Tracking of Live Cells by Mechanical and Cycle Consistency Losses
paper | code
[2]Deep Feature In-painting for Unsupervised Anomaly Detection in X-ray Images
paper | code
[1]Label-Free Liver Tumor Segmentation
paper | code
[6]Images Speak in Images: A Generalist Painter for In-Context Visual Learning
paper | code
[5]Context De-confounded Emotion Recognition
paper
[4]Joint Visual Grounding and Tracking with Natural Language Specification
paper
[3]Unifying Vision, Text, and Layout for Universal Document Processing
paper
[2]Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
paper
[1]DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
paper | code
[7]Fine-Grained Face Swapping via Regional GAN Inversion
paper
[6]Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models
paper
[5]Graph Transformer GANs for Graph-Constrained House Generation
paper
[4]Improving GAN Training via Feature Space Shrinkage
paper | code
[3]Adversarial Attack with Raindrops
paper
[2]T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations
paper | project
[1]Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
paper | project
[22]All are Worth Words: A ViT Backbone for Diffusion Models
paper | code
[21]Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
paper | code
[20]Shifted Diffusion for Text-to-image Generation
paper | code
[19]Towards Practical Plug-and-Play Diffusion Models
paper
[18]Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis
paper
[17]Wavelet Diffusion Models are fast and scalable Image Generators
paper | code
[16]Learning 3D-aware Image Synthesis with Unknown Pose Distribution
paper
[15]Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
paper
[14]3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process
paper | code
[13]A Dynamic Multi-Scale Voxel Flow Network for Video Prediction
paper | code
[12]Regularized Vector Quantization for Tokenized Image Synthesis
paper
[11]SpaText: Spatio-Textual Representation for Controllable Image Generation
paper
[10]Unifying Layout Generation with a Decoupled Diffusion Model
paper
[9]Scaling up GANs for Text-to-Image Synthesis
paper
[8]Inversion-Based Style Transfer with Diffusion Models
paper | code
[7]Perspective Fields for Single Image Camera Calibration
paper
[6]VGFlow: Visibility guided Flow Network for Human Reposing
paper
[5]DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
paper | code
[4]Progressive Open Space Expansion for Open-Set Model Attribution
paper | code
[3]Person Image Synthesis via Denoising Diffusion Model
paper
[2]Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models(使用预训练的 2D 扩散模型解决 3D 逆问题)
paper
[1]Parallel Diffusion Models of Operator and Image for Blind Inverse Problems(盲反问题算子和图像的并行扩散模型)
paper
[2]Learning a 3D Morphable Face Reflectance Model from Low-cost Data
paper | code
[1]Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
paper | code
[15]CLIP2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data
paper
[14]Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
paper | code
[13]Deep Graph-based Spatial Consistency for Robust Non-rigid Point Cloud Registration
paper | code
[12]Controllable Mesh Generation Through Sparse Latent Point Diffusion Models
paper
[11]Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis
paper | code
[10]Rotation-Invariant Transformer for Point Cloud Matching
paper
[9]GraVoS: Voxel Selection for 3D Point-Cloud Detection
paper
[8]DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
paper | code
[7]PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees
paper
[6]ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion
paper | code
[5]DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization
paper
[4]Frequency-Modulated Point Cloud Rendering with Easy Editing
paper
[3]Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
paper
[2]ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer
paper | code
[1]Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting
paper | code
[25]HexPlane: A Fast Representation for Dynamic Scenes
paper
[24]Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container
paper
[23]BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
paper
[22]Structured 3D Features for Reconstructing Controllable Avatars
paper
[21]PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360∘
paper
[20]Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization
paper
[19]TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
paper | code
[18]MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
paper | code
[17]PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision
paper
[16]SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
paper | code
[15]Masked Wavelet Representation for Compact Neural Radiance Fields
paper
[14]Decoupling Human and Camera Motion from Videos in the Wild
paper
[13]Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
paper
[12]NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images
paper
[11]Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion
paper | code
[10]MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices
paper | code
[9]Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly
paper
[8]NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
paper
[7]HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling
paper
[6]MACARONS: Mapping And Coverage Anticipation with RGB Online Self-Supervision
paper
[4]Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
paper | code
[3]Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes
paper | code
[2]ECON: Explicit Clothed humans Obtained from Normals
paper | code
[1]Structured 3D Features for Reconstructing Relightable and Animatable Avatars
paper | project
[32]Magic3D: High-Resolution Text-to-3D Content Creation
paper
[31]DiffRF: Rendering-Guided 3D Radiance Field Diffusion
paper
[30]Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields for Controllable Scene Stylization
paper | code
[29]Interactive Segmentation of Radiance Fields
paper
[28]MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation
paper
[27]GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images
paper
[26]Progressively Optimized Local Radiance Fields for Robust View Synthesis
paper
[25]ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field
paper
[24]HandNeRF: Neural Radiance Fields for Animatable Interacting Hands
paper
[23]Grid-guided Neural Radiance Fields for Large Urban Scenes
paper
[22]EventNeRF: Neural Radiance Fields from a Single Colour Event Camera
paper
[21]SPARF: Neural Radiance Fields from Sparse and Noisy Poses
paper
[20]RUST: Latent Neural Scene Representations from Unposed Imagery
paper
[19]SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field
paper
[18]ShadowNeuS: Neural SDF Reconstruction by Shadow Ray Supervision
paper | code
[17]Balanced Spherical Grid for Egocentric View Synthesis
paper | code
[16]Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention
paper
[15]MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures
paper | code
[14]Robust Dynamic Radiance Fields
paper
[13]I2-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs
paper
[12]Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image
paper
[11]Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision
paper
[10]Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields
paper
[9]DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors
paper | code
[8]SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
paper
[7]3D Video Loops from Asynchronous Input
paper | code
[6]NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
paper | code
[5]NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation
paper
[4]Renderable Neural Radiance Map for Visual Navigation
paper
[3]Real-Time Neural Light Field on Mobile Devices
paper | project
[2]Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
paper | code
[1]NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior
paper | project
[1]Neural Video Compression with Diverse Contexts
paper | code
[3]Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation
paper
[2]Generic-to-Specific Distillation of Masked Autoencoders
paper | code
[1]CLIPPING: Distilling CLIP-based Models for Video-Language Understanding(CLIPPING:为视频语言理解提炼基于 CLIP 的模型)
paper
[2]CP3: Channel Pruning Plug-in for Point-based Networks
paper
[1]DepGraph: Towards Any Structural Pruning
paper | code
[4]Hard Sample Matters a Lot in Zero-Shot Quantization
paper
[3]Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
paper
[2]Post-training Quantization on Diffusion Models
paper | code
[1]Adaptive Data-Free Quantization
paper | code
[6]LINe: Out-of-Distribution Detection by Leveraging Important Neurons
paper
[5]Towards Scalable Neural Representation for Diverse Videos
paper
[4]Boundary Unlearning
paper
[3]Equiangular Basis Vectors
paper | code
[2]LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
paper | code
[1]Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks
paper | code
[5]Randomized Adversarial Training via Taylor Expansion
paper | code
[4]Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
paper | code
[3]DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
paper | code
[2]Demystify Transformers & Convolutions in Modern Image Deep Networks
paper | code
[1]InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
paper | code
[15]CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection
paper | code
[14]Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
paper
[13]POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery
paper
[12]FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER
paper
[11]Spherical Transformer for LiDAR-based 3D Recognition
paper | code
[10]MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models
paper | code
[9]Top-Down Visual Attention from Analysis by Synthesis
paper
[8]BiFormer: Vision Transformer with Bi-Level Routing Attention
paper | code
[7]Making Vision Transformers Efficient from A Token Sparsification View
paper
[6]Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
paper
[5]Learning Imbalanced Data with Vision Transformers
paper | code
[4]SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
paper
[3]Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers
paper | code
[2]Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
paper | code
[1]Integrally Pre-Trained Transformer Pyramid Networks
paper | code
[2]Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks
paper
[1]From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm
paper
[3]Polynomial Implicit Neural Representations For Large Diverse Datasets
paper | code
[2]PA&DA: Jointly Sampling PAth and DAta for Consistent NAS
paper | code
[1]Stitchable Neural Networks(可缝合神经网络)
paper | code
[1]ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization
paper | code
[1]Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
paper | code
[1]TINC: Tree-structured Implicit Neural Compression
paper | code
[1]Masked Images Are Counterfactual Samples for Robust Fine-tuning
paper
[1]On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering
paper | code
[1]Context-Based Trit-Plane Coding for Progressive Image Compression
paper | code
[16]Generalist: Decoupling Natural and Robust Generalization
paper
[15]Feature Separation and Recalibration for Adversarial Robustness
paper
[14]Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck
paper
[13]FlexiViT: One Model for All Patch Sizes
paper | code
[12]Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
paper | code
[11]Improving Generalization with Domain Convex Game
paper
[10]TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization
paper | code
[9]An Extended Study of Human-like Behavior under Adversarial Training
paper
[8]Sharpness-Aware Gradient Matching for Domain Generalization
paper | code
[7]HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining
paper
[6]Universal Instance Perception as Object Discovery and Retrieval
paper | code
[5]Practical Network Acceleration with Tiny Sets
paper | code
[4]Towards Bridging the Performance Gaps of Joint Energy-based Models
paper | code
[3]DropKey
paper
[2]Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
paper
[1]DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
paper
[2]Fine-Grained Classification with Noisy Labels
paper
[1]Combating noisy labels in object detection datasets
paper
[1]Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification
paper
[3]Referring Image Matting
paper | code
[2]Iterative Geometry Encoding Volume for Stereo Matching
paper | code
[1]Modality-Agnostic Debiasing for Single Domain Generalization
paper
[12]Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
paper
[11]CrOC: Cross-View Online Clustering for Dense Visual Representation Learning
paper | code
[10]Masked Motion Encoding for Self-Supervised Video Representation Learning
paper | code
[9]Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
paper | code
[8]MARLIN: Masked Autoencoder for facial video Representation LearnINg
paper | code
[7]Hierarchical discriminative learning improves visual representations of biomedical microscopy
paper
[6]Fine-tuned CLIP Models are Efficient Video Learners
paper | code
[5]Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
paper | code
[4]Open-Set Representation Learning through Combinatorial Embedding
paper
[3]NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
paper
[2]Stare at What You See: Masked Image Modeling without Reconstruction
paper | code
[1]Switchable Representation Learning Framework with Self-compatibility
paper
[2]Physically Adversarial Infrared Patches with Learnable Shapes and Locations
paper
[1]TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
paper | code
[13]CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP
paper | code
[12]MaPLe: Multi-modal Prompt Learning
paper | code
[11]Decoupled Multimodal Distilling for Emotion Recognition
paper
[10]MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
paper | code
[9]BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency
paper | code
[8]Mutilmodal Feature Extraction and Attention-based Fusion for Emotion Estimation in Videos
paper | code
[7]Emotional Reaction Intensity Estimation Based on Multimodal Data
paper
[6]Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers
paper
[5]Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning
paper
[4]Multimodal Prompting with Missing Modalities for Visual Recognition
paper | code
[3]Align and Attend: Multimodal Summarization with Dual Contrastive Losses
paper | code
[2]Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information(通过最大化多模态互信息实现一体化预训练)
paper | code
[1]Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks(Uni-Perceiver v2:用于大规模视觉和视觉语言任务的通才模型)
paper | code
[6]Egocentric Audio-Visual Object Localization
paper | code
[5]Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
paper
[4]Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
paper
[3]Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
paper | code
[2]CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
paper | code
[1]A Light Weight Model for Active Speaker Detection
paper | code
[18]MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
paper | code
[17]Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
paper
[16]Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
paper | code
[15]Test of Time: Instilling Video-Language Models with a Sense of Time
paper | code
[14]Accelerating Vision-Language Pretraining with Free Language Modeling
paper
[13]Task Residual for Tuning Vision-Language Models
paper | code
[12]MAGVLT: Masked Generative Vision-and-Language Transformer
paper
[11]Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding
paper | code
[10]Lana: A Language-Capable Navigator for Instruction Following and Generation
paper | code
[9]FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
paper | code
[8]Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
paper
[7]Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
paper
[6]Connecting Vision and Language with Video Localized Narratives
paper | code
[5]Policy Adaptation from Foundation Model Feedback
paper
[4]Open-vocabulary Attribute Detection
paper
[3]Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
paper
[2]Turning a CLIP Model into a Scene Text Detector
paper | code
[1]GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods
paper
[4]TBP-Former: Learning Temporal Bird’s-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving
paper
[3]Intention-Conditioned Long-Term Human Egocentric Action Forecasting
paper | code
[2]Computational Choreography using Human Motion Synthesis
paper
[1]IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction
paper
[15]GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
paper
[14]ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data
paper
[13]Fantastic Breaks: A Dataset of Paired 3D Scans of Real-World Broken Objects and Their Complete Counterparts
paper
[12]A Bag-of-Prototypes Representation for Dataset-Level Applications
paper
[11]Music-Driven Group Choreography
paper
[10]RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset
paper
[9]Backdoor Defense via Adaptively Splitting Poisoned Dataset
paper | code
[8]Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
paper | code
[7]SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments
paper | code
[6]A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
paper | code
[5]MVImgNet: A Large-scale Dataset of Multi-view Images
paper
[4]Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo
paper
[3]CUDA: Convolution-based Unlearnable Datasets
paper
[2]V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception
paper
[1]Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes
paper
[8]CF-Font: Content Fusion for Few-shot Font Generation
paper
[7]DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
paper | code
[6]Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings
paper | code
[5]Bi-directional Distribution Alignment for Transductive Zero-Shot Learning
paper | code
[4]Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation
paper
[3]Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
paper | code
[2]NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging
paper
[1]FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization
paper | code
[2]Computationally Budgeted Continual Learning: What Does Matter?
paper | code
[1]Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
paper | code
[1]Probabilistic Debiasing of Scene Graphs
paper | code
[1]Prototype-based Embedding Network for Scene Graph Generation
paper
[2]SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
paper
[1]PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
paper | code
[5]Human Pose as Compositional Tokens
paper
[4]Data-efficient Large Scale Place Recognition with Graded Similarity Supervision
paper | code
[3]PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers
paper
[2]StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
paper
[1]PyramidFlow: High-Resolution Defect Contrastive Localization using Pyramid Normalizing Flow
paper
[7]3D Concept Learning and Reasoning from Multi-View Images
paper
[6]Abstract Visual Reasoning: An Algebraic Approach for Solving Raven’s Progressive Matrices
paper | code
[5]Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning
paper | code
[4]Generative Bias for Robust Visual Question Answering
paper
[3]MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
paper | code
[2]Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering
paper | code
[1]From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models
paper | code
[4]Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments
paper
[3]Semantic Prompt for Few-Shot Image Recognition
paper
[2]Boosting Verified Training for Robust Image Classifications via Abstraction
paper | code
[1]I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification(I2MVFormer:用于零样本图像分类的大型语言模型生成的多视图文档监督)
paper
[11]Deep Frequency Filtering for Domain Generalization
paper
[10]Semi-Supervised Domain Adaptation with Source Label Adaptation
paper | code
[9]Unsupervised Continual Semantic Adaptation through Neural Rendering
paper
[8]MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
paper | code
[7]Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective
paper
[6]Manipulating Transfer Learning for Property Inference
paper | code
[5]Trainable Projected Gradient Method for Robust Fine-tuning
paper
[4]DA-DETR: Domain Adaptive Detection Transformer with Information Fusion
paper
[3]Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection
paper | code
[2]Guiding Pseudo-labels with Uncertainty Estimation for Source-free Unsupervised Domain Adaptation
paper | code
[1]Adaptive Assignment for Geometry Aware Local Feature Matching
paper
[8]PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery
paper | code
[7]Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data
paper
[6]Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
paper
[5]Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation
paper | code
[4]MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset
paper | code
[3]CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
paper | code
[2]Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation
paper | code
[1]Twin Contrastive Learning with Noisy Labels
paper | code
[2]Class-Incremental Exemplar Compression for Class-Incremental Learning
paper
[1]Dense Network Expansion for Class Incremental Learning
paper
[3]Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
paper | code
[2]ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals
paper
[1]EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning
paper | code
[1]A Meta-Learning Approach to Predicting Performance and Data Requirements
paper
[2]Efficient Map Sparsification Based on 2D and 3D Discretized Grids
paper
[1]PyPose: A Library for Robot Learning with Physics-based Optimization(PyPose:基于物理优化的机器人学习库)
paper | code
[21]Can’t Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
paper
[20]Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation
paper | code
[19]ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning
paper
[18]Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels
paper
[17]Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching
paper | code
[16]Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data
paper
[15]Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning
paper
[14]Correlational Image Modeling for Self-Supervised Visual Pre-Training
paper
[13]Extracting Class Activation Maps from Non-Discriminative Features as well
paper | code
[12]TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation
paper | code
[11]LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
paper
[10]MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection
paper | code
[9]Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination
paper
[8]Non-Contrastive Unsupervised Learning of Physiological Signals from Video
paper
[7]Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems
paper | code
[6]Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models
paper
[5]The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
paper | code
[4]Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning
paper | code
[3]Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
paper
[2]Siamese Image Modeling for Self-Supervised Vision Representation Learning
paper | code
[1]Cut and Learn for Unsupervised Object Detection and Instance Segmentation
paper | project
[3]OCTET: Object-aware Counterfactual Explanations
paper | code
[2]Don’t Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis
paper
[1]SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries(SplineCam:深度网络几何和决策边界的精确可视化和表征)
paper | code
[1]Zero-shot Object Counting
paper
[3]Make Landscape Flatter in Differentially Private Federated Learning
paper
[2]STDLens: Model Hijacking-resilient Federated Learning for Object Detection
paper | code
[1]Re-thinking Federated Active Learning based on Inter-class Diversity
paper | code
[1]BEVFormer v2: Adapting Modern Image Backbones to Bird’s-Eye-View Recognition via Perspective Supervision(BEVFormer v2:通过透视监督使现代图像主干适应鸟瞰图识别)
paper
[57]Level-S2fM: Structure from Motion on Neural Level Set of Implicit Surfaces
paper
[56]FeatureBooster: Boosting Feature Descriptors with a Lightweight Neural Network
paper
[55]ARO-Net: Learning Implicit Fields from Anchored Radial Observations
paper | code
[54]Unknown Sniffer for Object Detection: Don’t Turn a Blind Eye to Unknown Objects
paper
[53]Robust Test-Time Adaptation in Dynamic Scenarios
paper
[52]LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction
paper
[51]Doubly Right Object Recognition: A Why Prompt for Visual Rationales
paper
[50]CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
paper
[49]Marching-Primitives: Shape Abstraction from Signed Distance Function
paper
[48]Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery
paper | code
[47]ActMAD: Activation Matching to Align Distributions for Test-Time-Training
paper | code
[46]Robust Mean Teacher for Continual and Gradual Test-Time Adaptation
paper | code
[45]Planning-oriented Autonomous Driving
paper | code
[44]Explicit Visual Prompting for Low-Level Structure Segmentations
paper | code
[43]Leapfrog Diffusion Model for Stochastic Trajectory Prediction
paper | code
[42]Feature Alignment and Uniformity for Test Time Adaptation
paper
[41]Attribute-preserving Face Dataset Anonymization via Latent Code Optimization
paper | code
[40]Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
paper | code
[39]Effective Ambiguity Attack Against Passport-based DNN Intellectual Property Protection Schemes through Fully Connected Layer Substitution
paper
[38]Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark
paper | code
[37]Learning a Depth Covariance Function
paper
[36]VecFontSDF: Learning to Reconstruct and Synthesize High-quality Vector Fonts via Signed Distance Functions
paper
[35]Dense Distinct Query for End-to-End Object Detection
paper | code
[34]Facial Affective Analysis based on MAE and Multi-modal Information for 5th ABAW Competition
paper
[33]Partial Network Cloning
paper | code
[32]Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection
paper | code
[31]Adversarial Counterfactual Visual Explanations
paper | code
[3-]A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation
paper | code
[29]Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
paper | code
[28]Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry
paper | code
[27]Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
paper | code
[26]Backdoor Defense via Deconfounded Representation Learning
paper | code
[25]Label Information Bottleneck for Label Enhancement
paper
[24]LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
paper | code
[23]Diversity-Aware Meta Visual Prompting
paper | code
[22]ABAW: Valence-Arousal Estimation, Expression Recognition, Action Unit Detection & Emotional Reaction Intensity Estimation Challenges
paper
[21]Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
paper
[20]UniHCP: A Unified Model for Human-Centric Perceptions
paper | code
[19]Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
paper
[18]Revisiting Rotation Averaging: Uncertainties and Robust Losses
paper | code
[17]3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
paper
[16]Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection
paper | code
[15]Understanding and Improving Visual Prompting: A Label-Mapping Perspective
paper | code
[14]vMAP: Vectorised Object Mapping for Neural Field SLAM
paper | code
[13]EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
paper
[12]Upcycling Models under Domain and Category Shift
paper | code
[11]Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images
paper | code
[10]Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies
paper
[9]Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples
paper | code
[8]Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
paper
[7]Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
paper
[6]Physical-World Optical Adversarial Attacks on 3D Face Recognition
paper
[5]Improving Cross-Modal Retrieval with Set of Diverse Embeddings
paper
[4]Neural Video Compression with Diverse Contexts
paper | code
[3]Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
paper
[2]Single Image Backdoor Inversion via Robust Smoothed Classifiers
paper | code
[1]Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
paper | code
1.CVPR2023|打破对MIM(掩码图像建模)的数据缩放能力的误解!
2.CVPR 2023|基于CLIP的微调新范式!训练速度和性能均创新高!
3.CVPR 2023|浙大提出全归一化流模型PyramidFlow:高分辨率缺陷异常定位新范式
4.CVPR 2023|大脑视觉信号被Stable Diffusion复现图像!“人类的谋略和谎言不存在了”
5.CVPR 2023|港科大 DA-BEV: 3D目标检测新 SOTA,一种强大的深度信息挖掘方法
6.CVPR 23|表征学习超MAE,谷歌等提出MAGE:无监督图像生成超越 Latent Diffusion
7.CVPR2023|不好意思我要加速度了!FasterNet:更高FLOPS才是更快更强的底气
8.CVPR 2023|大模型流行之下,SN-Net给出一份独特的答卷
9.CVPR 2023|结合特征金字塔结构的自监督学习 iTPNs
10.CVPR 2023|SQR:对于训练DETR-family目标检测的探索和思考
11.CVPR 2023|COCO新纪录65.4mAP!InternImage:注入新机制,扩展DCNv3,探索视觉大模型
12.CVPR 2023|YOLOv7强势收录!时隔6年,YOLOv系列再登CVPR!
13.CVPR 2023|谷歌提出Imagic:扩散模型只用文字就能PS照片了!
14.CVPR 2023|Lite DETR:计算量减少60%!高效交错多尺度编码器
15.CVPR 2023|白翔团队新作:借助CLIP完成场景文字检测
16.CVPR’23|即插即用系列!一种轻量高效的自注意力机制助力图像恢复网络问鼎 SOTA
17.CVPR 2023|英伟达提出VoxFromer: 单目3D语义场景补全新SOTA
18.CVPR 2023|EMA-VFI: 基于帧间注意力提取运动和外观信息的高效视频插帧
原文链接:https://github.com/extreme-assistant/CVPR2023-Paper-Code-Interpretation/blob/master/CVPR2023.md?plain=1