CVPR2024部分研究方向文章梳理(持续更新中)

CVPR2024部分研究方向文章梳理(持续更新中)

长尾分布(Long-Tailed)

  1. DeiT-LT: Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets.
    全文地址:DeiT-LT \(rangwani-harsh.github.io\)

领域自适应(Domain Adaptation)

  1. Learning CNN on ViT:A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation.
    全文地址:
    Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation | Project Page (dotrannhattuong.github.io)
  2. Source-Free Domain Adaptation with Frozen Multimodal Foundation Model

视觉-语言预训练(Vision-language Pretraining)

  1. A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models.
    全文地址:CLAP (jusiro.github.io)
  2. Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names
  3. FairCLIP: Harnessing Fairness in Vision-Language Learning
  4. Efficient Test-Time Adaptation of Vision-Language Models
  5. ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models
  6. Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
  7. Transductive Zero-Shot && Few-Shot CLIP
  8. LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP
  9. PeVL: Pose-Enhanced Vision-Language Model for Fine-Grained Human Action Recognition
  10. Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
  11. PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
  12. JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models
  13. Label Propagation for Zero-shot Classification with Vision-Language Models
  14. ProTeCt: Prompt Tuning for Taxonomic Open Set Classification
  15. Active Prompt Learning in Vision Language Models

多模态大模型(Large Multimodal Models)

  1. ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
    全文地址:ViP-LLaVA
  2. Generative Multimodal Models are In-Context Learners
  3. Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
    全文地址:Q-Instruct | ②[CVPR 2024] Low-level visual instruction tuning, with a 200K dataset and a model zoo for fine-tuned checkpoints. (q-future.github.io)

少样本学习(Few-shot Learning)

  1. Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning
  2. Adversarially Robust Few-shot Learning via Parameter Co-distillation of Similarity and Class Concept Learners
  3. OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning
  4. Discriminative Sample-Guided and Parameter-Efficient Feature Space Adaptation for Cross-Domain Few-Shot Learning
  5. Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning
  6. Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
  7. Large Language Models are Good Prompt Learners for Low-Shot Image Classification

扩散模型(Diffusion Model)

  1. GenTron: Diffusion Transformers for Image and Video Generation。
  2. DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models
    全文地址:DiffuseMix
  3. Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives
    全文地址:Lodge (li-ronghui.github.io)
  4. TokenCompose: Text-to-Image Diffusion with Token-level Supervision
    全文地址:TokenCompose: Grounding Diffusion with Token-level Supervision (mlpc-ucsd.github.io)
  5. FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
  6. LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion
  7. FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models
  8. Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models
  9. CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model
  10. Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
  11. L-MAGIC: Language Model Assisted Generation of Images with Consistency
  12. InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

类增量学习(Class Incremental Learning)

  1. Class Incremental Learning with Multi-Teacher Distillation
  2. Dual-Enhanced Coreset Selection with Class-wise Collaboration for Online Blurry Class Incremental Learning
  3. Generative Multi-modal Models are Good Class Incremental Learners
  4. Text-Enhanced Data-free Approach for Federated Class-Incremental Learning

噪声标签(Noisy Label Learning)

  1. Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

你可能感兴趣的:(计算机视觉,人工智能,深度学习,分类,AI作画,stable,diffusion)