【论文合集】Awesome Video Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, nerf, etc.

来源:https://github.com/showlab/Awesome-Video-Diffusion#

目录

Open-source Toolboxes and Foundation Models

Video Generation

Video Editing

Long-form Video Generation and Completion

Human or Subject Motion

Video Enhancement and Restoration

3D / NeRF

Video Understanding

Healthcare and Biology


Open-source Toolboxes and Foundation Models

  • I2VGen-XL (image-to-video / video-to-video)

  • text-to-video-synthesis-colab

  • VideoCrafter: A Toolkit for Text-to-Video Generation and Editing

  • ModelScope (Text-to-video synthesis)

  • Diffusers (Text-to-video synthesis)

Video Generation

  • Hierarchical Masked 3D Diffusion Model for Video Outpainting (Sep., 2023)

  • VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation (Sep., 2023)

  • MagicAvatar: Multimodal Avatar Generation and Animation (Aug., 2023)

  • Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models (Aug., 2023)

  • ModelScope Text-to-Video Technical Report (Aug., 2023)

  • Dual-Stream Diffusion Net for Text-to-Video Generation (Aug., 2023)

  • DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory (Aug., 2023)

  • InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation (Jul., 2023)

  • Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation (Jul., 2023)

  • AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning (Jul., 2023)

  • DisCo: Disentangled Control for Referring Human Dance Generation in Real World (Jul., 2023)

  • VideoComposer: Compositional Video Synthesis with Motion Controllability (Jun., 2023)

  • Probabilistic Adaptation of Text-to-Video Models (Jun., 2023)

  • Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance (Jun., 2023)

  • Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising (May, 2023)

  • Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models (May, 2023)

  • ControlVideo: Training-free Controllable Text-to-Video Generation (May, 2023)

  • Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity (May, 2023)

  • Any-to-Any Generation via Composable Diffusion (May, 2023)

  • VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation (May, 2023)

  • Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models (May, 2023)

  • Motion-Conditioned Diffusion Model for Controllable Video Synthesis (Apr., 2023)

  • LaMD: Latent Motion Diffusion for Video Generation (Apr., 2023)

  • Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models (CVPR 2023)

  • Text2Performer: Text-Driven Human Video Generation (Apr., 2023)

  • Generative Disco: Text-to-Video Generation for Music Visualization (Apr., 2023)

  • Latent-Shift: Latent Diffusion with Temporal Shift (Apr., 2023)

  • DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion (Apr., 2023)

  • Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos (Apr., 2023)

  • Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos (CVPR 2023)

  • Seer: Language Instructed Video Prediction with Latent Diffusion Models (Mar., 2023)

  • Text2video-Zero: Text-to-Image Diffusion Models Are Zero-Shot Video Generators (Mar., 2023)

  • Conditional Image-to-Video Generation with Latent Flow Diffusion Models (CVPR 2023)

  • Decomposed Diffusion Models for High-Quality Video Generation (CVPR 2023)

  • Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023)

  • Learning 3D Photography Videos via Self-supervised Diffusion on Single Images (Feb., 2023)

  • Structure and Content-Guided Video Synthesis With Diffusion Models (Feb., 2023)

  • Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation (Dec., 2022)

  • Mm-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation (CVPR 2023)

  • Magvit: Masked Generative Video Transformer (Dec., 2022)

  • VIDM: Video Implicit Diffusion Models (AAAI 2023)

  • Latent Video Diffusion Models for High-Fidelity Video Generation With Arbitrary Lengths (Nov., 2022)

  • SinFusion: Training Diffusion Models on a Single Image or Video (Nov., 2022)

  • MagicVideo: Efficient Video Generation With Latent Diffusion Models (Nov., 2022)

  • Imagen Video: High Definition Video Generation With Diffusion Models (Oct., 2022)

  • Make-A-Video: Text-to-Video Generation without Text-Video Data (ICLR 2023)

  • Diffusion Models for Video Prediction and Infilling (TMLR 2022)

  • McVd: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation (NeurIPS 2022)

  • Video Diffusion Models (Apr., 2022)

  • Diffusion Probabilistic Modeling for Video Generation (Mar., 2022)

Video Editing

  • MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation (Sep., 2023)

  • MagicEdit: High-Fidelity and Temporally Coherent Video Editing (Aug., 2023)

  • StableVideo: Text-driven Consistency-aware Diffusion Video Editing (ICCV 2023)

  • CoDeF: Content Deformation Fields for Temporally Consistent Video Processing (Aug., 2023)

  • TokenFlow: Consistent Diffusion Features for Consistent Video Editing (Jul., 2023)

  • INVE: Interactive Neural Video Editing (Jul., 2023)

  • VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing (Jun., 2023)

  • Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation (Jun., 2023)

  • ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing (May, 2023)

  • Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts (May, 2023)

  • Soundini: Sound-Guided Diffusion for Natural Video Editing (Apr., 2023)

  • Zero-Shot Video Editing Using Off-the-Shelf Image Diffusion Models (Mar., 2023)

  • Edit-A-Video: Single Video Editing with Object-Aware Consistency (Mar., 2023)

  • FateZero: Fusing Attentions for Zero-shot Text-based Video Editing (Mar., 2023)

  • Pix2video: Video Editing Using Image Diffusion (Mar., 2023)

  • Video-P2P: Video Editing with Cross-attention Control (Mar., 2023)

  • Dreamix: Video Diffusion Models Are General Video Editors (Feb., 2023)

  • Shape-Aware Text-Driven Layered Video Editing (Jan., 2023)

  • Speech Driven Video Editing via an Audio-Conditioned Diffusion Model (Jan., 2023)

  • Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding (CVPR 2023)

Long-form Video Generation and Completion

  • MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation (NeurIPS 2022)

  • NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation (Mar., 2023)

  • Flexible Diffusion Modeling of Long Videos (May, 2022)

Human or Subject Motion

  • Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model (CVPR 2023)

  • InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions (Apr., 2023)

  • ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model (Apr., 2023)

  • Human Motion Diffusion as a Generative Prior (Mar., 2023)

  • Can We Use Diffusion Probabilistic Models for 3d Motion Prediction? (Feb., 2023)

  • Single Motion Diffusion (Feb., 2023)

  • HumanMAC: Masked Motion Completion for Human Motion Prediction (Feb., 2023)

  • DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model (Jan., 2023)

  • Modiff: Action-Conditioned 3d Motion Generation With Denoising Diffusion Probabilistic Models (Jan., 2023)

  • Unifying Human Motion Synthesis and Style Transfer With Denoising Diffusion Probabilistic Models (GRAPP 2023)

  • Executing Your Commands via Motion Diffusion in Latent Space (CVPR 2023)

  • Pretrained Diffusion Models for Unified Human Motion Synthesis (Dec., 2022)

  • PhysDiff: Physics-Guided Human Motion Diffusion Model (Dec., 2022)

  • BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction (Dec., 2022)

  • Listen, Denoise, Action! Audio-Driven Motion Synthesis With Diffusion Models (Nov. 2022)

  • Diffusion Motion: Generate Text-Guided 3d Human Motion by Diffusion Model (ICASSP 2023)

  • Human Joint Kinematics Diffusion-Refinement for Stochastic Motion Prediction (Oct., 2022)

  • Human Motion Diffusion Model (ICLR 2023)

  • FLAME: Free-form Language-based Motion Synthesis & Editing (AAAI 2023)

  • MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model (Aug., 2022)

  • Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion (CVPR 2022)

Video Enhancement and Restoration

  • LDMVFI: Video Frame Interpolation with Latent Diffusion Models (Mar., 2023)

  • CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming (Nov., 2022)

3D / NeRF

  • Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields (May, 2023)

  • RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture (May, 2023)

  • NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models (CVPR 2023)

  • Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction (Apr., 2023

  • Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions (Mar., 2023)

  • DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models (Feb., 2023)

  • NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion (Feb., 2023)

  • DiffRF: Rendering-guided 3D Radiance Field Diffusion (CVPR 2023)

Video Understanding

  • Exploring Diffusion Models for Unsupervised Video Anomaly Detection (Apr., 2023)

  • PDPP:Projected Diffusion for Procedure Planning in Instructional Videos (CVPR 2023) 

  • DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion (Mar., 2023)

  • DiffusionRet: Generative Text-Video Retrieval with Diffusion Model (Mar., 2023)

  • Refined Semantic Enhancement Towards Frequency Diffusion for Video Captioning (Nov., 2022)

  • A Generalist Framework for Panoptic Segmentation of Images and Videos (Oct., 2022)

Healthcare and Biology

  • Annealed Score-Based Diffusion Model for Mr Motion Artifact Reduction (Jan., 2023)

  • Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis (Mar., 2023)

  • Neural Cell Video Synthesis via Optical-Flow Diffusion (Dec., 2022)

你可能感兴趣的:(知识学习系列,论文笔记,人工智能,diffusion,机器学习,深度学习)