Stable Diffusion

4582 readers

3 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 2 years ago

MODERATORS

[email protected]

Pusa: Thousands Timesteps Video Diffusion Model (github.com)

submitted 2 days ago by [email protected] to c/[email protected]

0 comments fedilink hide all child comments

Overview

Pusa introduces a paradigm shift in video diffusion modeling through frame-level noise control (thus it has thousands of timesteps, rather than one thousand of timesteps), departing from conventional approaches. This shift was first presented in our FVDM paper. Leveraging this architecture, Pusa seamlessly supports diverse video generation tasks (Text/Image/Video-to-Video) while maintaining exceptional motion fidelity and prompt adherence with our refined base model adaptations. Pusa-V0.5 represents an early preview based on Mochi1-Preview. We are open-sourcing this work to foster community collaboration, enhance methodologies, and expand capabilities.

Model: https://huggingface.co/RaphaelLiu/Pusa-V0.5

Code: https://github.com/Yaofang-Liu/Pusa-VidGen

Training Toolkit: https://github.com/Yaofang-Liu/Mochi-Full-Finetuner

Dataset: https://huggingface.co/datasets/RaphaelLiu/PusaV0.5_Training

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here