Publications
Publications
PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Improved and Accelerated Text-to-Image Generation with Collect, Reflect, and Refine
EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing
LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer
AuroraLong: Bringing RNNs Back to Efficient Open-Ended Video Understanding
Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine
Posta: A Go-To Framework for Customized Artistic Poster Generation
SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization
GenHaze: Pioneering Controllable One-Step Realistic Haze Generation for Real-World Dehazing
GlassWizard: Harvesting Diffusion Priors for Glass Surface Detection
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice
Technical report for Hedra Inc.’s Character-3 model.
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
Acceleration technique and foundational image-to-video model work for Hedra Inc.’s Character-3.
AGLLDiff: Guiding Diffusion Models Towards Unsupervised Training-free Real-world Low-light Image Enhancement
DPLUT: Unsupervised Low-Light Image Enhancement with Lookup Tables and Diffusion Priors
PromptHaze: Prompting Real-world Dehazing via Depth Anything Model
Residual Diffusion Deblurring Model for Single Image Defocus Deblurring
MovieChat+: Question-Aware Sparse Memory for Long Video Question Answering
RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models
Cross-Conditioned Diffusion Model for Medical Image-to-Image Translation
Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint
Learning Diffusion Texture Priors for Image Restoration
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
The first SDXL-level high-resolution non-AR T2I model.