Zehuan Huang (黄泽桓) 

Final Year Master Student @ Beihang University


Email: huangzehuan@buaa.edu.cn
[Github] [Google Scholar] [Twitter] [CV]

Biography

I am a master student in School of Software from Beihang University now, supervised by Prof. Lu Sheng.

My prior research focused on applying deep generative models to 3D asset creation, encompassing the generation of 3D objects, scenes, textures, and animations. My current research interests lie in world models and simulation, including

 (i) Generalizable 3D Foundation Models (Generative 3D Reconstruction, Physics Modeling)
 (ii) Interactive World Models (Real-Time, Long-term Memory, Physics-Compliance)

I am grateful to all my collaborators and mentors along the way. I first started doing research under the guidance of Prof. Miao Wang. Then I started working on deep learning related projects under the supervision of Prof. Lu Sheng. Besides, I also successively haved intern at MiniMax, Shanghai AI Lab, and VAST, and I'm fortunate to have worked closely with Junting Dong, Yuan-Chen Guo and Yanpei Cao.

I am always open to academic and industrial collaborations, if you share the vision, please do not hesitate to contact me!

News

Selected Publications

3D Generation
MV-Adapter: Multi-view Consistent Image Generation Made Easy
Zehuan Huang, Yuan-Chen Guo, Haoran Wang, Ran Yi, Lizhuang Ma, Yan-Pei Cao, Lu Sheng
ICCV 2025
TL;DR: Versatile multi-view generation with various base models and conditions, and high-quality 3D texture generation.
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
Zehuan Huang, Yuan-Chen Guo, Xingqiao An, Yunhan Yang, Yangguang Li, Zi-Xin Zou, Ding Liang, Xihui Liu, Yan-Pei Cao, Lu Sheng
CVPR 2025
TL;DR: MIDI-3D extends image-to-3D object generation models to multi-instance diffusion models for compositional 3D scene generation.
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Hao Wen*, Zehuan Huang*, Yaohui Wang, Xinyuan Chen, Yu Qiao, Lu Sheng
CVPR 2025
TL;DR: Transfer the two-stage image-to-3D pipeline into a unified recursive diffusion process, thereby reducing the data bias of each stage and improving the quality of generated 3D.
TELA: Text to Layer-wise 3D Clothed Human Generation
Junting Dong, Qi Fang, Zehuan Huang, Xudong Xu, Jingbo Wang, Sida Peng, Bo Dai
ECCV 2024
EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion
Zehuan Huang*, Hao Wen*, Junting Dong*, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng
CVPR 2024
* Equal contribution. Project Lead. Corresponding author.

Leading Projects

Controllable Image Generation
Multi-Agent Amodal Completion: Direct Synthesis with Fine-Grained Semantic Guidance
Hongxing Fan, Lipeng Wang, Haohua Chen, Zehuan Huang†, Lu Sheng
ACM MM 2025
TL;DR: A multi-agent framework for high-fidelity amodal completion.
Personalize Anything for Free with Diffusion Transformer
Haoran Feng*, Zehuan Huang*†, Lin Li, Hairong Lv, Lu Sheng
Under Review
TL;DR: Customize any subject with advanced DiT without additional fine-tuning.
Parts2Whole: Generalizable Multi-Part Portrait Customization
Hongxing Fan*, Zehuan Huang*†, Lipeng Wang, Haohua Chen, Li Yin, Lu Sheng
TIP 2025
TL;DR: A unified framework for customizing human images from user-specified part images.

Open-Source Projects

Custom nodes for using MV-Adapter for multi-view synthesis in ComfyUI.
ComfyUI PyTorch Diffusion
Python package for rendering 3D scenes and animations using blender.
Blender Python 3D Rendering

Honors & Awards

Educations

Industrial Experience

Services

Reviewer

ICLR, NeurIPS, CVPR, ICCV, ACM MM, TCSVT

Contributor

huggingface/diffusers, the most widely-used library for diffusion models.
threestudio, a popular repo for 3d generation.

In-School

2023 Fall ~ 2025 Spring, part-time technology counselor in School of Software, Beihang University
2024 Spring, TA in Image Processing and Computer Vision, instructed by Prof. Lu Sheng
© 2025 Zehuan Huang