Zehuan Huang (黄泽桓) 

Beihang University


Email: huanngzh@gmail.com
[Github] [Google Scholar] [Twitter] [CV]

Biography

I am a master student in School of Software from Beihang University now, supervised by Prof. Lu Sheng.

My prior research focused on applying deep generative models to 3D asset creation, encompassing the generation of 3D objects, scenes, textures, and animations. My current research interests lie in interactive world models and simulation, including (i) Simulation-Ready 3D Generation and Reconstruction, (ii) Interactive World Modeling, and (iii) Native Multi-Modal Generative Models.

I am grateful to all my collaborators and mentors along the way. I first started doing research under the guidance of Prof. Miao Wang. Then I started working on deep learning related projects under the supervision of Prof. Lu Sheng. Besides, I also haved intern at MiniMax, Shanghai AI Lab, VAST, and Tencent Hunyuan3D, and I'm fortunate to have worked closely with Junting Dong, Yuan-Chen Guo, Yanpei Cao, Yang Li, Zhuo Chen, and Chunchao Guo.

I am always open to academic and industrial collaborations, if you share the vision, please do not hesitate to contact me!

News

Selected Publications

3D Generation
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models
Zehuan Huang, Haoran Feng, Yangtian Sun, Yuanchen Guo, Yanpei Cao, Lu Sheng
SIGGRAPH Asia 2025
TL;DR: Animate any 3D skeleton with joint video-pose diffusion models.
MV-Adapter: Multi-view Consistent Image Generation Made Easy
Zehuan Huang, Yuan-Chen Guo, Haoran Wang, Ran Yi, Lizhuang Ma, Yan-Pei Cao, Lu Sheng
ICCV 2025
TL;DR: Versatile multi-view generation with various base models and conditions, and high-quality 3D texture generation.
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
Zehuan Huang, Yuan-Chen Guo, Xingqiao An, Yunhan Yang, Yangguang Li, Zi-Xin Zou, Ding Liang, Xihui Liu, Yan-Pei Cao, Lu Sheng
CVPR 2025
TL;DR: MIDI-3D extends image-to-3D object generation models to multi-instance diffusion models for compositional 3D scene generation.
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Lin Li*, Zehuan Huang*†, Haoran Feng, Gengxiong Zhuang, Rui Chen, Chunchao Guo, Lu Sheng
3DV 2026 Oral
TL;DR: A training-free 3D editing approach that performs precise and coherent editing in native 3D latent space instead of multi-view space.

* Equal contribution. Project Lead. Corresponding author.

Open-Source Projects

Custom nodes for using MV-Adapter for multi-view synthesis in ComfyUI.
ComfyUI PyTorch Diffusion
Python package for rendering 3D scenes and animations using blender.
Blender Python 3D Rendering

Selected Honors & Awards

Educations

Industrial Experience

Services

Reviewer

ICLR, NeurIPS, CVPR, ICCV, ACM MM, AAAI, IJCV, TCSVT

In-School

2023 Fall ~ 2025 Spring, part-time technology counselor in School of Software, Beihang University
2024 Spring, TA in Image Processing and Computer Vision, instructed by Prof. Lu Sheng
© 2026 Zehuan Huang