Yisheng He (何益升)

Yisheng He is a researcher at Alibaba. He obtained his Ph.D. at HKUST, advised by Prof. Qifeng Chen, Prof. Long Quan, and Dr. Jian Sun.

Email  /  Google Scholar  /  GitHub

🔥 We are now actively hiring research interns. The successful candidates will conduct research to publish at leading international conferences. To apply, please email your CV to: ethanheysh@gmail.com.

profile photo
Research

I'm interested in 3D Computer Vision, AIGC, Embodied AI, and Digital Avatar.


( * denotes equal contribution; ^ denotes intern student; ✉ denotes corresponding author.)
Forge4D: Feed-Forward 4D Human Reconstruction and Interpolation from Uncalibrated Sparse-View Videos
Yingdong Hu*^, Yisheng He*✉, Jinnan Chen, Weihao Yuan, Kejie Qiu, Zehong Lin, Siyu Zhu, Zilong Dong, Jun Zhang
Preprint, 2025
project page / paper / code GitHub stars

Forge4D is the first feed-forward model for 4D human Gaussian reconstruction in real world metric scale, and enables novel-view and novel-time synthesis from uncalibrated sparse-view videos in an efficient streaming manner.

PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image
Peng Li*^, Yisheng He*✉, Yingdong Hu^, Yuan Dong, Weihao Yuan, Yuan Liu, Siyu Zhu, Gang Cheng, Zilong Dong, Yike Guo
Preprint, 2025
project page / paper / code

PanoLAM is a large avatar model for Gaussian full-head reconstruction from a single unposed image. It utilize coarse-to-fine and dual-branch frameworks that creates Gaussian full-head within a second.

CoProSketch: Controllable and Progressive Sketch Generation with Diffusion Model
Ruohao Zhan*, Yijin Li*, Yisheng He, Shuo Chen, Yichen Shen, Xinyu Chen, Zilong Dong, Zhaoyang Huang, Guofeng Zhang
ACMM, 2025
paper

CoProSketch provides prominent controllability and details for sketch generation with diffusion models.

LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He*, Xiaodong Gu*, Xiaodan Ye, Chao Xu, Zhengyi Zhao, Yuan Dong, Weihao Yuan, Zilong Dong, Liefeng Bo
SIGGRAPH, 2025
project page / paper / code GitHub stars

LAM creates animatable Gaussian heads with one-shot images in a single forward pass, which can be reenacted and rendered on various platforms (including mobile phones) in real time.

LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li^, Weihao Yuan, Yisheng He, Lingteng Qiu, Shenhao Zhu, Xiaodong Gu, Weichao Shen, Yuan Dong, Zilong Dong, Laurence T. Yang
ICLR, 2025
project page / paper / code GitHub stars

LaMP is a language-motion pretraining model that advances text-to-motion generation, motion-text retrieval, and motion captioning through aligned language-motion representation learning.

MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow
Zhe Li^, Yisheng He, Zhong Lei, Weichao Shen, Qi Zuo, Lingteng Qiu, Shenhao Zhu, Zilong Dong, Laurence T. Yang, Weihao Yuan
Arxiv, 2025
paper

We build a bidirectional control flow between the style and the content for stylized motion generation and enable multimodal style control including text, image, and style motions.

Gaussian-Informed Continuum for Physical Property Identification and Simulation
Junhao Cai^, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen
NeurIPS, 2024 (Oral Presentation)
project page / paper / code GitHub stars

We introduce a hybrid framework that leverages 3D Gaussian representation to advance physical property identification.

clean-usnob MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling
Weihao Yuan*, Yisheng He*, Weichao Shen, Yuan Dong, Xiaodong Gu, Zilong Dong, Liefeng Bo, Qixing Huang
NeurIPS, 2024
paper

We introduce a 2D joint VQVAE to quantize each joint instead of all joints into tokens. A spatial-temporal modeling framework with temporal-spatial 2D masking and 2D attention is also proposed for motion generation.

clean-usnob Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition
Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang
ECCV, 2024
project page / paper

We enable high-fidelity, transferable, and intensity control for neural field editing.

clean-usnob Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Minglin Chen^, Longguang Wang, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo
Arxiv, 2024
paper

Our method synthesizes consistent 3D content with fine-grained sketch control.

clean-usnob OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
Junhao Cai*^, Yisheng He*, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qifeng Chen,
IEEE Robotics and Automation Letters (RA-L), 2024
project page / paper / code GitHub stars

We introduce a new problem: open-vocabulary 9D object pose and size estimation, a new dataset: OO3D-9D, and a new framework based on vision foundation model to tackle this problem.

clean-usnob Towards Self-Supervised Category-Level Object Pose and Size Estimation
Yisheng He, Haoqiang Fan, Haibin Huang, Qifeng Chen, Jian Sun
Arxiv, 2022
project page / paper

A self-supervised framework for category-level object pose and size estimation via differentiable shape deformation, registration, and rendering.

clean-usnob FS6D: Few-Shot 6D Pose Estimation of Novel Objects
Yisheng He, Yao Wang, Haoqiang Fan, Jian Sun, Qifeng Chen
CVPR, 2022
project page / paper / data / code GitHub stars

A new open-set few-shot 6D object pose estimation problem: estimating the 6D pose of an unknown object by a few support views without CAD models and extra training. A large-scale synthesis dataset for pre-training and benchmarks for future research.

clean-usnob FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun
CVPR, 2021 (Oral Presentation)
project page / paper / code GitHub stars / video (youtube) / video (bilibili)

A generic full flow bidirectional fusion framework for RGBD representation learning, applied to joint instance semantic segmentation and 3D keypoint-based 6D pose estimation.

clean-usnob iShape: A First Step Towards Irregular Shape Instance Segmentation
Lei Yang, Ziwei Yan, Yisheng He, Wei Sun, Zhenhang Huang, Haibin Huang, Haoqiang Fan
arXiv, 2021
project page / paper / code / dataset

A brand new dataset to promote the study of instance segmentation for objects with irregular shapes and an affinity-based algorithm to tackle it.

clean-usnob PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, Jian Sun
CVPR, 2020
project page / paper / code GitHub stars / video (youtube) / video (bilibili)

The first deep learning 3D keypoint-based 6D pose estimation algorithm and an overall framework for joint instance semantic segmantation and 3D keypoint detection.

Academic Challenge
clean-usnob Rank 2nd in OCRTOC: Open Cloud Robot Table Organization Challenge , 2020
Services

  • Program Committee/Reviewers: CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, ACMM, ICRA, IROS, TPAMI, IJCV, RAL, Neurocomputing
  • Teaching Assistant @ HKUST: COMP 4201 (Spring 2019), COMP 1029 (Fall 2020), COMP 4201 (Spring 2021)

  • Last updated: October, 2025.

    Thanks Dr. Jon Barron for sharing the template code.