| 
          
            | 
                Yisheng He (何益升)
               
                Yisheng He is a researcher at Alibaba. He obtained his Ph.D. at HKUST,
                advised by Prof. Qifeng Chen, Prof. Long Quan, and Dr. Jian Sun.
               
                Email  / 
                
                
                Google Scholar  / 
                
                GitHub 
                
               
                🔥We are now actively hiring research interns. If you are interested in working with me, feel free to email me your CV.
               |   |  
            
            | Research 
                I'm interested in 3D Computer Vision, AIGC, Embodied AI, and Digital Avatar.
               ( * denotes equal contribution; ^ denotes intern student; ✉ denotes corresponding author.)
 |  
          
            |   | Forge4D: Feed-Forward 4D Human Reconstruction and Interpolation from Uncalibrated Sparse-View Videos Yingdong Hu*^, Yisheng He*✉, Jinnan Chen, Weihao Yuan, Kejie Qiu, Zehong Lin, Siyu Zhu, Zilong Dong, Jun Zhang
 Preprint, 2025
 project page /
	      	      paper /
		            code
   
                Forge4D is the first feed-forward model for 4D human Gaussian reconstruction in real world metric scale, and enables novel-view and novel-time synthesis from uncalibrated sparse-view videos in an efficient streaming manner.
               |  
            |   | PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image Peng Li*^, Yisheng He*✉, Yingdong Hu^, Yuan Dong, Weihao Yuan, Yuan Liu, Siyu Zhu, Gang Cheng, Zilong Dong, Yike Guo
 Preprint, 2025
 project page /
	      	      paper /
		            code
 
                PanoLAM is a large avatar model for Gaussian full-head reconstruction from a single unposed image. It utilize coarse-to-fine and dual-branch frameworks that creates Gaussian full-head within a second.
               |  
            |   | CoProSketch: Controllable and Progressive Sketch Generation with Diffusion Model Ruohao Zhan*, Yijin Li*, Yisheng He, Shuo Chen, Yichen Shen, Xinyu Chen, Zilong Dong, Zhaoyang Huang, Guofeng Zhang
 ACMM, 2025
 paper
 
                CoProSketch provides prominent controllability and details for sketch generation with diffusion models.
               |  
            |   | LAM: Large Avatar Model for One-shot Animatable Gaussian Head Yisheng He*, Xiaodong Gu*, Xiaodan Ye, Chao Xu, Zhengyi Zhao, Yuan Dong, Weihao Yuan, Zilong Dong, Liefeng Bo
 SIGGRAPH, 2025
 project page /
	      	      paper /
		            code
   
                LAM creates animatable Gaussian heads with one-shot images in a single forward pass, which can be reenacted and rendered on various platforms (including mobile phones) in real time.
               |  
            |   | LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning Zhe Li^, Weihao Yuan, Yisheng He, Lingteng Qiu, Shenhao Zhu, Xiaodong Gu, Weichao Shen, Yuan Dong, Zilong Dong, Laurence T. Yang
 ICLR, 2025
 project page /
	      	    paper /
		          code
   
                LaMP is a language-motion pretraining model that advances text-to-motion generation, motion-text retrieval, and motion captioning through aligned language-motion representation learning.
               |  
            |   | MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow Zhe Li^, Yisheng He, Zhong Lei, Weichao Shen, Qi Zuo, Lingteng Qiu, Shenhao Zhu, Zilong Dong, Laurence T. Yang, Weihao Yuan
 Arxiv, 2025
 paper
 
                We build a bidirectional control flow between the style and the content for stylized motion generation and enable multimodal style control including text, image, and style motions.
               |  
            |   | Gaussian-Informed Continuum for Physical Property Identification and Simulation Junhao Cai^, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen
 NeurIPS, 2024 (Oral Presentation)
 project page /
	      	    paper /
		          code
   
                  We introduce a hybrid framework that leverages 3D Gaussian representation to advance physical property identification.
               |  
            |   | MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling Weihao Yuan*, Yisheng He*, Weichao Shen, Yuan Dong, Xiaodong Gu, Zilong Dong, Liefeng Bo, Qixing Huang
 NeurIPS, 2024
 paper
 
                We introduce a 2D joint VQVAE to quantize each joint instead of all joints into tokens. A spatial-temporal modeling framework with temporal-spatial 2D masking and 2D attention is also proposed for motion generation.
               |  
            |   | Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang
 ECCV, 2024
 project page /
	            paper
 
                We enable high-fidelity, transferable, and intensity control for neural field editing. 
                
               |  
            |   | Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation Minglin Chen^, Longguang Wang, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo
 Arxiv, 2024
 paper
 
                Our method synthesizes consistent 3D content with fine-grained sketch control. 
                
               |  
            |   | OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation Junhao Cai*^, Yisheng He*, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qifeng Chen,
 IEEE Robotics and Automation Letters (RA-L), 2024
 project page /
		          paper /
		          code
   
                We introduce a new problem: open-vocabulary 9D object pose and size estimation, a new dataset: OO3D-9D, and a new framework based on vision foundation model to tackle this problem.
                
               |  
            |   | Towards Self-Supervised Category-Level Object Pose and Size Estimation Yisheng He, Haoqiang Fan, Haibin Huang, Qifeng Chen, Jian Sun
 Arxiv, 2022
 project page / 
		          paper
 
                A self-supervised framework for category-level object pose and
                size estimation via differentiable shape deformation, registration, and rendering.
               |  
            |   | FS6D: Few-Shot 6D Pose Estimation of Novel Objects Yisheng He, Yao Wang, Haoqiang Fan, Jian Sun, Qifeng Chen
 CVPR, 2022
 project page / 
							paper / 
							data /
							code
   
                A new open-set few-shot 6D object pose estimation problem: 
                estimating the 6D pose of an unknown object by a few support views without CAD models and extra training. 
                A large-scale synthesis dataset for pre-training and benchmarks for future research.
               |  
            |   | FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun
 CVPR, 2021 (Oral Presentation)
 project page / 
							paper / 
							code
  /
              video (youtube) /
              video (bilibili) 
                A generic full flow bidirectional fusion framework for RGBD representation learning,
                applied to joint instance semantic segmentation and 3D keypoint-based 6D pose estimation.
               |  
            |   | iShape: A First Step Towards Irregular Shape Instance Segmentation Lei Yang, Ziwei Yan, Yisheng He, Wei Sun, Zhenhang Huang, Haibin Huang, Haoqiang Fan
 arXiv, 2021
 project page / 
							paper / 
							code / 
							dataset
 
                A brand new dataset to promote the study of instance segmentation for objects with irregular shapes and 
                an affinity-based algorithm to tackle it.
               |  
            |   | PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, Jian Sun
 CVPR, 2020
 project page /
							paper / 
							code
  /
              video (youtube) /
              video (bilibili) The first deep learning 3D keypoint-based 6D pose estimation algorithm and an overall framework for joint instance semantic segmantation and 3D keypoint detection. |  
            
                
                    | Services
                        
                        Program Committee/Reviewers: CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, ACMM, ICRA, IROS, TPAMI, IJCV, RAL, Neurocomputing 
                        Teaching Assistant @ HKUST:  COMP 4201 (Spring 2019), COMP 1029 (Fall 2020), COMP 4201 (Spring 2021) |  |