Yisheng He (何益升)
Yisheng He is an algorithm expert in Alibaba. He obtained his Ph.D. at HKUST,
advised by Prof. Qifeng Chen, Prof. Long Quan, and Dr. Jian Sun.
Email  / 
Google Scholar  / 
GitHub  / 
WeChat
If you are interested in working with me as a research intern, feel free to email your CV.
|
|
Research
I'm interested in 3D Computer Vision, AIGC, Embodied AI, and Digital Avatar.
|
|
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He*, Xiaodong Gu*, Xiaodan Ye, Chao Xu, Zhengyi Zhao, Yuan Dong, Weihao Yuan, Zilong Dong, Liefeng Bo
SIGGRAPH, 2025
project page /
paper /
code
LAM creates animatable Gaussian heads with one-shot images in a single forward pass, which can be reenacted and rendered on various platforms (including mobile phones) in real time.
|
|
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li, Weihao Yuan, Yisheng He, Lingteng Qiu, Shenhao Zhu, Xiaodong Gu, Weichao Shen, Yuan Dong, Zilong Dong, Laurence T. Yang
ICLR, 2025
project page /
paper /
code
LaMP is a language-motion pretraining model that advances text-to-motion generation, motion-text retrieval, and motion captioning through aligned language-motion representation learning.
|
|
MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow
Zhe Li, Yisheng He, Zhong Lei, Weichao Shen, Qi Zuo, Lingteng Qiu, Shenhao Zhu, Zilong Dong, Laurence T. Yang, Weihao Yuan
Arxiv, 2025
paper
We build a bidirectional control flow between the style and the content for stylized motion generation and enable multimodal style control including text, image, and style motions.
|
|
Gaussian-Informed Continuum for Physical Property Identification and Simulation
Junhao Cai, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen
NeurIPS, 2024 (Oral Presentation)
project page /
paper /
code
We introduce a hybrid framework that leverages 3D Gaussian representation to advance physical property identification.
|
|
MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling
Weihao Yuan*, Yisheng He*, Weichao Shen, Yuan Dong, Xiaodong Gu, Zilong Dong, Liefeng Bo, Qixing Huang
NeurIPS, 2024
paper
We introduce a 2D joint VQVAE to quantize each joint instead of all joints into tokens. A spatial-temporal modeling framework with temporal-spatial 2D masking and 2D attention is also proposed for motion generation.
|
|
Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition
Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang
ECCV, 2024
project page /
paper
We enable high-fidelity, transferable, and intensity control for neural field editing.
|
|
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Minglin Chen, Longguang Wang, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo
Arxiv, 2024
paper
Our method synthesizes consistent 3D content with fine-grained sketch control.
|
|
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
Junhao Cai*, Yisheng He*, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qifeng Chen,
IEEE Robotics and Automation Letters (RA-L), 2024
project page /
paper /
code
We introduce a new problem: open-vocabulary 9D object pose and size estimation, a new dataset: OO3D-9D, and a new framework based on vision foundation model to tackle this problem.
|
|
Towards Self-Supervised Category-Level Object Pose and Size Estimation
Yisheng He, Haoqiang Fan, Haibin Huang, Qifeng Chen, Jian Sun
Arxiv, 2022
project page /
paper
A self-supervised framework for category-level object pose and
size estimation via differentiable shape deformation, registration, and rendering.
|
|
FS6D: Few-Shot 6D Pose Estimation of Novel Objects
Yisheng He, Yao Wang, Haoqiang Fan, Jian Sun, Qifeng Chen
CVPR, 2022
project page /
paper /
data /
code
A new open-set few-shot 6D object pose estimation problem:
estimating the 6D pose of an unknown object by a few support views without CAD models and extra training.
A large-scale synthesis dataset for pre-training and benchmarks for future research.
|
|
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun
CVPR, 2021 (Oral Presentation)
project page /
paper /
code
/
video (youtube) /
video (bilibili)
A generic full flow bidirectional fusion framework for RGBD representation learning,
applied to joint instance semantic segmentation and 3D keypoint-based 6D pose estimation.
|
|
iShape: A First Step Towards Irregular Shape Instance Segmentation
Lei Yang, Ziwei Yan, Yisheng He, Wei Sun, Zhenhang Huang, Haibin Huang, Haoqiang Fan
arXiv, 2021
project page /
paper /
code /
dataset
A brand new dataset to promote the study of instance segmentation for objects with irregular shapes and
an affinity-based algorithm to tackle it.
|
|
PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, Jian Sun
CVPR, 2020
project page /
paper /
code
/
video (youtube) /
video (bilibili)
The first deep learning 3D keypoint-based 6D pose estimation algorithm and an overall framework for joint instance semantic segmantation and 3D keypoint detection.
|
Services
Program Committee/Reviewers: CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, ACMM, ICRA, IROS, TPAMI, IJCV, RAL, Neurocomputing
Teaching Assistant @ HKUST: COMP 4201 (Spring 2019), COMP 1029 (Fall 2020), COMP 4201 (Spring 2021)
|
|