Yisheng He (何益升)

Yisheng He is an algorithm expert in Alibaba. He obtained his Ph.D. at HKUST, advised by Prof. Qifeng Chen, Prof. Long Quan, and Dr. Jian Sun.

Email  /  Google Scholar  /  GitHub  /  WeChat

If you are interested in working with me as a research intern, feel free to email your CV.

profile photo
Research

I'm interested in 3D Computer Vision, AIGC, Embodied AI, and Digital Avatar.

LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He*, Xiaodong Gu*, Xiaodan Ye, Chao Xu, Zhengyi Zhao, Yuan Dong, Weihao Yuan, Zilong Dong, Liefeng Bo
SIGGRAPH, 2025
project page / paper / code GitHub stars

LAM creates animatable Gaussian heads with one-shot images in a single forward pass, which can be reenacted and rendered on various platforms (including mobile phones) in real time.

LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li, Weihao Yuan, Yisheng He, Lingteng Qiu, Shenhao Zhu, Xiaodong Gu, Weichao Shen, Yuan Dong, Zilong Dong, Laurence T. Yang
ICLR, 2025
project page / paper / code GitHub stars

LaMP is a language-motion pretraining model that advances text-to-motion generation, motion-text retrieval, and motion captioning through aligned language-motion representation learning.

MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow
Zhe Li, Yisheng He, Zhong Lei, Weichao Shen, Qi Zuo, Lingteng Qiu, Shenhao Zhu, Zilong Dong, Laurence T. Yang, Weihao Yuan
Arxiv, 2025
paper

We build a bidirectional control flow between the style and the content for stylized motion generation and enable multimodal style control including text, image, and style motions.

Gaussian-Informed Continuum for Physical Property Identification and Simulation
Junhao Cai, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen
NeurIPS, 2024 (Oral Presentation)
project page / paper / code GitHub stars

We introduce a hybrid framework that leverages 3D Gaussian representation to advance physical property identification.

clean-usnob MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling
Weihao Yuan*, Yisheng He*, Weichao Shen, Yuan Dong, Xiaodong Gu, Zilong Dong, Liefeng Bo, Qixing Huang
NeurIPS, 2024
paper

We introduce a 2D joint VQVAE to quantize each joint instead of all joints into tokens. A spatial-temporal modeling framework with temporal-spatial 2D masking and 2D attention is also proposed for motion generation.

clean-usnob Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition
Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang
ECCV, 2024
project page / paper

We enable high-fidelity, transferable, and intensity control for neural field editing.

clean-usnob Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Minglin Chen, Longguang Wang, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo
Arxiv, 2024
paper

Our method synthesizes consistent 3D content with fine-grained sketch control.

clean-usnob OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
Junhao Cai*, Yisheng He*, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qifeng Chen,
IEEE Robotics and Automation Letters (RA-L), 2024
project page / paper / code GitHub stars

We introduce a new problem: open-vocabulary 9D object pose and size estimation, a new dataset: OO3D-9D, and a new framework based on vision foundation model to tackle this problem.

clean-usnob Towards Self-Supervised Category-Level Object Pose and Size Estimation
Yisheng He, Haoqiang Fan, Haibin Huang, Qifeng Chen, Jian Sun
Arxiv, 2022
project page / paper

A self-supervised framework for category-level object pose and size estimation via differentiable shape deformation, registration, and rendering.

clean-usnob FS6D: Few-Shot 6D Pose Estimation of Novel Objects
Yisheng He, Yao Wang, Haoqiang Fan, Jian Sun, Qifeng Chen
CVPR, 2022
project page / paper / data / code GitHub stars

A new open-set few-shot 6D object pose estimation problem: estimating the 6D pose of an unknown object by a few support views without CAD models and extra training. A large-scale synthesis dataset for pre-training and benchmarks for future research.

clean-usnob FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun
CVPR, 2021 (Oral Presentation)
project page / paper / code GitHub stars / video (youtube) / video (bilibili)

A generic full flow bidirectional fusion framework for RGBD representation learning, applied to joint instance semantic segmentation and 3D keypoint-based 6D pose estimation.

clean-usnob iShape: A First Step Towards Irregular Shape Instance Segmentation
Lei Yang, Ziwei Yan, Yisheng He, Wei Sun, Zhenhang Huang, Haibin Huang, Haoqiang Fan
arXiv, 2021
project page / paper / code / dataset

A brand new dataset to promote the study of instance segmentation for objects with irregular shapes and an affinity-based algorithm to tackle it.

clean-usnob PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, Jian Sun
CVPR, 2020
project page / paper / code GitHub stars / video (youtube) / video (bilibili)

The first deep learning 3D keypoint-based 6D pose estimation algorithm and an overall framework for joint instance semantic segmantation and 3D keypoint detection.

Academic Challenge
clean-usnob Rank 2nd in OCRTOC: Open Cloud Robot Table Organization Challenge , 2020
Services

  • Program Committee/Reviewers: CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, ACMM, ICRA, IROS, TPAMI, IJCV, RAL, Neurocomputing
  • Teaching Assistant @ HKUST: COMP 4201 (Spring 2019), COMP 1029 (Fall 2020), COMP 4201 (Spring 2021)

  • Last updated: March, 2022.

    Thanks Dr. Jon Barron for sharing the template code.