I’m a PhD student at the Institute of Artificial Intelligence, Peking University, advised by Prof.
Yaodong
Yang (both a good teacher and a helpful friend in my life). In parallel, I
am also a visiting scholar at the Hong Kong University of Science and Technology, advised by renowned computer scientist Prof. Yike Guo.
My research focuses on Reinforcement Learning, Large Language Models, Multimodal Models and Safety Alignment, with a strong emphasis on bridging academic advances and real-world deployment. I
have contributed to the open-source and real-world deployment of several large-scale models, including
Baichuan2, the Hong Kong AI Model HKGAI-v1, the Pengcheng Brain model, and the medical triage model
MedGuide. Notably, MedGuide has been deployed in hospitals and is actively supporting doctors and nurses
in emergency triage—something I take great pride in beyond my academic achievements.
In 2025, I was honored to be selected as an Apple Scholar in AI/ML, mentored by Rin Metcalf Susa and
Natalie Mackraz. In 2024, I received the first batch of National Natural Science Foundation funding for
the Youth Student Basic Research Project (Ph.D. track), as the sole awardee from Peking University in
the field of intelligence.
Prior to my Ph.D., I conducted research on neuromorphic computing and brain-computer interfaces with
Prof. Gang Pan at Zhejiang University. I began my research journey focusing on safe reinforcement
learning and won the championship in the NeurIPS 2022 MyoChallenge for robotic dexterous
manipulation.
AI
Alignment: Given the biases and discriminations that may exist in
pre-training data, large models
(LMs) may exhibit unintended behaviors. I am interested in alignment methods (e.g.,
Reinforcement
Learning from human feedback (RLHF)) and post-hoc alignment methods to ensure the safety and
trustworthy of LLMs.
Theoretical
Explanations and Mechanism Design for
Alignment:
Aligning these AI System (e.g. LLMs) effectively to
ensure
consistency with human intentions and values (though some views may question universal
values) is a significant current challenge. I am particularly interested in ensuring the
feasibility of these alignment methods in both theoretical and practical mechanisms.
Applications (LM + X):
I am interested in the
application of large models in various domain, such as healthcare and education, and the
potential impact of rapid industry development and iteration brought about by large
models.
Honors
2025-03
Apple Scholars in AI/ML. 苹果学者,全国仅两位。
2024-12
CIE-Tencent Doctoral Research Incentive Project. 首届中国电子学会—腾讯博士生科研激励计划,全国17人,科研基金10万。
2024-05
Peking University President Scholarship, the highest doctoral research honor.
北京大学校长奖学金。
2024-05
National Natural Science Foundation for Ph.D. students (first batch; the sole
recipient in the Peking University's intelligence field). 首批国家自然科学基金青年学生基础研究项目(博士研究生)项目资助,北大智能学科唯一。