Jiaming Ji (吉嘉铭)

Phd Student at Peking University

AI Alignment
AI Safety
Large Models

Email: jiamg.ji at gmail dot com

[Google Scholar][GitHub]

About me

I’m a PhD student at the Institute of Artificial Intelligence, Peking University, advised by Prof. Yaodong Yang (both a good teacher and a helpful friend in my life). In parallel, I am also a visiting scholar at the Hong Kong University of Science and Technology, advised by renowned computer scientist Prof. Yike Guo. My research focuses on Reinforcement Learning, Large Language Models, Multimodal Models and Safety Alignment, with a strong emphasis on bridging academic advances and real-world deployment. I have contributed to the open-source and real-world deployment of several large-scale models, including Baichuan2, the Hong Kong AI Model HKGAI-v1, the Pengcheng Brain model, and the medical triage model MedGuide. Notably, MedGuide has been deployed in hospitals and is actively supporting doctors and nurses in emergency triage—something I take great pride in beyond my academic achievements.

In 2025, I was honored to be selected as an Apple Scholar in AI/ML, mentored by Rin Metcalf Susa and Natalie Mackraz. In 2024, I received the first batch of National Natural Science Foundation funding for the Youth Student Basic Research Project (Ph.D. track), as the sole awardee from Peking University in the field of intelligence. Prior to my Ph.D., I conducted research on neuromorphic computing and brain-computer interfaces with Prof. Gang Pan at Zhejiang University. I began my research journey focusing on safe reinforcement learning and won the championship in the NeurIPS 2022 MyoChallenge for robotic dexterous manipulation.

关于我(中文)

吉嘉铭,北京大学人工智能研究院博士生在读,导师为杨耀东老师,研究方向为强化学习、大模型的安全与价值对齐,在计算机顶级会议期刊发表口头、焦点论文等十余篇,谷歌学术引用累计2200余次,模型开源累计下载500W,GitHub开源累计获得2W+ Star。曾获首批国自然博士青年基金资助(2023年度北京大学智能学科唯一),苹果学者奖学金(Apple Scholar,全国仅两位),获北京大学博士最高研究奖“校长奖学金”, 首届中国电子学会—腾讯博士生科研激励计划(全国17人),获 NeurIPS‘22 机器人灵巧操作比赛冠军,研究成果及模型被OpenAI 、Meta引用,被MIT Tech Review报道。

News

[Show more]

Research Summary

Currently, i focus on AI Safety and Alignment.

Honors

Preprints

SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Safe Reinforcement Learning
Borong Zhang*, Yuhao Zhang*, Jiaming Ji*, Yingshan Lei, Josef Dai, Yuanpei Chen, Yaodong Yang
Arxiv, 2025
[Project Webpage]
Robotics and VLASafety and Alignment
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
Jiaming Ji*, Jiayi Zhou*, Hantao Lou, Boyuan Chen, Donghai Hong, Xuyao Wang, ..., Yaodong Yang
Arxiv, 2025
[Code][Data]
Multimodal AlignmentOmni-Models
AI Alignment: A Comprehensive Survey
Jiaming Ji*, Tianyi Qiu*, Boyuan Chen*, Borong Zhang*, Hantao Lou, Kaile Wang, ..., Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao
Arxiv, 2024
[Project Webpage]
AlignmentSurvey

Publications (* denotes equal contribution, and denotes the corresponding author)

Language Models Resist Alignment: Evidence From Data Compression
Jiaming Ji*, Kaile Wang*, Tianyi Qiu*, Boyuan Chen*, Jiayi Zhou*, Changye Li, Hantao Lou, Juntao Dai, Yunhuai Liu, Yaodong Yang
ACL Best Paper, 2025
[Paper]
AI AlignmentAI SafetyReinforcement Learning
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
Jiaming Ji*, Donghai Hong*, Borong Zhang, Boyuan Chen, Josef Dai, Boren Zheng, Tianyi Qiu, Boxun Li, Yaodong Yang
ACL Main, 2025
[Paper][Data]
AI AlignmentAI SafetyReinforcement Learning
Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu*, Fanzhi Zeng*, Jiaming Ji*, Dong Yan*, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, Yaodong Yang
ACL Findings, 2025
[Paper]
Reinforcement LearningLarge Language Models
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
Hantao Lou, Jiaming Ji, Kaile Wang, Yaodong Yang
AAAI, 2025
[Paper]
Large Language ModelsAI Alignment
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
Jiayi Zhou*, Jiaming Ji*, Juntao Dai, Yaodong Yang
AAAI Oral, 2025
[Paper]
Large Language ModelsReinforcement Learning
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Jiaming Ji*, Jiayi Zhou*, Borong Zhang*, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang
JMLR, 2024
Reinforcement LearningRobotics and VLA
Aligner: Efficient Alignment by Learning to Correct
Jiaming Ji*, Boyuan Chen*, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang
NeurIPS Oral, 2024
[Code][Data]
Large Language ModelsAI SafetyReinforcement Learning
ProgressGym: Alignment with a Millennium of Moral Progress
Tianyi Qiu*, Yang Zhang*, Xuchuan Huang, Jasmine Xinze Li, Jiaming Ji, Yaodong Yang
NeurIPS Spotlight, 2024
Large Language ModelsAI SafetyAI Alignment
Safe RLHF: Safe Reinforcement Learning from Human Feedback
Josef Dai*, Xuehai Pan*, Ruiyang Sun*, Jiaming Ji*, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang
ICLR Spotlight, 2024
[Code]
Large Language ModelsAI SafetyAI Alignment
SafeDreamer: Safe Reinforcement Learning with World Models
Weidong Huang*, Jiaming Ji*, Chunhe Xia*, Borong Zhang, Yaodong Yang
ICLR, 2024
[Code]
Reinforcement LearningRobotics and VLA
Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Jiaming Ji*, Borong Zhang*, Jiayi Zhou*, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang
NeurIPS, 2023
[Paper][Code]
Reinforcement LearningRobotics and VLA
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Jiaming Ji*, Mickel Liu*, Juntao Dai*, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang
NeurIPS, 2023
[Paper][Code][Data]
Large Language ModelsAI SafetyAI Alignment
Baichuan 2: Open Large-scale Language Models
Jiaming Ji, Other Authors (Alphabetic Order)
Arxiv (Technical Report), 2023
[Code]
Large Language Models
Constrained Update Projection Approach to Safe Policy Optimization
Long Yang*, Jiaming Ji*, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan
NeurIPS, 2022
[Paper][Code]
Reinforcement LearningRobotics and VLA

Services

Teaching Assistant