Research Summary - Jiaming Ji (吉嘉铭)

Currently, i focus on AI Safety and Alignment.

AI Alignment: Given the biases and discriminations that may exist in pre-training data, large models (LMs) may exhibit unintended behaviors. I am interested in alignment methods (e.g., Reinforcement Learning from human feedback (RLHF)) and post-hoc alignment methods to ensure the safety and trustworthy of LLMs.
Theoretical Explanations and Mechanism Design for Alignment: Aligning these AI System (e.g. LLMs) effectively to ensure consistency with human intentions and values (though some views may question universal values) is a significant current challenge. I am particularly interested in ensuring the feasibility of these alignment methods in both theoretical and practical mechanisms.
Applications (LM + X): I am interested in the application of large models in various domain, such as healthcare and education, and the potential impact of rapid industry development and iteration brought about by large models.