|
Jiamin He
I am a Ph.D. student in Computing Science at the University of Alberta, supervised by Martha White. Currently, I'm interning at Google DeepMind in London, working with Diana Borsa and Hado van Hasselt on foundational reinforcement learning algorithms and their applications in science.
Previously, I received my M.Sc. (Thesis) degree in Computing Science at the University of Alberta under the supervision of Rupam Mahmood. Before that, I also worked with Chongjie Zhang as a research intern.
My research interests lie in reinforcement learning, with a focus on policy optimization, off-policy learning, and representation learning.
Email  / 
Google Scholar  / 
GitHub  / 
Blog
|
|
|
Investigating the Utility of Mirror Descent in Off-policy Actor-Critic.
Samuel Neumann, Jiamin He, Adam White, Martha White.
Reinforcement Learning Conference (RLC), 2025.
paper | code
|
|
Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers.
Gautham Vasan, Mohamed Elsayed, Alireza Azimi*, Jiamin He*, Fahim Shariar, Colin Bellinger, Martha White, A. Rupam Mahmood.
Conference on Neural Information Processing Systems (NeurIPS), 2024.
paper | code
|
|
Loosely Consistent Emphatic Temporal-Difference Learning.
Jiamin He, Fengdi Che, Wan Yi, A. Rupam Mahmood.
Conference on Uncertainty in Artificial Intelligence (UAI), 2023.
paper | code
|
|
Episodic Multi-agent Reinforcement Learning with Curiosity-Driven Exploration.
Lulu Zheng*, Jiarui Chen*, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao,
Chongjie Zhang.
Conference on Neural Information Processing Systems (NeurIPS), 2021.
paper | code
|
|
Revisiting Mixture Policies in Entropy-Regularized Actor-Critic.
Jiamin He, Samuel Neumann, Jincheng Mei, Adam White, Martha White.
Aligning Reinforcement Learning Experimentalists and Theorists Workshop at NeurIPS, 2025.
Extended version under review, 2026.
paper
|
|
Improving Reward-Based Hindsight Credit Assignment.
Aditya A. Ramesh, Jiamin He, Jürgen Schmidhuber, Martha White.
European Workshop on Reinforcement Learning, 2025.
paper
|
|
Distribution Parameter Actor-Critic: Shifting the Agent-Environment Boundary for Diverse Action Spaces.
Jiamin He, A. Rupam Mahmood, Martha White.
Finding the Frame Workshop at RLC, 2025.
Extended version under review, 2026.
paper
|
|
The Emphatic Approach to Average-Reward Policy Evaluation.
Jiamin He, Wan Yi, A. Rupam Mahmood.
Deep Reinforcement Learning Workshop at NeurIPS, 2022.
paper
|
|
Consistent Emphatic Temporal-Difference Learning.
Jiamin He.
M.Sc. Thesis, University of Alberta, 2023.
details
|
|