Jiamin He

I am a Ph.D. student in Computing Science at the University of Alberta, supervised by Martha White. I completed an internship at Google DeepMind in London, where I worked with Diana Borsa and Hado van Hasselt on foundational reinforcement learning algorithms and their applications in science.

Previously, I received my M.Sc. (Thesis) degree in Computing Science at the University of Alberta under the supervision of Rupam Mahmood. Before that, I also worked with Chongjie Zhang as a research intern.

My research interests lie in reinforcement learning, with a focus on policy optimization, off-policy learning, and representation learning.

Email  /  Google Scholar  /  GitHub  /  Blog

profile photo

Publications

Distributions as Actions: A Unified Framework for Diverse Action Spaces.
Jiamin He, A. Rupam Mahmood, Martha White.
Preliminary version at the Finding the Frame Workshop at RLC, 2025.
ICLR, 2026.
paper | code

Investigating the Utility of Mirror Descent in Off-policy Actor-Critic.
Samuel Neumann, Jiamin He, Adam White, Martha White.
RLC, 2025.
paper | code

Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers.
Gautham Vasan, Mohamed Elsayed, Alireza Azimi*, Jiamin He*, Fahim Shariar, Colin Bellinger, Martha White, A. Rupam Mahmood.
NeurIPS, 2024.
paper | code

Loosely Consistent Emphatic Temporal-Difference Learning.
Jiamin He, Fengdi Che, Wan Yi, A. Rupam Mahmood.
Preliminary version in the average-reward setting at the Deep RL Workshop at NeurIPS, 2022.
UAI, 2023.
paper | code

Episodic Multi-agent Reinforcement Learning with Curiosity-Driven Exploration.
Lulu Zheng*, Jiarui Chen*, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang.
NeurIPS, 2021.
paper | code


Other Workshop Papers

Revisiting Mixture Policies in Entropy-Regularized Actor-Critic.
Jiamin He, Samuel Neumann, Jincheng Mei, Adam White, Martha White.
Aligning Reinforcement Learning Experimentalists and Theorists Workshop at NeurIPS, 2025.
Extended version under review, 2026.
paper

Improving Reward-Based Hindsight Credit Assignment.
Aditya A. Ramesh, Jiamin He, Jürgen Schmidhuber, Martha White.
European Workshop on Reinforcement Learning, 2025.
paper


Theses

Consistent Emphatic Temporal-Difference Learning.
Jiamin He.
M.Sc. Thesis, University of Alberta, 2023.
details



Stolen from Jon Barron