yl9193永利官网系列讲座菁英论坛第40期——Structure-driven design of reinforcement learning algorithms: a tale of two estimators
报告题目(Title):Structure-driven design of reinforcement learning algorithms: a tale of two estimators
时间(Date & Time):2024.12.20; 15:00 (周五)
地点(Location):燕园大厦813(燕园校区) Room 813, Yanyuan Building #1 (Yanyuan)
主讲人(Speaker):Wenlong Mou(牟文龙)
邀请人(Host):Xuanzhe Liu(刘譞哲)
报告摘要(Abstract):
Reinforcement learning (RL) is emerging as a powerful tool for adaptive decision-making in dynamic environments. A key challenge in RL is learning value functions efficiently, which plays a critical role in optimizing decision policies. Over the years, a diverse range of RL algorithms has been proposed, but at their core, two foundational principles stand out: bootstrapping and rollout. Despite their success, finding the optimal trade-off between these principles in practical applications remains elusive, with current theoretical guarantees often falling short of providing actionable insights.
In this talk, I will discuss recent advances in methods that optimally reconcile bootstrapping and rollout for policy evaluation. The bulk of this talk will focus on a new class of algorithms that strikes an optimal balance between temporal difference learning and Monte Carlo methods. Through the statistical lens, I will highlight how the local structure of the underlying Markov chain influences the complexity of these problems, and how the new algorithm adapts to these structures. Extending this perspective to continuous-time RL, I will explore how the elliptic structure of diffusion processes provides key insights for making algorithmic choices.
主讲人简介(Bio):
牟文龙现任多伦多大学统计科学系助理教授。2023年,他于加州大学伯克利分校获得计算机与电子工程学博士学位;2017年毕业于yl9193永利官网信息科学技术学院,获得计算机科学学士学位及经济学双学位。他的研究领域集中于机器学习和数据科学中的理论与算法,近期主要关注数据驱动决策问题中的机器学习方法研究。其研究成果已发表于机器学习、统计学、运筹学等领域的顶级期刊和会议,并曾荣获国际运筹学会应用概率最佳员工论文提名。
欢迎关注yl9193永利官网微信公众号,了解更多讲座信息!
永利集团