Reward Machines: Structuring Reward Function Specifications and Reducing Sample Complexity in RL

October 3, 2019
Sheila Mcllraith | University of Toronto
Reinforcement Learning Day 2019

Reinforcement Learning Day 2019:
Reward Machines: Structuring Reward Function Specifications and Reducing Sample Complexity in Reinforcement Learning

Research Area
- Artificial intelligence
Research Lab
- Microsoft Research Lab - New York City
Group
- Reinforcement Learning
Event
- Reinforcement Learning Day 2019

Watch Next

AutoAdapt demo
April 24, 2026
Microsoft Transforms its Cloud Supply Chain with Optimization and Generative AI
April 16, 2026
Peter Lee,

Konstantina Mellou,

Kayla Kummerlowe

, et. al.
Will machines ever be intelligent?
March 23, 2026
Subutai Ahmad,

Doug Burger,

Nicolo Fusi
Dion2: A new simple method to shrink matrix in Muon
March 3, 2026
Anson Ho,

Kwangjun Ahn
ARO: A new lens on matrix optimization for LLMs
March 3, 2026
Anson Ho,

Wenbo Gong,

Chao Ma
Lessons from deploying HealthBots with experts-in-the-loop
March 3, 2026
Anson Ho,

Mohit Jain
Teaching small language models to think like optimization experts with OptiMind
March 3, 2026
Anson Ho,

Xinzhi Zhang
Agent Lightning: One learning system that makes all agents evolve
March 3, 2026
Anson Ho,

Luna K. Qiu
Magentic Marketplace: Testing societies of agents at scale
March 3, 2026
Gagan Bansal,

Anson Ho
Efficient Distributed Orthonormal Optimizers for Large-Scale Training
February 12, 2026
Kwangjun Ahn

Reward Machines: Structuring Reward Function Specifications and Reducing Sample Complexity in RL

Research Area

Research Lab

Group

Event

Watch Next