Publication
Microsoft Research Blog
A Deep Learning Theory: Global minima and over-parameterization
One empirical finding in deep learning is that simple methods such as stochastic gradient descent (SGD) have a remarkable ability to fit training data. From a capacity perspective, this may not be surprising— modern neural…
Video
Deep Generative Models for Imitation Learning and Fairness
In the first part of the talk, I will introduce Multi-agent Generative Adversarial Imitation Learning, a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse…