Matthew Chang

I am an PhD student at the University of Illinois Urbana-Champaign, being advised by Saurabh Gupta. I did my undergrad and Master's degree at MIT. I study machine vision, robotics and reinforcement learning. I am interested in developing agents which can effectively interact with the world around them in a safe and intelligent manner. I am focusing on classical robotic methods combined with reinforcement learning, and methods for utilizing human demonstrations, or other offline data, for enhancing robotic performance.

Recent Talks
Publications

2021
Learning Value Functions from Undirected State-only Experience
Matthew Chang, Arjun Gupta, Saurabh Gupta
International Conference on Learning Representations (ICLR), 2022
Embodied AI Workshop CVPR 2022
Workshop on Offline Reinforcement Learning NeurIPS 2021
Deep Reinforcement Learning Workshop NeurIPS 2021
abstract / bibtex / webpage / arxiv link / video

This paper tackles the problem of learning value functions from undirected state-only experience (state transitions without action labels i.e. (s,s',r) tuples). We first theoretically characterize the applicability of Q-learning in this setting. We show that tabular Q-learning in discrete Markov decision processes (MDPs) learns the same value function under any arbitrary refinement of the action space. This theoretical result motivates the design of Latent Action Q-learning or LAQ, an offline RL method that can learn effective value functions from state-only experience. Latent Action Q-learning (LAQ) learns value functions using Q-learning on discrete latent actions obtained through a latent-variable future prediction model. We show that LAQ can recover value functions that have high correlation with value functions learned using ground truth actions. Value functions learned using LAQ lead to sample efficient acquisition of goal-directed behavior, can be used with domain-specific low-level controllers, and facilitate transfer across embodiments. Our experiments in 5 environments ranging from 2D grid world to 3D visual navigation in realistic environments demonstrate the benefits of LAQ over simpler alternatives, imitation learning oracles, and competing methods.

@inproceedings{chang2022learning,
author = "Chang, Matthew and Gupta, Arjun and Gupta, Saurabh",
title = "Learning Value Functions from Undirected State-only Experience",
booktitle = "International Conference on Learning Representations",
year = "2022"
}
2020
Semantic Visual Navigation by Watching YouTube Videos
Matthew Chang, Arjun Gupta, Saurabh Gupta
Neural Information Processing Systems (NeurIPS), 2020
abstract / bibtex / webpage / arxiv link / video

Semantic cues and statistical regularities in real-world environment layouts can improve efficiency for navigation in novel environments. This paper learns and leverages such semantic cues for navigating to objects of interest in novel environments, by simply watching YouTube videos. This is challenging because YouTube videos do not come with labels for actions or goals, and may not even showcase optimal behavior. Our method tackles these challenges through the use of Q-learning on pseudo-labeled transition quadruples (image, action, next image, reward). We show that such off-policy Q-learning from passive data is able to learn meaningful semantic cues for navigation. These cues, when used in a hierarchical navigation policy, lead to improved efficiency at the ObjectGoal task in visually realistic simulations. We observe a relative improvement of 15-83% over end-to-end RL, behavior cloning, and classical methods, while using minimal direct interaction.

@inproceedings{chang2020semantic,
author = "Chang, Matthew and Gupta, Arjun and Gupta, Saurabh",
title = "Semantic Visual Navigation by Watching YouTube Videos",
booktitle = "Advances in Neural Information Processing Systems",
year = "2020"
}