Using Imperfect Predictions to Make Good Decisions

Imagine learning to play a new video game. A natural approach would be to learn to make predictions about how the video game will respond to your actions (model learning) and to use those predictions to make decisions (planning). Model-based reinforcement learning (MBRL) is an analogous approach for artificial learning agents, where agents use their experience to create a predictive model of their environment and then use that model for planning purposes. Unfortunately, model-based learning has not been as successful in artificial agents as it is in natural ones! One major reason for this is that even tiny errors in the model can lead to catastrophic planning errors. In this project we will study this problem and potential remedies that may make MBRL more robust. We will work toward applying these ideas in the challenging domain of Atari 2600 games. Specific projects will be informed by the mutual interests and experience of student and mentor, but will likely center on questions about how to measure and represent model error/uncertainty, how to robustly make decisions using a flawed, uncertain model, and/or how to scale up MBRL techniques to complex, high-dimensional problems.

Essay Prompts:

  • What makes you interested in this particular project?
  • How do you see independent research fitting in with your broader interests/plans?
  • What prior experience do you have with machine learning techniques or ideas?
  • What questions do you have about this project or the representative publication?
Name of research group, project, or lab
L.A.C.E. Lab
Why join this research group or lab?

Learning Agents in Complex Environments (L.A.C.E.) Lab seeks to understand how artificial agents can be designed to flexibly learn to behave competently in a wide variety of complex, high-dimensional environments.

One reason to study this is that programs with this capability would be super useful! Imagine digital assistants that learn to satisfy your particular needs and preferences, "cognitive orthotics" that learn to support people with cognitive disabilities in remembering tasks and maintaining routines, or cleaning robots that learn how to tidy and clean your particular house. Flexibility and robust learning are key to many of the short- and long-term ambitious of artificial intelligence.

Another reason is that it is just so fundamentally weird that humans and other animals can just walk around in a world as complicated as this one and somehow get things done! By studying the problem of creating computational artifacts that can learn and behave flexibly, we may learn about the fundamental computational challenges that need to be overcome in order to make life in a complicated world possible.

Our work will of course not reach these grand ambitions in the short term, but every journey starts with one step...

Representative publication
Logistics Information:
Project categories
Computer Science
Artificial Intelligence
Student ranks applicable
Student qualifications

To make the most out of our collaboration, student researchers should

  • be comfortable, confident C++ programmers, most commonly demonstrated through success in CS 70 or beyond,
  • have significant experience with and interest in the fundamentals of supervised learning and/or reinforcement learning, most commonly demonstrated by success in a course that covers at least one of these subjects (though experience outside of formal coursework is also valued), and
  • have experience or strong interest in pursuing open-ended problems that require creativity, tenacity, and careful, systematic problem-solving. 
Time commitment
Summer - Full Time
Paid Research
Number of openings
Techniques learned

Students working on this project can expect to gain experience with the basic research process (posing questions, studying related literature, designing algorithms and experiments, interpreting results, and communicating findings). Students will also gain understanding of and experience with algorithms and methodologies for supervised learning and reinforcement learning.

Contact Information:
Mentor name
Erin Talvitie
Mentor email
Mentor position
Associate Professor of Computer Science
Name of project director or principal investigator
Erin Talvitie
Email address of project director or principal investigator
3 sp. | 27 appl.
Hours per week
Summer - Full Time
Project categories
Artificial Intelligence (+1)
Computer ScienceArtificial Intelligence