AMISTAD Lab: Understanding Artificial Learning from First Principles

What powers machine learning? The AMISTAD Lab opens the black-box to understand how machine learning works as a form of search, governed by information theoretic and statistical constraints. We prove formal results in learning and search. We will explore how search provides a unifying concept for machine learning, how information resources and dependence structures can be leveraged to move beyond memorization to true generalization, and we will probe the formal limits of learning processes. More information about our lab can be found here, as well as a brief news item about some of our recent publications.

In more detail, you will learn how to deconstruct and build learning algorithms from first principles. The sub-projects in the lab vary from year-to-year, but usually include using proof methods and mathematics to say something about learning process (or more generally, about stochastic search processes). For this summer, we will likely have a project dealing with formulating a probabilistic basis for abductive reasoning (e.g., "Why Abduction Works"), which is a continuation from a project we began last summer that resulted in a publication in ICAART 2021. We may also have projects relating the information storage capacity of learning algorithms and models to their prediction behavior and biases. Potentially, we could have a project exploring deterministic finite automata (DFAs) with relation to learning and compression, if we have students that have taken CS 81 and are interested in the intersection of theoretical computation and artificial learning.

Name of research group, project, or lab
Why join this research group or lab?

Aside from the rush of probing the truths of the universe, there are pragmatic reasons to join AMISTAD. Our lab has had a number of successes, including publishing 7 papers in peer-reviewed conferences in 2020, a summer research paper being awarded the ICAART 2020 Best Paper award (the first time an undergraduate-only institution has won any award in that conference's twelve year history), and having 3 papers already accepted at conferences in 2021. If you're serious about going to graduate school and want publications on your CV, there is a high probability you can get one or more by joining our lab.

I work with a lot of students (18 this semester!), with many students remaining in the lab for many semesters. It is often said that people vote with their feet, so the large number of students I work with and retain is probably a strong signal that students find the lab a nurturing, fulfilling, and exciting place to be. Talk with students in my lab, and you can find out what they have to say about their experiences.

Logistics Information:
Project categories
Computer Science
Artificial Intelligence
Data Science
Machine Learning
Numerical Modeling
Student ranks applicable
Student qualifications

The ability to explore open-ended (and sometimes loosely defined!) problems, take autonomy in formulating questions and seeking answers, and the perseverance to not give up when you hit roadblocks. Strong math ability is a plus (discrete math, basic linear algebra, calculus, probability theory), but students with coding ability can also find their way in the lab. We adapt projects to student abilities, so don't let math phobia steer you away from the lab. You can be successful as long as you are curious, hard-working, and are coachable, regardless of your backgound.

Time commitment
Summer - Full Time
Paid Research
Number of openings
Techniques learned

Some subset of: proof techniques, reading primary literature, technical writing, experimental design, data analysis, and data visualization.

Contact Information:
Mentor name
George Montanez
Mentor email
Mentor position
Research Advisor (Lab Director)
Name of project director or principal investigator
George Montanez
Email address of project director or principal investigator
6 sp. | 63 appl.
Hours per week
Summer - Full Time
Project categories
Machine Learning (+5)
Computer ScienceMathematicsArtificial IntelligenceData ScienceMachine LearningNumerical Modeling