Computer and Human Vision Research and Development

Here in the Laboratory for Cognition and Attention in Time and Space (Lab for CATS), we study visual perception with a focus on artificial visual agents, but often draw inspiration from and compare to human vision. During Summer 2023, there will be two projects that I am seeking applicants for.

1. Attention and Feature Representation in Spatiotemporal Networks

Over the past decade, deep learning has become the predominant approach to computer vision problems. While deep neural networks have demonstrated highly impressive results in many areas, they still sometimes exhibit surprising brittleness. Often this brittleness appears to result from networks relying on unintended patterns and correlations in the training data, or optimizing over easier to learn but less reliable parts of the visual signal (e.g. Geirhos et al. (2019) demonstrated that deep networks tend to rely more heavily on local image texture rather than more global object shape). Understanding behaviour in spatiotemporal processing is less well explored, however. Our work takes advantage of techniques developed to visualize the activity of deep networks in order to better understand what visual information is driving a network’s behaviour (e.g. Grad-CAM (Selvaraju et al., 2020)) to compare action recognition network activity to human eye tracking data on the same videos. In this way, we can identify and quantify how deep networks operating on videos rely on visual information, and begin to develop novel training and data augmentation techniques to guide deep neural networks to more robust representations.

This is an ongoing project with an active code base under development, and multiple research threads being pursued. Students hired to work on will continue to develop this code base by adding additional network models as well as network activity visualization techniques for comparison with the ones currently implemented. Additionally, students will be involved in developing analyses and experiments to investigate and manipulate network behaviour, and in designing and testing novel training augmentations or network architectures based on our findings.

2. Computer Vision Model and Tool Exploration

Diffusion models have upended the landscape of computer generated images, combining a powerful linguistic understanding engine with state-of-the-art generative models to achieve previously unattainable results in automated image production. Given the newness of these models, however, there are a lot of unanswered questions. This project will involve setting up a code base of diffusion models, and then exploratory research over this space. This is fairly open-ended; tentative plans include experimenting with the development of alternative forms of input from text prompts, extensions or open source implementations of human-in-the-loop tools, or probing behavioural patterns across different diffusion models.

Applying to Work in the Lab for CATS:

In your essay, please address the following questions:

  1. Which project(s) are you interested in working on.
  2. For each project mentioned in (1.), write two to three sentences on why you are interested in that specific project (e.g. personal interest or alignment with future career goals).
  3. For each project mentioned in (1.), please also provide a few sentences describing any relevant experience you have had that will assist with getting started on that project (this can be relevant course work, past research or professional experience, or other things you think are relevant).

A Note on Research Goals: While scholarly publication is typically one of the primary goals of research, it is important to note that the first topic (spatiotemporal representations) has a more explicit focus on publication. The second project is highly preliminary, and is more focused on exploring a new space out of interest.

Name of research group, project, or lab
Lab for Cognition and Attention in Time and Space (CATS)
Why join this research group or lab?

The Lab for CATS seeks to understand visual cognition, and help build more robust and unbiased artificial visual agents. There is a lot of hype in the world of computer vision and machine learning, and we seek to keep a grounded focus on fair and realistic evaluations of model behaviour with the goal of identifying when common benchmarking and evaluation practices might result in unanticipated deficits in novel or unconstrained environments.

If you want to get a better sense of the lab culture and what working in the Lab for CATS might be like, I encourage you to speak to my current and former students! I'd be happy to facilitate contact.

Logistics Information:
Project categories
Computer Science
Artificial Intelligence
Computer Vision
Machine Learning
Student ranks applicable
Sophomore
Junior
Senior
Student qualifications

For both projects, experience with git and Python will be essential. Experience with deep learning programming (and PyTorch specifically) would be a huge asset. Familiarity with the topics of computer vision and/or deep learning would be very helpful.

Time commitment
Summer - Full Time
Compensation
Paid Research
Number of openings
4
Contact Information:
Mentor
Calden Wloka
cwloka@hmc.edu
Principal Investigator
Name of project director or principal investigator
Calden Wloka
Email address of project director or principal investigator
cwloka@hmc.edu
4 sp. | 31 appl.
Hours per week
Summer - Full Time
Project categories
Artificial Intelligence (+3)
Computer ScienceArtificial IntelligenceComputer VisionMachine Learning