Saliency models are a type of computer vision model designed to predict what aspects or elements of an image are most attention grabbing in an image. Saliency models can be used in a wide variety of applications, including in automatic image cropping, design feedback for user interfaces or graphical content, helping to understand human visual attention, and as a pre-processing step for a number of other computer vision applications.
Over the past couple of decades a large number of saliency model algorithms have been developed, from early models which explicitly encode rules to compare image features such as intensity, color, and line orientation, to more recent models which use deep learning to train a network for saliency map prediction. With such a large number of models, it can be difficult to decide which model might best suit a particular task. Likewise, each model is typically implemented by an independent research group with a different set of parameters, function arguments, and assumptions over inputs and outputs, which can make it difficult and time consuming to run a fair comparison of methods. In order to better support research efforts exploring the development and use of saliency models, we developed the Saliency Model Implementation Library for Experimental Research (SMILER).
SMILER is a comprehensive tool which wraps saliency models into a common API. It is built using Python and MATLAB, and uses docker to support models which have been implemented in a variety of additional languages and formats. Since its release in 2018, SMILER has been used in a variety of projects, including studies of zebra stripes, aerial tree detection, and saliency model benchmarking over novel datasets. Now, however, SMILER is in need of an update.
This project three primary goals:
1.) When SMILER was first designed, docker did not support GPU computation, and so SMILER was designed around nvidia-docker, a toolbox which extended docker functionality to GPU computing. Docker now directly supports GPU computation, and nvidia-docker has been deprecated. In order to easily run on modern systems, therefore, SMILER needs to be updated to replace its reliance on nvidia-docker with standard calls to docker. Students will get a chance to work work with an open-source code library and gain significant practice with docker.
2.) Saliency models continue to be built and released, and there are a number of recent models which are not yet included in SMILER. Students will get a chance to understand and test modern computer vision models such that they can be accurately wrapped into SMILER's API.
3.) In order to ensure that SMILER can continue to be of use to the research community, it needs better documentation on how to maintain and update its design. Students will get experience writing technical documentation designed to engage with a highly interdisciplinary research community which includes computer scientists, engineers, neuroscientists, and psychologists.
Essay prompt (address the following):
- What interests you in the project?
- What experience do you have with software design or computer vision? (It’s okay if the answer is none!)
- What do you hope to get out of this research experience?
This is an opportunity to gain an introduction to computer vision and saliency modeling while contributing to software development which supports ongoing scientific research around the world.