Large-scale systems of linear equations arise in many areas of data science, including in machine learning and as subroutines of several optimization methods. When the systems are very large and cannot be read into working memory in their entirety, iterative methods which use a small portion of the data in each iteration are typically employed. These methods can offer a small memory footprint and good convergence guarantees. Kaczmarz methods, a classical example of these types of methods, consist of sequential orthogonal projections towards the solution set of a single equation (or subsystem). There are many variants within this family of methods, often using randomized or greedy strategies to select the row (subsystem) used in each iteration.
There has been a lot of work on Kaczmarz-type methods; some proving convergence results for different variants, some illustrating the application of the Kaczmarz method to specific problems from signal processing, network science, and machine learning, and some developing strategies for systems with adversarial corruption. In this project, we will explore both theoretically and experimentally potential research questions coming from these different areas of Kaczmarz-related study. Potential directions include:
- Developing and analyzing methods for systems with corruption or noise;
- Analyzing methods in the case that the system has structure coming from a graph or network; or
- Developing methods for applications in network ranking and consensus.
Student researchers in this project will develop mathematical skills in areas like numerical linear algebra, optimization, probability, and statistics. They will build and strengthen skills in literature review, scientific reading, technical writing and presentation, and programming and package development.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Interested applicants should submit a CV, the name of a reference, and answers to the following prompts:
- Why are you interested in this project? What makes you a good fit?
- What skills do you bring to this project? What skills do you hope to develop?
- What are your goal outcomes for your summer research project?
In the matH of Algorithms, Data & Decisions (HADD) research group, we consider problems motivated by the study of real-world data. We consider the mathematics of data, models for making decisions with data, and methods for training such models. We consist of fun and passionate people who encourage one another and help each other develop as mathematicians, data scientists, and researchers. Our group has students working on various projects, and often interacts with collaborators from other institutions (graduate students, postdoctoral researchers, and faculty). Students may continue their work in a senior thesis, present their findings at conferences, and/or coauthor resulting publications.
We work in areas like mathematical data science, optimization, and applied convex geometry, leveraging mathematical tools like probability, combinatorics, and convex geometry, on problems in data science and optimization. Our group has been active recently in randomized numerical linear algebra, combinatorial methods for convex optimization, tensor decomposition for topic modeling, network consensus and ranking problems, and community detection on graphs and hypergraphs.