Kaczmarz Methods for Large-scale Data Analysis

Large-scale systems of linear equations arise in many areas of data science, including in machine learning and as subroutines of several optimization methods.  When the systems are very large and cannot be read into working memory in their entirety, iterative methods which use a small portion of the data in each iteration are typically employed.  These methods can offer a small memory footprint and good convergence guarantees. Kaczmarz methods, a classical example of these types of methods, consist of sequential orthogonal projections towards the solution set of a single equation (or subsystem).  There are many variants within this family of methods, often using randomized or greedy strategies to select the row (subsystem) used in each iteration.

There has been a lot of work on Kaczmarz-type methods; some proving convergence results for different variants, some illustrating the application of the Kaczmarz method to specific problems from signal processing, network science, and machine learning, and some developing strategies for systems with adversarial corruption.  In this project, we will explore both theoretically and experimentally potential research questions coming from these different areas of Kaczmarz-related study.

Students working on this project will develop skills in literature review, code development, numerical experiment design, theoretical analysis, technical writing, and technical presentation.  We will build on prior work to understand these methods both theoretically and empirically on synthetic and real-world linear systems.

Name of research group, project, or lab
Haddock Research Group
Why join this research group or lab?

This is exciting research that will have impact both in mathematics as well as in the fields where the Kaczmarz methods find practical application!  You will work in a small group of students throughout the summer, and will have the opportunity to connect with a broad network of the PI’s collaborators who study these methods.  This project could be continued in a senior thesis, and future results could be presented at regional, national, or international conferences or in publications.

Representative publication
Logistics Information:
Project categories
Data Science
Student ranks applicable
Student qualifications
  • The student should have completed the core course in Linear Algebra.
  • Additional linear algebra courses and courses in probability or statistics are highly desirable.
  • Experience programming, especially in Python or MATLAB, is highly desirable.
  • Courses in scientific computing, numerical analysis, or graph theory are useful (but not necessary).
Time commitment
Summer - Full Time
Paid Research
Number of openings
Techniques learned

A summer student will learn:

  • Standard techniques used in Kaczmarz-method analysis
  • Linear algebraic representations of network problems
  • Numerical experiment design and implementation
  • Code development and (if time permitting) package publication
  • Technical writing and presentation skills
Contact Information:
Mentor name
Jamie Haddock
Mentor email
Name of project director or principal investigator
Jamie Haddock
Email address of project director or principal investigator
2 sp. | 23 appl.
Hours per week
Summer - Full Time
Project categories
Optimization (+3)
MathematicsAlgorithmsData ScienceOptimization