# SIAM-NNP Minisymposium on The Matrix-Vector Complexity of Linear Algebra

At the SIAM-NNP conference, I am organizing a minisymposium on matrix-vector complexity. This is a small webpage to track the time and place of the talks, as well as the titles and abstracts of the talks.

## When and where is the event?

The talks are at NJIT on October 22nd, 2023. Specifically, they will run from 8:30am to 10:30am in the Central King Building, Room 219.

To attend, you have to register for the conference here. It's free for students and postdocs!

## Schedule

The talks are all 20 minutes long, plus 5 minutes for Q&A.

Time | Presenter | University | Talk Title (click to jump to the abstract) |
---|---|---|---|

8:30am | Shyam Narayanan | MIT | New Lower Bounds for Matrix-Vector Algorithms |

8:55am | Diana Halikias | Cornell | Data-efficient matrix recovery and PDE learning |

9:20am | William Swartworth | CMU | Spectrum Approximation via non-adaptive Queries |

9:45am | Tyler Chen | NYU | Peering into the black box: Krylov-aware stochastic trace estimation |

10:10am | Raphael Meyer | NYU | Hutchinson's Estimator is Bad at Kronecker-Trace-Estimation |

## Abstracts for the Talks

#### New Lower Bounds for Matrix-Vector Algorithms

The Matrix-Vector model is a model of computation where, given an unknown matrix $\boldsymbol{ M}$, one attempts to learn properties of $\boldsymbol{ M}$ through an oracle that receives as input a vector $\mathbf{ v}$ and outputs $\boldsymbol{ M}\mathbf{ v}$. This model captures a wide range of algorithms used in optimization and numerical linear algebra. This model can also be viewed as a sublinear-query model, where one wishes to ask a sublinear number of queries to the oracle to learn properties of $\boldsymbol{ M}$. In this talk, we demonstrate a new technique for proving lower bounds for the query complexity of various matrix-vector algorithms. The technique is based on a novel reduction that demonstrates that "block Krylov" algorithms are optimal for this problem. This allows us to prove lower bounds for two important questions:

Log-concave sampling: we show that sampling from strongly log-concave and log-smooth distributions with condition number $\kappa$ requires $\tilde{\Omega}(\min(\sqrt{\kappa} \log d, d))$ first-order queries. In fact, this bound even holds (and is tight) for the more restricted class of Gaussian distributions. Importantly, we provide the first lower bound for general log-concave distributions proving that the query complexity of general log-concave sampling cannot be dimension-free (i.e., only depend on $\kappa$).

Low-rank approximation: we show that providing a $(1 + \epsilon)$-approximate rank-1 approximation to $\boldsymbol{ M}$ (in spectral error) requires $\Omega(\log d/\sqrt{\epsilon})$ samples, which is optimal. For this question, no super-constant lower bound (either in terms of $\epsilon$ or in $d$) was known.

#### Data-efficient matrix recovery and PDE learnings

Can one recover the entries of a matrix from only matrix-vector products? If so, how many are needed? I will present randomized algorithms to recover various structured matrices, as well as theoretical results which bound the query complexity of these structured families. Moreover, a continuous generalization of query complexity describes how many pairs of forcing terms and solutions are needed to uniquely identify a Green's function corresponding to the solution operator of an elliptic PDE. I will present a recent main result, which is a theoretical guarantee on the number of input-output pairs required in elliptic PDE learning problems with high probability. The proof of this result is constructive, and leverages insights from numerical linear algebra and PDE theory.

#### Spectrum Approximation via non-adaptive Queries

Variants of Power Method are useful for approximating the top eigenvalues of a matrix, but these methods rely heavily on adaptivity and typically only learn the top portion of the spectrum. How well can one do if the goal is to approximate the entire spectrum via non-adaptive matrix-vector queries? For a symmetric matrix $\boldsymbol{ A}$, this talk will show how to recover all eigenvalues of $\boldsymbol{ A}$ to within $\varepsilon\| \boldsymbol{ A}\|_F$ additive error using $O(1/\varepsilon^2)$ matrix-vector queries. I will also discuss a lower bound showing that $\Omega(1/\varepsilon^2)$ matrix-vector queries are necessary for obtaining our guarantee, even if we allow adaptivity.

#### Peering into the black box: Krylov-aware stochastic trace estimation

Matrix-vector query models have gained increased attention, and in that case that the matrix of interest is a matrix function (e.g. the matrix inverse or matrix exponential), it is common to use Krylov subspace methods as a black box for computing matrix-vector products. In this talk, we discuss how taking a look into the black-box allows further efficiencies to be obtained.

#### Hutchinson's Estimator is Bad at Kronecker-Trace-Estimation

We study the problem of estimating the trace of a matrix $\boldsymbol{ A}$ that can only be access via Kronecker-matrix-vector products. That is, we can compute $\boldsymbol{ A}\mathbf{ x}$ for any vector $\mathbf{ x}$ that has Kronecker structure. In particular, we study how Hutchinson's Estimator performs in this setting, proving tight rates for the number of matrix-vector products this estimator needs to find a relative error approximation to the trace of $\boldsymbol{ A}$. We find an exact expression for the variance of this estimator, show this is unavoidably exponential, and conclude with evidence that a much faster non-Hutchinson algorithm may exist.