# Program

#### MONDAY, JUNE 17, WEDNESDAY, JUNE 19, and FRIDAY, 21, 2019

Stevanovich Center, MS 112

5727 S. University Avenue

Chicago, Illinois 60637

### Lectures 1 & 2: Latent factor models

"Matrix data and decompositions"

"NMF and topic models"

Speaker: David Bindel, Cornell University

Monday, June 17, 2019, at 1:30-2:30 PM, and 3:00-4:00 PM

Approximate low-rank factorizations pervade matrix data analysis, often interpreted in terms of latent factor models. After discussing the ubiquitous singular value decomposition (aka PCA), we turn to factorizations such as the interpolative decomposition and the CUR factorization that offer advantages in terms of interpretability and ease of computation. We then discuss constrained approximate factorizations, particularly non-negative matrix factorizations and topic models, which are often particularly useful for decomposing data into sparse parts. Unfortunately, these decompositions may be very expensive to compute, at least in principal. But in many practical applications one can make a separability assumption that allows for relatively inexpensive algorithms. In particular, we show how to the separability assumption enables efficient linear-algebra-based algorithms for topic modeling, and how linear algebraic preprocessing can be used to “clean up” the data and improve the quality of the resulting topics.

#### Seminar I: "Mathematical Analysis of Deep Learning Based Methods

for Solving High-Dimensional PDEs"

Speaker: Julius Berner, University of Vienna

Monday, June 17, 2019, at 4:30-5:30 PM

Recently, methods based on empirical risk minimization (ERM) over deep neural network hypothesis classes have been applied to the numerical solution of PDEs with great success. We consider under which conditions ERM over a neural network hypothesis class approximates, with high probability, the solution of a d-dimensional Kolmogorov PDE with affine drift and diffusion coefficients up to error e. We establish that such an approximation can be achieved with both the size of the hypothesis class and the number of training samples scaling only polynomially in d and 1/e.

### Lectures 3 & 4: Scalable kernel methods

"Structure and interpretation of kernels"

"Making kernel methods scale"

Speaker: David Bindel, Cornell University

Wednesday, June 19, 2019, at 1:30-2:30 PM and 3:00-4:00 PM

Kernel methods are used throughout statistical modeling, data science, and approximation theory. Depending on the community, they may be introduced in many different ways: through dot products of feature maps, through data-adapted basis functions in an interpolation space, through the natural structure of a reproducing kernel Hilbert space, or through the covariance structure of a Gaussian process. We describe these various interpretations and their relation to each other, and then turn to the key computational bottleneck for all kernel methods: the solution of linear systems and the computation of (log) determinants for dense matrices whose size scales with the number of examples. Recent developments in linear algebra make it increasingly feasible to solve these problems efficiently even with millions of data points. We discuss some of these techniques, including rank-structured factorization, structured kernel interpolation, and stochastic estimators for determinants and their derivatives. We also give a perspective on some open problems and on approaches to addressing the constant challenge posed by the curse of dimensionality.

#### Seminar II: "Universal sparsity of deep ReLU networks"

Speaker: Dennis Elbrächter, University of Vienna

Wednesday, June 19, 2019, at 4:30-5:30 PM

We consider the approximation capabilities of deep neural networks. Specifically, we introduce (or assimilate) a number of key concepts, which allows us to compare neural networks to classical representation systems (meaning e.g. wavelets, shearlets, and Gabor systems, or more generally any system generated from some mother function through translation, dilation and modulation). This enables us to establish that any function class is (asymptotically) at least as sparse with respect to (ReLU) neural networks, as it is in any 'reasonable' classical representation system.

### Lectures 5 & 6: Spectral methods in network analysis

"Network spectra, optimization, and dynamics"

"Network densities of states"

Speaker: David Bindel, Cornell University

Friday, June 21, 2019, at 1:30-2:30 PM and 3:00-4:00 PM

Linear algebra methods play a central role in modern methods for large-scale network analysis. The same approach underlies many of these methods. First, one tells a story that associates the network with a matrix, either as the generator of a linear time-invariant dynamical process on the graph or as a quadratic form used to measure some quantity of interest. Then, one uses the eigenvalues and eigenvectors of the matrix to reason about the properties of the dynamical system or quadratic form, and from there to understand the network. We describe some of the most well-known spectral network analysis methods for tasks such as bisection and partitioning, clustering and community detection, and ranking and centrality. These methods largely depend only on a few eigenvalues and eigenvectors, but we will also describe some methods that require a more global perspective, including methods that we have developed for local spectral clustering and for graph analysis via spectral densities.