In a generalized linear model (GLM), the goal is to estimate a d-dimensional signal x from an n-dimensional observation of the form f(Ax, w), where A is a design matrix and w is a noise vector. Well-known examples of GLMs include phase retrieval, 1-bit compressed sensing, and logistic regression. We focus on the high-dimensional setting in which both the number of measurements n and the signal dimension d diverge, with their ratio tending to a fixed constant.

Linear and spectral methods are two popular solutions to obtain an initial estimate, which can also be used as a warm start for other algorithms. In particular, the linear estimator is a data-dependent linear combination of the columns of the design matrix, and its analysis is quite simple; the spectral estimator is the principal eigenvector of a data-dependent matrix, whose spectrum exhibits a phase transition. In this talk, I will first analyze the spectral method and prove that it leads to the information-theoretically optimal threshold for weak recovery in phase retrieval. Second, I will show how to optimally combine the linear and spectral estimators. Finally, I will consider estimators based on approximate message passing (AMP) and prove how to initialize them with the spectral method.

Based on joint work with Andrea Montanari, Christos Thrampoulidis and Ramji Venkataraman [https://arxiv.org/abs/1708.05932, https://arxiv.org/abs/2008.03326, https://arxiv.org/abs/2010.03460].