Existing nonconvex statistical optimization theory and methods crucially rely on the correct specification of the underlying "true" statistical models. To address this issue, we take a first step towards taming model misspecification by studying the high-dimensional sparse phase retrieval problem with misspecified link functions. In particular, we propose a simple variant of the thresholded Wirtinger flow algorithm that, given a proper initialization, linearly converges to an estimator with optimal statistical accuracy for a broad family of unknown link functions. We further provide extensive numerical experiments to support our theoretical findings.where β 0 denotes the number of nonzero entries in β.The nonconvex problem in (1.1) gives rise to two challenges in optimization and statistics. From the perspective of optimization, (1.1) is NP-hard in the worst case (Sahinoglou and Cabrera, 1991), Equal contribution. . 1 Here we use the shorthand [n] = {1, 2, . . . , n}. 1 arXiv:1712.06245v1 [stat.ML] 18 Dec 2017that is, under computational hardness hypotheses, no algorithm can achieve the global minimum in polynomial time. Particularly, most existing general-purpose first-order or second-order optimization methods (Ghadimi and Lan, 2013;Bolte et al., 2014;Lu and Xiao, 2015;Hong et al., 2016;Ghadimi and Lan, 2016;Xu and Yin, 2017;Gonçalves et al., 2017) are only guaranteed to converge to certain stationary points. Meanwhile, since (1.1) can also be cast as a polynomial optimization problem, we can leverage various semidefinite programming approaches (Parrilo, 2003;Kim et al., 2016;Weisser et al., 2017;Ahmadi and Parrilo, 2017). However, in real applications the problem dimension of practical interest is often large, for example, p can be of the order of millions. To the best of our knowledge, existing polynomial optimization approaches do not scale up to such large dimensions. The difficulty in optimization further leads to more challenges in statistics. From the perspective of statistics, researchers are interested in characterizing the statistical properties of β with respect to some underlying ground truth β * , for example, the estimation error β − β * 2 . Nevertheless, due to the lack of global optimality in nonconvex optimization, the statistical properties of the solutions obtained by existing algorithms remain rather difficult to analyze.Recently, Cai et al. (2016) proposed a thresholded Wirtinger flow (TWF) algorithm to tackle the problem, which essentially employs proximal-type iterations. TWF starts from a carefully specified initial point, and iteratively performs gradient descent steps. In particular, at each iteration, TWF performs a thresholding step to preserve the sparsity of the solution. Cai et al. (2016) further prove that TWF achieves a linear rate of convergence to an approximate global minimum that has optimal statistical accuracy. Note that Cai et al. (2016) can establish such strong theoretical results because their algorithm and analysis exploit the underlying "true" data generating process...