“…Unfortunately, due to the large number of N ω , N s , N r and N z , N x , it is prohibitive to compute H −1 k directly with all data in industrialscale FWI problems. In order to reduce the computational complexity of FWI, it has been widely reported that for the cases with large acquisition aperture and wide frequency bandwidth, H k is almost diagonally dominant and H −1 k can be approximated with a diagonal matrix [Beylkin, 1985, Shin et al, 2001, Plessix and Mulder, 2004, Jang et al, 2009, Ren et al, 2013, Pan et al, 2015. The computational complexity can also be reduced by approximating H k with quasi-Newton methods such as the limited-memory Broyden-Fletcher-Goldfarb-Shanno (l-BFGS) algorithm [Nocedal, 1980, Nocedal andWright, 2006].…”