MSC:In this work, we develop an optimization framework for problems whose solutions are well-approximated by Hierarchical Tucker (HT) tensors, an efficient structured tensor format based on recursive subspace factorizations. By exploiting the smooth manifold structure of these tensors, we construct standard optimization algorithms such as Steepest Descent and Conjugate Gradient for completing tensors from missing entries. Our algorithmic framework is fast and scalable to large problem sizes as we do not require SVDs on the ambient tensor space, as required by other methods. Moreover, we exploit the structure of the Gramian matrices associated with the HT format to regularize our problem, reducing overfitting for high subsampling ratios. We also find that the organization of the tensor can have a major impact on completion from realistic seismic acquisition geometries. These samplings are far from idealized randomized samplings that are usually considered in the literature but are realizable in practical scenarios. Using these algorithms, we successfully interpolate large-scale seismic data sets and demonstrate the competitive computational scaling of our algorithms as the problem sizes grow.
Despite recent developments in improved acquisition, seismic data often remains undersampled along source and receiver coordinates, resulting in incomplete data for key applications such as migration and multiple prediction. We interpret the missing-trace interpolation problem in the context of matrix completion and outline three practical principles for using low-rank optimization techniques to recover seismic data. Specifically, we strive for recovery scenarios wherein the original signal is low rank and the subsampling scheme increases the singular values of the matrix. We employ an optimization program that restores this low rank structure to recover the full volume. Omitting one or more of these principles can lead to poor interpolation results, as we show experimentally. In light of this theory, we compensate for the high-rank behavior of data in the source-receiver domain by employing the midpoint-offset transformation for 2D data and a source-receiver permutation for 3D data to reduce the overall singular values. Simultaneously, in order to work with computationally feasible algorithms for large scale data, we use a factorization-based approach to matrix completion, which significantly speeds up the computations compared to repeated singular value decompositions without reducing the recovery quality. In the context of our theory and experiments, we also show that windowing the data too aggressively can have adverse effects on the recovery quality. To overcome this problem, we carry out our interpolations for each frequency independently while working with the entire frequency slice. The result is a computationally efficient, theoretically motivated framework for interpolating missing-trace data. Our tests on realistic two-and three-dimensional seismic data sets show that our method compares favorably, both in terms of computational speed and recovery quality, to existing curvelet-based and tensor-based techniques.
In statistical inverse problems, the objective is a complete statistical description of unknown parameters from noisy observations to quantify uncertainties in unknown parameters. We consider inverse problems with partial-differential-equation (PDE) constraints, which are applicable to many seismic problems. Bayesian inference is one of the most widely used approaches to precisely quantify statistics through a posterior distribution, incorporating uncertainties in observed data, modeling kernel, and prior knowledge of parameters. Typically when formulating the posterior distribution, the PDE constraints are required to be exactly satisfied, resulting in a highly nonlinear forward map and a posterior distribution with many local maxima. These drawbacks make it difficult to find an appropriate approximation for the posterior distribution. Another complicating factor is that traditional Markov chain Monte Carlo (MCMC) methods are known to converge slowly for realistically sized problems. To overcome these drawbacks, we relax the PDE constraints by introducing an auxiliary variable, which allows for Gaussian errors in the PDE and yields a bilinear posterior distribution with weak PDE constraints that is more amenable to uncertainty quantification because of its special structure. We determine that for a particular range of variance choices for the PDE misfit term, the new posterior distribution has fewer modes and can be well-approximated by a Gaussian distribution, which can then be sampled in a straightforward manner. Because it is prohibitively expensive to explicitly construct the dense covariance matrix of the Gaussian approximation for problems with more than [Formula: see text] unknowns, we have developed a method to implicitly construct it, which enables efficient sampling. We apply this framework to 2D seismic inverse problems with 1800 and 92,455 unknown parameters. The results illustrate that our framework can produce comparable statistical quantities with those produced by conventional MCMC-type methods while requiring far fewer PDE solves, which are the main computational bottlenecks in these problems.
Large-scale parameter estimation problems are among some of the most computationally demanding problems in numerical analysis. An academic researcher’s domain-specific knowledge often precludes that of software design, which results in inversion frameworks that are technically correct but not scalable to realistically sized problems. On the other hand, the computational demands for realistic problems result in industrial codebases that are geared solely for high performance, rather than comprehensibility or flexibility. We propose a new software design for inverse problems constrained by partial differential equations that bridges the gap between these two seemingly disparate worlds. A hierarchical and modular design reduces the cognitive burden on the user while exploiting high-performance primitives at the lower levels. Our code has the added benefit of actually reflecting the underlying mathematics of the problem, which lowers the cognitive load on the user using it and reduces the initial startup period before a researcher can be fully productive. We also introduce a new preconditioner for the 3D Helmholtz equation that is suitable for fault-tolerant distributed systems. Numerical experiments on a variety of 2D and 3D test problems demonstrate the effectiveness of this approach on scaling algorithms from small- to large-scale problems with minimal code changes.
Conventional oil and gas fields are increasingly difficult to explore and image, resulting in the call for more complex wave-equation-based inversion algorithms that require dense long-offset samplings. Consequently, there is an exponential growth in the size of data volumes and prohibitive demands on computational resources. We propose a method to compress and process seismic data directly in a low-rank tensor format, which drastically reduces the amount of storage required to represent the data. Seismic data exhibits low-rank structure in a particular transform domain, which can be exploited to compress the dense data in one extremely storage-efficient tensor format when the data is fully sampled, or interpolated when the data has missing entries. In either case, once our data is represented in its compressed tensor form, we propose an algorithm to extract source or receiver gathers directly from the compressed parameters. This extraction process can be done on-the-fly directly on the compressed data and does not require scanning through the entire dataset in order to form shot gathers. We apply this shot-extraction technique in the context of stochastic full-waveform inversion as well as forming full subsurface image gathers through probing techniques and demonstrate the minor differences between using the full and compressed data, while drastically reducing the total memory costs.
In this work, we develop an optimization framework for problems whose solutions are wellapproximated by Hierarchical Tucker (HT) tensors, an efficient structured tensor format based on recursive subspace factorizations. By exploiting the smooth manifold structure of these tensors, we construct standard optimization algorithms such as Steepest Descent and Conjugate Gradient for completing tensors from missing entries. Our algorithmic framework is fast and scalable to large problem sizes as we do not require SVDs on the ambient tensor space, as required by other methods. Moreover, we exploit the structure of the Gramian matrices associated with the HT format to regularize our problem, reducing overfitting for high subsampling ratios. We also find that the organization of the tensor can have a major impact on completion from realistic seismic acquisition geometries. These samplings are far from idealized randomized samplings that are usually considered in the literature but are realizable in practical scenarios. Using these algorithms, we successfully interpolate large-scale seismic data sets and demonstrate the competitive computational scaling of our algorithms as the problem sizes grow.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.