Earth observation from satellite sensory data poses challenging problems, where machine learning is currently a key player. In recent years, Gaussian Process (GP) regression have excelled in biophysical parameter estimation tasks from airborne and satellite observations. GP regression is based on solid Bayesian statistics, and generally yields efficient and accurate parameter estimates. However, GPs are typically used for inverse modeling based on concurrent observations and in situ measurements only. Very often a forward model encoding the well-understood physical relations between the state vector and the radiance observations is available though and could be useful to improve predictions and understanding. In this work, we review three GP models that respect and learn the physics of the underlying processes in the context of both forward and inverse modeling. After reviewing the traditional application of GPs for parameter retrieval, we introduce a Joint GP (JGP) model that combines in situ measurements and simulated data in a single GP model. Then, we present a latent force model (LFM) for GP modeling that encodes ordinary differential equations to blend data-driven modeling and physical constraints of the system governing equations. The LFM performs
Many fields of science and engineering rely on running simulations with complex and computationally expensive models to understand the involved processes in the system of interest. Nevertheless, the high cost involved hamper reliable and exhaustive simulations. Very often such codes incorporate heuristics that ironically make them less tractable and transparent. This paper introduces an active learning methodology for adaptively constructing surrogate models, i.e. emulators, of such costly computer codes in a multi-output setting. The proposed technique is sequential and adaptive, and is based on the optimization of a suitable acquisition function. It aims to achieve accurate approximations, model tractability, as well as compact and expressive simulated datasets. In order to achieve this, the proposed Active Multi-Output Gaussian Process Emulator (AMOGAPE) combines the predictive capacity of Gaussian Processes (GPs) with the design of an acquisition function that favors sampling in low density and fluctuating regions of the approximation functions. Comparing different acquisition functions, we illustrate the promising performance of the method for the construction of emulators with toy examples, as well as for a widely used remote sensing transfer code.
Solving inverse problems is central to geosciences and remote sensing. Radiative transfer models (RTMs) represent mathematically the physical laws which govern the phenomena in remote sensing applications (forward models). The numerical inversion of the RTM equations is a challenging and computationally demanding problem, and for this reason, often the application of a nonlinear statistical regression is preferred. In general, regression models predict the biophysical parameter of interest from the corresponding received radiance. However, this approach does not employ the physical information encoded in the RTMs. An alternative strategy, which attempts to include the physical knowledge, consists in learning a regression model trained using data simulated by an RTM code. In this work, we introduce a nonlinear nonparametric regression model which combines the benefits of the two aforementioned approaches. The inversion is performed taking into account jointly both real observations and RTM-simulated data. The proposed Joint Gaussian Process (JGP) provides a solid framework for exploiting the regularities between the two types of data. The JGP automatically detects the relative quality of the simulated and real data, and combines them accordingly. This occurs by learning an additional hyper-parameter w.r.t. a standard GP model, and fitting parameters through maximizing the pseudo-likelihood of the real observations. The resulting scheme is both simple and robust, i.e., capable of adapting to different scenarios. The advantages of the JGP method compared to benchmark strategies are shown considering RTM-simulated and real observations in different experiments. Specifically, we consider leaf area index (LAI) retrieval from Landsat data combined with simulated data generated by the PROSAIL model.Comment: 21 pages single column, Accepted for publication in IEEE Transactions on Geoscience and Remote Sensin
Parameter retrieval and model inversion are key problems in remote sensing and Earth observation. Currently, different approximations exist: a direct, yet costly, inversion of radiative transfer models (RTMs); the statistical inversion with in situ data that often results in problems with extrapolation outside the study area; and the most widely adopted hybrid modeling by which statistical models, mostly nonlinear and non-parametric machine learning algorithms, are applied to invert RTM simulations. We will focus on the latter. Among the different existing algorithms, in the last decade kernel based methods, and Gaussian Processes (GPs) in particular, have provided useful and informative solutions to such RTM inversion problems. This is in large part due to the confidence intervals they provide, and their predictive accuracy. However, RTMs are very complex, highly nonlinear, and typically hierarchical models, so that very often a single (shallow) GP model cannot capture complex feature relations for inversion. This motivates the use of deeper hierarchical architectures, while still preserving the desirable properties of GPs. This paper introduces the use of deep Gaussian Processes (DGPs) for bio-geo-physical model inversion. Unlike shallow GP models, DGPs account for complicated (modular, hierarchical) processes, provide an efficient solution that scales well to big datasets, and improve prediction accuracy over their single layer counterpart. In the experimental section, we provide empirical evidence of performance for the estimation of surface temperature and dew point temperature from infrared sounding data, as well as for the prediction of chlorophyll content, inorganic suspended matter, and coloured dissolved matter from multispectral data acquired by the Sentinel-3 OLCI sensor. The presented methodology allows for more expressive forms of GPs in big remote sensing model inversion problems.
The modelling of Earth observation data is a challenging problem, typically approached by either purely mechanistic or purely data-driven methods. Mechanistic models encode the domain knowledge and physical rules governing the system. Such models, however, need the correct specification of all interactions between variables in the problem and the appropriate parameterization is a challenge in itself. On the other hand, machine learning approaches are flexible datadriven tools, able to approximate arbitrarily complex functions, but lack interpretability and struggle when data is scarce or in extrapolation regimes. In this paper, we argue that hybrid learning schemes that combine both approaches can address all these issues efficiently. We introduce Gaussian process (GP) convolution models for hybrid modelling in Earth observation (EO) problems. We specifically propose the use of a class of GP convolution models called latent force models (LFMs) for EO time series modelling, analysis and understanding. LFMs are hybrid models that incorporate physical knowledge encoded in differential equations into a multioutput GP model. LFMs can transfer information across time-series, cope with missing observations, infer explicit latent functions forcing the system, and learn parameterizations which are very helpful for system analysis and interpretability. We illustrate the performance in two case studies. First, we consider time series of soil moisture from active (ASCAT) and passive (SMOS, AMSR2) microwave satellites. We show how assuming a first order differential equation as governing equation, the model automatically estimates the e-folding time or decay rate related to soil moisture persistence and discovers latent forces related to precipitation. In the second case study, we show how the model can fill in gaps of leaf area index (LAI) and Fraction of Absorbed Photosynthetically Active Radiation (fAPAR) from MODIS optical time series by exploiting their relations across different spatial and temporal domains. The proposed hybrid methodology reconciles the two main approaches in remote sensing parameter estimation by blending statistical learning and mechanistic modeling. Index Terms-Machine learning, Physics, Time series analysis, Gap filling, Gaussian process, Ordinary differential equation (ODE), Leaf area index (LAI), Fraction of Absorbed Photosynthetically Active Radiation (fAPAR), Soil moisture (SM), Moderate resolution imaging spectroradiometer (MODIS), Soil moisture and ocean salinity (SMOS), Advanced scatterometer (ASCAT), Advanced microwave scanning radiometer (AMSR-2).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.