We propose a novel approach for mitigating radio frequency interference (RFI) signals in radio data using the latest advances in deep learning. We employ a special type of Convolutional Neural Network, the U-Net, that enables the classification of clean signal and RFI signatures in 2D time-ordered data acquired from a radio telescope. We train and assess the performance of this network using the HIDE & SEEK radio data simulation and processing packages, as well as early Science Verification data acquired with the 7m single-dish telescope at the Bleien Observatory. We find that our U-Net implementation is showing competitive accuracy to classical RFI mitigation algorithms such as SEEK's SumThreshold implementation. We publish our U-Net software package on GitHub under GPLv3 license.
To shed light on the fundamental problems posed by dark energy and dark matter, a large number of experiments have been performed and combined to constrain cosmological models. We propose a novel way of quantifying the information gained by updates on the parameter constraints from a series of experiments which can either complement earlier measurements or replace them. For this purpose, we use the Kullback-Leibler divergence or relative entropy from information theory to measure differences in the posterior distributions in model parameter space from a pair of experiments. We apply this formalism to a historical series of cosmic microwave background experiments ranging from Boomerang to WMAP, SPT, and Planck. Considering different combinations of these experiments, we thus estimate the information gain in units of bits and distinguish contributions from the reduction of statistical errors and the "surprise" corresponding to a significant shift of the parameters' central values. For this experiment series, we find individual relative entropy gains ranging from about 1 to 30 bits. In some cases, e.g. when comparing WMAP and Planck results, we find that the gains are dominated by the surprise rather than by improvements in statistical precision. We discuss how this technique provides a useful tool for both quantifying the constraining power of data from cosmological probes and detecting the tensions between experiments.
Abstract. Bayesian inference is often used in cosmology and astrophysics to derive constraints on model parameters from observations. This approach relies on the ability to compute the likelihood of the data given a choice of model parameters. In many practical situations, the likelihood function may however be unavailable or intractable due to non-gaussian errors, non-linear measurements processes, or complex data formats such as catalogs and maps. In these cases, the simulation of mock data sets can often be made through forward modeling. We discuss how Approximate Bayesian Computation (ABC) can be used in these cases to derive an approximation to the posterior constraints using simulated data sets. This technique relies on the sampling of the parameter set, a distance metric to quantify the difference between the observation and the simulations and summary statistics to compress the information in the data. We first review the principles of ABC and discuss its implementation using a Population Monte-Carlo (PMC) algorithm and the Mahalanobis distance metric. We test the performance of the implementation using a Gaussian toy model. We then apply the ABC technique to the practical case of the calibration of image simulations for wide field cosmological surveys. We find that the ABC analysis is able to provide reliable parameter constraints for this problem and is therefore a promising technique for other applications in cosmology and astrophysics. Our implementation of the ABC PMC method is made available via a public code release.
We study the benefits and limits of parallelised Markov chain Monte Carlo (MCMC) sampling in cosmology. MCMC methods are widely used for the estimation of cosmological parameters from a given set of observations and are typically based on the Metropolis-Hastings algorithm. Some of the required calculations can however be computationally intensive, meaning that a single long chain can take several hours or days to calculate. In practice, this can be limiting, since the MCMC process needs to be performed many times to test the impact of possible systematics and to understand the robustness of the measurements being made. To achieve greater speed through parallelisation, MCMC algorithms need to have short auto-correlation times and minimal overheads caused by tuning and burn-in. The resulting scalability is hence influenced by two factors, the MCMC overheads and the parallelisation costs. In order to efficiently distribute the MCMC sampling over thousands of cores on modern cloud computing infrastructure, we developed a Python framework called CosmoHammer which embeds emcee, an implementation by Foreman-Mackey et al. (2012) of the affine invariant ensemble sampler by Goodman and Weare (2010). We test the performance of CosmoHammer for cosmological parameter estimation from cosmic microwave background data. While Metropolis-Hastings is dominated by overheads, CosmoHammer is able to accelerate the sampling process from a wall time of 30 hours on a dual core notebook to 16 minutes by scaling out to 2048 cores. Such short wall times for complex data sets opens possibilities for extensive model testing and control of systematics.Comment: Published version. 17 pages, 6 figures. The code is available at http://www.astro.ethz.ch/refregier/research/Software/cosmohamme
Intensity mapping of neutral hydrogen (HI) is a promising observational probe of cosmology and large-scale structure. We present wide field simulations of HI intensity maps based on N-body simulations of a 2.6 Gpc/h box with 2048 3 particles (particle mass 1.6 × 10 11 M /h). Using a conditional mass function to populate the simulated dark matter density field with halos below the mass resolution of the simulation (10 8 M /h < M halo < 10 13 M /h), we assign HI to those halos according to a phenomenological halo to HI mass relation. The simulations span a redshift range of 0.35 z 0.9 in redshift bins of width ∆z ≈ 0.05 and cover a quarter of the sky at an angular resolution of about 7 . We use the simulated intensity maps to study the impact of non-linear effects and redshift space distortions on the angular clustering of HI. Focusing on the autocorrelations of the maps, we apply and compare several estimators for the angular power spectrum and its covariance. We verify that these estimators agree with analytic predictions on large scales and study the validity of approximations based on Gaussian random fields, particularly in the context of the covariance. We discuss how our results and the simulated maps can be useful for planning and interpreting future HI intensity mapping surveys.ArXiv ePrint: 1509.01589 arXiv:1509.01589v2 [astro-ph.CO] 1 Mar 2016 1 We do note that recent measurements of the clustering of DLA systems at redshifts 2 z 3 [42] are discrepant with direct HI observations on which our model is built [21]. The resolution of this discrepancy is unclear at present, and we will return to this issue in future work.
As several large single-dish radio surveys begin operation within the coming decade, a wealth of radio data will become available and provide a new window to the Universe. In order to fully exploit the potential of these data sets, it is important to understand the systematic effects associated with the instrument and the analysis pipeline. A common approach to tackle this is to forward-model the entire system -from the hardware to the analysis of the data products. For this purpose, we introduce two newly developed, open-source Python packages: the HI Data Emulator (HIDE) and the Signal Extraction and Emission Kartographer (SEEK) for simulating and processing single-dish radio survey data. HIDE forward-models the process of collecting astronomical radio signals in a single-dish radio telescope instrument and outputs pixel-level time-ordered-data. SEEK processes the time-ordered-data, removes artifacts from Radio Frequency Interference (RFI), automatically applies flux calibration, and aims to recover the astronomical radio signal. The two packages can be used separately or together depending on the application. Their modular and flexible nature allows easy adaptation to other instruments and data sets. We describe the basic architecture of the two packages and examine in detail the noise and RFI modeling in HIDE, as well as the implementation of gain calibration and RFI mitigation in SEEK. We then apply HIDE & SEEK to forward-model a Galactic survey in the frequency range 990 -1260 MHz based on data taken at the Bleien Observatory. For this survey, we expect to cover 70% of the full sky and achieve a median signal-to-noise ratio of approximately 5 -6 in the cleanest channels including systematic uncertainties. However, we also point out the potential challenges of high RFI contamination and baseline removal when examining the early data from the Bleien Observatory. The fully documented HIDE & SEEK packages are available at http://hideseek.phys.ethz.ch/ and are published under the GPLv3 license on GitHub.
h i g h l i g h t s• We discuss the reasons for the lower execution speed of Python. • We present HOPE, a specialised Python just-in-time compiler.• The package combines the ease of Python and the speed of C++.• HOPE improves the execution speed up to a factor of 120 compared to plain Python. • The code is freely available under GPLv3 license on PyPI and GitHub. a b s t r a c tThe Python programming language is becoming increasingly popular for scientific applications due to its simplicity, versatility, and the broad range of its libraries. A drawback of this dynamic language, however, is its low runtime performance which limits its applicability for large simulations and for the analysis of large data sets, as is common in astrophysics and cosmology. While various frameworks have been developed to address this limitation, most focus on covering the complete language set, and either force the user to alter the code or are not able to reach the full speed of an optimised native compiled language.In order to combine the ease of Python and the speed of C++, we developed HOPE, a specialised Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C++ while performing numerical optimisation on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. We assess the performance of HOPE by performing a series of benchmarks and compare its execution speed with that of plain Python, C++ and the other existing frameworks. We find that HOPE improves the performance compared to plain Python by a factor of 2 to 120, achieves speeds comparable to that of C++, and often exceeds the speed of the existing solutions. We discuss the differences between HOPE and the other frameworks, as well as future extensions of its capabilities. The fully documented HOPE package is available at http://hope.phys.ethz.ch and is published under the GPLv3 license on PyPI and GitHub.
We announce the public release of PynPoint, a Python package that we have developed for analysing exoplanet data taken with the angular differential imaging observing technique. In particular, PynPoint is designed to model the point spread function of the central star and to subtract its flux contribution to reveal nearby faint companion planets. The current version of the package does this correction by using a principal component analysis method to build a basis set for modelling the point spread function of the observations. We demonstrate the performance of the package by reanalysing publicly available data on the exoplanet β Pictoris b, which consists of close to 24,000 individual image frames. We show that PynPoint is able to analyse this typical data in roughly 1.5 minutes on a Mac Pro, when the number of images is reduced by co-adding in sets of 5. The main computational work parallelises well as a result of a reliance on SciPy and NumPy functions. For this calculation the peak memory load is 6Gb, which can be run comfortably on most workstations. A simpler calculation, by co-adding over 50, takes 3 seconds with a peak memory usage of 600 Mb. This can be performed easily on a laptop. In developing the package we have modularised the code so that we will be able to extend functionality in future releases, through the inclusion of more modules, without it affecting the users application programming interface. We distribute the PynPoint package through the central PyPi sever, and the documentation is available online (http://pynpoint.ethz.ch).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.