We present a catalog of 1,172,157 quasar candidates selected from the photometric imaging data of the Sloan Digital Sky Survey (SDSS). The objects are all point sources to a limiting magnitude of i = 21.3 from 8417 deg 2 of imaging from SDSS Data Release 6 (DR6). This sample extends our previous catalog by using the latest SDSS public release data and probing both UV-excess and high-redshift quasars. While the addition of high-redshift candidates reduces the overall efficiency (quasars:quasar candidates) of the catalog to ∼ 80%, it is expected to contain no fewer than 850,000 bona fide quasars -∼ 8 times the number of our previous sample, and ∼ 10 times the size of the largest spectroscopic quasar catalog. Cross-matching between our photometric catalog and spectroscopic quasar catalogs from both the SDSS and 2dF Surveys, yields 88,879 -2spectroscopically confirmed quasars. For judicious selection of the most robust UV-excess sources (∼ 500, 000 objects in all), the efficiency is nearly 97%more than sufficient for detailed statistical analyses. The catalog's completeness to type 1 (broad-line) quasars is expected to be no worse than 70%, with most missing objects occurring at z < 0.7 and 2.5 < z < 3.0. In addition to classification information, we provide photometric redshift estimates (typically good to ∆z ± 0.3 [2σ]) and cross-matching with radio, X-ray, and proper motion catalogs. Finally, we consider the catalog's utility for determining the optical luminosity function of quasars and are able to confirm the flattening of the bright-end slope of the quasar luminosity function at z ∼ 4 as compared to z ∼ 2.
We present new measurements of the quasar angular autocorrelation function from a sample of $80,000 photometrically classified quasars taken from the First Data Release of the Sloan Digital Sky Survey. We find a best-fit model of !() ¼ (0:066 þ0:026 À0:024 ) À ( 0:98AE0:15 ) for the angular correlation function, consistent with estimates of the slope from spectroscopic quasar surveys. We show that only models with little or no evolution in the clustering of quasars in comoving coordinates since a median redshift of z $ 1:4 can recover a scale length consistent with local galaxies and active galactic nuclei (AGNs). A model with little evolution of quasar clustering in comoving coordinates is best explained in the current cosmological paradigm by rapid evolution in quasar bias. We show that quasar biasing must have changed from b Q $ 3 at a (photometric) redshift ofz phot ¼ 2:2 to b Q $ 1:2 1:3 byz phot ¼ 0:75. Such a rapid increase with redshift in biasing implies that quasars at z $ 2 cannot be the progenitors of modern L Ã objects; rather they must now reside in dense environments, such as clusters. Similarly, the duration of the UVX (ultraviolet-excess) quasar phase must be short enough to explain why local UVX quasars reside in essentially unbiased structures. Our estimates of b Q are in good agreement with recent spectroscopic results , which demonstrate that the implied evolution in b Q is consistent with quasars inhabiting halos of similar mass at every redshift. Treating quasar clustering as a bivariate function of both redshift and luminosity, we find no evidence for luminosity dependence in quasar clustering, and that redshift evolution thus affects quasar clustering more than changes in quasars' luminosity. Our results are robust against a range of systematic uncertainties. We provide a new method for quantifying stellar contamination in photometrically classified quasar catalogs via the correlation function. Subject headingg s: cosmology: observations -large-scale structure of universe -quasars: general -surveys
We present evidence of a large angle correlation between the cosmic microwave background measured by WMAP and a catalog of photometrically detected quasars from the SDSS. The observed cross correlation is 0:30 0:14 K at zero lag, with a shape consistent with that expected for correlations arising from the integrated Sachs-Wolfe effect. The photometric redshifts of the quasars are centered at z 1:5, making this the deepest survey in which such a correlation has been observed. Assuming this correlation is due to the ISW effect, this constitutes the earliest evidence yet for dark energy and it can be used to constrain exotic dark energy models.
The problem of efficiently finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied in literature. However, a closely related problem of efficiently finding the best match with respect to the inner product has never been explored in the general setting to the best of our knowledge. In this paper we consider this general problem and contrast it with the existing best-match algorithms. First, we propose a general branchand-bound algorithm using a tree data structure. Subsequently, we present a dual-tree algorithm for the case where there are multiple queries. Finally we present a new data structure for increasing the efficiency of the dual-tree algorithm. These branch-and-bound algorithms involve novel bounds suited for the purpose of best-matching with inner products. We evaluate our proposed algorithms on a variety of data sets from various applications, and exhibit up to five orders of magnitude improvement in query time over the naive search technique.
Background: The majority of ovarian cancer biomarker discovery efforts focus on the identification of proteins that can improve the predictive power of presently available diagnostic tests. We here show that metabolomics, the study of metabolic changes in biological systems, can also provide characteristic small molecule fingerprints related to this disease.
In this paper we develop density estimation trees (DETs), the natural analog of classification trees and regression trees, for the task of density estimation. We consider the estimation of a joint probability density function of a d-dimensional random vector X and define a piecewise constant estimator structured as a decision tree. The integrated squared error is minimized to learn the tree. We show that the method is nonparametric: under standard conditions of nonparametric density estimation, DETs are shown to be asymptotically consistent. In addition, being decision trees, DETs perform automatic feature selection. They empirically exhibit the interpretability, adaptability and feature selection properties of supervised decision trees while incurring slight loss in accuracy over other nonparametric density estimators. Hence they might be able to avoid the curse of dimensionality if the true density is sparse in dimensions. We believe that density estimation trees provide a new tool for exploratory data analysis with unique capabilities.
We study the AutoML problem of automatically configuring machine learning pipelines by jointly selecting algorithms and their appropriate hyper-parameters for all steps in supervised learning pipelines. This black-box (gradient-free) optimization with mixed integer & continuous variables is a challenging problem. We propose a novel AutoML scheme by leveraging the alternating direction method of multipliers (ADMM). The proposed framework is able to (i) decompose the optimization problem into easier sub-problems that have a reduced number of variables and circumvent the challenge of mixed variable categories, and (ii) incorporate black-box constraints alongside the black-box optimization objective. We empirically evaluate the flexibility (in utilizing existing AutoML techniques), effectiveness (against open source AutoML toolkits), and unique capability (of executing AutoML with practically motivated black-box constraints) of our proposed scheme on a collection of binary classification data sets from UCI ML & OpenML repositories. We observe that on an average our framework provides significant gains in comparison to other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical advantages of this framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.