T.T. Osugi scite author profile

Valid data are required to make climate assessments and to make climate-related decisions. The objective of this paper is threefold: to introduce an explicit treatment of Type I and Type II errors in evaluating the performance of quality assurance procedures, to illustrate a quality control approach that allows tailoring to regions and subregions, and to introduce a new spatial regression test. Threshold testing, step change, persistence, and spatial regression were included in a test of three decades of temperature and precipitation data at six weather stations representing different climate regimes. The magnitude of thresholds was addressed in terms of the climatic variability, and multiple thresholds were tested to determine the number of Type I errors generated. In a separate test, random errors were seeded into the data and the performance of the tests was such that most Type II errors were made in the range of Ϯ1ЊC for temperature, not too different from the sensor field accuracy. The study underscores the fact that precipitation is more difficult to quality control than temperature. The new spatial regression test presented in this document outperformed all the other tests, which together identified only a few errors beyond those identified by the spatial regression test.

show abstract

SVM-based generalized multiple-instance learning via approximate box counting

Qing

Scott

Vinodchandran

et al. 2004

View full text Add to dashboard Cite

The multiple-instance learning (MIL) model has been very successful in application areas such as drug discovery and content-based imageretrieval. Recently, a generalization of this model and an algorithm for this generalization were introduced, showing significant advantages over the conventional MIL model in certain application areas. Unfortunately, this algorithm is inherently inefficient, preventing scaling to high dimensions. We reformulate this algorithm using a kernel for a support vector machine, reducing its time complexity from exponential to polynomial. Computing the kernel is equivalent to counting the number of axis-parallel boxes in a discrete, bounded space that contain at least one point from each of two multisets P and Q. We show that this problem is #P-complete, but then give a fully polynomial randomized approximation scheme (FPRAS) for it. Finally, we empirically evaluate our kernel.

show abstract

An extended kernel for generalized multiple-instance learning

Qing

Scott

Vinodchandran

et al.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

T.T. Osugi

Balancing Exploration and Exploitation: A New Algorithm for Active Machine Learning

Performance of Quality Assurance Procedures for an Applied Climate Information System

SVM-based generalized multiple-instance learning via approximate box counting

An extended kernel for generalized multiple-instance learning

Contact Info

Product

Resources

About