Mark Schwabacher scite author profile

Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real high-dimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the efficiency is because the time to process non-outliers, which are the majority of examples, does not depend on the size of the data set..

show abstract

A Survey of Data-Driven Prognostics

Schwabacher

2005

211

154

View full text Add to dashboard Cite

Integrated Systems Health Management includes fault detection, fault diagnosis (or fault isolation), and fault prognosis. We define prognosis to be detecting the precursors of a failure, and predicting how much time remains before a likely failure. Algorithms that use the data-driven approach to prognosis learn models directly from the data, rather than using a hand-built model based on human expertise. This paper surveys past work in the datadriven approach to prognosis. It also includes related work in data-driven fault detection and diagnosis, and in model-based diagnosis and prognosis, particularly as applied to space systems.

show abstract

Mining distance-based outliers in near linear time with randomization and a simple pruning rule

2003

View full text Add to dashboard Cite

show abstract

General Purpose Data-Driven System Monitoring for Space Operations

Iverson

Martin

Schwabacher

et al. 2009

View full text Add to dashboard Cite

Modern space propulsion and exploration system designs are becoming increasingly sophisticated and complex. Determining the health state of these systems using traditional methods is becoming more difficult as the number of sensors and component interactions grows. Data-driven monitoring techniques have been developed to address these issues by analyzing system operations data to automatically characterize normal system behavior. The Inductive Monitoring System (IMS) is a data-driven system health monitoring software tool that has been successfully applied to several aerospace applications. IMS uses a data mining technique called clustering to analyze archived system data and characterize normal interactions between parameters. This characterization, or model, of nominal operation is stored in a knowledge base that can be used for real-time system monitoring or for analysis of archived events. Ongoing and developing IMS space operations applications include International Space Station flight control, satellite vehicle system health management, launch vehicle ground operations, and fleet supportability. As a common thread of discussion this paper will employ the evolution of the IMS data-driven technique as related to several Integrated Systems Health Management (ISHM) elements. Thematically, the projects listed will be used as case studies. The maturation of IMS via projects where it has been deployed, or is currently being integrated to aid in fault detection will be described. The paper will also explain how IMS can be used to complement a suite of other ISHM tools, providing initial fault detection support for diagnosis and recovery.

show abstract

Machine Learning for Rocket Propulsion Health Monitoring

Schwabacher¹

2005

View full text Add to dashboard Cite

This paper describes the initial results of applying two machine-learning-based unsupervised anomaly detection algorithms, Orca and GritBot, to data from two rocket propulsion testbeds. The first testbed uses historical data from the Space Shuttle Main Engine. The second testbed uses data from an experimental rocket engine test stand located at NASA Stennis Space Center. The paper describes four candidate anomalies detected by the two algorithms.

show abstract

Learning to set up numerical optimizations of engineering designs

Schwabacher

Ellman

Hirsh

1998

AIEDAM

View full text Add to dashboard Cite

Gradient-based numerical optimization of complex engineering designs offers the promise of rapidly producing better designs. However, such methods generally assume that the objective function and constraint functions are continuous, smooth, and defined everywhere. Unfortunately, realistic simulators tend to violate these assumptions, making optimization unreliable. Several decisions that need to be made in setting up an optimization, such as the choice of a starting prototype and the choice of a formulation of the search space, can make a difference in the reliability of the optimization. Machine learning can improve gradient-based methods by making these choices based on the results of previous optimizations. This paper demonstrates this idea by using machine learning for four parts of the optimization setup problem: selecting a starting prototype from a database of prototypes, synthesizing a new starting prototype, predicting which design goals are achievable, and selecting a formulation of the search space. We use standard tree-induction algorithms (C4.5 and CART). We present results in two realistic engineering domains: racing yachts and supersonic aircraft. Our experimental results show that using inductive learning to make setup decisions improves both the speed and the reliability of design optimization.

show abstract

General Purpose Data-Driven Monitoring for Space Operations

Iverson¹,

Martin²,

Schwabacher³

et al. 2012

Journal of Aerospace Computing, Information, and Communication

View full text Add to dashboard Cite

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.