IntroductionŽ . Statistical process control SPC has been an active area of research for many decades. A broad spectrum of methods have been developed, including methods for univariate SPC Ž . such as Shewhart, moving-average MA , exponentially Ž . weighted moving-average EWMA , and cumulative-sum Ž . CUSUM charts. Methods for multivariate SPC include multivariate extensions of univariate methods, and methods that monitor latent variables obtained by combining the measured variables with a lower dimension of space. Popular methods for reducing the dimensionality of the measured data include Ž . principal-component analysis PCA and partial least-square Ž . regression PLS . Many extensions and applications of these Ž have been developed Kresta et al., 1991;Ku et al., 1995;. MacGregor, 1994 . Correspondence concerning this article should be addressed to B. R. Bakshi. Current addresses of: H. B. Aradhye, SRI International, Menlo Park, CA; R. A. Strauss, ExxonMobil, Fairfax, VA. J. F. Davis, University of California, Los Angeles, CA.Most existing univariate and multivariate SPC methods operate at a fixed scale, and are best for detecting changes at a single scale. For example, Shewhart charts analyze the raw measurements at the scale of the sampling interval or the finest scale, and are best for detecting large, localized changes. In contrast, MA, EWMA, and CUSUM charts inherently filter the data, and, therefore, process measurements at a coarser scale. They are best for detecting small shifts or features at coarse scales. Tuning parameters such as window length or filter constant determine the scale at which the measurements are represented.In contrast to the single-scale nature of SPC methods, data from most practical processes are inherently multiscale due to events occurring with different localizations in time, space, and frequency. A typical example of such data from a petrochemical process is shown in Figure 1. Figure 1a shows data during normal operation, while Figure 1b represents unusual operation due to a drier cooling event. In Figure 1b cess change at approximately 150 time units is at a very fine scale and localized in time, but spans a wide range of frequencies. The steady portions of the signal are at coarse scales and span a wide temporal range. Finally, the change between 425 and 675 time units consists of a small sharp change followed by a short steady section and a slow ramp at an intermediate scale. Ideally, techniques for detecting changes at different scales, such as those shown in Figure 1b, should adapt automatically to the scale of the features. In response to this need, many heuristic or ad hoc techniques have been proposed for overcoming the single-scale nature of SPC Ž charts. These include the Western Electric rules Western . Electric, 1956 , useful for identifying patterns in data, and Ž . combined Shewhart and CUSUM charts Lucas, 1982 for identifying large and small shifts. Other methods, such as Ž . CUSCORE charts Box and Ramirez, 1992 , may be specially designed to detect abnormal feat...
We present a system that automatically recommends tags for YouTube videos solely based on their audiovisual content. We also propose a novel framework for unsupervised discovery of video categories that exploits knowledge mined from the World-Wide Web text documents/searches. First, video content to tag association is learned by training classifiers that map audiovisual content-based features from millions of videos on YouTube.com to existing uploadersupplied tags for these videos. When a new video is uploaded, the labels provided by these classifiers are used to automatically suggest tags deemed relevant to the video. Our system has learned a vocabulary of over 20,000 tags. Secondly, we mined large volumes of Web pages and search queries to discover a set of possible text entity categories and a set of associated is-A relationships that map individual text entities to categories. Finally, we apply these is-A relationships mined from web text on the tags learned from audiovisual content of videos to automatically synthesize a reliable set of categories most relevant to videos -along with a mechanism to predict these categories for new uploads. We then present rigorous rating studies that establish that: (a) the average relevance of tags automatically recommended by our system matches the average relevance of the uploader-supplied tags at the same or better coverage and (b) the average precision@K of video categories discovered by our system is 70% with K=5.
This paper discusses a new method for automatic discovery and organization of descriptive concepts (labels) within large real-world corpora of user-uploaded multimedia, such as YouTube.com. Conversely, it also provides validation of existing labels, if any. While training, our method does not assume any explicit manual annotation other than the weak labels already available in the form of video title, description, and tags. Prior work related to such auto-annotation assumed that a vocabulary of labels of interest (e.g., indoor, outdoor, city, landscape) is specified a priori. In contrast, the proposed method begins with an empty vocabulary. It analyzes audiovisual features of 25 million YouTube.com videos -nearly 150 years of video data -effectively searching for consistent correlation between these features and text metadata. It autonomously extends the label vocabulary as and when it discovers concepts it can reliably identify, eventually leading to a vocabulary with thousands of labels and growing. We believe that this work significantly extends the state of the art in multimedia data mining, discovery, and organization based on the technical merit of the proposed ideas as well as the enormous scale of the mining exercise in a very challenging, unconstrained, noisy domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.