One of the main challenges in interactive concept-based video search is the insufficient relevant sample problem, especially for queries with complex semantics. To address this problem, in this paper, we propose to utilize "related samples" to learn the complex queries. The "related samples" refer to those video segments that are irrelevant to the query but relevant to some of the related concepts of the query. Different from the relevant samples which may be rare, the related samples are usually sufficient and easy to find in the search result list. Specifically, we learn a detector for the query by simultaneously leveraging the related concept detectors, as well as users' feedbacks including relevant, irrelevant, and related samples. The query detector is then employed to predict the presence of the query in new video segments. As a result, new search results can be obtained according to the query presence. Furthermore, our approach is developed based on incremental learning technique. Thus, the query detector can be efficiently updated in each feedback iteration. We conduct experiments on two real-world video datasets: TRECVID 2008 and Youtube datasets. The experimental results demonstrate the effectiveness and efficiency of the proposed approach.
No abstract
BackgroundIt is commonly believed that including domain knowledge in a prediction model is desirable. However, representing and incorporating domain information in the learning process is, in general, a challenging problem. In this research, we consider domain information encoded by discrete or categorical attributes. A discrete or categorical attribute provides a natural partition of the problem domain, and hence divides the original problem into several non-overlapping sub-problems. In this sense, the domain information is useful if the partition simplifies the learning task. The goal of this research is to develop an algorithm to identify discrete or categorical attributes that maximally simplify the learning task.ResultsWe consider restructuring a supervised learning problem via a partition of the problem space using a discrete or categorical attribute. A naive approach exhaustively searches all the possible restructured problems. It is computationally prohibitive when the number of discrete or categorical attributes is large. We propose a metric to rank attributes according to their potential to reduce the uncertainty of a classification task. It is quantified as a conditional entropy achieved using a set of optimal classifiers, each of which is built for a sub-problem defined by the attribute under consideration. To avoid high computational cost, we approximate the solution by the expected minimum conditional entropy with respect to random projections. This approach is tested on three artificial data sets, three cheminformatics data sets, and two leukemia gene expression data sets. Empirical results demonstrate that our method is capable of selecting a proper discrete or categorical attribute to simplify the problem, i.e., the performance of the classifier built for the restructured problem always beats that of the original problem.ConclusionsThe proposed conditional entropy based metric is effective in identifying good partitions of a classification problem, hence enhancing the prediction performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.