Abstract:With the advent and proliferation of digital cameras and computers, the number of digital photos created and stored by consumers has grown extremely large. This created increasing demand for image retrieval systems to ease interaction between consumers and personal media content. Active learning is a widely used user interaction model for retrieval systems, which learns the query concept by asking users to label a number of images at each iteration. In this paper, we study sampling strategies for active learni… Show more
“…Considering the Corel database, we find in our experiments that only 7% of the images generates errors for all strategies. Indeed, theoretically it should be possible to decrease considerably the error rate, maybe by alternating or combining strategies like in [16], or by defining other strategies.…”
Organising a collection of images requires an intensive and time consuming human effort. We present here a framework to classify dynamically collections of images without a priori content knowledge. Our work is based on active learning techniques: unlabeled samples are selected iteratively one by one, and a knn-evidential classifier make a proposition of label at each step. Users can initialize, remove or merge classes and may correct the propositions. The Transferable Belief Model framework offers us a complete formal model to express jointly the classifier and different sampling strategies such as positivity, ambiguity and diversity. Our aims are to study these different sampling strategies in order to minimize the error rates as well as the user cognitive charge according to the distribution of the endeavor over time.
“…Considering the Corel database, we find in our experiments that only 7% of the images generates errors for all strategies. Indeed, theoretically it should be possible to decrease considerably the error rate, maybe by alternating or combining strategies like in [16], or by defining other strategies.…”
Organising a collection of images requires an intensive and time consuming human effort. We present here a framework to classify dynamically collections of images without a priori content knowledge. Our work is based on active learning techniques: unlabeled samples are selected iteratively one by one, and a knn-evidential classifier make a proposition of label at each step. Users can initialize, remove or merge classes and may correct the propositions. The Transferable Belief Model framework offers us a complete formal model to express jointly the classifier and different sampling strategies such as positivity, ambiguity and diversity. Our aims are to study these different sampling strategies in order to minimize the error rates as well as the user cognitive charge according to the distribution of the endeavor over time.
“…Cohn et al have demonstrated its usefulness in theory [8]. Wu et al define a representativeness measure for each sample according to its distance to nearby samples, and take it as a criterion of sample selection [18]. Zhang et al estimate data distribution p(x) by Kernel Density Estimation (KDE), and then take it into account in sample selection [13,20].…”
Active learning methods have been widely applied to reduce human labeling effort in multimedia annotation tasks. However, in traditional methods multiple concepts are usually sequentially annotated, i.e., each concept is exhaustively annotated before proceeding to the next, without taking the learnabilities of different concepts into consideration. Furthermore, in most of these methods only a single modality is applied. This paper presents a novel multiconcept multi-modality active learning method which exchangeably annotates multiple concepts in the context of multi-modality. It iteratively selects a concept and a batch of unlabeled samples, and then these samples are annotated with the selected concept. After that, a graph-based semi-supervised learning is conducted on each modality for the selected concept. The proposed method takes into account both the learnabilities of different concepts and the potentials of different modalities. Experimental results on TRECVID 2005 benchmark have demonstrated its effectiveness and efficiency.
“…Thus, in practice most active learning methods empirically adopt closest-to-boundary criterion to choose the most uncertain samples [23,24]. Zhang et al and Wu et al further proposed to incorporate the density distribution of samples into the sample selection process [29,32]. Brinker et al [4] pointed out that the selected samples should be diverse, especially when the active learning method works in a batch mode, i.e., in each round a batch of samples rather than an individual sample is selected.…”
Section: Related Workmentioning
confidence: 99%
“…(1) we can also see the impact of p(x). Wu et al define a representativeness measure for each sample according to its distance to nearby samples, and take it as a criterion of sample selection [29]. Zhang et al estimate data distribution p(x) by Kernel Density Estimation (KDE), and then use it in sample selection [17,32].…”
Active learning has been demonstrated to be an effective approach to reducing human labeling effort in multimedia annotation tasks. However, most of the existing active learning methods for video annotation are studied in a relatively simple context where concepts are sequentially annotated with fixed effort and only a single modality is applied. However, we usually have to deal with multiple modalities, and sequentially annotating concepts without preference cannot suitably assign annotation effort. To address these two issues, in this paper we propose a multi-concept multi-modality active learning method for video annotation in which multiple concepts and multiple modalities can be simultaneously taken into consideration. In each round of active learning, this method selects the concept that is expected to get the highest performance gain and a batch of suitable samples to be annotated for this concept. Then, a graph-based semi-supervised learning is conducted on each modality for the selected concept. The proposed method is able to sufficiently explore the human effort by considering both the learnabilities of different concepts and the potentials of different modalities. Experimental results on TRECVID 2005 benchmark have demonstrated its effectiveness and efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.