Abstract-Automatic semantic concept detection in video is important for effective content-based video retrieval and mining and has gained great attention recently. In this paper, we propose a general post-filtering framework to enhance robustness and accuracy of semantic concept detection using association and temporal analysis for concept knowledge discovery. Co-occurrence of several semantic concepts could imply the presence of other concepts. We use association mining techniques to discover such inter-concept association relationships from annotations. With discovered concept association rules, we propose a strategy to combine associated concept classifiers to improve detection accuracy. In addition, because video is often visually smooth and semantically coherent, detection results from temporally adjacent shots could be used for the detection of the current shot. We propose temporal filter designs for inter-shot temporal dependency mining to further improve detection accuracy. Experiments on the TRECVID 2005 dataset show our post-filtering framework is both efficient and effective in improving the accuracy of semantic concept detection in video. Furthermore, it is easy to integrate our framework with existing classifiers to boost their performance.
Sequential pattern mining is an essential data mining technique that has been widely applied to many realworld applications. However, traditional algorithms generally suffer from the scalability problem when dealing with big data. In this paper, we aim to significantly upgrade the scale and propose Sequential PAttern Mining algorithm based on MapReduce model on the Cloud (abbreviated as SPAMC). Derived from the prior SPAM algorithm, we design an iterative MapReduce framework to efficiently generate and prune candidate patterns when constructing the lexical sequence tree. This framework not only distributes the sub-tasks of tree construction to independent mappers in parallel, but also enables the parallel processing of support counting. We conduct extensive experiments on the cloud environment of 32 virtual machines with up to 12.8 million transactional sequences. Experimental results show that SPAMC can significantly reduce mining time with big data, achieve extremely high scalability, and provide perfect load balancing on the cloud cluster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.