Multimedia content has been growing quickly and video retrieval is regarded as one of the most famous issues in multimedia research. In order to retrieve a desirable video, users express their needs in terms of queries. Queries can be on object, motion, texture, color, audio, etc. Low-level representations of video are different from the higher level concepts which a user associates with video. Therefore, query based on semantics is more realistic and tangible for end user. Comprehending the semantics of query has opened a new insight in video retrieval and bridging the semantic gap. However, the problem is that the video needs to be manually annotated in order to support queries expressed in terms of semantic concepts. Annotating semantic concepts which appear in video shots is a challenging and time-consuming task. Moreover, it is not possible to provide annotation for every concept in the real world. In this study, an integrated semantic-based approach for similarity computation is proposed with respect to enhance the retrieval effectiveness in concept-based video retrieval. The proposed method is based on the integration of knowledge-based and corpus-based semantic word similarity measures in order to retrieve video shots for concepts whose annotations are not available for the system. The TRECVID 2005 dataset is used for evaluation purpose, and the results of applying proposed method are then compared against the individual knowledgebased and corpus-based semantic word similarity measures which were utilized in previous studies in the same domain. The superiority of integrated similarity method is shown and evaluated in terms of Mean Average Precision (MAP).
Problem statement: Record linkage is a technique which is used to detect and match duplicate records which are generated in data integration process. A variety of record linkage algorithms with different steps have been developed in order to detect such duplicate records. To find out whether two records are duplicate or not, supervised and unsupervised classification techniques are utilized in different studies. In order to utilize the supervised classification algorithms without consuming a lot of time for labeling data manually, a two step method which selects the training data automatically has been proposed in previous studies. However, the effectiveness of different classification techniques is the issue which should be taken into accounts in record linkage systems in order to classify records more accurately. Approach: To determine and compare the effectiveness of different supervised classification techniques in an unsupervised manner, some of the prominent classification methods are applied in duplicate records detection. Duplicate detection and classification of records in two real world datasets, namely Cora and Restaurant is experimented by Support Vector Machines, Naïve Bayes, Decision Tree and Bayesian Networks which are regarded as some prominent classification techniques. Results: As experimental results show, while Support Vector Machines outperforms with F-measure of 96.27% in Restaurant dataset, for Cora dataset, the effectiveness of Naïve Bayes is the best and it leads to an improvement with F-measure of 89.7%. Conclusion/Recommendation: The result of detecting duplicate records with different classification techniques tends to fluctuate depending on the dataset which is used. Moreover, Support Vector Machines and Naïve Bayes outperform other methods in our experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.