In this paper we present a robust information integration approach to identifying images of persons in large collections such as the Web. The underlying system relies on combining content analysis, which involves face detection and recognition, with context analysis, which involves extraction of text or HTML features. Two aspects are explored to test the robustness of this approach: sensitivity of the retrieval performance to the context analysis parameters and automatic construction of a facial image database via automatic pseudofeedback. For the sensitivity testing, we reevaluate system performance while varying context analysis parameters. This is compared with a learning approach where association rules among textual feature values and image relevance are learned via the CN2 algorithm. A face database is constructed by clustering after an initial retrieval relying on face detection and context analysis alone. Experimental results indicate that the approach is robust for identifying and indexing person images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.