The collection of concepts in a document set can be represented by a geometric structure called simplicial complex of combinatorial topology where each keyword is represented as a vertex and the relation between keywords as simplex. A simplex which consists of more than one keyword is a high-frequency keywordset. These keywords occur close to each other within a document which also occur frequently within a set of documents. The high frequent occurence of these keywords shows relations between keywords. These relations carry concepts. The relations of these keywords can be captured by Association Rule Mining and represented as simplices. The collection of all these simplices, represents the structure of concepts within a document set. Based on this topology, documents are clustered and the collection of simplices can serve as document index.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.