Abstract.A cluster labeling algorithm for creating generic titles based on external resources such as WordNet is proposed. Our method first extracts category-specific terms as cluster descriptors. These descriptors are then mapped to generic terms based on a hypernym search algorithm. The proposed method has been evaluated on a patent document collection and a subset of the Reuters-21578 collection. Experimental results revealed that our method performs as anticipated. Real-case applications of these generic terms show promising in assisting humans in interpreting the clustered topics. Our method is general enough such that it can be easily extended to use other hierarchical resources for adaptable label generation.
Patent documents contain important research results. They are often collectively analyzed and organized in a visual way to support decision making. However, they are lengthy and rich in technical terminology, and thus require a lot of human effort for analysis. Automatic tools for assisting patent engineers or decision makers in patent analysis are in great demand. This paper describes a summarization method for patent surrogate extraction, intended to efficiently and effectively support patent mapping, which is an important subtask of patent analysis. Six patent maps were used to evaluate its relative usefulness. The experimental results confirm that the machine generated summaries do preserve more important content words than some other patent sections or even than the full patent texts when only a few terms are to be considered for classification and mapping. The implication is that if one were to determine a patent's category based on only a few terms at a quick pace, one could begin by reading the section summaries generated automatically.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.