Patents are critical intellectual assets for any competitive business. With ever increasing patent filings, effective patent prior art search has become an inevitably important task in patent retrieval which is a subfield of information retrieval (IR). The goal of the prior art search is to find and rank documents related to a query patent. Query formulation is a key step in prior art search in which patent structure is exploited to generate queries using various fields available in patent text. As patent encodes multiple technical domains, this work argues that technical domains and patent structure have their combined effect on the effectiveness of patent retrieval. The study uses international patent classification codes (IPC) to categorize query patents in eight technical domains and also explores eighteen different combination of patent fields to generate search queries. A total of 144 extensive retrieval experiments have been carried out using BM25 ranking algorithm. Retrieval performance is evaluated in terms of recall score of top 1000 records. Empirical results support our assumption. A two-way analysis of variance is also conducted to validate the hypotheses. The findings of this work may be helpful for patent information retrieval professionals to develop domain specific patent retrieval systems exploiting the patent structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.