Effective patent management is essential for organizations to maintain their competitive advantage. The classification of patents is a critical part of patent management and industrial analysis. This study proposes a hybrid-patent-classification approach that combines a novel patent-network-based classification method with three conventional classification methods to analyze query patents and predict their classes. The novel patent network contains various types of nodes that represent different features extracted from patent documents. The nodes are connected based on the relationship metrics derived from the patent metadata. The proposed classification method predicts a query patent's class by analyzing all reachable nodes in the patent network and calculating their relevance to the query patent. It then classifies the query patent with a modified k -nearest neighbor classifier. To further improve the approach, we combine it with content-based, citation-based, and metadata-based classification methods to develop a hybrid-classification approach. We evaluate the performance of the hybrid approach on a test dataset of patent documents obtained from the U.S. Patent and Trademark Office, and compare its performance with that of the three conventional methods. The results demonstrate that the proposed patent-network-based approach yields more accurate class predictions than the patent network-based approach.
IntroductionPatents are valuable intellectual property and therefore require effective management to ensure that an organization maintains its competitive advantage (Guan & Gao, 2009;Su, Lai, Sharma, & Kuo, 2009). Because of developments in various technologies, the number of patents has increased rapidly in recent years. How to manage the constantly growing volume of patents is thus becoming an important issue. Patent classification is a key part of patent management; however, as the task is usually performed by patent analysts, categorizing new patent documents correctly is a laborious process. Hence, there is a pressing need for an effective patent-classification approach.Basically, patent classification can be regarded as a textcategorization problem that involves assigning a patent document to a particular class. Most existing studies have considered information content to classify patent documents, and several classification algorithms have been developed based on different content features (e.g., He & Lo, 2008;Fall, Torcsvari, Benzineb, & Karetka, 2003, 2004Kim & Choi, 2007;Larkey, 1999;Loh, He, & Shen, 2006;Trappey, Hsu, Trappey, & Lin, 2006). In addition, some approaches have utilized citation relationships to improve the performance of patent classification (Lai & Wu, 2005; Li, Chen, Zang, & Lie, 2007) while others have employed patent metadata (e.g., the inventor's name) to achieve improvements in the classification performance (Richter & MacFarlane, 2005).Since patent metadata provides rich information that can be used to infer possible relationships between patent documents, there exists the poten...