Patent classification by fine-tuning BERT language model

Lee, Jieh-Sheng; Hsiang, Jieh

doi:10.1016/j.wpi.2020.101965

Cited by 134 publications

(79 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Li et al [33] proposed a DeepPatent that combines the convolutional neural network (CNN) model with the word embedding model for classifying patents. Lee and Hsiang [34] fine-tuned a bidirectional encoder representations from transformers (BERT) model to classify patents and compared the fine-tuned model with the previously mentioned model, DeepPatent, and the result shows that the precision is 9% higher. Jun [35] proposed a method for technical integration and analysis using boosting (an ML algorithm that can be used to reduce bias in supervised learning) and ensemble learning.…”

Section: Patent Miningmentioning

confidence: 99%

Emerging Technologies of Natural Language‐Enabled Chatbots: A Review and Trend Forecast Using Intelligent Ontology Extraction and Patent Analytics

Chao

Trappey

2021

Complexity

View full text Add to dashboard Cite

Natural language processing (NLP) is a critical part of the digital transformation. NLP enables user-friendly interactions between machine and human by making computers understand human languages. Intelligent chatbot is an essential application of NLP to allow understanding of users’ utterance and responding in understandable sentences for specific applications simulating human-to-human conversations and interactions for problem solving or Q&As. This research studies emerging technologies for NLP-enabled intelligent chatbot development using a systematic patent analytic approach. Some intelligent text-mining techniques are applied, including document term frequency analysis for key terminology extractions, clustering method for identifying the subdomains, and Latent Dirichlet Allocation for finding the key topics of patent set. This research utilizes the Derwent Innovation database as the main source for global intelligent chatbot patent retrievals.

show abstract

Section: Patent Miningmentioning

confidence: 99%

Emerging Technologies of Natural Language‐Enabled Chatbots: A Review and Trend Forecast Using Intelligent Ontology Extraction and Patent Analytics

Chao

Trappey

2021

Complexity

View full text Add to dashboard Cite

show abstract

“…Gururangan et al (2020) showed that continuing the training of a model with additional domain-adaptive and task-adaptive pretraining with unlabeled data leads to performance gains for both high-and low-resource settings for numerous English domains and tasks. This is also displayed in the number of domain-adapted language models (Alsentzer et al, 2019;Adhikari et al, 2019;Lee and Hsiang, 2020;Jain and Ganesamoorty, 2020, (i.a. )), most notably BioBERT that was pre-trained on biomedical PubMED articles and SciBERT (Beltagy et al, 2019) for scientific texts.…”

Section: Domain-specific Pre-trainingmentioning

confidence: 94%

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios

Hedderich¹,

Lange²,

Adel³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

137

View full text Add to dashboard Cite

Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings. Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing. After a discussion about the different dimensions of data availability, we give a structured overview of methods that enable learning when training data is sparse. This includes mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific low-resource setting. Further key aspects of this work are to highlight open issues and to outline promising directions for future research.

show abstract

“…BERT is pre-trained on two tasks: the masked language model and next sentence prediction (Devlin et al (2018)). Applying BERT means fine-tuning the pre-trained BERT to a task like patent classiffcation (Sun et al (2019) and Lee & Hsiang (2020a)). Patents are classified according to standards as international patent classiffcation (IPC, Note 3) and cooperative patent classiffcation (CPC, Note 4) by patent offices according to technical features characterizing the invention.…”

Section: Related Literaturementioning

confidence: 99%

Transformer-Based Patent Novelty Search by Training Claims to Their Own Description

Freunek¹,

Bodmer

2021

AEF

View full text Add to dashboard Cite

In this paper we present a method to concatenate patent claims to their own description. By applying this method, bidirectional encoder representations from transformers (BERT) train suitable descriptions for claims. Such a trained BERT could be able to identify novelty relevant descriptions for patents. In addition, we introduce a new scoring scheme: relevance score or novelty score to interprete the output of BERT. We test the method on patent applications by training BERT on the first claims of patents and corresponding descriptions. The output is processed according to the relevance score and the results compared with the cited X documents in the search reports. The test shows that BERT score some of the cited X documents as highly relevant.

show abstract

Patent classification by fine-tuning BERT language model

Cited by 134 publications

References 8 publications

Emerging Technologies of Natural Language‐Enabled Chatbots: A Review and Trend Forecast Using Intelligent Ontology Extraction and Patent Analytics

Emerging Technologies of Natural Language‐Enabled Chatbots: A Review and Trend Forecast Using Intelligent Ontology Extraction and Patent Analytics

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios

Transformer-Based Patent Novelty Search by Training Claims to Their Own Description

Contact Info

Product

Resources

About