“…In this study, we propose a novel model aimed at addressing the challenges in agricultural entity recognition, such as diverse naming methods, blurred entity boundaries, insufficient feature extraction, and inconsistent labeling of entity boundaries [32]. While BERT has shown promising results in encoding languages like Persian and Italian [33,34], its ability to encode the Chinese text is limited.…”
With the continuous advancement of information technology in the agricultural field, a large amount of unstructured agricultural textual information has been generated. This information is crucial for supporting the development of smart agriculture, making the application of named entity recognition in the agricultural field more urgent. In order to enhance the accuracy of agricultural entity recognition, this study utilizes the pre-trained BERT-wwm model for word embedding into the text. Additionally, a channel attention mechanism (CA) is introduced in the BILSTM-CRF downstream feature extraction network to comprehensively capture the contextual features of the text. Experimental results demonstrate that the proposed method significantly improves the performance of named entity recognition, with increased accuracy, recall, and F1 value. The successful implementation of this method provides reliable support for downstream tasks such as agricultural knowledge graph construction and question and answer systems and establishes a foundation for better understanding and utilization of agricultural textual information.
“…In this study, we propose a novel model aimed at addressing the challenges in agricultural entity recognition, such as diverse naming methods, blurred entity boundaries, insufficient feature extraction, and inconsistent labeling of entity boundaries [32]. While BERT has shown promising results in encoding languages like Persian and Italian [33,34], its ability to encode the Chinese text is limited.…”
With the continuous advancement of information technology in the agricultural field, a large amount of unstructured agricultural textual information has been generated. This information is crucial for supporting the development of smart agriculture, making the application of named entity recognition in the agricultural field more urgent. In order to enhance the accuracy of agricultural entity recognition, this study utilizes the pre-trained BERT-wwm model for word embedding into the text. Additionally, a channel attention mechanism (CA) is introduced in the BILSTM-CRF downstream feature extraction network to comprehensively capture the contextual features of the text. Experimental results demonstrate that the proposed method significantly improves the performance of named entity recognition, with increased accuracy, recall, and F1 value. The successful implementation of this method provides reliable support for downstream tasks such as agricultural knowledge graph construction and question and answer systems and establishes a foundation for better understanding and utilization of agricultural textual information.
“…The crop disease-pest-related information is described by complex wordformation and universal phenomena of word combination and entity embedding. To address the above problems, Wang et al (2022) combined discourse topic and attention mechanism, and proposed the attention-based SoftLexicon with term frequencyinverse document frequency (TF-IDF) for crop disease-pest entity recognition, designed a flow chart to explain the major principles and steps, and explained the model through visual methods. The recognition accuracy of Chinese agricultural pest-diseases was improved by dividing the word sets according to the position of the characters in the word, integrating the discourse theme features into the calculation of lexical information, and introducing the attention mechanism.…”
Crop disease-pest question classification is an essential part of pest knowledge intelligent question answering system. A crop disease-pest question classification method is proposed on the basis of bidirectional encoder representations from transformers (BERT), bidirectional gated unit (BiGRU), capsule network (CapsNet), and BERT-BiGRU-CapsNet with attention pooling (BBGCAP). In BBGCAP, the unstructured text data are preprocessed vectorically using BERT, BiGRU is used to extract the deep features of the text, attention pooling is used to assign the corresponding weights to the extracted deep information, and CapsNet is used to route the right alternative. BBGCAP is a synthetic model by integrating the advantages of BERT, BiGRU, CapsNet, and attention pooling. The experimental results on the cucumber-pest question database show that the proposed method is superior to the methods based on traditional template matching, support vector machines (SVM), and convolutional neural network–long short-term memory (LSTM), and the accuracy rates of precision, recall, and F1 are all above 902.15%. This method provides technical support for intelligent question answering system of crop disease-pests.
“…Named entity recognition in the Chinese language is used in agriculture [39], natural hazards [40], the military [41], engineering [42], chemicals [43], and mainly in medicine, covering electronic health records [43,44] and clinical texts [45,46]. Although named entity recognition has many applications in Chinese, it still has a variety of challenges.…”
Section: Named Entity Recognition For Chinese Languagementioning
Football is one of the most popular sports in the world, arousing a wide range of research topics related to its off- and on-the-pitch performance. The extraction of football entities from football news helps to construct sports frameworks, integrate sports resources, and timely capture the dynamics of the sports through visual text mining results, including the connections among football players, football clubs, and football competitions, and it is of great convenience to observe and analyze the developmental tendencies of football. Therefore, in this paper, we constructed a 1000,000-word Chinese corpus in the field of football and proposed a BiLSTM-based model for named entity recognition. The ALBERT-BiLSTM combination model of deep learning is used for entity extraction of football textual data. Based on the BiLSTM model, we introduced ALBERT as a pre-training model to extract character and enhance the generalization ability of word embedding vectors. We then compared the results of two different annotation schemes, BIO and BIOE, and two deep learning models, ALBERT-BiLSTM-CRF and ALBERT BiLSTM. It was verified that the BIOE tagging was superior than BIO, and the ALBERT-BiLSTM model was more suitable for football datasets. The precision, recall, and F-Score of the model were 85.4%, 83.47%, and 84.37%, correspondingly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.