Purpose The purpose of this paper is to evaluate the global progress and explore research areas and development trends of open government data (OGD) field from the Web of Science (WOS) database by applying the bibliometric visualization approach. Design/methodology/approach This paper conducted a bibliometric mapping study on OGD scientific research publications based on WOS from six aspects. Findings There are six research perspectives on OGD research. European countries and developed countries pay more attention to OGD movement. The 20 most cited and highly influential research documents were identified. What’s more, the analysis of journals level highlights the interdisciplinary and cross-disciplinary characteristics of OGD research. Current six research topics for OGD research that have been formed and two major emerging research priorities in OGD research fields were identified. Research limitations/implications The limitation is that data retrieval result which decided to include only 180 publications in the WOS-indexed publications produced a bias against research publications published in non-WOS publication sources. A fuller research trend would be obtained with the more extensively used electronic databases. Practical implications By dint of bibliometric analysis, this paper may be able to quantify research patterns on OGD, to analyze what has been done in this field and to identify the main research hotspots. Therefore, it can aid academic researchers and practicing professionals in contributing to the field more effectively and advancing scientific progress in the field of OGD research. Social implications The results can also promote the study on OGD movement in academia, government and industry and also enrich the theory of OGD and provide some new perspectives for research on OGD. Originality/value This is the first study to quantify and evaluate global research patterns and development trends in OGD research based on WOS database, which provides a quantitative perspective on OGD studies that may assist in advancing the development of the field.
The purpose of this paper is to solve the problem of big data and small samples caused by the high manual annotation cost of a military corpus. The deep learning algorithm of entity extraction in the military field was organically combined with the method of bootstrapping loop iteration to complete a study on the application of intelligent corpus annotation of military field entities. With the experimental research showing that using a small number of military field entity corpus annotations for RoBERTa pretraining word vectors and BiLSTM-CRF models and based on the bootstrapping algorithm idea to complete 3 rounds of loop iterations and 10 rounds of cross-validation joint-voting model iterations, the best entity extraction model evaluation F value reached up to 91.5%. Finally, the 60M intelligent corpus annotation application testing was completed using the best model of iteration of this round, with a total of 178,177 sentences of military field corpus intelligently labeled, the number of entities that should be labeled reaching 417,734. Therefore, this is an efficient way of construction and evaluation of intelligent corpus annotation model in the military entity extraction field. The findings of this paper provide an effective way of how to complete the labeled corpus. The research serves as a first step for future research, for example, the construction of knowledge graphs and military intelligent Q&A.
The purpose of this study is to solve the effective way of domain-specific knowledge graph construction from information to knowledge. We propose the deep learning algorithm to extract entities and relationship from open-source intelligence by the RoBERTa-wwm-ext pretraining model and a knowledge fusion framework based on the longest common attribute entity alignment technology and bring in different text similarity algorithms and classification algorithms for verification. The experimental research showed that the named entity recognition model using the RoBERTa-wwm-ext pretrained model achieves the best results in terms of recall rate and F1 value, first, and the F value of RoBERTa-wwm-ext + BiLSTM + CRF reached up to 83.07%. Second, the RoBERTa-wwm-ext relationship extraction model has achieved the best results; compared with the relation extraction model based on recurrent neural network, it is improved by about 20%∼30%. Finally, the entity alignment algorithm based on the attribute similarity of the longest common subsequence has achieved the best results on the whole. The findings of this study provide an effective way to complete knowledge graph construction in domain-specific texts. The research serves as a first step for future research, for example, domain-specific intelligent Q&A.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.