2021
DOI: 10.1016/j.jbi.2021.103683
|View full text |Cite
|
Sign up to set email alerts
|

Exploration of text matching methods in Chinese disease Q&A systems: A method using ensemble based on BERT and boosted tree models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 8 publications
0
9
0
Order By: Relevance
“…The algorithm computes the edit distance when a part of the string is moved to another string instead of detecting a single change in the string. Previous studies have confirmed its effectiveness [22], [23], [26]. N-gram used a sequence of strings and measured the similarity of sub-sequence of n words from a given sequence text.…”
Section: ) Lexical-similarity Methodsmentioning
confidence: 96%
See 2 more Smart Citations
“…The algorithm computes the edit distance when a part of the string is moved to another string instead of detecting a single change in the string. Previous studies have confirmed its effectiveness [22], [23], [26]. N-gram used a sequence of strings and measured the similarity of sub-sequence of n words from a given sequence text.…”
Section: ) Lexical-similarity Methodsmentioning
confidence: 96%
“…Many evaluation methods are presented in the literature, but the majority evaluate the method in terms of robustness, accuracy, and time. The common methods among performance metrics are matching accuracy, precision, recall, granularity, and F1-measure [8], [3], [23], [24]. Contrary to accuracy, several studies analyze the performance with respect to the error metrics by employing root mean square error (RMSE) and failed detection ratio (FDR) [11], [25].…”
Section: Research Issuesmentioning
confidence: 99%
See 1 more Smart Citation
“…We can observe that, in 2021, researchers mainly concentrated on studying English-language data. Indeed, compared to previous years, a fewer number of languages were covered: Chinese [3][4][5][6][7][8][9][10], Dutch [11], French [12,13], Italian [14][15][16], Japanese [17], Korean [18,19], Norwegian [20], and Spanish . Besides, except for Chinese, there were also very few works done for the languages represented in publications.…”
Section: Languages Addressedmentioning
confidence: 99%
“…Studies such as [ 112 ] used the BERT network to evaluate different methods for a Q&A system trained on Chinese medical data. SCI-BERT [ 10 ], which leveraged unsupervised pre-training on a large multi-domain corpus of scientific publications, was introduced in and BioBERT, which was pre-trained on biomedical domain corpora (e.g., PubMed abstracts and PubMed Central full-text articles), was proposed in [ 54 ].…”
Section: Introductionmentioning
confidence: 99%