2022
DOI: 10.1186/s12859-022-04780-1
|View full text |Cite
|
Sign up to set email alerts
|

A novel multiple kernel fuzzy topic modeling technique for biomedical data

Abstract: Background Text mining in the biomedical field has received much attention and regarded as the important research area since a lot of biomedical data is in text format. Topic modeling is one of the popular methods among text mining techniques used to discover hidden semantic structures, so called topics. However, discovering topics from biomedical data is a challenging task due to the sparsity, redundancy, and unstructured format. Methods In this p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 48 publications
0
4
0
Order By: Relevance
“…Nevertheless, the cut-off of the F 1 -score (∼80%) was chosen due to the fact that NLP tasks do not receive much attention in the LIB literature, and SotA scores are lacking. In contrast, the biomedical community has made significant efforts to evaluate TM and IE systems, and it has for long been discussing NLP SotA, with their highest test scores already being close to 99% . Hence, further discussions about the acceptable accuracy rates that should be considered by the LIB community and how to improve it in the future will be therefore critically needed in the future.…”
Section: Resultsmentioning
confidence: 99%
“…Nevertheless, the cut-off of the F 1 -score (∼80%) was chosen due to the fact that NLP tasks do not receive much attention in the LIB literature, and SotA scores are lacking. In contrast, the biomedical community has made significant efforts to evaluate TM and IE systems, and it has for long been discussing NLP SotA, with their highest test scores already being close to 99% . Hence, further discussions about the acceptable accuracy rates that should be considered by the LIB community and how to improve it in the future will be therefore critically needed in the future.…”
Section: Resultsmentioning
confidence: 99%
“…Rashid et al proposed a topic modeling technique for text mining by mixing inverse document frequency and fuzzy kmeans clustering algorithm in machine learning, which helped discover precise topics from biomedical text documents [127], Lossio-Ventura et al trained and compared topic modeling and clustering algorithm models for health-related tweets on social software, the results show that relevant researchers can choose an appropriate model according to the characteristics of the data [128], Karami et al used fuzzy latent semantic analysis (FLSA) for topic modeling applications in health and medical corpus redundancy problems and provided a new method for estimating the number of topics, the results show that FLSA produces performance and features superior to past research [129]. Rashid et al proposed a novel Multiple kernel fuzzy topic modeling (MKFTM) technology, which uses fusion probabilistic inverse document frequency and multiple kernel fuzzy c-means clustering algorithm methods for biomedical text mining, the fusion probabilistic inverse document frequency method is used to calculate the weight of the entire medical document, and then use the BOW vector model to convert the local vector of the paragraph and the entire document vector through MKFTM and perform document clustering to model [130], in addition, in order to reduce The impact of technical terms, the study also used principal component analysis for dimensionality reduction.…”
Section: Medicinementioning
confidence: 99%
“…In topic modelling approach by Rashid, et.al [34] utilized fuzzy k-means latent semantic analysis (FKLSA) on medical and health text corpora. Multiple kernel fuzzy topic modeling technique by J. Rashid, et.al [35] was proposed for biomedical text mining. Fuzzy LSA-W and FLSA-V [36] was applied to topic embeddings in text classi cation.…”
Section: Related Workmentioning
confidence: 99%