2022
DOI: 10.3389/fpubh.2022.846118
|View full text |Cite
|
Sign up to set email alerts
|

A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population

Abstract: Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 57 publications
1
3
0
Order By: Relevance
“…As the end product of purine metabolism, high levels of serum uric acid could induce fat accumulation and then lead to the development of fatty liver [ 36 ]. Waist circumference was also a significant predictor of NAFLD, consistent finding has been revealed in previous studies [ 7 , 37 ]. Furthermore, our results also showed that direct bilirubin levels from five or more years ago can have a significant and continuous effect on the development of NAFLD, which may hint at its long-term effect on the onset of the disease.…”
Section: Discussionsupporting
confidence: 90%
See 2 more Smart Citations
“…As the end product of purine metabolism, high levels of serum uric acid could induce fat accumulation and then lead to the development of fatty liver [ 36 ]. Waist circumference was also a significant predictor of NAFLD, consistent finding has been revealed in previous studies [ 7 , 37 ]. Furthermore, our results also showed that direct bilirubin levels from five or more years ago can have a significant and continuous effect on the development of NAFLD, which may hint at its long-term effect on the onset of the disease.…”
Section: Discussionsupporting
confidence: 90%
“…While cross-sectional data-based diagnostic prediction models have shown promising results, they fall short in predicting the future incidence of NAFLD and identifying high-risk populations before the onset of the disease [ 5 , 7 , 8 , 20 ]. Prognosis prediction models, on the other hand, excel in early risk estimation for diseases.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Non-invasive modalities to diagnose MASH and stages of liver fibrosis have promising diagnostic accuracy (with an area under the receiver operating characteristic curve [AUC] ranging between 0.76 and 0.90, predicting the risk of incorrect classification at <10-24%) ( . Similarly, machine learning algorithms increasingly emerge to use laboratory parameters and demographic data, but their overall performance fails to meet clinical relevance due to a lack of liverbiopsy-proven MASLD/MASH as the prediction outcome or the absence of external validation cohorts (Ji et al, 2022;Chang et al, 2023;Kouvari et al, 2023b;Peng et al, 2023). Recently, a large multi-center liver biopsy-based study (n=455 total serum samples) validated, confirmed, and compared the diagnostic performance of established and novel non-invasive MASLD indices (n=12 total non-invasive testing indices) (Kouvari et al, 2023b).…”
Section: Cryptogenic Steatotic Liver Diseasementioning
confidence: 99%
“…Several ML techniques, such as logistic regression (LR), random forest (RF), artificial neural networks (ANNs), support vector machines, and extreme gradient boosting (xgBoost), show promise in improving predictions compared with conventional risk scoring systems. There are several previous studies that used ML methods to show a higher diagnostic value for the presence of fatty liver disease with clinical variables [ 9 , 10 , 11 , 12 , 13 , 14 , 15 ]. However, these studies utilized a limited number of datasets, and most of them did not examine with an additional testing dataset for validation.…”
Section: Introductionmentioning
confidence: 99%