Machine learning in project analytics: a data-driven framework and case study

Uddin, Shahadat; Ong, Sim‐Heng; Lu, Haohui

doi:10.1038/s41598-022-19728-x

Cited by 22 publications

(8 citation statements)

References 62 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is evident in the literature that tree-based ML algorithms can handle non-linear classification datasets better [ 13 ], which could be a possible reason for their performance superiority. Uddin and Lu [ 41 ] noticed that dataset meta-level and statistical attributes do not impact the performance of tree-based MLs. However, they have a statistically significant impact on non-tree-based ML algorithms.…”

Section: Discussionmentioning

confidence: 99%

“…While our findings suggest tree-based algorithms outperform non-tree-based ones across multiple datasets, we recognise the importance of considering dataset-specific characteristics, such as feature distribution and complexity, that could influence algorithm performance. Uddin and Lu [ 41 ] discovered that ML algorithms exhibit varying performances when applied to datasets with distinct meta-level and statistical attributes. Moreover, an explanatory approach, combined with domain expertise, could unearth the factors contributing to the superiority of tree-based algorithms.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data

Uddin,

2024

PLoS ONE

Self Cite

View full text Add to dashboard Cite

Many individual studies in the literature observed the superiority of tree-based machine learning (ML) algorithms. However, the current body of literature lacks statistical validation of this superiority. This study addresses this gap by employing five ML algorithms on 200 open-access datasets from a wide range of research contexts to statistically confirm the superiority of tree-based ML algorithms over their counterparts. Specifically, it examines two tree-based ML (Decision tree and Random forest) and three non-tree-based ML (Support vector machine, Logistic regression and k-nearest neighbour) algorithms. Results from paired-sample t-tests show that both tree-based ML algorithms reveal better performance than each non-tree-based ML algorithm for the four ML performance measures (accuracy, precision, recall and F1 score) considered in this study, each at p<0.001 significance level. This performance superiority is consistent across both the model development and test phases. This study also used paired-sample t-tests for the subsets of the research datasets from disease prediction (66) and university-ranking (50) research contexts for further validation. The observed superiority of the tree-based ML algorithms remains valid for these subsets. Tree-based ML algorithms significantly outperformed non-tree-based algorithms for these two research contexts for all four performance measures. We discuss the research implications of these findings in detail in this article.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data

Uddin,

2024

PLoS ONE

Self Cite

View full text Add to dashboard Cite

show abstract

“…This algorithm incorporates regularization in its boosting process, thus mitigating overfitting and enhancing the generalizability of the results. Recognized for its outstanding performance and speed, XGBoost has become a dominant algorithm in applied machine learning 49 . A recent state‐of‐the‐art comparison of classification algorithms 50 underscores XGBoost's effectiveness across both small and large training sets—consistently outperforming more popular classifiers such as support vector machine and random forest.…”

Section: Methodsmentioning

confidence: 99%

Decoding bilingualism from resting‐state oscillatory network organization

Amoruso,

García,

Pusil

et al. 2024

Annals of the New York Academy of Sciences

View full text Add to dashboard Cite

Can lifelong bilingualism be robustly decoded from intrinsic brain connectivity? Can we determine, using a spectrally resolved approach, the oscillatory networks that better predict dual‐language experience? We recorded resting‐state magnetoencephalographic activity in highly proficient Spanish‐Basque bilinguals and Spanish monolinguals, calculated functional connectivity at canonical frequency bands, and derived topological network properties using graph analysis. These features were fed into a machine learning classifier to establish how robustly they discriminated between the groups. The model showed excellent classification (AUC: 0.91 ± 0.12) between individuals in each group. The key drivers of classification were network strength in beta (15–30 Hz) and delta (2–4 Hz) rhythms. Further characterization of these networks revealed the involvement of temporal, cingulate, and fronto‐parietal hubs likely underpinning the language and default‐mode networks (DMNs). Complementary evidence from a correlation analysis showed that the top‐ranked features that better discriminated individuals during rest also explained interindividual variability in second language (L2) proficiency within bilinguals, further supporting the robustness of the machine learning model in capturing trait‐like markers of bilingualism. Overall, our results show that long‐term experience with an L2 can be “brain‐read” at a fine‐grained level from resting‐state oscillatory network organization, highlighting its pervasive impact, particularly within language and DMN networks.

show abstract

“…Changing the constraint is still the model becomes Linear Programming (LP). As a result, this type of function is generated to control the performance of the main model: 5)- (13).…”

Section: Mathematical Model Consequently It Is Necessary To Make the ...mentioning

confidence: 99%

A robust and resilience machine learning for forecasting agri-food production

Lotfi

Gholamrezaei

Kadłubek

et al. 2022

Sci Rep

View full text Add to dashboard Cite

This research proposes a new framework for agri-food capacity production by considering resiliency and robustness and paying attention to disruption and risk for the first time. It is applied robust stochastic optimization by adding robustness to the constraint's objective function and resiliency situation. This research minimizes the mean absolute deviation and coefficient of standard deviation errors by linear function in the agri-food capacity production. This study suggests agri-food managers and decision-makers use this mathematical method to forecast and improve production management. The results of this research lead to better decision-making and are compared with other sine functions. The main model's Robust and Resiliency Mean Absolute Deviation (RRMAD) value is 1.28% lower than other sine-type functions. The conservativity coefficient, confidence level, weight factor, resiliency coefficient, and probability of the scenario vary. The main model's RRMAD value is 1.28% lower than other sine-type functions. Growing the weight factor will result in an increase in RRMAD and a smooth decline in R-squared. Additionally, as the resilience coefficient rises, the RRMAD function increases while the R-squared declines. By altering the probability of the scenario, the RRMAD function drops, and the R-squared goes up.

show abstract

Machine learning in project analytics: a data-driven framework and case study

Cited by 22 publications

References 62 publications

Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data

Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data

Decoding bilingualism from resting‐state oscillatory network organization

A robust and resilience machine learning for forecasting agri-food production

Contact Info

Product

Resources

About