BackgroundSupervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction.MethodsIn this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction.ResultsWe found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered.ConclusionThis study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies.
BackgroundThe analysis of co-authorship network aims at exploring the impact of network structure on the outcome of scientific collaborations and research publications. However, little is known about what network properties are associated with authors who have increased number of joint publications and are being cited highly.Methodology/Principal FindingsMeasures of social network analysis, for example network centrality and tie strength, have been utilized extensively in current co-authorship literature to explore different behavioural patterns of co-authorship networks. Using three SNA measures (i.e., degree centrality, closeness centrality and betweenness centrality), we explore scientific collaboration networks to understand factors influencing performance (i.e., citation count) and formation (tie strength between authors) of such networks. A citation count is the number of times an article is cited by other articles. We use co-authorship dataset of the research field of ‘steel structure’ for the year 2005 to 2009. To measure the strength of scientific collaboration between two authors, we consider the number of articles co-authored by them. In this study, we examine how citation count of a scientific publication is influenced by different centrality measures of its co-author(s) in a co-authorship network. We further analyze the impact of the network positions of authors on the strength of their scientific collaborations. We use both correlation and regression methods for data analysis leading to statistical validation. We identify that citation count of a research article is positively correlated with the degree centrality and betweenness centrality values of its co-author(s). Also, we reveal that degree centrality and betweenness centrality values of authors in a co-authorship network are positively correlated with the strength of their scientific collaborations.Conclusions/SignificanceAuthors’ network positions in co-authorship networks influence the performance (i.e., citation count) and formation (i.e., tie strength) of scientific collaborations.
Although co-authorship in scientific research has a long history the analysis of co-authorship network to explore scientific collaboration among authors is a relatively new research area. Studies of current literature about co-authorship networks mostly give emphasis to understand patterns of scientific collaborations, to capture collaborative statistics, and to propose valid and reliable measures for identifying prominent author(s). However, there is no such study in the literature which conducts a longitudinal analysis of co-authorship networks. Using a dataset that spans over 20 years, this paper attempts to explore efficiency and trend of co-authorship networks. Two scientists are considered connected if they have co-authored a paper, and these types of connections between two scientists eventually constitute co-authorship networks. Co-authorship networks evolve among researchers over time in specific research domains as well as in interdisciplinary research areas. Scientists from diverse research areas and different geographical locations may participate in one specific co-authorship network whereas an individual scientist may belong to different co-authorship networks. In this paper, we study a longitudinal co-authorship network of a specific scientific research area. By applying approaches to
Autism Spectrum Disorder (ASD) is a group of neurodevelopmental disabilities that are not curable but may be ameliorated by early interventions. We gathered early-detected ASD datasets relating to toddlers, children, adolescents and adults, and applied several feature transformation methods, including log, Z-score and sine functions to these datasets. Various classification techniques were then implemented with these transformed ASD datasets and assessed for their performance. We found SVM showed the best performance for the toddler dataset, while Adaboost gave the best results for the children dataset, Glmboost for the adolescent and Adaboost for the adult datasets. The feature transformations resulting in the best classifications was sine function for toddler and Z-score for children and adolescent datasets. After these analyses, several feature selection techniques were used with these Z-score-transformed datasets to identify the significant ASD risk factors for the toddler, child, adolescent and adult subjects. The results of these analytical approaches indicate that, when appropriately optimised, machine learning methods can provide good predictions of ASD status. This suggests that it may possible to apply these models for the detection of ASD in its early stages.
Several studies exist which use scientific literature for comparing scientific activities (e.g., productivity, and collaboration). In this study, using co-authorship data over the last 40 years, we present the evolutionary dynamics of multi level (i.e., individual, institutional and national) collaboration networks for exploring the emergence of collaborations in the research field of ''steel structures''. The collaboration network of scientists in the field has been analyzed using author affiliations extracted from Scopus between 1970 and 2009. We have studied collaboration distribution networks at the micro-, meso-and macro-levels for the 40 years. We compared and analyzed a number of properties of these networks (i.e., density, centrality measures, the giant component and clustering coefficient) for presenting a longitudinal analysis and statistical validation of the evolutionary dynamics of ''steel structures'' collaboration networks. At all levels, the scientific collaborations network structures were central considering the closeness centralization while betweenness and degree centralization were much lower. In general networks density, connectedness, centralization and clustering coefficient were highest in marco-level and decreasing as the network size grow to the lowest in micro-level. We also find that the average distance between countries about two and institutes five and for authors eight meaning that only about eight steps are necessary to get from one randomly chosen author to another.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.