2018
DOI: 10.7717/peerj.4349
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of complex networks and tree-based methods of phylogenetic analysis and proposal of a bootstrap method

Abstract: Complex networks have been successfully applied to the characterization and modeling of complex systems in several distinct areas of Biological Sciences. Nevertheless, their utilization in phylogenetic analysis still needs to be widely tested, using different molecular data sets and taxonomic groups, and, also, by comparing complex networks approach to current methods in phylogenetic analysis. In this work, we compare all the four main methods of phylogenetic analysis (distance, maximum parsimony, maximum like… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(11 citation statements)
references
References 31 publications
0
11
0
Order By: Relevance
“…However, cautions also need to be paid during the choice of each method. For tree-based modeling, although it has the advantage of being easy to understand, being useful in data exploration, requiring less data cleaning, with no constraint on data type, and being non-parametric method, it still confronts with the challenge of over fitting, which is one of the most practical difficulties for decision tree models and can only be solved by setting constraints on model parameters and pruning [23–25]. For k-means clustering, its ease of implementation, computational efficiency and low memory consumption has kept it very popular, yet its sensitivity to the initial centroids chosen, the potential bias to create clusters of equal size, and lack of robustness to outliers require further adjustment while using this method [29, 37].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, cautions also need to be paid during the choice of each method. For tree-based modeling, although it has the advantage of being easy to understand, being useful in data exploration, requiring less data cleaning, with no constraint on data type, and being non-parametric method, it still confronts with the challenge of over fitting, which is one of the most practical difficulties for decision tree models and can only be solved by setting constraints on model parameters and pruning [23–25]. For k-means clustering, its ease of implementation, computational efficiency and low memory consumption has kept it very popular, yet its sensitivity to the initial centroids chosen, the potential bias to create clusters of equal size, and lack of robustness to outliers require further adjustment while using this method [29, 37].…”
Section: Discussionmentioning
confidence: 99%
“…Having been proposed to be one of the best and mostly used supervised learning methods, tree based methods empower predictive models for both categorical and continuous input and output variables, and map both linear and non-linear relationships quite well [23–25]. Here we use interaction trees to capture treatment-subgroup interactions by recursively splitting the group of patients based on pretreatment characteristics, such that in each split the treatment-split interaction is maximized.…”
Section: Methodsmentioning
confidence: 99%
“…To summarize the network approach applied here, we describe the main steps performed: The proposed approach used here requires less than 10 seconds to run on an Acer Intel Core i7-6700 CPU @ 3.40GHz for all data sets tested to date (<100 sequences). The scripts used here can be found on GitHub (https:// github.com/deCarvalho90/network_analysis) and the software with a graphical interface is available in the work by Goés-Neto et al 51…”
Section: Network Analysismentioning
confidence: 99%
“…Such a multi-dimensional set was employed to simultaneously investigate the structural topology of similarity networks inferred from these proteins, integrating signal from multiple orthologs, all supported by a bootstrap analysis to assess the confidence of the observed clustering patterns. Several contributions by some of us ( Góes-Neto et al, 2010 ; Andrade et al, 2011 ; Carvalho et al, 2015 ; Góes-Neto et al, 2018 , Carvalho, Schnable & Almeida, 2018 ) have shown that the original single-layer network version of this method is reliable and consistent. Indeed, it has been explicitly demonstrated that it produces results comparable to those obtained using distance, maximum parsimony, maximum likelihood, and Bayesian methods ( Góes-Neto et al, 2018 ).…”
Section: Introductionmentioning
confidence: 98%
“…Nevertheless, broad consensual evidence for its utility in real-world datasets is still lacking ( Spielman & Kosakovsky Pond, 2018 ; Abadi et al, 2019 ; Spielman, 2020 ). As an alternative strategy, several authors (including most of us) have provided evidence that Network Science approaches based on Sequence Similarity Networks (SSNs) ( Bapteste et al, 2013 ) help the discovery of relationships between taxa ( Atkinson et al, 2009 ; Andrade et al, 2011 ; Larremore, Clauset & Buckee, 2013 ; Corel et al, 2016 ; Chowdhary, Löffler & Smith, 2017 ; Solís-Lemus, Bastide & Ané, 2017 ; Góes-Neto et al, 2018 ). Therefore, complex networks have been continuously adopted to explain even complex evolutionary processes, including but not restricted to, horizontal gene transfer, gene domain fusion, and gene or genome introgression ( Corel et al, 2016 ; Pathmanathan et al, 2018 ; Ocana-Pallares et al, 2019 ).…”
Section: Introductionmentioning
confidence: 99%