2022
DOI: 10.1186/s12859-022-04714-x
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning and statistics shape a novel path in archaeal promoter annotation

Abstract: Background Archaea are a vast and unexplored domain. Bioinformatic techniques might enlighten the path to a higher quality genome annotation in varied organisms. Promoter sequences of archaea have the action of a plethora of proteins upon it. The conservation found in a structural level of the binding site of proteins such as TBP, TFB, and TFE aids RNAP-DNA stabilization and makes the archaeal promoter prone to be explored by statistical and machine learning techniques. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 9 publications
(15 citation statements)
references
References 34 publications
0
15
0
Order By: Relevance
“…First, we posit for the balance of the classi cation achieved by our train/test (84.55% precision) and the external sequences (81%), which is further con rmed by the likeliness of the DDS pro le of these sequences. Next, we argue upon the fact that better performance has been obtained classifying archaea (Martinez et al, 2022), eukarya (Oubounyt et al, 2019), and bacteria (de Avila e Silva et al, 2011). The authors mainly employed Arti cial Neural Networks for classi cation, which poses as a classi cation rationale that is mathematically more complex than the SVMs we employed (Pisner and Schnyder, 2020).…”
Section: Discussionmentioning
confidence: 87%
See 2 more Smart Citations
“…First, we posit for the balance of the classi cation achieved by our train/test (84.55% precision) and the external sequences (81%), which is further con rmed by the likeliness of the DDS pro le of these sequences. Next, we argue upon the fact that better performance has been obtained classifying archaea (Martinez et al, 2022), eukarya (Oubounyt et al, 2019), and bacteria (de Avila e Silva et al, 2011). The authors mainly employed Arti cial Neural Networks for classi cation, which poses as a classi cation rationale that is mathematically more complex than the SVMs we employed (Pisner and Schnyder, 2020).…”
Section: Discussionmentioning
confidence: 87%
“…Promoters were derived from the transcripts of the organisms. In addition, 405 promoters in-silico predicted (Martinez et al, 2022) of A. boonei and T. pendens were considered. In brief, Martinez et al (2002) employed machine learning and statistics to validate promoter sequences of unannotated archaea.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…It has been previously benchmarked by Ref. 19 that among non-coding and shuffled promoter sequences, the shuffled version of a control dataset yielded in the worst classification. Thus, we opted to use this method of obtaining negative sequences to stress our classification model.…”
Section: Methodsmentioning
confidence: 99%
“…Many promoter predictors use ML as their form of classification. In this task, different algorithms have been proposed, such as Artificial Neural Networks 13 , 15 , 19 , Support Vector Machines (SVM) 11 , Recurrent Neural Networks 16 , among others. However, due to the mathematical complexity, many of these tools will function as black-box classifiers, where one just knows the output associated with given input features.…”
Section: Introductionmentioning
confidence: 99%