ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies

Clark, Scott; Egan, Rob; Frazier, Peter I.; Zhong, Wang

doi:10.1093/bioinformatics/bts723

Cited by 168 publications

(145 citation statements)

References 39 publications

Supporting

Mentioning

139

Contrasting

Unclassified

Order By: Relevance

“…This assembler applies a bi-directional De Bruijn graph, solving 'complex knots' , under a range of different k-mer settings. Finally, (vi) in order to select among candidate assemblies from the SPAdes assembly an 'assembly likelihood estimation' (Clark et al 2013) is applied that calculates the likelihood of the fit of the original reads to each candidate assembly, using a model that includes parameters such as 'read quality' , 'mate pair orientation' , 'read alignment' and 'sequence coverage' . The ALE test therefore assures assembly quality at the read level (Clark et al 2013), rather than using mere maximum length or highest N50 as a criterion.…”

Section: A Herbarium Genomics Test-casementioning

confidence: 99%

“…Finally, (vi) in order to select among candidate assemblies from the SPAdes assembly an 'assembly likelihood estimation' (Clark et al 2013) is applied that calculates the likelihood of the fit of the original reads to each candidate assembly, using a model that includes parameters such as 'read quality' , 'mate pair orientation' , 'read alignment' and 'sequence coverage' . The ALE test therefore assures assembly quality at the read level (Clark et al 2013), rather than using mere maximum length or highest N50 as a criterion. The assembly with the best -LnL score is selected as final assembly, to which the original reads are mapped again in order to visually check for possible anomalies, sequence breaks, coverage gaps etc.…”

Section: A Herbarium Genomics Test-casementioning

confidence: 99%

See 1 more Smart Citation

Herbarium genomics: skimming and plastomics from archival specimens

Bakker¹

2017

Webbia

View full text Add to dashboard Cite

Section: A Herbarium Genomics Test-casementioning

confidence: 99%

Section: A Herbarium Genomics Test-casementioning

confidence: 99%

Herbarium genomics: skimming and plastomics from archival specimens

Bakker¹

2017

Webbia

View full text Add to dashboard Cite

“…Метрика ALE [2]. В отличие от вышеперечисленных метрик, данной метрике для оценки качества сборки помимо множества контигов требуется также набор входных ридов.…”

Section: безреференсные методикиunclassified

“…В качестве ответа она выдает логарифм вероятности того, что сборка является верной при наличии заданного набора чтений. Для этого оценивается три фактора: насколько содержание чтений совпадает со сборкой, насколько априорные расстояния между парными чтениями совпадают с получившимися в результате сборки и насколько априорная глубина покрытия в каждой позиции совпадает с получившейся в результате сборки на основе GC-состава [2].…”

Section: безреференсные методикиunclassified

A new method of evaluating genome assemblies based on kmers frequencies

Romanenkov¹

2017

KIAM Prepr.

View full text Add to dashboard Cite

Метод оценки качества сборки генома на основе частот k-меров Достаточно распространена ситуация, когда результаты применения геномных сборщиков или одного сборщика с разными параметрами существенно отличаются для одних и тех же входных данных, при этом в настоящее время не существует единой методики выбора наилучшей сборки. В данной работе предложен новый метод оценки качества геномной сборки организмов, для которых использование уже собранных геномов невозможно, с помощью анализа частот k-меров на основе программного средства Jellyfish. Предложенный метод устанавливает соответствие между набором коротких чтений, полученных в результате секвенирования, и собранным геномом, позволяя более точно оценивать результат геномной сборки. В результате проверки метода на различных сборках организма Encephalitozoon cuniculi fungus было установлено, что в большинстве случаев предложенная методика коррелирует с референс-зависимыми метриками и позволяет корректно определять лучшую сборку. При этом не была выявлена взаимосвязь между качеством сборки и стандартными метриками. Ключевые слова: частоты k-меров, сравнение геномных сборок, оценка качества геномной сборки, Encephalitozoon cuniculi fungus Kirill Vladimirovich RomanenkovA new method of evaluating genome assemblies based on kmers frequencies Running different genome assemblers or one genome assembler with different parameters on the same input data commonly leads to a great variety of results. However, there is no generally recognized method for choosing the best assembly. This article introduces a new reference-free method based on Jellyfish software for evaluating genome assembly by kmers frequencies analysis. The proposed method sets up a correspondence between short reads obtained from sequencer and assembled genome, which allows a more accurate genome assembly assessing. The method was validated on different assemblies of Encephalitozoon cuniculi fungus organism. It was found that in most cases it correlates with reference-dependent metrics and could correctly identify the best assembly. Furthermore, an interconnection between assembly quality and standard reference-free metrics was not observed.

show abstract

“…Other methods are available for evaluating reference genomes, e.g. amosValidate [18] and ALE [19], however, these methods only assess assembly accuracy without correcting misassemblies. The resulting reference assembly represents the consensus genome of the population of cells used to generate the material.…”

Section: Genome Evaluationmentioning

confidence: 99%

PEPR: pipelines for evaluating prokaryotic references

et al. 2016

View full text Add to dashboard Cite

The rapid adoption of microbial whole genome sequencing in public health, clinical testing, and forensic laboratories requires the use of validated measurement processes. Well-characterized, homogeneous, and stable microbial genomic reference materials can be used to evaluate measurement processes, improving confidence in microbial whole genome sequencing results. We have developed a reproducible and transparent bioinformatics tool, PEPR, Pipelines for Evaluating Prokaryotic References, for characterizing the reference genome of prokaryotic genomic materials. PEPR evaluates the quality, purity, and homogeneity of the reference material genome, and purity of the genomic material. The quality of the genome is evaluated using high coverage paired-end sequence data; coverage, paired-end read size and direction, as well as soft-clipping Published in the topical collection featuring

show abstract

ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies

Abstract: ALE is released as open source software under the UoI/NCSA license at http://www.alescore.org. It is implemented in C and Python.

Cited by 168 publications

References 39 publications

Herbarium genomics: skimming and plastomics from archival specimens

Herbarium genomics: skimming and plastomics from archival specimens

A new method of evaluating genome assemblies based on kmers frequencies

PEPR: pipelines for evaluating prokaryotic references

Contact Info

Product

Resources

About