2017
DOI: 10.1109/tcbb.2016.2586070
|View full text |Cite
|
Sign up to set email alerts
|

A Survey of Software and Hardware Approaches to Performing Read Alignment in Next Generation Sequencing

Abstract: Computational genomics is an emerging field that is enabling us to reveal the origins of life and the genetic basis of diseases such as cancer. Next Generation Sequencing (NGS) technologies have unleashed a wealth of genomic information by producing immense amounts of raw data. Before any functional analysis can be applied to this data, read alignment is applied to find the genomic coordinates of the produced sequences. Alignment algorithms have evolved rapidly with the advancement in sequencing technology, st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(17 citation statements)
references
References 76 publications
0
17
0
Order By: Relevance
“…The variant calling bioinformatics pipeline is contained within the data pre-processing and data analysis stages of a much larger bioinformatics-based research study as illustrated in the upper panel. Data file formats at each step are presented in the lower panel ( Al Kawam et al , 2017 ; Lightbody et al , 2019 ). *Platform-specific raw sequence output either .BAM or .FASTQ or .HDF5 ( NCBI, 2019 ).…”
Section: Genome Sequencing and Genomic Data Analysismentioning
confidence: 99%
“…The variant calling bioinformatics pipeline is contained within the data pre-processing and data analysis stages of a much larger bioinformatics-based research study as illustrated in the upper panel. Data file formats at each step are presented in the lower panel ( Al Kawam et al , 2017 ; Lightbody et al , 2019 ). *Platform-specific raw sequence output either .BAM or .FASTQ or .HDF5 ( NCBI, 2019 ).…”
Section: Genome Sequencing and Genomic Data Analysismentioning
confidence: 99%
“…BWT was a text compression algorithm which was later used for genomic classification applications [3]. Along with FM-index this algorithm has found applications in genomic industry.…”
Section: Burrows-wheeler Transform With Fm-indexmentioning
confidence: 99%
“…3) The transformation result required is the last column of the strings obtained [3]. Illustrating BWT with an example considering the string "BANANA$" ($ is the string terminator): Table 2: Sorting Therefore, BWT(BANANA$)=ANNB$AA BWT is particularly useful as it creates an array whose rows are formed by cyclic shifting of the target string, that may be a DNA sequence.…”
Section: Burrows-wheeler Transform With Fm-indexmentioning
confidence: 99%
“…Therefore, bioinformatics researchers started to think about new ways to efficiently manage and analyze such enormous amount of data. The first crucial step in the analysis of next-generation sequencing (NGS) data, posterior to quality control and filtering steps, is alignment (mapping) of generated sequencing reads to the respective reference [6]. However, this step is biased by many errors due to the following reasons [7]: (1) a reference genome is generally long (* billions) and presents complex regions such as repetitive elements (repetitive regions are usually masked because there is no consensus about how to deal with them, yet), (2) reads are short in length (typically, 50-150 bp), causing issues with efficiency and accuracy, aligning more likely in multiple locations rather than to unique positions in the reference genome, (3) the subject genome could inherently be different from the reference genome due to acquired alterations over time.…”
Section: Introductionmentioning
confidence: 99%