The performance of local and global alignment algorithms circumscribe the alignment of sequences shorter than 30 nucleotides. Regardless of the computational approach applied, a series of diverse limitations accumulate as the length and similarity between the aligned sequences decreases, often resulting in alignment biases: local alignments have difficulties reporting correctly the number of matches and mismatches (m/mm) flanking the seed; global alignments lengthen the total alignment size and introduce gaps artificially. These biases compromise the accuracy of computational analysis of short sequences. Here we report ExtendAlign, a computational tool that overhauls and corrects the aforementioned bias generated by local and global alignments. ExtendAlign provides an end-to-end report of the accurate number of m/mm for all the nucleotides that flank a local alignment of short sequences, thus eliminating the artificial lengthening of the query size, the introduction of gaps, and the failure in reporting flanking m/mm. Since ExtendAlign combines the refinement and strength of global and local multiple sequence alignments, it delivers exceptional accuracy in correcting the alignment of dissimilar sequences in the range of ~35-50% similarity -also known as the twilight zone; indicating it can be adopted regularly whenever high accuracy is required for short-sequence alignments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.