Bio-Strings: A Relational Database Data-Type for Dealing with Large Biosequences

Lifschitz, Sérgio; Hæusler, Edward Hermann; Catanho, Marcos; Miranda, Antônio Basílio de; Armas, Elvismary Molina de; Heine, Alexandre; Moreira, Sergio G. M. P.; Tristão, Cristian

doi:10.3390/biotech11030031

Cited by 16 publications

(14 citation statements)

References 21 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For GC content analysis, a custom R function, calculate-GC-Content, was developed using the ‘Biostrings’ package 89 . This function read the sequences from FASTA files and calculated the GC content by aggregating guanine and cytosine nucleotide counts across all sequences, measuring the genomic GC proportion of the bacteria.…”

Section: Methodsmentioning

confidence: 99%

ACaenorhabditis elegansbased system for high-throughput functional phenotyping of human gut microbiota

Ambat,

Valappil,

Ghimire

et al. 2024

Preprint

View full text Add to dashboard Cite

SummaryBottoms-up approach of mono or poly colonizing microbes in germfree model is an important tool for mechanistic understanding of human gut microbiota. However, doing this in models such as germfree mouse is expensive and time consuming. To address this problem, we developed aCaenorhabditis elegansbased screening system. We used a gut microbiota culture collection that represents more than 70% functional capacity of the human gut microbiome to anaerobically colonizeC. elegans. We chose colonization resistance as the phenotype of the microbiome for further screening and found that half of the strains, previously identifiedin vitroas inhibitingClostridioides difficile, also did so inC. elegans. When validated using germ-free mouse model, results were in concordance with that obtained fromC. elegansmodel. Our system therefore could be used for screening large number of bacterial species to better understand host-microbiome interaction.

show abstract

Section: Methodsmentioning

confidence: 99%

ACaenorhabditis elegansbased system for high-throughput functional phenotyping of human gut microbiota

Ambat,

Valappil,

Ghimire

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“… 25 https://doi.org/10.1093/bioinformatics/bty633 Biostrings, v2.66.0 Lifschitz et al. 26 https://doi.org/10.18129/B9.bioc.Biostrings GGally, v2.1.2 Schloerke et al. 27 https://ggobi.github.io/ggally/ ggseqlogo, v0.1 Wagih et al.…”

Section: Key Resources Tablementioning

confidence: 99%

Protocol for fast clonal family inference and analysis from large-scale B cell receptor repertoire sequencing data

Wang,

Cai,

Wang

et al. 2024

STAR Protocols

View full text Add to dashboard Cite

“…Variant-transcript pairs with a PTC conforming to any of the above rules will be annotated to escape NMD, but results for all rules are reported individually by aenmd; this allows users to focus on subsets of rules, if desired. aenmd is implemented in the R programming language [45], making use of the VariantAnnotation [46] and vcfR [47] packages for importing and exporting variants from vcf files, and the Biostrings [48] and Ge-nomicRanges [49] packages for calculating rules. An index containing all PTC-generating SNVs is pre-calculated for a given transcript set and stored in a trie data structure for lookup, using the triebeard package.…”

Section: Annotating Escape From Nmdmentioning

confidence: 99%

aenmd: Annotating escape from nonsense-mediated decay for transcripts with protein-truncating variants

Klonowski

Liang

Akdemir

et al. 2023

Preprint

View full text Add to dashboard Cite

Summary: DNA changes that cause premature termination codons (PTCs) represent a large fraction of clinically relevant pathogenic genomic variation. Typically, PTCs induce a transcript's degradation by nonsense-mediated mRNA decay (NMD) and render such changes loss-of-function alleles. However, certain PTC-containing transcripts escape NMD and can exert dominant-negative or gain-of-function (DN/GOF) effects. Therefore, systematic identification of human PTC-causing variants and their susceptibility to NMD contributes to the investigation of the role of DN/GOF alleles in human disease. Here we present aenmd, a software for annotating PTC-containing transcript-variant pairs for predicted escape from NMD. aenmd is user-friendly and self-contained. It offers functionality not currently available in other methods and is based on established and experimentally validated rules for NMD escape; the software is designed to work at scale, and to integrate seamlessly with existing analysis workflows. We applied aenmd to variants in the gnomAD, Clinvar, and GWAS catalog databases and report the prevalence of human PTC-causing variants in these databases, and the subset of these that could exert DN/GOF effects via NMD escape. Availability and implementation: aenmd is implemented in the R programming language. Code is available on GitHub as an R package (github.com/kostkalab/aenmd.git), and as a containerized command-line interface (github.com/kostkalab/aenmd_cli.git).

show abstract

Bio-Strings: A Relational Database Data-Type for Dealing with Large Biosequences

Cited by 16 publications

References 21 publications

ACaenorhabditis elegansbased system for high-throughput functional phenotyping of human gut microbiota

ACaenorhabditis elegansbased system for high-throughput functional phenotyping of human gut microbiota

Protocol for fast clonal family inference and analysis from large-scale B cell receptor repertoire sequencing data

aenmd: Annotating escape from nonsense-mediated decay for transcripts with protein-truncating variants

Contact Info

Product

Resources

About