2020
DOI: 10.1038/s41436-019-0643-6
|View full text |Cite
|
Sign up to set email alerts
|

AVADA: toward automated pathogenic variant evidence retrieval directly from the full-text literature

Abstract: The primary literature on human genetic diseases with high penetrance includes descriptions of large numbers of pathogenic variants that can be essential for clinical diagnosis. Variant databases such as ClinVar and HGMD collect pathogenic variants by manual curation of either voluntary submissions or the published literature. AVADA (Automatically curated VAriant DAtabase) represents the first automated tool designed to construct a comprehensive database of highly penetrant genetic variants directly from full-… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 25 publications
(19 citation statements)
references
References 43 publications
0
19
0
Order By: Relevance
“…Researchers at HGMD have been involved for several years in attempts to automatically extract mutation data from the literature. We recently contributed towards the Automatic Variant Evidence Database (AVADA), a novel machine learning tool that uses natural language processing to automatically identify pathogenic genetic variant evidence in full-text primary literature (Birgmeier et al 2020 ). AVADA automatically retrieved 58% of the likely disease-causing variants deposited in HGMD.…”
Section: Automated Mutation Retrievalmentioning
confidence: 99%
“…Researchers at HGMD have been involved for several years in attempts to automatically extract mutation data from the literature. We recently contributed towards the Automatic Variant Evidence Database (AVADA), a novel machine learning tool that uses natural language processing to automatically identify pathogenic genetic variant evidence in full-text primary literature (Birgmeier et al 2020 ). AVADA automatically retrieved 58% of the likely disease-causing variants deposited in HGMD.…”
Section: Automated Mutation Retrievalmentioning
confidence: 99%
“…Five new tracks were created to support the assessment of sequence variants in a clinical context: gnomAD Constraint Metrics (metrics of pathogenicity per-gene and transcript regions) (Variation Group) ( 8 ); gnomAD Structural Variants (allele frequencies of SVs in the common population) (Variation Group) ( 8 ); dbVar Curated Common Structural Variants (Variation Group) ( 9 ); Automatic Variant evidence Database (AVADA) variants extracted from full-text publications (Phenotype and Literature Group) ( 10 ); the ClinGen track collection, including Gene Dosage Sensitivity (haploinsufficiency and triplosensitivity) (Phenotype and Literature Group) ( 11 ), and Problematic Regions (regions known to cause short-read sequencing analysis artifacts) (Mapping and Sequencing Group).…”
Section: Annotations and Visualizationsmentioning
confidence: 99%
“…For example, Birgmeier and co-authors developed an end-to-end machine learning tool, named AVADA, for the automatic retrieval of variant evidence directly from full-text literature. 39 Suppose we can accumulate enormous datasets of evidence-related sentences or figures, in that case, it is possible to apply machine-learning approaches in the future for evidence retrieval and to automate the remaining ACMG/AMP rules in the next version of VIP-HL. In the meantime, our interface enables curators to manually activate the relevant codes after manual literature curation.…”
Section: Discussionmentioning
confidence: 99%