Brian Alcock - The Comprehensive Antibiotic Resistance Database - Curating the Global Resistome

Описание к видео Brian Alcock - The Comprehensive Antibiotic Resistance Database - Curating the Global Resistome

Session: April 25 - Applications 1 - 02

Abstract: The Comprehensive Antibiotic Resistance Database (CARD; card.mcmaster.ca) is an ontologically-driven knowledgebase and bioinformatics resource on the molecular biology and chemical components of antimicrobial resistance (AMR). This is achieved by integrating the Antibiotic Resistance Ontology (ARO) with the CARD Model Ontology (MO), which is used to organize AMR gene (ARG) sequences, resistance-conferring mutation data and bioinformatic parameters for in silico ARG detection by CARD’s Resistance Gene Identifier (RGI) software. To preserve integrity over time, the ARO is routinely updated by a biocuration team through, for example, the addition of novel AMR genes or gene variants or the revision of existing ontology branches for clarity, accuracy and/or computational efficiency. While manual curation of the literature and sequences is a cornerstone of CARD’s curation philosophy, the volume of AMR scientific literature renders this approach time-consuming and impractical. We therefore developed CARD*Shark, an algorithm and software for computer-assisted AMR literature triage. The current iteration, CARD*Shark 3, identifies and prioritizes literature for review through a machine-learning methodology, which is then reviewed by a CARD biocurator. CARD thereby integrates continuous curation from multiple approaches: computer-assisted literature triage, identification of errors and oversights through public feedback such as our GitHub repository (https://github.com/arpcard/amr_curation), and targeted curation within collaborative projects or other efforts of specific foci. To date, the ARO includes over 6500 terms which combine with 5000 ARG reference sequences and almost 2000 resistance-associated variants sourced from over 3000 publications to produce the current version of CARD. This manually curated data is used as a baseline for in silico prediction of resistomes and ARG prevalence for over 300 pathogens. Here we provide an overview of CARD’s design, curation methodology and overall content scope, and illustrate how a computer-assisted curation approach improves our efficacy and accuracy.

Комментарии

Информация по комментариям в разработке