The determination of lineages from strain-based molecular genotyping information is an important problem in tuberculosis. Mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) typing is a commonly used molecular genotyping approach that uses counts of the number of times pre-specified loci repeat in a strain. There are three main approaches for determining lineage based on MIRU-VNTR data - one based on a direct comparison to the strains in a curated database, and two others, on machine learning algorithms trained on a large collection of labeled data. All existing methods have limitations. The direct approach imposes an arbitrary threshold on how much a database strain can differ from a given one to be informative. On the other hand, the machine learning-based approaches require a substantial amount of labeled data. Notably, all three methods exhibit suboptimal classification accuracy without additional data. We explore several computational approaches to address these limitations. First, we show that eliminating the arbitrary threshold improves the performance of the direct approach. Second, we introduce RuleTB, an alternative direct method that proposes a concise set of rules for determining lineages. Lastly, we propose StackTB, a machine learning approach that requires only a fraction of the training data to outperform the accuracy of both existing machine learning methods. Our approaches demonstrate superior performance on a training dataset collected in New York City over 10 years, and the improvement in performance translates to a held-out testing set. We conclude that our methods provide opportunities for improving the determination of pathogenic lineages based on MIRU-VNTR data.
The study of host-pathogen co-evolution is fundamental to understanding the emergence and spread of infectious diseases. The obligate human pathogen of the Mycobacterium tuberculosis complex (Mtbc) separates genetically into nine lineages with distinct patterns of geographical distribution that in some cases parallel that of human subpopulations. Based on these observations, geographically restricted Mtbc lineages have been hypothesized to be niche specialists that preferentially infect particular human subpopulations, but this is yet to be confirmed while controlling for social networks and risk of disease among exposed hosts. To address this question, we used a multi-site cohort of tuberculosis index cases with pathogen sequence data and linked contacts. Our data show that strains of specialist (spec) Mtbc lineages L1,L2spec,L3,L4spec,L5,L6 are intrinsically less transmissible than generalist Mtbc lineages (L2gen,L4gen) across Western European and North American cosmopolitan populations. Comparing transmissibility between sympatric and allopatric host-pathogen pairs, we found the first controlled evidence for co-adaptation between Mtbc strains and their human hosts; allopatric host-pathogen exposures had a 32% decrease in the odds of infection among contacts compared with sympatric exposures. We measured 10-fold decreased phagocytosis and growth rates of L6 specialist strains compared to L4gen in in vitro allopatric macrophage infections. In conclusion, long term co-adaptation between Mtbc strains and humans has resulted in differential transmissibility between allopatric and sympatric hosts for the specialist lineages. Understanding the specific genetic and immunological underpinnings of this co-adaptation may inform rational vaccine design and TB control.
Understanding tuberculosis (TB) transmission chains can help public health staff target their resources to prevent further transmission, but currently there are few tools to automate this process. We have developed the Logically Inferred Tuberculosis Transmission (LITT) algorithm to systematize the integration and analysis of whole-genome sequencing, clinical, and epidemiological data. Based on the work typically performed by hand during a cluster investigation, LITT identifies and ranks potential source cases for each case in a TB cluster. We evaluated LITT using a diverse dataset of 534 cases in 56 clusters (size range: 2–69 cases), which were investigated locally in three different U.S. jurisdictions. Investigators and LITT agreed on the most likely source case for 145 (80%) of 181 cases. By reviewing discrepancies, we found that many of the remaining differences resulted from errors in the dataset used for the LITT algorithm. In addition, we developed a graphical user interface, user's manual, and training resources to improve LITT accessibility for frontline staff. While LITT cannot replace thorough field investigation, the algorithm can help investigators systematically analyze and interpret complex data over the course of a TB cluster investigation.Code available at:https://github.com/CDCgov/TB_molecular_epidemiology/tree/1.0; https://zenodo.org/badge/latestdoi/166261171.
BACKGROUND: We have updated the epidemiology of tuberculosis (TB) among healthcare personnel (HCP) in New York City (NYC), USA, during a period of declining TB burden.METHODS: Using routinely collected Health Department data for NYC TB cases from 2001 to 2014, we conducted
a retrospective descriptive analysis. P values were calculated using Pearson's χ2 or Fisher's exact test for categorical data; Wilcoxon rank-sum test was used to compare medians. We used the Cochran-Armitage test for trend and linear regression for trend analyses.RESULTS:
HCP accounted for 6% of adults with TB throughout the study period and were more likely than other adults to be female (68% vs. 37%, P ≤ 0.0001), have extrapulmonary-only disease (31% vs. 23%, P ≤ 0.0001), have an isolate with multidrug resistance (4% vs. 2%, P =
0.0211), and report a previous history of latent TB infection (LTBI) (51% vs. 23%, P ≤ 0.0001). Compared to non-US-born HCP, US-born HCP were more likely to have HIV infection (18% vs. 8%, P = 0.0011) or a genotypically clustered isolate (67% vs. 37%, P ≤ 0.0001)
and less likely to report history of prior LTBI (43% vs. 54%, P = 0.0128).CONCLUSIONS: Further research is needed to explore transmission and occupational risk among HCP. New approaches are needed to optimize completion of prophylaxis for HCP with LTBI.
Improving access to TB care for foreign-born patients in NYC requires strategies that address specific social, economic and structural barriers. Improving linkages between private providers and public health initiatives is a key challenge. Health care providers' commitment to foreign-born communities is a significant resource.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.