Understanding how the host and microbiome interact in both health and disease.
Jonathan Golob is an assistant professor of Infectious Disease at the University of Michgan. His research interests include how the host and microbiome interact in health and disease. He has a particular focus on the immunocompromised host. He also contributes to computational biology tool development, and is a proponent of reproducible science via workflows and containerized software.
Infectious Disease Fellowship, 2017
University of Washington
Internal Medicine Residency, 2013
University of Washington
Medical Degree, 2011
University of Washington
PhD in Pathology, 2009
University of Washington
BSc in Biomedical Engineer and Computer Science, 2001
Johns Hopkins University
Background : Steroid refractory acute graft-versus-host-disease (GVHD) after hematopoietic cell transplantation (HCT) is highly morbid …
BACKGROUND AND METHOD: We examined specimens from 111 HIV-infected participants virally suppressed on ART for a minimum of 5 years who had donated serial peripheral blood mononuclear cell (PBMC) specimens to the University of Washington/Fred Hutch Center for AIDS Research (CFAR) Specimen Repository. We determined the HIV proviral copy number per million PBMCs, corrected for CD4 cell count, in 477 specimens collected after a minimum of 5 years of follow-up and up to 15.5 years of clinical viral suppression. Generalized estimating equation regression was used to examine the association between the reservoir size and time, age at study entry, antiretroviral regimen, and risk factors for HIV acquisition. RESULTS AND CONCLUSION: We found that the inter-participant baseline HIV DNA level varied widely between 0.01 and 4.8 pol-copies per microgram genomic DNA and per CD4 cell number/micoliter; the HIV DNA level declined with time (half-life was estimated at 12 years, 95% confidence interval of 6.2-240 years); the HIV DNA level was lower for those who achieved viral suppression at a younger age; and the HIV DNA level was not affected by the specific antiretroviral regimen used to achieve and maintain suppression.
Short-read metagenomics use high-throughput sequencing to reveal the functional capability of microbial communities. The computational analysis of the raw short-read data is complicated by the shared functional domains in peptides, and the complex evolutionary origins of many peptides (including recombination and truncation), resulting in a given short read aligning equally well to many possible peptides. We find that a short read known to be from one peptide will on average align equally well to about 160 other peptide sequences not present in the sample. Here we describe an iterative algorithmic approach to successfully map reads to their true origin peptides and introduce a software package FAMLI that implements this algorithm. We demonstrate that FAMLI is able to identify peptides from a wide variety of metagenomes with a consistent precision of about 0.8, and recall of about 0.6, while retaining O(n) runtimes. This compares favorably to alternative approaches, including de novo assembly (that results in superior precision, but much inferior recall and much larger computational resource needs as compared to FAMLI), or hybrid taxonomic-identification approaches (that results in less consistent performance for extremely novel communities, as compared to FAMLI). Addressing the problem of short reads aligning equally well to hundreds of more peptides than are truly present in a sample is a key challenge any successful short-read-metagenomics software must address. We present an effective approach to mitigate this problem and improve the accuracy of functional metagenomic analysis.
Background: Graft-versus-host disease (GVHD) is common after allogeneic hematopoietic cell transplantation (HCT). Risk for death from GVHD has been associated with low bacterial diversity in the stool microbiota early after transplant; however, the specific species associated with GVHD risk remain poorly defined. Methods: We prospectively collected serial weekly stool samples from 66 patients who underwent HCT, starting pre-transplantation and continuing weekly until 100 days post-transplant, a total of 694 observations in HCT recipients. We used 16S rRNA gene polymerase chain reaction with degenerate primers, followed by high-throughput sequencing to assess the relative abundance of sequence reads from bacterial taxa in stool samples over time. Results: The gut microbiota was highly dynamic in HCT recipients, with loss and appearance of taxa common on short time scales. As in prior studies, GVHD was associated with lower alpha diversity of the stool microbiota. At neutrophil recovery post-HCT, the presence of oral Actinobacteria and oral Firmicutes in stool was positively correlated with subsequent GVHD; Lachnospiraceae were negatively correlated. A gradient of bacterial species (difference of the sum of the relative abundance of positive correlates minus the sum of the relative abundance of negative correlates) was most predictive (receiver operator characteristic area under the curve of 0.83) of subsequent severe acute GVHD. Conclusions: The stool microbiota around the time of neutrophil recovery post-HCT is predictive of subsequent development of severe acute GVHD in this study.
BACKGROUND: Microbiome studies commonly use 16S rRNA gene amplicon sequencing to characterize microbial communities. Errors introduced at multiple steps in this process can affect the interpretation of the data. Here we evaluate the accuracy of operational taxonomic unit (OTU) generation, taxonomic classification, alpha- and beta-diversity measures for different settings in QIIME, MOTHUR and a pplacer-based classification pipeline, using a novel software package: DECARD. RESULTS: In-silico we generated 100 synthetic bacterial communities approximating human stool microbiomes to be used as a gold-standard for evaluating the colligative performance of microbiome analysis software. Our synthetic data closely matched the composition and complexity of actual healthy human stool microbiomes. Genus-level taxonomic classification was correctly done for only 50.4-74.8% of the source organisms. Miscall rates varied from 11.9 to 23.5%. Species-level classification was less successful, (6.9-18.9% correct); miscall rates were comparable to those of genus-level targets (12.5-26.2%). The degree of miscall varied by clade of organism, pipeline and specific settings used. OTU generation accuracy varied by strategy (closed, de novo or subsampling), reference database, algorithm and software implementation. Shannon diversity estimation accuracy correlated generally with OTU-generation accuracy. Beta-diversity estimates with Double Principle Coordinate Analysis (DPCoA) were more robust against errors introduced in processing than Weighted UniFrac. The settings suggested in the tutorials were among the worst performing in all outcomes tested. CONCLUSIONS: Even when using the same classification pipeline, the specific OTU-generation strategy, reference database and downstream analysis methods selection can have a dramatic effect on the accuracy of taxonomic classification, and alpha- and beta-diversity estimation. Even minor changes in settings adversely affected the accuracy of the results, bringing them far from the best-observed result. Thus, specific details of how a pipeline is used (including OTU generation strategy, reference sets, clustering algorithm and specific software implementation) should be specified in the methods section of all microbiome studies. Researchers should evaluate their chosen pipeline and settings to confirm it can adequately answer the research question rather than assuming the tutorial or standard-operating-procedure settings will be adequate or optimal.