Discovery of a structural class of antibiotics with explainable deep learning


Data availability

Data generated from chemical screens, machine learning models and whole-genome sequencing experiments are available as Supplementary Data 1–4. Source Data are available for Figs. 4 and 5 and Extended Data Figs. 8 and 9. Data from whole-genome sequencing reads have been deposited on BioProject under accession number PRJNA1026995. A copy of model predictions for the Mcule purchasable database (ver. 200601) and the Broad Institute database used in this work is available at Source data are provided with this paper.

Code availability

Chemprop is available at The Chemprop checkpoints for the final antibiotic activity, cytotoxicity, and proton motive force-alteration models, along with a code platform for performing and adapting the analyses developed in this work, are available at and


Download references


The authors thank the past and present members of the Collins laboratory for helpful discussions; members of the Broad Institute Center for the Development of Therapeutics (CDoT) for helpful feedback; the Microbial Genome Sequencing Center for assistance with sequencing; the Harvard Center for Mass Spectrometry for assistance with LC–MS experiments; S. Gould and R. Singh for medicinal chemistry feedback; A. Vrcic and T. Dawson for assistance with compound management; A. Graveline for assistance with mouse experiments; and Z. Gitai for E. coli strains RFM795 and JW5503-KanS. F.W. was supported by the James S. McDonnell Foundation and the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under award number K25AI168451. A.K. was supported by the Swiss National Science Foundation under grant number SNSF_ 203071. A.M.E. and A.L.M. were supported by federal funds from the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under grant number U19AI110818 to the Broad Institute. J.M.S. was supported by the Banting Fellowships Program (393360). L.D.R. was supported by the Volkswagen Foundation. J.J.C. was supported by the Defense Threat Reduction Agency (grant number HDTRA12210032), the National Institutes of Health (grant number R01-AI146194), and the Broad Institute of MIT and Harvard. This work is part of the Antibiotics-AI Project, which is directed by J.J.C. and supported by the Audacious Project, Flu Lab, LLC, the Sea Grape Foundation, R. Zander and H. Wyss for the Wyss Foundation, and an anonymous donor.

Author information

Author notes

  1. Jonathan M. Stokes

    Present address: Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research and David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada

  2. These authors contributed equally: Felix Wong, Erica J. Zheng

Authors and Affiliations

  1. Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA

    Felix Wong, Erica J. Zheng, Jacqueline A. Valeri, Melis N. Anahtar, Satotaka Omori, Andres Cubillos-Ruiz, Aarti Krishnan, Abigail L. Manson, Ashlee M. Earl, Jonathan M. Stokes & James J. Collins

  2. Institute for Medical Engineering and Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA

    Felix Wong, Jacqueline A. Valeri, Andres Cubillos-Ruiz, Aarti Krishnan, Jonathan M. Stokes & James J. Collins

  3. Integrated Biosciences, San Carlos, CA, USA

    Felix Wong, Satotaka Omori & Alicia Li

  4. Program in Chemical Biology, Harvard University, Cambridge, MA, USA

    Erica J. Zheng

  5. Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA

    Erica J. Zheng, Jacqueline A. Valeri, Nina M. Donghia, Andres Cubillos-Ruiz & James J. Collins

  6. Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard, Cambridge, MA, USA

    Wengong Jin

  7. Leibniz Institute of Polymer Research and the Max Bergmann Center of Biomaterials, Dresden, Germany

    Jens Friedrichs, Ralf Helbig & Lars D. Renner

  8. Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA

    Behnoush Hajian, Dawid K. Fiejtek, Florence F. Wagner & Holly H. Soutter


F.W. conceived research, designed all models and experiments, performed or directed all experiments and analysis, wrote the paper and supervised research. E.J.Z., S.O. and A.L. performed screening experiments and analysis. J.A.V. and W.J. assisted with data interpretation and analysis, and W.J. developed and implemented the MCTS rationale extraction algorithm. N.M.D., M.N.A. and A.C.-R. performed mouse experiments and analysis. M.N.A. and A.K. performed screening experiments and assisted with data interpretation. J.F. and R.H. performed cellular physiology experiments and analysis. A.L.M. and A.M.E. performed genomic analysis and assisted with data interpretation. B.H., H.H.S. and J.M.S. assisted with data interpretation. D.K.F. and F.F.W. assisted with chemical testing experiments. L.D.R. performed cellular physiology experiments and analysis and assisted with data interpretation. J.J.C. supervised research. All authors assisted with manuscript editing.

Corresponding author

Correspondence to
James J. Collins.

Ethics declarations

Competing interests

J.J.C. is an academic co-founder and scientific advisory board chair of EnBiotix, an antibiotic drug discovery company, and Phare Bio, a non-profit venture focused on antibiotic drug development. J.J.C. is also an academic co-founder and board member of Cellarity and the founding scientific advisory board chair of Integrated Biosciences. J.M.S. is scientific co-founder and scientific director of Phare Bio. F.W. is a co-founder of Integrated Biosciences. S.O. and A.L. contributed to this work as employees of Integrated Biosciences, and S.O. may have an equity interest in Integrated Biosciences. F.W. and J.J.C. have filed a patent based on the results of this work. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Molecular weight distribution of the 39,312 compounds screened.

Data are from an original set of 39,312 compounds containing most known antibiotics, natural products, and structurally diverse molecules, with molecular weights between 40 Da and 4,200 Da. Frequency is shown on a log scale.

Extended Data Fig. 2 Comparison of deep learning models for predicting antibiotic activity.

a,b, Precision-recall curves for predictions of antibiotic activity, for an ensemble of 10 Chemprop models without RDKit features (a) and the best-performing random forest classifier model based on Morgan fingerprints (b), trained and tested using data from a screen of 39,312 molecules (Fig. 1 of the main text). The black dashed line represents the baseline fraction of active compounds in the training set (1.3%). Blue curves and the 95% confidence interval indicate the variation generated by bootstrapping. AUC, area under the curve.

Extended Data Fig. 3 Comparison of deep learning models for predicting human cell cytotoxicity.

a,b, Precision-recall curves for predictions of HepG2 cytotoxicity, for an ensemble of 10 Chemprop models without RDKit features (a) and the best-performing random forest classifier model based on Morgan fingerprints (b), trained and tested using data from a screen of 39,312 molecules (Fig. 1 of the main text). The black dashed line represents the baseline fraction of active compounds in the training set (8.5%). Blue curves and the 95% confidence interval indicate the variation generated by bootstrapping. AUC, area under the curve. c,d, Precision-recall curves for predictions of HSkMC cytotoxicity, for an ensemble of 10 Chemprop models without RDKit features (c) and the best-performing random forest classifier model based on Morgan fingerprints (d), trained and tested using data from a screen of 39,312 molecules (Fig. 1 of the main text). The black dashed line represents the baseline fraction of active compounds in the training set (3.8%). Blue curves and the 95% confidence interval indicate the variation generated by bootstrapping. e,f, Precision-recall curves for predictions of IMR-90 cytotoxicity, for an ensemble of 10 Chemprop models without RDKit features (e) and the best-performing random forest classifier model based on Morgan fingerprints (f), trained and tested using data from a screen of 39,312 molecules (Fig. 1 of the main text). The black dashed line represents the baseline fraction of active compounds in the training set (8.8%). Blue curves and the 95% confidence interval indicate the variation generated by bootstrapping.

Extended Data Fig. 4 Visualizing chemical space across different prediction score thresholds.

a,b, t-Distributed neighbor embedding (t-SNE) plot of compounds with high and low antibiotic prediction scores, in addition to compounds in the training set, for different prediction score thresholds. The plot shows the chemical similarity or dissimilarity of various compounds, and active compounds in the training set (red dots) are seen to largely separate compounds with high prediction scores (green, black, and purple dots) from compounds with low prediction scores (brown dots).

Extended Data Fig. 5 Examples of rationale calculations using Monte-Carlo tree search.

a, Illustration of the MCTS forward pass using compound 1. The figure shows three possible search paths from the root (compound 1) by deleting peripheral bonds or rings (highlighted in red). Due to space limitations, only three steps from the root are shown. b, Illustration of a complete search path from the root (compound 1) to a leaf node (the rationale). Chemprop is used to predict the activity of each leaf node, and these predictions are used to make updates to the statistics of each intermediate node in the backward pass.

Extended Data Fig. 6 Maximal common substructure identification reveals known antibiotic classes, but are less predictive than Chemprop rationales across all hits.

a,b, Rank-ordered numbers of hits (a) and non-hits (b) associated with maximal common substructures (MCSs) identified by a grouping method. Here, any hit associated with any of the MCSs shown shares a minimum of 12 atoms with the MCS. Dashed lines in MCSs indicate either single or double bonds. Each green or brown bar shows the prediction score of each MCS viewed as a molecule in its own right. Where bars are thin, the corresponding MCS prediction scores are approximately zero (including all brown bars in (b)). c,d, Similar to (a), but here, any hit associated with any of the MCSs shown shares a minimum of 10 (c) or 15 (d) atoms with the MCS. e, Illustration of the rationales (red) determined using a Monte Carlo tree search for example hits (black) associated with MCSs A1-A12. No hit associated with MCS A12 possessed a rationale. f, MCS prediction scores (blue bars) and the average prediction scores of all rationales of all hits associated with MCSs A1-A12 (red bars). Where blue bars are thin, the corresponding MCS prediction scores are approximately zero. No hit associated with MCS A12 possessed a rationale.

Extended Data Fig. 7 Closest active training set compounds to, and selectivities of, four validated hits associated with rationale groups G1-G5.

a, Closest active compounds (right), as measured by Tanimoto similarity, are from the training set of 39,312 compounds. Compounds are colored according to associated rationale groups (as indicated in parentheses), and the identifier and Tanimoto similarity score of each closest active compound are displayed. b, S. aureus MIC and human cell IC50 values of the four compounds in (a), shown on a log scale. Bars show the means of two biological replicates (points) and are colored by the bacterial strain, human cell type, or media condition tested. Asterisks indicate values larger than 128 µg/mL.

Extended Data Fig. 8 Comparison of MICs of different compounds against methicillin-susceptible and methicillin-resistant S. aureus, and eradication of kanamycin persisters by treatment with compounds 1 and 2.

a, MICs of various antibiotics against S. aureus RN4220 (black) and S. aureus USA300 (blue) on a log scale. Bars show the mean of two biological replicates (individual points). b, Survival curves of B. subtilis 168 after combination treatment with kanamycin and compounds 1 and 2, respectively, as determined by plating and CFU counting. Initial CFU values are ~107. Each point is representative of the mean of two biological replicates. Cultures treated with kanamycin in addition to compounds 1 and 2 were eradicated after 24 h (CFU/mL = 0), and these values were truncated to a log survival value of −7 on this plot.

Source Data

Extended Data Fig. 9 Toxicity, chemical properties, and in vivo efficacy of compounds 1 and 2.

a, Fractional hemolysis measurements of human red blood cells (RBCs) treated with compounds 1 and 2 at the indicated final concentrations. Vehicle (1% DMSO) was used as a negative control, and Triton X-100, a detergent, was used as a positive control. Black points indicate values from two biological replicates, and red bars indicate average values. b, Ferrous iron chelation measurements of compounds 1 and 2. Vehicle (1% DMSO) was used as a negative control, and ethylenediaminetetraacetic acid (EDTA), an iron chelator, was used as a positive control. Black points indicate values from two biological replicates, and gray bars indicate average values. c, Ames test mutagenesis measurements of the fractions of revertant S. typhimurium TA100 cultures treated with compounds 1 and 2 at the indicated final concentrations. Vehicle (1% DMSO) was used as a negative control, and 5 µg/mL sodium azide was used as a positive control. Black points indicate values from two biological replicates, and purple bars indicate average values. Higher fractions of revertant cultures indicate higher mutagenic potential (inset). d, Chemical stability of compound 1 in various buffers as a function of incubation time at 37 °C. Values are normalized to the mean measurement at time zero, and each point is representative of the mean of two biological replicates. Error bars indicate the full range of values arising from two biological replicates. e, Photographs of WoundSkin models 24 h after topical treatment with compound 1 (1%) or DMSO vehicle. Images are representative of six biological replicates in each treatment group. Scale bar, 2 mm. f, Illustration of the in vivo study of a neutropenic mouse wound infection model using MRSA CDC 563 shown in Fig. 5a of the main text. g, Illustration of the in vivo study of a neutropenic mouse thigh infection model using MRSA CDC 706 shown in Fig. 5b of the main text.

Source Data

Extended Data Fig. 10 Exploration of a structural class through structure-activity relationships.

a, The rationale of compounds 1 and 2, overlaid with chemical modifications (R1-R8) that encompass all compounds used to test SAR (Supplementary Data 2). SAR, structure-activity relationships. b, Analogues of compounds 1 and 2 found to have varying degrees of activity against S. aureus. Corresponding MIC and IC50 values are representative of two biological replicates.

Supplementary information

Supplementary Information

This file contains Supplementary Notes 1-4, Supplementary References, and Supplementary Tables 1-9.

Reporting Summary

Supplementary Data 1

Training set of 39,312 compounds tested for antibiotic activity and cytotoxicity, in addition to 200 RDKit features used to augment the models and cytotoxicity testing results. Antibiotic activity was defined using a 20% relative mean growth cut-off in S. aureus RN4220. Cytotoxicity was defined using a 90% relative mean cell viability cut-off in HepG2 cells, HSkMCs, and IMR-90 cells. Data are from two biological replicates.

Supplementary Data 2

Model predictions, rationales, and procured compounds from the ensemble Chemprop model. Compound SMILES strings and corresponding prediction scores are shown for all 3,646 hits, out of 12,076,365 compounds whose antibiotic activity and cytotoxicity against human cells were predicted. Rationale and scaffold SMARTS strings, vendor catalogue information for all 283 procured and tested compounds shown in Fig. 3e of the main text, and vendor catalogue information for all 17 procured and tested compounds as part of the structure–activity relationship analyses shown in Extended Data Fig. 10 are also provided, in addition to the MCS SMARTS strings for the analyses described in Supplementary Note 2 and Extended Data Fig. 6.

Supplementary Data 3

Mutations arising in cells exposed to compounds. For each compound, results are shown for at least two independently passaged or suppressor mutant populations. All mutations that passed mapping filters are listed here. Black boxes highlight mutations in similar regions across sequencing replicates either present in the same gene, or present in an adjacent gene or intergenic region.

Supplementary Data 4

Training and test data for models predicting proton motive force-altering activity. Proton motive force-altering activity was defined using a 30% relative mean fluorescence change in S. aureus RN4220 in the presence of DiSC3(5), a proton motive force-sensitive dye. 475 antibacterial compounds from Supplementary Data 1 were tested, and all inactive antibacterial compounds were assumed to not alter proton motive force. Data are from two biological replicates.

Peer Review File

Source data

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wong, F., Zheng, E.J., Valeri, J.A. et al. Discovery of a structural class of antibiotics with explainable deep learning.
Nature (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


