Machine learning prediction of resistance to subinhibitory antimicrobial concentrations from escherichia coli genomes

Sam Benkwitz-Bedford, Martin Palm, Talip Yasir Demirtas, Ville Mustonen, Anne Farewell, Jonas Warringer, Leopold Parts, Danesh Moradigaravand

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Escherichia coli is an important cause of bacterial infections worldwide, with multidrug-resistant strains incurring substantial costs on human lives. Besides therapeutic concentrations of antimicrobials in health care settings, the presence of subinhibitory antimicrobial residues in the environment and in clinics selects for antimicrobial resistance (AMR), but the underlying genetic repertoire is less well understood. Here, we used machine learning to predict the population doubling time and cell growth yield of 1, 407 genetically diverse E. coli strains expanding under exposure to three subinhibitory concentrations of six classes of antimicrobials from single- nucleotide genetic variants, accessory gene variation, and the presence of known AMR genes. We predicted cell growth yields in the held-out test data with an average correlation (Spearman's r ) of 0.63 (0.36 to 0.81 across concentrations) and cell doubling times with an average correlation of 0.59 (0.32 to 0.92 across concentrations), with moderate increases in sample size unlikely to improve predictions further. This finding points to the remaining missing heritability of growth under antimicrobial exposure being explained by effects that are too rare or weak to be captured unless sample size is dramatically increased, or by effects other than those conferred by the presence of individual single-nucleotide polymorphisms (SNPs) and genes. Predictions based on whole-genome information were generally superior to those based only on known AMR genes and were accurate for AMR resistance at therapeutic concentrations. We pinpointed genes and SNPs determining the predicted growth and thereby recapitulated many known AMR determinants. Finally, we estimated the effect sizes of resistance genes across the entire collection of strains, disclosing the growth effects for known resistance genes in each individual strain. Our results underscore the potential of predictive modeling of growth patterns from genomic data under subinhibitory concentrations of antimicrobials, although the remaining missing heritability poses a challenge for achieving the accuracy and precision required for clinical use.
Original languageEnglish (US)
Issue number4
StatePublished - Jul 1 2021
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2023-02-15


Dive into the research topics of 'Machine learning prediction of resistance to subinhibitory antimicrobial concentrations from escherichia coli genomes'. Together they form a unique fingerprint.

Cite this