Abstract Background The spread of antibiotic resistance has become one of the most urgent threats to global health, which is estimated to cause 700,000 deaths each year globally. Its surrogates, antibiotic resistance genes (ARGs), are highly transmittable between food, water, animal, and human to mitigate the efficacy of antibiotics. Accurately identifying ARGs is thus an indispensable step to understanding the ecology, and transmission of ARGs between environmental and human-associated reservoirs. Unfortunately, the previous computational methods for identifying ARGs are mostly based on sequence alignment, which cannot identify novel ARGs, and their applications are limited by currently incomplete knowledge about ARGs. Results Here, we propose an end-to-end Hierarchical Multi-task Deep learning framework for ARG annotation (HMD-ARG). Taking raw sequence encoding as input, HMD-ARG can identify, without querying against existing sequence databases, multiple ARG properties simultaneously, including if the input protein sequence is an ARG, and if so, what antibiotic family it is resistant to, what resistant mechanism the ARG takes, and if the ARG is an intrinsic one or acquired one. In addition, if the predicted antibiotic family is beta-lactamase, HMD-ARG further predicts the subclass of beta-lactamase that the ARG is resistant to. Comprehensive experiments, including cross-fold validation, third-party dataset validation in human gut microbiota, wet-experimental functional validation, and structural investigation of predicted conserved sites, demonstrate not only the superior performance of our method over the state-of-art methods, but also the effectiveness and robustness of the proposed method. Conclusions We propose a hierarchical multi-task method, HMD-ARG, which is based on deep learning and can provide detailed annotations of ARGs from three important aspects: resistant antibiotic class, resistant mechanism, and gene mobility. We believe that HMD-ARG can serve as a powerful tool to identify antibiotic resistance genes and, therefore mitigate their global threat. Our method and the constructed database are available at http://www.cbrc.kaust.edu.sa/HMDARG/.
Bibliographical noteKAUST Repository Item: Exported on 2021-02-11
Acknowledged KAUST grant number(s): FCC/1/1976-04, FCC/1/1976-06, FCC/1/1976-17, FCC/1/1976-18, FCC/1/1976-23, FCC/1/1976-25, FCC/1/1976-26, URF/1/3450-01, URF/1/4098-01-01, REI/1/0018-01-01
Acknowledgements: Figure 1 was created by Heno Hwang, scientific illustrator at King Abdullah University of Science and Technology (KAUST). We thank the review greatly for the thorough and detailed comments, which have improved the quality of the manuscript significantly.