Abstract
The rapid accumulation of molecular data motivates development of innovative approaches to computationally characterize sequences, structures and functions of biological and chemical molecules in an efficient, accessible and accurate manner. Notwithstanding several computational tools that characterize protein or nucleic acids data, there are no one-stop computational toolkits that comprehensively characterize a wide range of biomolecules. We address this vital need by developing a holistic platform that generates features from sequence and structural data for a diverse collection of molecule types. Our freely available and easy-to-use iFeatureOmega platform generates, analyzes and visualizes 189 representations for biological sequences, structures and ligands. To the best of our knowledge, iFeatureOmega provides the largest scope when directly compared to the current solutions, in terms of the number of feature extraction and analysis approaches and coverage of different molecules. We release three versions of iFeatureOmega including a webserver, command line interface and graphical interface to satisfy needs of experienced bioinformaticians and less computer-savvy biologists and biochemists. With the assistance of iFeatureOmega, users can encode their molecular data into representations that facilitate construction of predictive models and analytical studies. We highlight benefits of iFeatureOmega based on three research applications, demonstrating how it can be used to accelerate and streamline research in bioinformatics, computational biology, and cheminformatics areas.
Original language | English (US) |
---|---|
Journal | Nucleic acids research |
DOIs | |
State | Published - May 7 2022 |
Bibliographical note
KAUST Repository Item: Exported on 2022-05-09Acknowledged KAUST grant number(s): BAS/1/1624-01-01, URF/1/1976-04-01, URF/1/4352-01-01, URF/1/4379-01-0, URF/1/4663-01-01
Acknowledgements: National Natural Science Foundation of China [32170677]; National Health and Medical Research Council of Australia (NHMRC) [APP1127948 and APP1144652]; Young Scientists Fund of the National Natural Science Foundation of China [32101797]; Hainan Yazhou Bay Seed Laboratory of China [B21HJ0001]; Australian Research Council [LP110200333 and DP120104460]; National Institute of Allergy and Infectious Diseases of the National Institutes of Health [R01 AI111965]; a Major Inter-Disciplinary Research project awarded by Monash University, and the Collaborative Research Program of Institute for Chemical Research, Kyoto University; Fundamental Research Funds for the Central Universities [3132020170, 3132019323]; National Natural Science Foundation of Liaoning Province [20180550307]; C.L. was supported by an NHMRC CJ Martin Early Career Research Fellowship [1143366]; L.K. is supported in part by the Robert J. Mattauch Endowment funds. X.G. was supported by the King Abdullah University of Science and Technology (KAUST) Office of Research Administration (ORA) [BAS/1/1624-01-01, URF/1/1976-04-01, URF/1/4352-01-01, URF/1/4379-01-01 and URF/1/4663-01-01]. Funding for open access charge: Major Inter-Disciplinary Research (IDR) Project awarded by Monash University.
ASJC Scopus subject areas
- Genetics