Abstract
Background: mRNA polyadenylation is an essential step of pre-mRNA processing in eukaryotes. Accurate prediction of the pre-mRNA 3'-end cleavage/polyadenylation sites is important for defining the gene boundaries and understanding gene expression mechanisms.Results: 28761 human mapped poly(A) sites have been classified into three classes containing different known forms of polyadenylation signal (PAS) or none of them (PAS-strong, PAS-weak and PAS-less, respectively) and a new computer program POLYAR for the prediction of poly(A) sites of each class was developed. In comparison with polya_svm (till date the most accurate computer program for prediction of poly(A) sites) while searching for PAS-strong poly(A) sites in human sequences, POLYAR had a significantly higher prediction sensitivity (80.8% versus 65.7%) and specificity (66.4% versus 51.7%) However, when a similar sort of search was conducted for PAS-weak and PAS-less poly(A) sites, both programs had a very low prediction accuracy, which indicates that our knowledge about factors involved in the determination of the poly(A) sites is not sufficient to identify such polyadenylation regions.Conclusions: We present a new classification of polyadenylation sites into three classes and a novel computer program POLYAR for prediction of poly(A) sites/regions of each of the class. In tests, POLYAR shows high accuracy of prediction of the PAS-strong poly(A) sites, though this program's efficiency in searching for PAS-weak and PAS-less poly(A) sites is not very high but is comparable to other available programs. These findings suggest that additional characteristics of such poly(A) sites remain to be elucidated. POLYAR program with a stand-alone version for downloading is available at http://cub.comsats.edu.pk/polyapredict.htm.
Original language | English (US) |
---|---|
Article number | 646 |
Journal | BMC genomics |
Volume | 11 |
Issue number | 1 |
DOIs | |
State | Published - Nov 19 2010 |
Externally published | Yes |
Bibliographical note
Funding Information:This work was supported by the Higher Education Commission of Pakistan through a research grant entitled “Comparative Computer Analysis of Promoter Architecture and Expression Patterns of Plant Genes”, which is collaborative research project jointly conducted by the Department of Biosciences (COMSATS Institute of Information Technology, Islamabad, Pakistan), the Department of Molecular-genetic bases of Production Processes (Institute of Botany, Azerbaijan National Academy of Sciences, Baku, Azerbaijan) and Department of Computer Science (Royal Holloway, University of London, UK).
ASJC Scopus subject areas
- Biotechnology
- Genetics