TY - JOUR
T1 - Mice and men
T2 - Their promoter properties
AU - Bajic, Vladimir B.
AU - Sin, Lam Tan
AU - Christoffels, Alan
AU - Schönbach, Christian
AU - Lipovich, Leonard
AU - Yang, Liang
AU - Hofmann, Oliver
AU - Kruger, Adele
AU - Hide, Winston
AU - Kai, Chikatoshi
AU - Kawai, Jun
AU - Hume, David A.
AU - Carninci, Piero
AU - Hayashizaki, Yoshihide
PY - 2006/4
Y1 - 2006/4
N2 - Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis-elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and cis-elements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune-response-related genes (GO:0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene-finding tools.
AB - Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis-elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and cis-elements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune-response-related genes (GO:0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene-finding tools.
UR - http://www.scopus.com/inward/record.url?scp=33646472425&partnerID=8YFLogxK
U2 - 10.1371/journal.pgen.0020054
DO - 10.1371/journal.pgen.0020054
M3 - Article
C2 - 16683032
AN - SCOPUS:33646472425
VL - 2
SP - 614
EP - 626
JO - PLoS Genetics
JF - PLoS Genetics
SN - 1553-7390
IS - 4
M1 - e54
ER -