TY - JOUR
T1 - Information for the Coordinates of Exons (ICE)
T2 - A human splice sites database
AU - Chong, Allen
AU - Zhang, Guanglan
AU - Bajic, Vladimir B.
PY - 2004/10
Y1 - 2004/10
N2 - We present a comprehensive database, Information for the Coordinates of Exons (ICE), of genomic splice sites (SSs) for 10,803 human genes. ICE contains 91,846 pairs of donor acceptor sites, supported by the alignment of "full-length" human mRNAs (including transcript variants) on human genomic sequences. ICE represents the largest collection of human SSs known to date and provides a significant resource to both molecular biologists and bioinformaticians alike. A user can visualize and extract genomic sequences around SSs of the donor acceptor pairs and can also visualize the primary structure of individual genes. We list in this article the 22 most frequently found canonical and noncanonical splice sites. The top four most represented donor acceptor pairs (GT-AG, GC-AG, AT-AC, and GT-GG) accounted for 99.16% of our data set. In addition, we calculated the SS matrix models for the three most common donor acceptor pairs. The database is focused on providing SSs and surrounding sequence information, associated SS and sequence characteristics, and relation to overall transcript structure. It allows targeted search and presents evidence for the gene structure.
AB - We present a comprehensive database, Information for the Coordinates of Exons (ICE), of genomic splice sites (SSs) for 10,803 human genes. ICE contains 91,846 pairs of donor acceptor sites, supported by the alignment of "full-length" human mRNAs (including transcript variants) on human genomic sequences. ICE represents the largest collection of human SSs known to date and provides a significant resource to both molecular biologists and bioinformaticians alike. A user can visualize and extract genomic sequences around SSs of the donor acceptor pairs and can also visualize the primary structure of individual genes. We list in this article the 22 most frequently found canonical and noncanonical splice sites. The top four most represented donor acceptor pairs (GT-AG, GC-AG, AT-AC, and GT-GG) accounted for 99.16% of our data set. In addition, we calculated the SS matrix models for the three most common donor acceptor pairs. The database is focused on providing SSs and surrounding sequence information, associated SS and sequence characteristics, and relation to overall transcript structure. It allows targeted search and presents evidence for the gene structure.
UR - http://www.scopus.com/inward/record.url?scp=4744367570&partnerID=8YFLogxK
U2 - 10.1016/j.ygeno.2004.05.007
DO - 10.1016/j.ygeno.2004.05.007
M3 - Article
C2 - 15475254
AN - SCOPUS:4744367570
SN - 0888-7543
VL - 84
SP - 762
EP - 766
JO - Genomics
JF - Genomics
IS - 4
ER -