A new database (GCD) on genome composition for eukaryote and prokaryote genome sequences and their initial analyses

Kirill Kryukov, Kenta Sumiyama, Kazuho Ikeo, Takashi Gojobori, Naruya Saitou*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Eukaryote genomes contain many noncoding regions, and they are quite complex. To understand these complexities, we constructed a database, Genome Composition Database, for the whole genome composition statistics for 101 eukaryote genome data, as well as more than 1,000 prokaryote genomes. Frequencies of all possible one to ten oligonucleotides were counted for each genome, and these observed values were compared with expected values computed under observed oligonucleotide frequencies of length 1-4. Deviations from expected values were much larger for eukaryotes than prokaryotes, except for fungal genomes. Mammalian genomes showed the largest deviation among animals. The results of comparison are available online at http://esper.lab.nig.ac.jp/ genome-composition-database/.

Original languageEnglish (US)
Pages (from-to)501-512
Number of pages12
JournalGenome biology and evolution
Volume4
Issue number4
DOIs
StatePublished - Jan 1 2012
Externally publishedYes

Keywords

  • Alignment-free sequence comparison
  • GCD
  • Oligonucleotide frequency

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics

Fingerprint

Dive into the research topics of 'A new database (GCD) on genome composition for eukaryote and prokaryote genome sequences and their initial analyses'. Together they form a unique fingerprint.

Cite this