Characterization of soybean genomic features by analysis of its expressed sequence tags

Ai Guo Tian, Jun Wang, Peng Cui, Yu Jun Han, Hao Xu, Li Juan Cong, Xian Gang Huang, Xiao Ling Wang, Yong Zhi Jiao, Bang Jun Wang, Yong Jun Wang, Jin Song Zhang*, Shou Yi Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

77 Scopus citations


We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92% of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communi-cation belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.

Original languageEnglish (US)
Pages (from-to)903-913
Number of pages11
JournalTheoretical and Applied Genetics
Issue number5
StatePublished - Mar 2004
Externally publishedYes

ASJC Scopus subject areas

  • Biotechnology
  • Agronomy and Crop Science
  • Genetics


Dive into the research topics of 'Characterization of soybean genomic features by analysis of its expressed sequence tags'. Together they form a unique fingerprint.

Cite this