Plant omics data center: An integrated web repository for interspecies gene expression networks with NLP-based curation

Hajime Ohyanagi, Tomoyuki Takano, Shin Terashima, Masaaki Kobayashi, Maasa Kanno, Kyoko Morimoto, Hiromi Kanegae, Yohei Sasaki, Misa Saito, Satomi Asano, Soichi Ozaki, Toru Kudo, Koji Yokoyama, Koichiro Aya, Keita Suwabe, Go Suzuki, Koh Aoki, Yasutaka Kubo, Masao Watanabe, Makoto MatsuokaKentaro Yano*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

47 Scopus citations


Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center;, now open to the public, providing GENs, functional annotations and additional comprehensive omics resources.

Original languageEnglish (US)
Pages (from-to)e9
JournalPlant and Cell Physiology
Issue number1
StatePublished - Jan 1 2015
Externally publishedYes

Bibliographical note

Funding Information:
This work is supported by the Japan Society for the Promotion of Science (JSPS) [Grants-in-Aid for Scientific Research on Innovative Areas (No. 26113716 to K.Y., No. 23113006 to G.S., No. 23113005 to M.M., No. 23113001 to G.S. and M.M.), Scientific Research (A) (No. 23248005 to K.A., No. 25252001 to M.W.), Scientific Research (B) (No. 25292005 to K.S., No. 24380023 to Y.K.) and Scientific Research (C) (No. 25450515 to G.S.); the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT) [Supported Program for the Strategic Research Foundation at Private Universities (2014– 2018)]; Meiji University [Research Funding for Computational Software Supporting Program].

Publisher Copyright:
© 2014 The Author.


  • Correspondence analysis
  • Database
  • Gene expression network
  • Manual curation
  • Natural language processing (NLP)
  • Omics

ASJC Scopus subject areas

  • Physiology
  • Plant Science
  • Cell Biology


Dive into the research topics of 'Plant omics data center: An integrated web repository for interspecies gene expression networks with NLP-based curation'. Together they form a unique fingerprint.

Cite this