Generalizing and learning protein-DNA binding sequence representations by an evolutionary algorithm

Ka Chun Wong, Chengbin Peng, Manhon Wong, Kwongsak Leung

Research output: Contribution to journalArticlepeer-review

26 Scopus citations

Abstract

Protein-DNA bindings are essential activities. Understanding them forms the basis for further deciphering of biological and genetic systems. In particular, the protein-DNA bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) play a central role in gene transcription. Comprehensive TF-TFBS binding sequence pairs have been found in a recent study. However, they are in one-to-one mappings which cannot fully reflect the many-to-many mappings within the bindings. An evolutionary algorithm is proposed to learn generalized representations (many-to-many mappings) from the TF-TFBS binding sequence pairs (one-to-one mappings). The generalized pairs are shown to be more meaningful than the original TF-TFBS binding sequence pairs. Some representative examples have been analyzed in this study. In particular, it shows that the TF-TFBS binding sequence pairs are not presumably in one-to-one mappings. They can also exhibit many-to-many mappings. The proposed method can help us extract such many-to-many information from the one-to-one TF-TFBS binding sequence pairs found in the previous study, providing further knowledge in understanding the bindings between TFs and TFBSs. © 2011 Springer-Verlag.
Original languageEnglish (US)
Pages (from-to)1631-1642
Number of pages12
JournalSoft Computing
Volume15
Issue number8
DOIs
StatePublished - Feb 5 2011

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: The authors are grateful to the anonymous reviewers for their valuable comments. They would like to thank Tak-Ming Chan for his help on surveying the related works. This research is partially supported by the grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Nos. 414107 and 414708).

ASJC Scopus subject areas

  • Geometry and Topology
  • Theoretical Computer Science
  • Software

Fingerprint

Dive into the research topics of 'Generalizing and learning protein-DNA binding sequence representations by an evolutionary algorithm'. Together they form a unique fingerprint.

Cite this