Proteins can form complex three-dimensional structures which allow them to perform many different functions, including efficient catalysis, sophisticated signal processing or dynamical cell restructuring. Hence, exploring the protein folding process is central to understand development, adaptation and disease states of organisms. It is becoming increasingly clear that proteins that arise “de novo” through translation of originally non-coding DNA regions play a significant role in the diversification and adaptation of species. However, the extent to which these de novo proteins can already form complex structures, and hence perform sophisticated functions, is unknown. In this study, bioinformatically predicted the structural features of 200 de novo sequence from rice, using AlphaFold and other computational tools. Based on the bioinformatic analytical results, we selected sequences for the small-scale screening and expression. Two proteins, de novo 18 and de novo 47 could be expressed recombinantly and experimentally analyzed using biophysical methods. Circular dichroism and Nuclear Magnetic Resonance measurements confirmed structural features in agreement with computational predictions. In addition, we designed protocols for the high-throughput analysis using robotics. Our results provide a stepping stone for comprehensive analysis of the structural landscape of de novo proteins, and raise the possibility that de novo proteins can produce more sophisticated folds through self-association. Thus, our work provides a hypothesis for the origin of complex protein folds that serve as a framework for complex protein functions.
|Date made available
|KAUST Research Repository