Molecular dynamics (MD) simulation is a well-established method for studying protein motion at the atomic scale. However, it is computationally intensive and generates massive amounts of data. One way of addressing the dual challenges of computation efficiency and data analysis is to construct simplified models of long-timescale protein motion from MD simulation data. In this direction, we propose to use Markov models with hidden states, in which the Markovian states represent potentially overlapping probabilistic distributions over protein conformations. We also propose a principled criterion for evaluating the quality of a model by its ability to predict long-timescale protein motions. Our method was tested on 2D synthetic energy landscapes and two extensively studied peptides, alanine dipeptide and the villin headpiece subdomain (HP-35 NleNle). One interesting finding is that although a widely accepted model of alanine dipeptide contains six states, a simpler model with only three states is equally good for predicting long-timescale motions. We also used the constructed Markov models to estimate important kinetic and dynamic quantities for protein folding, in particular, mean first-passage time. The results are consistent with available experimental measurements.
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledgements: AcRF grant R-252-000-350-112 from the Ministry of Education, Singapore (to D.H., in parts). National Science Foundation grant DMS-0443939 and two grants from the Academic Excellence Alliance program between King Abdullah University of Science&Technology (KAUST) and Stanford University (to J-C.L., in parts)
This publication acknowledges KAUST support, but has no KAUST affiliated authors.