Time series associated with single-molecule experiments and/or simulations contain a wealth of multiscale information about complex biomolecular systems. We demonstrate how a collection of Penalized-splines (P-splines) can be useful in quantitatively summarizing such data. In this work, functions estimated using P-splines are associated with stochastic differential equations (SDEs). It is shown how quantities estimated in a single SDE summarize fast-scale phenomena, whereas variation between curves associated with different SDEs partially reflects noise induced by motion evolving on a slower time scale. P-splines assist in "semiparametrically" estimating nonlinear SDEs in situations where a time-dependent external force is applied to a single-molecule system. The P-splines introduced simultaneously use function and derivative scatterplot information to refine curve estimates. We refer to the approach as the PuDI (P-splines using Derivative Information) method. It is shown how generalized least squares ideas fit seamlessly into the PuDI method. Applications demonstrating how utilizing uncertainty information/approximations along with generalized least squares techniques improve PuDI fits are presented. Although the primary application here is in estimating nonlinear SDEs, the PuDI method is applicable to situations where both unbiased function and derivative estimates are available.
|Original language||English (US)|
|Number of pages||19|
|Journal||Multiscale Modeling & Simulation|
|State||Published - Jan 2010|
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUS-CI-016-04
Acknowledgements: Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005. Current address: Numerica Corporation, 4850 Hahns Peak Drive, Suite 200, Loveland, CO 80538 (Chris.Calderon@numerica.us). This author's work was funded by NIH grant T90 DK070121-04.Department of Statistics, Texas A&M University, College Station, TX 77843. Current address: Department of Epidemiology & Biostatistics, School of Rural Public Health, Texas A&M Health Science Center, 1266 TAMU, College Station, TX 77843 (email@example.com). This author's work was supported by a postdoctoral training grant from the National Cancer Institute (CA90301).Department of Statistics, Texas A&M University, College Station, TX 77843 (firstname.lastname@example.org). This author's work was supported by a grant from the National Cancer Institute (CA57030) and by award KUS-CI-016-04, given by King Abdullah University of Science and Technology (KAUST).Department of Computational and Applied Mathematics, Rice University, Houston, TX 77005 (email@example.com). This author's work was partially supported by AFOSR grant FA9550-09-1-0225 and by NSF grant CCF-0634902.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.