Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors

Mark A. Van De Wiel*, Gwenaël G.R. Leday, Luba Pardo, Håvard Rue, Aad W. Van Der Vaart, Wessel N. Van Wieringen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

83 Scopus citations

Abstract

Next generation sequencing is quickly replacing microarrays as a technique to probe different molecular levels of the cell, such as DNA or RNA. The technology provides higher resolution, while reducing bias. RNA sequencing results in counts of RNA strands. This type of data imposes new statistical challenges. We present a novel, generic approach to model and analyze such data. Our approach aims at large flexibility of the likelihood (count) model and the regression model alike. Hence, a variety of count models is supported, such as the popular NB model, which accounts for overdispersion. In addition, complex, non-balanced designs and random effects are accommodated. Like some other methods, our method provides shrinkage of dispersion-related parameters. However, we extend it by enabling joint shrinkage of parameters, including those for which inference is desired. We argue that this is essential for Bayesian multiplicity correction. Shrinkage is effectuated by empirically estimating priors. We discuss several parametric (mixture) and non-parametric priors and develop procedures to estimate (parameters of) those. Inference is provided by means of local and Bayesian false discovery rates. We illustrate our method on several simulations and two data sets, also to compare it with other methods. Model-and data-based simulations show substantial improvements in the sensitivity at the given specificity. The data motivate the use of the ZI-NB as a powerful alternative to the NB, which results in higher detection rates for low-count data. Finally, compared with other methods, the results on small sample subsets are more reproducible when validated on their large sample complements, illustrating the importance of the type of shrinkage.

Original languageEnglish (US)
Pages (from-to)113-128
Number of pages16
JournalBiostatistics
Volume14
Issue number1
DOIs
StatePublished - Jan 2013
Externally publishedYes

Keywords

  • Count data
  • Empirical Bayes
  • High-dimensional
  • Multiplicity
  • Sequencing data
  • Shrinkage

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors'. Together they form a unique fingerprint.

Cite this