Training set optimization under population structure in genomic selection

Julio Isidro, Jean Luc Jannink, Deniz Akdemir, Jesse Poland, Nicolas Heslot, Mark E. Sorrells

Research output: Contribution to journalArticlepeer-review

250 Scopus citations

Abstract

KEY MESSAGE: Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.
Original languageEnglish (US)
Pages (from-to)145-158
Number of pages14
JournalTAG. Theoretical and applied genetics. Theoretische und angewandte Genetik
Volume128
Issue number1
DOIs
StatePublished - Jan 1 2015
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2022-09-13

ASJC Scopus subject areas

  • Genetics
  • Agronomy and Crop Science
  • Biotechnology

Fingerprint

Dive into the research topics of 'Training set optimization under population structure in genomic selection'. Together they form a unique fingerprint.

Cite this