Sparse kernel models provide optimization of training set design for genomic prediction in multiyear wheat breeding data

Marco Lopez-Cruz, Susanne Dreisigacker, Leonardo Crespo-Herrera, Alison R. Bentley, Ravi Singh, Jesse Poland, Sandesh Shrestha, Julio Huerta-Espino, Velu Govindan, Philomin Juliana, Suchismita Mondal, Paulino Pérez-Rodríguez*, Jose Crossa*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The success of genomic selection (GS) in breeding schemes relies on its ability to provide accurate predictions of unobserved lines at early stages. Multigeneration data provides opportunities to increase the training data size and thus, the likelihood of extracting useful information from ancestors to improve prediction accuracy. The genomic best linear unbiased predictions (GBLUPs) are performed by borrowing information through kinship relationships between individuals. Multigeneration data usually becomes heterogeneous with complex family relationship patterns that are increasingly entangled with each generation. Under these conditions, historical data may not be optimal for model training as the accuracy could be compromised. The sparse selection index (SSI) is a method for training set (TRN) optimization, in which training individuals provide predictions to some but not all predicted subjects. We added an additional trimming process to the original SSI (trimmed SSI) to remove less important training individuals for prediction. Using a large multigeneration (8 yr) wheat (Triticum aestivum L.) grain yield dataset (n = 68,836), we found increases in accuracy as more years are included in the TRN, with improvements of ∼0.05 in the GBLUP accuracy when using 5 yr of historical data relative to when using only 1 yr. The SSI method showed a small gain over the GBLUP accuracy but with an important reduction on the TRN size. These reduced TRNs were formed with a similar number of subjects from each training generation. Our results suggest that the SSI provides a more stable ranking of genotypes than the GBLUP as the TRN becomes larger.

Original languageEnglish (US)
Article numbere20254
JournalPlant Genome
Volume15
Issue number4
DOIs
StatePublished - Dec 2022

Bibliographical note

Funding Information:
We thank all CIMMYT scientists, field workers, and lab assistants who collected the data used in this study. Open Access fees are received from the Bill and Melinda Gates Foundation. We acknowledge the financial support provided by the Bill and Melinda Gates Foundation [INV‐003439 BMGF/FCDO Accelerating Genetic Gains in Maize and Wheat for Improved Livelihoods (AGG)] as well as USAID projects [Amend. No. 9 MTO 069033, USAID‐CIMMYT Wheat/AGGMW, AGG‐Maize Supplementary Project, AGG (Stress Tolerant Maize for Africa)] that generated the CIMMYT data analyzed in this study. We are also thankful for the financial support provided by the Foundation for Research Levy on Agricultural Products (F.F.L.) and the Agricultural Agreement Research Fund (J.A.) in Norway through NFR grant 267806, the CIMMYT CRP‐WHEAT, and the USDA National Institute of Food and Agriculture grants 2020‐67013‐30904 and 2018‐67015‐27957 to DER and Hatch project 1010469.

Funding Information:
We thank all CIMMYT scientists, field workers, and lab assistants who collected the data used in this study. Open Access fees are received from the Bill and Melinda Gates Foundation. We acknowledge the financial support provided by the Bill and Melinda Gates Foundation [INV-003439 BMGF/FCDO Accelerating Genetic Gains in Maize and Wheat for Improved Livelihoods (AGG)] as well as USAID projects [Amend. No. 9 MTO 069033, USAID-CIMMYT Wheat/AGGMW, AGG-Maize Supplementary Project, AGG (Stress Tolerant Maize for Africa)] that generated the CIMMYT data analyzed in this study. We are also thankful for the financial support provided by the Foundation for Research Levy on Agricultural Products (F.F.L.) and the Agricultural Agreement Research Fund (J.A.) in Norway through NFR grant 267806, the CIMMYT CRP-WHEAT, and the USDA National Institute of Food and Agriculture grants 2020-67013-30904 and 2018-67015-27957 to DER and Hatch project 1010469.

Publisher Copyright:
© 2022 International Maize and Wheat Improvement Center (CIMMYT). The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.

ASJC Scopus subject areas

  • Genetics
  • Agronomy and Crop Science
  • Plant Science

Fingerprint

Dive into the research topics of 'Sparse kernel models provide optimization of training set design for genomic prediction in multiyear wheat breeding data'. Together they form a unique fingerprint.

Cite this