Abstract
Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.
Original language | English (US) |
---|---|
Pages (from-to) | 870-880 |
Number of pages | 11 |
Journal | Genome Research |
Volume | 29 |
Issue number | 5 |
DOIs | |
State | Published - Apr 16 2019 |
Bibliographical note
KAUST Repository Item: Exported on 2020-10-01Acknowledgements: Work in the Grigoriev laboratory is supported by awards from the National Science Foundation (DBI-1458202, ACI-1548562), National Institute of Health (R15CA220059), and New Jersey Health Foundation (PC77-17). We thank XSEDE (MCB150014), PRAGMA, Universidad de los Andes, and the Department of Science and Technology, Advanced Science and Technology Institute of the Philippines for computing resources. We thank the Bill and Melinda Gates Foundation for support to the 3K RG via the GSR Phase 2 award OPPGD1393.