Rice (Oryza sativa), a major staple throughout the world and a model system for plant genomics and breeding, was the first crop genome sequenced almost two decades ago. However, reference genomes of all higher organisms to date contain gaps and missing sequences. Here, we report, for the first time, the assembly and analyses of gap-free reference genome sequences of two elite O. sativa xian/indica rice varieties 'Zhenshan 97 (ZS97)' and 'Minghui 63 (MH63)' that are being used as a model system for studying heterosis and yield. Gap-free reference genomes provide the opportunity for a global view of the structure and function of centromeres. We show that all rice centromeric regions share conserved centromere-specific satellite motifs with different copy numbers and structures. In addition, the similarity of CentO repeats in the same chromosomes is higher than across chromosomes supporting a model of local expansion and homogenization. Both genomes had over 395 non-TE genes located in centromere regions, of which ∼41% are actively transcribed. Two large structural variants at the end of chromosome 11 affected the copy number of resistance genes between the two genomes. The availability of the two gap-free genomes lays a solid foundation for further understanding genome structure and function in plants and breeding climate resilient varieties.
KAUST Repository Item: Exported on 2021-06-28
Acknowledgements: We sincerely thank 1) Pacific Biosciences of California, Inc. for sequencing of MH63, 2) Wuhan Frasergen Bioinformatics Co., Ltd. for sequencing of ZS97, 3) the computing platform of the National Key Laboratory of Crop Genetic Improvement in HZAU for providing the computational resources, and 4) Dr. Jiming Jiang at MSU for his critical comments and constructive suggestions on our centromere analyses.
- Plant Science
- Molecular Biology