Aegilops columnaris Zhuk organellar-enriched DNA sequencing, assembly, and comparative genomics
dataset
posted on 2024-09-29, 05:47authored byUS Department of Agriculture
Proper interactions between the nucleus and cytoplasmic organelles (mitochondria and plastids) are essential to eukaryotic cellular function. To improve our understanding of the role of organellar genomes and nuclear-cytoplasmic interactions in plant development and stress response, our first aim is to survey organellar genome diversity in wheat and across the broader Triticum-Aegilops complex. This will be followed by work to assess genome dynamics across developmental stages as well as during abiotic and biotic stress response. The results of this work will be important for improving crop traits. To accomplish our goals, it was critical to first establish improved methods for the isolation, sequencing, and assembly of organellar genomes from limited starting material without whole genome amplification. As a proof of concept, we optimized our methods using the Triticum aestivum cv. Chinese Spring, for which there is previous sequencing data available. The mitochondria and chloroplast genomes have large repeats (upto 10kb and 20kb in length, respectively). Previous studies have performed whole genome amplification and have manually stitched contigs to force a single master circle configuration of the organellar genomes, which may or may not reflect the true native state of the wheat organellar genomes. To resolve the long repeats and perform de novo assemblies without whole genome amplification and manual stitching of contigs, we utilized low input PacBio 20kb library preparations to generate long sequencing reads. In total, we sequenced 20 organellar-enriched samples with PacBio, including 13 diverse wild species, T. durum, T. aestivum cv. Chinese Spring, and three wheat alloplasmic lines. In addition we generated Illumina short-read sequences for many additional cultivars, wild species, and alloplasmic lines. This project includes data for one of these samples (Aegilops columnaris Zhuk). Raw sequencing reads are deposited here. Assemblies and annotations will be included once available.
It is recommended to cite the accession numbers that are assigned to data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioProjects need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJNA470701 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/)."