Triticum aestivum cv. Chinese Spring organellar-enriched long-read DNA sequencing

dataset

posted on 2024-11-23, 21:50 authored by US Department of Agriculture

Proper interactions between the nucleus and cytoplasmic organelles (mitochondria and plastids) are essential to eukaryotic cellular function. To improve our understanding of the role of organellar genomes and nuclear-cytoplasmic interactions in plant development and stress response, our first aim is to survey organellar genome diversity in wheat and across the broader Triticum-Aegilops complex. This will be followed by work to assess genome dynamics across developmental stages as well as during abiotic and biotic stress response. The results of this work will be important for improving crop traits. To accomplish our goals, it was critical to first establish improved methods for the isolation, sequencing, and assembly of organellar genomes from limited starting material without whole genome amplification. As a proof of concept, we optimized our methods using the Triticum aestivum cv. Chinese Spring, for which there is previous sequencing data available. The mitochondria and chloroplast genomes have large repeats (upto 10kb and 20kb in length, respectively). Previous studies have performed whole genome amplification and have manually stitched contigs to force a single master circle configuration of the organellar genomes, which may or may not reflect the true native state of the wheat organellar genomes. To resolve the long repeats and perform de novo assemblies without whole genome amplification and manual stitching of contigs, we utilized low input PacBio 20kb library preparations to generate long sequencing reads. We compared two methods of organellar DNA isolation and purification -- a traditional differential centrifugation (DC) approach and a kit-based pulldown approach that we call the methyl fractionation (MF) approach. Raw sequencing reads from these techniques are provided here.

Funding

NSF: 1361554

History

Data contact name

BioProject Curation Staff

Data contact email

bioprojecthelp@ncbi.nlm.nih.gov

Publisher

National Center for Biotechnology Information

Temporal Extent Start Date

2017-07-21

Theme

Non-geospatial

ISO Topic Category

biota

National Agricultural Library Thesaurus terms

sequence analysis

Pending citation

Public Access Level

Public

Accession Number

PRJNA395365

Preferred dataset citation

It is recommended to cite the accession numbers that are assigned to data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioProjects need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJNA395365 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/)."