Ag Data Commons
Browse
- No file added yet -

Aegilops columnaris Zhuk organellar-enriched DNA sequencing, assembly, and comparative genomics

dataset
posted on 2024-09-29, 05:47 authored by US Department of Agriculture
Proper interactions between the nucleus and cytoplasmic organelles (mitochondria and plastids) are essential to eukaryotic cellular function. To improve our understanding of the role of organellar genomes and nuclear-cytoplasmic interactions in plant development and stress response, our first aim is to survey organellar genome diversity in wheat and across the broader Triticum-Aegilops complex. This will be followed by work to assess genome dynamics across developmental stages as well as during abiotic and biotic stress response. The results of this work will be important for improving crop traits. To accomplish our goals, it was critical to first establish improved methods for the isolation, sequencing, and assembly of organellar genomes from limited starting material without whole genome amplification. As a proof of concept, we optimized our methods using the Triticum aestivum cv. Chinese Spring, for which there is previous sequencing data available. The mitochondria and chloroplast genomes have large repeats (upto 10kb and 20kb in length, respectively). Previous studies have performed whole genome amplification and have manually stitched contigs to force a single master circle configuration of the organellar genomes, which may or may not reflect the true native state of the wheat organellar genomes. To resolve the long repeats and perform de novo assemblies without whole genome amplification and manual stitching of contigs, we utilized low input PacBio 20kb library preparations to generate long sequencing reads. In total, we sequenced 20 organellar-enriched samples with PacBio, including 13 diverse wild species, T. durum, T. aestivum cv. Chinese Spring, and three wheat alloplasmic lines. In addition we generated Illumina short-read sequences for many additional cultivars, wild species, and alloplasmic lines. This project includes data for one of these samples (Aegilops columnaris Zhuk). Raw sequencing reads are deposited here. Assemblies and annotations will be included once available.

Funding

NSF: 1361554

History

Data contact name

BioProject Curation Staff

Publisher

National Center for Biotechnology Information

Temporal Extent Start Date

2018-05-09

Theme

  • Non-geospatial

ISO Topic Category

  • biota

National Agricultural Library Thesaurus terms

genetics

Pending citation

  • No

Public Access Level

  • Public

Accession Number

PRJNA470701

Preferred dataset citation

It is recommended to cite the accession numbers that are assigned to data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioProjects need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJNA470701 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/)."

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC