Ag Data Commons
Browse

Genome sequencing and assembly of Aphelinus atriplicis and Aphelinus certus

dataset
posted on 2024-11-23, 22:14 authored by USDA-ARS
Safe, effective biological-control introductions against invasive pests depend on narrowly host-specific natural enemies with the ability to adapt to a changing environment. As part of a project on the genetic architectures of these traits, we assembled and annotated the genomes of two aphid parasitoids, Aphelinus atriplicis and Aphelinus certus. We report here several assemblies made using Illumina and PacBio data, which we combined into a meta-assembly. We scaffolded the meta-assembly with markers from a genetic map of hybrids between A. atriplicis and A. certus. We used this genetic-linkage scaffolded (GLS) assembly of A. atriplicis to scaffold a de novo assembly of A. certus. The de novo assemblies differed in contiguity, and the meta-assembly was more contiguous than the best de novo assembly. Scaffolding with genetic-linkage data allowed chromosomal-level assembly of the A. atriplicis genome, and scaffolding a de novo assembly of A. certus with this GLS assembly, greatly increased the contiguity of the A. certus assembly. However, completeness of the A. atriplicis assembly, as measured by percent of the complete, single-copy BUSCO hymenopteran genes, varied little among de novo assemblies and was not increased by meta-assembly or genetic scaffolding. Furthermore, the greater contiguity of the meta-assembly and GLS assembly had little or no effect on the numbers of genes identified, the proportion with homologs or functional annotations. Increased contiguity of the A. certus assembly provided modest improvement in assembly completeness, as measured by the percent complete, single-copy BUSCO hymenopteran genes. The total genic sequence increased, and while the number of genes declined, gene length increased, which together suggests greater accuracy of gene models. More contiguous assemblies provide uses other than gene annotation, for example, identifying the genes associated with quantitative trait loci and understanding of chromosomal rearrangements associated with speciation.

Funding

USDA-NIFA: 2017-67013-26526

USDA-ARS: 8010-22000-032-00D

History

Data contact name

BioProject Curation Staff

Publisher

National Center for Biotechnology Information

Temporal Extent Start Date

2021-10-01

Theme

  • Non-geospatial

ISO Topic Category

  • biota

National Agricultural Library Thesaurus terms

genomics; sequence analysis; genome assembly

Pending citation

  • No

Public Access Level

  • Public

Accession Number

PRJNA767831

Preferred dataset citation

It is recommended to cite the accession numbers that are assigned to data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioProjects need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJNA767831 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/)."

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC