Genome sequencing and assembly of Aphelinus atriplicis and Aphelinus certus
dataset
posted on 2024-11-23, 22:14authored byUSDA-ARS
Safe, effective biological-control introductions against invasive pests depend on narrowly host-specific natural enemies with the ability to adapt to a changing environment. As part of a project on the genetic architectures of these traits, we assembled and annotated the genomes of two aphid parasitoids, Aphelinus atriplicis and Aphelinus certus. We report here several assemblies made using Illumina and PacBio data, which we combined into a meta-assembly. We scaffolded the meta-assembly with markers from a genetic map of hybrids between A. atriplicis and A. certus. We used this genetic-linkage scaffolded (GLS) assembly of A. atriplicis to scaffold a de novo assembly of A. certus. The de novo assemblies differed in contiguity, and the meta-assembly was more contiguous than the best de novo assembly. Scaffolding with genetic-linkage data allowed chromosomal-level assembly of the A. atriplicis genome, and scaffolding a de novo assembly of A. certus with this GLS assembly, greatly increased the contiguity of the A. certus assembly. However, completeness of the A. atriplicis assembly, as measured by percent of the complete, single-copy BUSCO hymenopteran genes, varied little among de novo assemblies and was not increased by meta-assembly or genetic scaffolding. Furthermore, the greater contiguity of the meta-assembly and GLS assembly had little or no effect on the numbers of genes identified, the proportion with homologs or functional annotations. Increased contiguity of the A. certus assembly provided modest improvement in assembly completeness, as measured by the percent complete, single-copy BUSCO hymenopteran genes. The total genic sequence increased, and while the number of genes declined, gene length increased, which together suggests greater accuracy of gene models. More contiguous assemblies provide uses other than gene annotation, for example, identifying the genes associated with quantitative trait loci and understanding of chromosomal rearrangements associated with speciation.
It is recommended to cite the accession numbers that are assigned to data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioProjects need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJNA767831 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/)."