Ag Data Commons
3 files

Oncopeltus fasciatus hybrid genome assembly 1.0

posted on 2024-02-13, 13:37 authored by Iris M. Vargas Jentzsch, Viera Kovacova, Kurt Stueber, Stefan Koelzer, Kristen A. Panfilio

The milkweed bug, Oncopeltus fasciatus, was sequenced as part of the i5k pilot project from Baylor College of Medicine (Illumina data). To augment those resources, we present here a hybrid genome assembly with low coverage PacBio data, assembled with PBJelly: the Oncopeltus fasciatus Hybrid Genome Assembly v1.0.

Oncopeltus fasciatus has been an established lab organism for over 60 years, and has been used for a wide range of studies from physiology to development and evolution. As a relatively conservative and generalized species, it affords a baseline against which other species can be compared.

For example, this species has the same piercing and sucking type mouthparts as its less benign relatives, including the blood-sucking kissing bug, Rhodnius prolixus, and the brown marmorated stink bug, Halyomorpha halys, which are disease vector and agricultural pest species, respectively. Unlike the pest species, the benign, seed-feeding Oncopeltus can be functionally investigated in the lab by RNA interference (RNAi). Comparing the genomes, and conducting experimental lab work in Oncopeltus, will help to identify unique features of the pest species, and thus inform management strategies for them.

More generally, Oncopeltus is a key species for comparisons across the insects. It is one of the few experimentally tractable hemimetabolous species that can ground comparisons with the completely metamorphosing species of the Holometabola (e.g., flies, beetles, wasps). Topics investigated in this framework include reproductive biology and development of the legs, wings, body segments, extraembryonic membranes, and overall establishment of the body plan.

This dataset presents the Oncopeltus fasciatus hybrid genome assembly v1.0. These are the results from PBJelly-based gapfilling of the Illumina assembly (Illumina assembly v.1.0, DOI: 10.15482/usda.adc/1173238). It uses PacBio data from 2,135,043 subreads (mean length 15,760 nt; range 35 to 46,753 nt), providing an approximate coverage of 8x. The gapfilled assembly was constructed with PBJelly v13.10.22, with the following blastr parameters: ‘minMatch 8 -minPctIdentity 70 -bestn 1 -nCandidates 20 -maxScore -500 -nproc 4 –noSplitSubreads’. Compared to the Illumina-only assembly, gapfilling substantially reduced the proportion of undetermined nucleotides (assembly gaps), from 30% to 6%, with an attendant modest reduction in the number of scaffolds (17,095 scaffolds, N50 = 409 Kb), while the assembly size increased from 1,099 to 1,361 Mb (cf., genome size of 926 Mb based on flow cytometry measurements).

The PacBio subreads used in the assembly are also provided here, where the file names indicate which reaction chemistry was used: P4-C2 (23 SMRT cells) and P6-C4 (2 SMRT cells). The sequencing was done with a PacBio RS II machine. The single template library was generated from pooled genomic DNA from an adult female and mixed-stage eggs.

Resources in this dataset:

  • Resource Title: Oncopeltus fasciatus PBJelly Gapfilled Illumina Assembly v1.0.

    File Name: Oncopeltus_fasciatus_PBJelly_Gapfilled_Illumina_Assemblyv1.fasta.fas

    Resource Description: Hybrid genome assembly for the milkweed bug, Oncopeltus fasciatus, based on the Illumina assembly v1.0 gapfilled with low coverage PacBio data (approximate coverage of 8x), using PBJelly v13.10.22. Assembly scaffolds are provided in fasta format.

  • Resource Title: Oncopeltus fasciatus PacBio subheads with P4-C2 chemistry.

    File Name: Oncopeltus_fasciatus_PacBio_subreadsP4-C2.fasta.fas

    Resource Description: Oncopeltus fasciatus PacBio subreads with P4-C2 chemistry (23 SMRT cells), from pooled genomic DNA from an adult female and mixed-stage eggs. Subreads are provided in fasta format.

  • Resource Title: Oncopeltus fasciatus PacBio subreads with P6-C4 chemistry.

    File Name: Oncopeltus_fasciatus_PacBio_subreadsP6-C4.fasta.fas

    Resource Description: Oncopeltus fasciatus PacBio subreads with P6-C4 chemistry (2 SMRT cells), from pooled genomic DNA from an adult female and mixed-stage eggs. Subreads are provided in fasta format.


Deutsche Forschungsgemeinschaft: SFB 680 project A12


Data contact name

Panfilio, Kristen


Ag Data Commons

Intended use

This preliminary hybrid assembly was primarily used to assess repetitive content in the genome. Please see the final publication on this work for full details.

Use limitations

As indicated in the general description, this is a preliminary hybrid assembly. It is based on low coverage PacBio data, and although gaps were substantially reduced compared to the Illumina-only assembly, the hybrid assembly still has limited contiguity and is somewhat in excess of the measured genome size. These data should be interpreted accordingly.

Temporal Extent Start Date


Temporal Extent End Date



  • Not specified

Geographic Coverage


Geographic location - description

Panfilio Lab, University of Cologne, Zuelpicher Str. 47b, Cologne 50674, Germany

ISO Topic Category

  • biota

Ag Data Commons Group

  • Insects - i5K

National Agricultural Library Thesaurus terms

ecology; Oncopeltus fasciatus; genome assembly; physiology; evolution; mouthparts; wings; extraembryonic membranes; data collection; nucleotides; flow cytometry; DNA; adults; females; eggs; genomics; zinc finger motif; proteins; introns; models; invertebrates; insect pests; Hemiptera; Heteroptera; Lygaeidae; phytophagy; developmental biology; RNA interference; legs; genes

Primary article PubAg Handle

Pending citation

  • No

Public Access Level

  • Public

Preferred dataset citation

Vargas Jentzsch, Iris M.; Kovacova, Viera; Stueber, Kurt; Koelzer, Stefan; Panfilio, Kristen A. (2019). Oncopeltus fasciatus hybrid genome assembly 1.0. Ag Data Commons.