Ag Data Commons
Browse
TEXT
O157H7_10read_simulated.fasta (72.07 kB)
TEXT
O157H7_50read_simulated.fasta (417.91 kB)
TEXT
O157H7_75read_simulated.fasta (535.74 kB)
TEXT
O157H7_100read_simulated.fasta (749.98 kB)
TEXT
O157H7_250read_simulated.fasta (2.01 MB)
TEXT
O157H7_500read_simulated.fasta (3.65 MB)
TEXT
O157H7_750read_simulated.fasta (5.73 MB)
TEXT
O157H7_1000read_simulated.fasta (7.67 MB)
TEXT
O157H7_2500read_simulated.fasta (18.69 MB)
TEXT
O157H7_5000read_simulated.fasta (37.04 MB)
TEXT
O157H7_7500read_simulated.fasta (56.27 MB)
TEXT
O157H7_10000read_simulated.fasta (75.54 MB)
TEXT
O157H7_50000read_simulated.fasta (375.54 MB)
TEXT
O157H7_75000reads_simulated.fasta (565.61 MB)
TEXT
O157H7_100000read_simulated.fasta (752.26 MB)
TEXT
O157H7_250000read_simulated.fasta (1.83 GB)
TEXT
O157H7_500000read_simulated.fasta (3.66 GB)
TEXT
Lmono_10reads_simulated.fasta (71.89 kB)
TEXT
Lmono_50reads_simulated.fasta (425.35 kB)
TEXT
Lmono_75reads_simulated.fasta (578.35 kB)
1/0
57 files

Data from: Use of long-read sequencing simulators to assess real-world applications for food safety

dataset
posted on 2024-02-28, 21:21 authored by Katrina L. Counihan, Siddhartha Kanrar, Shannon Tilman, Andrew Gehring

Shiga toxin-producing Escherichia coli (STEC) and Listeria monocytogenes are responsible for severe foodborne illnesses in the United States. Current identification methods require at least four days to identify STEC and six days for L. monocytogenes. Adoption of long-read, whole genome sequencing for testing could significantly reduce the time needed for identification, but method development costs are high. Therefore, the goal of this project was to use NanoSim-H software to simulate Oxford Nanopore sequencing reads to assess the feasibility of sequencing-based foodborne pathogen detection and guide experimental design. Sequencing reads were simulated for STEC, L. monocytogenes, and a 1:1 combination of STEC and Bos taurus genomes using NanoSim-H. This dataset includes all of the simulated reads generated by the project in fasta format. This dataset can be analyzed bioinformatically or used to test bioinformatic pipelines.

Funding

USDA-ARS: 8072-42000-093-00D

USDA-ARS: 0500-00093-001-00-D

History

Data contact name

Counihan, Katrina

Data contact email

katrina.counihan@usda.gov

Publisher

Ag Data Commons

Temporal Extent Start Date

2021-11-01

Temporal Extent End Date

2022-06-30

Theme

  • Not specified

Geographic Coverage

{"type":"FeatureCollection","features":[{"geometry":{"type":"Point","coordinates":[-795.18665313721,40.077810523208]},"type":"Feature","properties":{}}]}

Geographic location - description

600 E Mermaid Ln, Wyndmoor, PA 19038

ISO Topic Category

  • health

National Agricultural Library Thesaurus terms

food safety; Shiga toxin-producing Escherichia coli; Listeria monocytogenes; United States; sequence analysis; computer software; nanopores; food pathogens; microbial detection; experimental design; cattle; genome; data collection; bioinformatics

OMB Bureau Code

  • 005:18 - Agricultural Research Service

OMB Program Code

  • 005:040 - National Research

ARS National Program Number

  • 108

Pending citation

  • No

Related material without URL

Counihan, K., S. Kanrar, S. Tilman, and A. Gehring. (2024) Evaluation of long-read sequencing simulators to assess real-world applications for food safety. Foods 13(16). https://doi.org/10.3390/foods13010016

Public Access Level

  • Public

Preferred dataset citation

Counihan, Katrina L.; Kanrar, Siddhartha; Tilman, Shannon; Gehring, Andrew (2023). Data from: Use of long-read sequencing simulators to assess real-world applications for food safety. Ag Data Commons. https://doi.org/10.15482/USDA.ADC/1529447