Ag Data Commons
Browse

Data and Images from: Numerical Signature Dataset of Thoracic and Elytral Fragments from Curculionidae and Tenebrionidae Beetles for AI-Based Species Identification

dataset
posted on 2025-07-28, 17:17 authored by Alison GerkenAlison Gerken, Ronnie Serfa Juan
<p dir="ltr">This dataset presents curated and annotated 256×256 pixel image fragments of thoracic and elytral regions from six economically significant species within the beetle families Curculionidae and Tenebrionidae. These anatomical fragments were extracted from high-resolution trap images to support numerical signature generation for species-level classification. The dataset enables researchers to develop and validate image-based machine learning pipelines for pest identification in stored product environments. All samples are standardized and include metadata indicating species, anatomical region, and imaging parameters. This work contributes to advancing computational entomology, post-harvest pest management, and real-time automated monitoring systems.</p><p dir="ltr">Insect pest detection in post-harvest storage environments is crucial for mitigating economic losses and ensuring food security. Traditional identification approaches are labor-intensive and rely on full-body morphological features. This dataset introduces a fragment-based classification framework using annotated thoracic and elytral sections of stored-product beetles. Species included are:</p><ul><li><i>Sitophilus zeamais</i></li><li><i>Sitophilus oryzae</i></li><li><i>Sitophilus granarius</i></li><li><i>Tribolium castaneum</i></li><li><i>Latheticus oryzae</i></li><li><i>Tribolium confusum</i></li></ul><p dir="ltr">Folder abbreviations for the insects are <i>Sitophilus zeamais</i> (mw); <i>Sitophilus oryzae</i> (rw); <i>Sitophilus granarius</i> (ww); <i>Tribolium castaneum</i> (tc); <i>Latheticus oryzae</i> (lo); and <i>Tribolium confusum</i> (tco).</p><p dir="ltr">Key numerical signature descriptors were computed for each fragment:</p><ul><li><b>Skewness</b> – Asymmetry of pixel intensity distribution</li><li><b>Kurtosis</b> – Sharpness of contour distributions</li><li><b>Entropy</b> – Texture complexity</li><li><b>Standard Deviation</b> – Pixel intensity variation</li></ul><p></p>

Funding

Department of Energy DE-SC0014664

USDA-ARS: 3020-43000-034-00-D

History

Data contact name

Gerken, Alison R.

Data contact email

alison.gerken@usda.gov

Publisher

Ag Data Commons

Intended use

For detection and identification of insects and insect fragments that may infest stored products. Focused on three Curculionidae species and three Tenebrionidae species. The provided files are a subset of the full dataset used for model training, testing, and validation.

Use limitations

This is only a representative subset of the images used for the full model.

Temporal Extent Start Date

2025-04-02

Frequency

  • notPlanned

Theme

  • Non-geospatial

ISO Topic Category

  • biota

National Agricultural Library Thesaurus terms

data collection; species identification; image analysis; researchers; artificial intelligence; pipelines; pest identification; metadata; entomology; pest management; automation; monitoring; postharvest storage; financial economics; food security; storage insects; Sitophilus zeamais; Sitophilus oryzae; Sitophilus granarius; Tribolium castaneum; Latheticus oryzae; Tribolium confusum; asymmetry; entropy; texture; standard deviation

OMB Bureau Code

  • 005:18 - Agricultural Research Service

OMB Program Code

  • 005:040 - National Research

ARS National Program Number

  • 304

ARIS Log Number

427340

Pending citation

  • Yes

Public Access Level

  • Public