posted on 2025-09-16, 14:43authored byVladimir Kulyukin, Aleksey KulyukinAleksey Kulyukin, Reagan Hill, Matthew Lister, William G. Meikle, Milagra Weiss, Daniel Coster
<p dir="ltr">In 2014–2022, USDA-ARS Tucson, AZ, by itself and in collaboration with other precision apiculture (PA) research programs, including the PA program at Utah State University, and several commercial operations, acquired a large reservoir of multi-sensor data, including thousands of frame photographs and sensor measurements, from field experiments with managed honey bee colonies. This reservoir is a loose collection of hive frame photos, CSV files, spreadsheets, and hive inspection text logs. Our project explores and exploits this reservoir and makes public its curated subsets. This dataset is the first such subset we curated in 2024-25 under USDA-NIFA Award 205732 "DSFAS - Exploration and Exploitation of the 2014-2022 USDA-ARS Tucson, AZ Digital Data Reservoir of Field Experiments with Managed Honey Bee Colonies."</p><p dir="ltr">The zipped directory ANNOTATED_HIVE_FRAMES includes 13 image subdirectories with annotated images.</p><p dir="ltr">1) 2013_07_28_CHBRC -- 57 Files</p><p dir="ltr">2) 2014_07_30_12_CHBRC -- 111 Files</p><p dir="ltr">3) 2015_02_11_MAC_RR -- 660 Files</p><p dir="ltr">4) 2016_03_30_HOOPS -- 153 Files</p><p dir="ltr">5) 2017_02_01_SRER_BEAR_CAGE -- 87 Files</p><p dir="ltr">6) 2018_02_13_SRER_SC_complete_3_9_25 -- 195 Files</p><p dir="ltr">7) 2018_04_18_SRER_SC_Methoxy -- 366 Files</p><p dir="ltr">8) 2019_07_11_SRER_BC_Neonic -- 60 Files</p><p dir="ltr">9) 2020_02_27_RR_Hive_Directions -- 36 Files</p><p dir="ltr">10) 2021_06_08_CHBRC_VLAD -- 282 Files</p><p dir="ltr">11) 2021_09_27_RR_ColdStor -- 855 Files</p><p dir="ltr">12) 2021_02_11_CT_ColdStor -- 111 Files</p><p dir="ltr">13) 2014_12_15_50_CHBRC --- 30 files</p><p dir="ltr">The name of each subfolder includes a year, a month, and a date on which the frame photos were taken, followed by the location of the apiary where the photos were taken. The de-abbreviations are as follows:</p><p dir="ltr">CHBRC -- Carl Hayden Bee Research Center</p><p dir="ltr">MAC -- Maricopa Agriculture Center</p><p dir="ltr">RR -- Red Rock Agriculture Center</p><p dir="ltr">HOOPS -- one of the apiaries at CHBRC</p><p dir="ltr">SRER -- Santa Rita Experimental Range</p><p dir="ltr">SRER -- Shipping Corrals</p><p dir="ltr">CT -- Cow Town</p><p dir="ltr">Each of the 13 subdirectories has three subsubdirectories: PNG/, XML/, TXT/.</p><p dir="ltr">PNG/ -- hive frame photos in PNG format;</p><p dir="ltr">XML/ -- XML annotations of images in PNG/ with LabelImg</p><p dir="ltr">TXT/ -- TXT annotations of images in PNG/ for YOLO training</p><p dir="ltr">Thus, in each of the 13 folders, each PNG image has two annotation files. E.g.,</p><p dir="ltr">2020_02_27_RR_Hive_Directions_IMG_2540_VK.PNG</p><p dir="ltr">2020_02_27_RR_Hive_Directions_IMG_2540_VK.xml</p><p dir="ltr">2020_02_27_RR_Hive_Directions_IMG_2540_VK.txt</p><p dir="ltr">Each PNG is annotated for the following categories:</p><p dir="ltr">(1) CappedHoneyCell </p><p dir="ltr">(2) CappedWorkerBroodCell</p><p dir="ltr"> (3) EmptyCombCell</p><p dir="ltr">(4) PollenCell</p><p dir="ltr"> (5) UncappedNectarCell</p><p dir="ltr"> (6) UncappedWorkerLarvaCell</p><p dir="ltr">(7) BeeHiveFrame</p><p dir="ltr">The counts on the number of annotated region of interest (ROI) images are as follows:</p><p dir="ltr">CappedHoneyCell: 19,723</p><p dir="ltr">CappedWorkerBroodCell: 21,456</p><p dir="ltr">EmptyCombCell: 20,655</p><p dir="ltr">PollenCell: 13,406</p><p dir="ltr">UncappedNectarCell: 11,009</p><p dir="ltr">UncappedWorkerLarvaCell: 18,283</p><p dir="ltr">BeeHiveFrame: 1001</p><p dir="ltr">Each such ROI can be extracted into a separate image and used in training machine learning algorithms.</p><p dir="ltr">The subdirectory SRC/ contains two Python scripts that can convert XML to TXT and TXT to XML: xml_to_txt_converter.py and txt_to_xml_converter.py.</p><p dir="ltr">USDA_ARZ_DATA_YOLO_19june2025.zip is a 3GB zip version of these images prepared for YOLO training. It is available at https://usu.box.com/s/dh75xkinwfyl3sqgb9vugy1ahf6z9mrh.</p><p dir="ltr">SRC/ also contains the following Python scripts that we used for training YOLO networks:</p><p dir="ltr">(a) train_valid_split.py -- splits all alldata.txt in USDA_ARZ_DATA_YOLO_19june2025.zip into train.txt and valid.txt for YOLO training.</p><p dir="ltr">(b) tune_y8n.py --- tunes YOLOv8-nano</p><p dir="ltr">(c) tune_y8s.py --- tunes YOLOv8-small</p><p dir="ltr">(d) tune_y11n.py -- tunes YOLOv11-nano</p><p dir="ltr">(e) tune_y11s.py -- tunes YOLOv11-small</p><p dir="ltr">The folder METADATA/ contains two files: METADATA.txt and PapersDataSets_DrMeikle.xlsx. These files provide the metadata on the the USDA-ARS Tucson, AZ reservoir.</p>
Funding
DSFAS: Exploration and Exploitation of the 2014-2022 USDA-ARS Tucson, AZ Digital Data Reservoir of Field Experiments with Managed Honey Bee Colonies
Apiaries in Arizona:
CHBRC -- Carl Hayden Bee Research Center
MAC -- Maricopa Agriculture Center
RR -- Red Rock Agriculture Center
HOOPS -- one of the apiaries at CHBRC
SRER -- Santa Rita Experimental Range
SRER -- Shipping Corrals
CT -- Cow Town
ISO Topic Category
environment
farming
National Agricultural Library Thesaurus terms
Agricultural Research Service; Arizona; field experimentation; honey bee colonies; beehives; monitoring; data collection; pollination; artificial intelligence; neural networks; statistics; time series analysis; apiculture; research programs; Utah; photographs; digital database; computer software
OMB Bureau Code
005:18 - Agricultural Research Service
005:20 - National Institute of Food and Agriculture