A harmonized dataset of ground-mounted solar energy in the US with enhanced metadata
dataset
posted on 2025-08-20, 02:51authored byJacob Stid, Anthony Kendall, Jeremy Rapp, James Bingaman, Annick Anctil, David Hyndman
<em>Ground-Mounted Solar Energy in the United States (GM-SEUS)</em><p>Abstract:</p><p>Solar energy generating systems are critical components of our expanding energy infrastructure, yet available datasets remain incomplete or not publicly available–particularly at the sub-array level. Combining the best open-access datasets in the US with image analysis on freely available remotely-sensed imagery, we present the Ground-Mounted Solar Energy in the United States (GM-SEUS) dataset, a harmonized, open access geospatial and temporal repository of solar energy arrays and panel-rows. GM-SEUS v1.0 includes over <strong>15,000 commercial- and utility-scale ground-mounted solar photovoltaic and concentrating solar energy arrays</strong> (<strong>186 GW</strong>) covering <strong>2,950 km²</strong> and includes <strong>2.92 million unique solar panel-rows</strong> (<strong>466 km²</strong>). We use these newly compiled and delineated solar arrays and panel-rows to harmonize and independently estimate value-added attributes to existing datasets including installation year, azimuth, mount technology, panel-row area and dimensions, inter-row spacing, ground cover ratio, tilt, and installed capacity. By estimating and harmonizing these attributes of the distributed US solar energy landscape, GM-SEUS supports diverse applications in renewable energy modeling, ecosystem service assessment, and infrastructure planning. </p><p>Technical info:</p><p>This is the data repository for creating and maintaining the Ground-Mounted Solar Energy in the United States (GM-SEUS) spatiotemporal dataset of solar arrays and panel-rows using existing datasets, machine learning, and object-based image analysis to enhance existing sources. Contents of this repository are described here briefly, with the attatched data README providing more detailed descriptions. The source Github Repository for generating this dataset can be found <a href="https://github.com/stidjaco/GMSEUS">here</a>. A paper has been submitted describing this dataset.</p>
<p>This is the initial release of GM-SEUS (version 1.0). All input datasets and solar panel-row delineation results are up-to-date through December 11th, 2024. </p>
<p><strong>Primary Repository Contents Include: </strong></p>
<p><em><strong>GMSEUS_Arrays_Final</strong></em>: Final array dataset containing over 15,000 array boundaries from existing datasets and enhanced by buffer-dissolve-erode technique with GM-SEUS panel-rows containing all array-level attributes (ESRI:102003), geopackage, shapefile, and comma separated values</p>
<p><em><strong>GMSEUS_Panels_Final</strong></em>: Final panel-row dataset containing 2.92 million boundaries from existing datasets and newly delineated GM-SEUS panel-rows containing all panel-row-level attributes (ESRI:102003), geopackage, shapefile, and comma separated values</p>
<p><em><strong>GMSEUS_NAIP_Arrays</strong></em>: All array boundaries created by buffer-dissolve-erode method of newly delineated (NAIP) GM-SEUS panel-rows (ESRI:102003), geopackage, shapefile, and comma separated values</p>
<p><em><strong>GMSEUS_NAIP_Panels</strong></em>: All newly delineated panel-row boundaries (ESRI:102003), geopackage, shapefile, and comma separated values</p>
<p><em><strong>GMSEUS_NAIP_PanelsNoQAQC</strong></em>: All newly delineated panel-rows from NAIP imagery without any quality control (ESRI:102003), geopackage, shapefile, and comma separated values</p>
<p><em><strong>NAIPtrainRF</strong></em>: Training dataset of 12,000 NAIP training points (2,000 per class) containing class values, spectral index values, the year of NAIP imagery accessed, and point coordinates (WGS84), comma separated values</p>
<p><em><strong>NAIPclassifyRF</strong></em>: Random forest classifier trees and weights as output from Google Earth Engine classifier, comma separated values</p>
<p><em><strong>LabeledImages</strong></em>: Directory containing image and mask subdirectories with ~17,500 input and target images for deep learning pattern recognition applications, GeoTIFF</p>
<p><strong>Disclaimer: </strong></p>
<p>This dataset provides a broad characterization of solar array design practices. Any characterization of solar array design and management derived from remote sensing imagery should be considered with extreme scrutiny given the limitations of such approaches. While our work fills a critical data gap and compiles and enhances existing high-fidelity datasets, the design practices reported here are thus subject to uncertainty and should not be used to represent actual conditions at individual sites. No warranty is expressed or implied regarding accuracy, completeness or fitness for a specific purpose. We publish this dataset in open access, for the broader science community, policy makers, and stakeholders in addressing questions about the existing renewable energy landscape and do not consent to this data being used to target, identify, or make claims about individual arrays, properties, or entities. Any such use case is strictly prohibited. </p>
Funding
United States Department of Agriculture: 2018-67003-27406
remote sensing; class; value added; image analysis; infrastructure; issues and policy; stakeholders; metadata; Internet; data collection; energy; quality control; shapefile; ecosystem services; renewable energy sources; uncertainty; solar energy