<p>This dataset contains all the data and code needed to reproduce the analyses in the manuscript:</p>
<p>Penn, H. J., & Read, Q. D. (2023). Stem borer herbivory dependent on interactions of sugarcane variety, associated traits, and presence of prior borer damage. Pest Management Science. <a href="https://doi.org/10.1002/ps.7843">https://doi.org/10.1002/ps.7843</a></p>
<p>Included are two .Rmd notebooks containing all code required to reproduce the analyses in the manuscript, two .html file of rendered notebook output, three .csv data files that are loaded and analyzed, and a .zip file of intermediate R objects that are generated during the model fitting and variable selection process. </p>
<h3>Notebook files</h3>
<ul>
<li><code>01_boring_analysis.Rmd</code>: This RMarkdown notebook contains R code to read and process the raw data, create exploratory data visualizations and tables, fit a Bayesian generalized linear mixed model, extract output from the statistical model, and create graphs and tables summarizing the model output including marginal means for different varieties and contrasts between crop years. </li>
<li><code>02_trait_covariate_analysis.Rmd</code>: This RMarkdown notebook contains R code to read raw variety-level trait data, perform feature selection based on correlations between traits, fit another generalized linear mixed model using traits as predictors, and create graphs and tables from that model output including marginal means by categorical trait and marginal trends by continuous trait.</li>
</ul>
<h3>HTML files</h3>
<p>These HTML files contain the rendered output of the two RMarkdown notebooks. They were generated by Quentin Read on 2023-08-30 and 2023-08-15.</p>
<ul>
<li><code>01_boring_analysis.html</code></li>
<li><code>02_trait_covariate_analysis.html</code></li>
</ul>
<h3>CSV data files</h3>
<p>These files contain the raw data. To recreate the notebook output the CSV files should be at the file path <code>project/data/</code> relative to where the notebook is run. Columns are described below.</p>
<ul>
<li><code>BoredInternodes_26April2022_no format.csv</code>: primary data file with sugarcane borer (SCB) damage
<ul>
<li>Columns A-C are the year, date, and location. All location values are the same.</li>
<li>Column D identifies which experiment the data point was collected from.</li>
<li>Column E, <code>Stubble</code>, indicates the crop year (plant cane or first stubble)</li>
<li>Column F indicates the variety</li>
<li>Column G indicates the plot (integer ID)</li>
<li>Column H indicates the stalk within each plot (integer ID)</li>
<li>Column I, <code># Internodes</code>, indicates how many internodes were on the stalk</li>
<li>Columns J-AM are numbered 1-30 and indicate whether SCB damage was observed on that internode (0 if no, 1 if yes, blank cell if that internode was not present on the stalk)</li>
<li>Column AN indicates the experimental treatment for those rows that are part of a manipulative experiment</li>
<li>Column AO contains notes</li>
</ul></li>
<li><code>variety_lookup.csv</code>: summary information for the 16 varieties analyzed in this study
<ul>
<li>Column A is the variety name</li>
<li>Column B is the total number of stalks assessed for SCB damage for that variety across all years</li>
<li>Column C is the number of years that variety is present in the data</li>
<li>Column D, <code>Stubble</code>, indicates which crop years were sampled for that variety ("PC" if only plant cane, "PC, 1S" if there are data for both plant cane and first stubble crop years)</li>
<li>Column E, <code>SCB resistance</code>, is a categorical designation with four values: susceptible, moderately susceptible, moderately resistant, resistant</li>
<li>Column F is the literature reference for the SCB resistance value</li>
</ul></li>
<li><code>Select_variety_traits_12Dec2022.csv</code>: variety-level traits for the 16 varieties analyzed in this study
<ul>
<li>Column A is the variety name</li>
<li>Column B is the SCB resistance designation as an integer</li>
<li>Column C is the categorical SCB resistance designation (see above)</li>
<li>Columns D-I are continuous traits from year 1 (plant cane), including sugar (Mg/ha), biomass or aboveground cane production (Mg/ha), TRS or theoretically recoverable sugar (g/kg), stalk weight of individual stalks (kg), stalk population density (stalks/ha), and fiber content of stalk (percent).</li>
<li>Columns J-O are the same continuous traits from year 2 (first stubble)</li>
<li>Columns P-V are categorical traits (in some cases continuous traits binned into categories): maturity timing, amount of stalk wax, amount of leaf sheath wax, amount of leaf sheath hair, tightness of leaf sheath, whether leaf sheath becomes necrotic with age, and amount of collar hair.</li>
</ul></li>
</ul>
<h3>ZIP file of intermediate R objects</h3>
<p>To recreate the notebook output without having to run computationally intensive steps, unzip the archive. The fitted model objects should be at the file path <code>project/</code> relative to where the notebook is run.</p>
<ul>
<li><code>intermediate_R_objects.zip</code>: This file contains intermediate R objects that are generated during the model fitting and variable selection process. You may use the R objects in the .zip file if you would like to reproduce final output including figures and tables without having to refit the computationally intensive statistical models.
<ul>
<li><code>binom_fit_intxns_updated_only5yrs.rds</code>: fitted <strong>brms</strong> model object for the main statistical model</li>
<li><code>binom_fit_reduced.rds</code>: fitted <strong>brms</strong> model object for the trait covariate analysis </li>
<li><code>marginal_trends.RData</code>: calculated values of the estimated marginal trends with respect to year and previous damage</li>
<li><code>marginal_trend_trs.rds</code>: calculated values of the estimated marginal trend with respect to TRS</li>
<li><code>marginal_trend_fib.rds</code>: calculated values of the estimated marginal trend with respect to fiber content
<div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: Sugarcane borer damage data by internode, 1993-2021.</p> <p>File Name: BoredInternodes_26April2022_no format.csv</p></li><br><li><p>Resource Title: Summary information for the 16 sugarcane varieties analyzed.</p> <p>File Name: variety_lookup.csv</p></li><br><li><p>Resource Title: Variety-level traits for the 16 sugarcane varieties analyzed.</p> <p>File Name: Select_variety_traits_12Dec2022.csv</p></li><br><li><p>Resource Title: RMarkdown notebook 2: trait covariate analysis.</p> <p>File Name: 02_trait_covariate_analysis.Rmd</p></li><br><li><p>Resource Title: Rendered HTML output of notebook 2.</p> <p>File Name: 02_trait_covariate_analysis.html</p></li><br><li><p>Resource Title: RMarkdown notebook 1: main analysis.</p> <p>File Name: 01_boring_analysis.Rmd</p></li><br><li><p>Resource Title: Rendered HTML output of notebook 1.</p> <p>File Name: 01_boring_analysis.html</p></li><br><li><p>Resource Title: Intermediate R objects.</p> <p>File Name: intermediate_R_objects.zip</p></li></ul></li>
</ul></li>
</ul><p></p>
Funding
Agricultural Research Service, 6052-21000-017-000-D
The data and code provided here will reproduce all analysis presented in the manuscript, including processing the raw data into analysis-ready format, fitting statistical models, doing variable selection, extracting output from the models, and creating graphs and tables.
Use limitations
The R code is only intended to analyze the data provided and would need to be modified to work with other similar datasets.
Penn, Hannah J.; Read, Quentin D. (2023). Data and code from: Stem borer herbivory dependent on interactions of sugarcane variety, associated traits, and presence of prior borer damage. Ag Data Commons. https://doi.org/10.15482/USDA.ADC/1529826