posted on 2025-08-19, 02:28authored byCatherine Linnen, Danielle Herrig, Ryan Ridenbaugh, Kim Vertacnik, Kathryn Everson, SHEINA SIMSHEINA SIM, Scott GeibScott Geib, David Weisrock
<p>Rapidly evolving taxa are excellent models for understanding the mechanisms that give rise to biodiversity. However, developing an accurate historical framework for comparative analysis of such lineages remains a challenge due to ubiquitous incomplete lineage sorting and introgression. Here, we use a whole-genome alignment, multiple locus-sampling strategies, and summary-tree and SNP-based species-tree methods to infer a species tree for eastern North American <em>Neodiprion </em>species, a clade of pine-feeding sawflies (Order: Hymenopteran; Family: Diprionidae). We recovered a well-supported species tree that—except for three uncertain relationships—was robust to different strategies for analyzing whole-genome data. Nevertheless, underlying gene-tree discordance was high. To understand this genealogical variation, we used multiple linear regression to model site concordance factors estimated in 50-kb windows as a function of several genomic predictor variables. We found that site concordance factors tended to be higher in regions of the genome with more parsimony-informative sites, fewer singletons, less missing data, lower GC content, more genes, lower recombination rates, and lower D-statistics (less introgression). Together, these results suggest that incomplete lineage sorting, introgression, and genotyping error all shape the genomic landscape of gene-tree discordance in <em>Neodiprion</em>. More generally, our findings demonstrate how combining phylogenomic analysis with knowledge of local genomic features can reveal mechanisms that produce topological heterogeneity across genomes.</p><p>All files are either in FASTA or NEXUS format. FASTA format is a standard text format for nucleotide sequences. FASTA genome files are provided for each <em>Neodiprion</em> species. Using freely available scripts (<a href="https://github.com/LinnenLab/Herrig_etal_NeodiprionPhylogeny">https://github.com/LinnenLab/Herrig_etal_NeodiprionPhylogeny</a>), these can be used to produce window-based and gene-based datasets in nexus format. Nexus is a standard format for character data for phylogenetic analysis. These can be used as input for many different phylogenetic programs. </p>