Data from: A Community Resource for Exploring and Utilizing Genetic Diversity in the USDA Pea Single Plant Plus Collection
Included in this dataset are SNP and fasta data for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions.
These 6 datasets can be roughly divided into two groups. Group 1 consists of three datasets labeled PSPPC which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of three datasets labeled PSPPC + P. fulvum which refer to SNP data pertaining to the USDA PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore SNP names that are shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus.
For analysis, SNP data is available in two widely used formats: hapmap and vcf. These formats can be successfully loaded into TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file.
Descriptions of the first 11 columns in the hapmap file are as follows:
- rs#- Name of locus (i.e. SNP name)
- alleles- Indicates the SNPs for each allele at the locus
- chrom- Irrelevant for these datasets, since markers are unordered.
- pos- Irrelevant for these datasets, since markers are unordered.
- strand- Irrelevant for these datasets, since markers are unordered
- assembly#- required field for hapmap format. NA for these datasets
- center- required field for hapmap format. NA for these datasets
- protLSID- required field for hapmap format. NA for these datasets
- assayLSID- required field for hapmap format. NA for these datasets
- panel- required field for hapmap format. NA for these datasets
- QCcode- required field for hapmap format. NA for these datasets
The fasta sequences containing the SNPs are also available for such downstream applications as development of primers for platform-specific markers.
For more information about this dataset, contact Clarice Coyne at Clarice.Coyne@usda.gov or coynec@wsu.edu.
Resources in this dataset:
Resource Title: PSPPC SNPs in hapmap format.
File Name: PSPPC.hmp.txt
Resource Description: 66591 unanchored SNPs for the PSPPC collection in hapmap format
Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel
Resource Title: PSPPC SNP FASTA Sequences.
File Name: PSPPC.fa.txt
Resource Description: FASTA sequences for each allele of the PSPPC SNP dataset
Resource Title: PPSPPC + P. fulvum SNPs in hapmap format.
File Name: PSPPC+fulvums.hmp.txt
Resource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in hapmap format. SNP names are independent and unrelated to plain PSPPC SNP files.
Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel
Resource Title: PSPPC + P. fulvum SNP FASTA Sequences.
File Name: PSPPC+fulvums.fa.txt
Resource Description: FASTA sequences for each allele of the PSPPC + P. fulvum SNP dataset. SNP names are independent and unrelated to plain PSPPC SNP files.
Resource Title: PSPPC + P. fulvum SNPs in vcf format.
File Name: PSPPC+fulvums.vcf.txt
Resource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in vcf format. SNP names are independent and unrelated to plain PSPPC SNP files.
Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel
Resource Title: PSPPC SNPs in vcf format.
File Name: PSPPC.vcf.txt
Resource Description: 66591 SNPs from the PSPPC in vcf format
Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel
Resource Title: README.
File Name: Data Dictionary.docx
Resource Description: These data are for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions.
The 6 datasets can be divided into two groups. Group 1 consists of 3 datasets labeled “PSPPC” which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of 3 datasets labeled “PSPPC + P. fulvum” which refer to SNP data pertaining to the PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore any SNP name that is shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus.
For analysis, SNP data is available in two widely used formats: hapmap and vcf. These files were successfully loaded into the standalone version of TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel).
Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file.
The first 11 columns required for the hapmap format are as follows: rs#- Name of locus (i.e. SNP name) alleles- Indicates the SNPs for each allele at the locus chrom- N/A, since markers are unordered. pos- N/A, since markers are unordered. strand- N/A, since markers are unordered assembly#- N/A center- N/A protLSID- N/A assayLSID- N/A panel- N/A QCcode- N/A
The fasta sequences containing the SNPs are also available here for such downstream applications as development of primers for platform-specific markers.
Funding
USDA-ARS: 5348-21000-017-00D
History
Data contact name
Coyne, ClariceData contact email
Clarice.Coyne@usda.govPublisher
Ag Data CommonsIntended use
These data facilitate trait mapping and genomics assisted breeding in pea.Use limitations
SNPs for each of these groups were called independently; therefore SNP names that are shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus.Temporal Extent Start Date
2013-01-01Temporal Extent End Date
2014-12-31Theme
- Not specified
Geographic Coverage
{"type":"FeatureCollection","features":[{"geometry":{"type":"Polygon","coordinates":[[[-166.640625,-59.987997631212],[-166.640625,83.254516804633],[194.765625,83.254516804633],[194.765625,-59.987997631212],[-166.640625,-59.987997631212]]]},"type":"Feature","properties":{}}]}ISO Topic Category
- biota
National Agricultural Library Thesaurus terms
data collection; peas; single nucleotide polymorphism; Pisum fulvumOMB Bureau Code
- 005:18 - Agricultural Research Service
OMB Program Code
- 005:040 - National Research
ARS National Program Number
- 301
Pending citation
- No
Public Access Level
- Public