Agricultural Research Word Vectors
This model was originally trained for use in a recommendation system to the Ag Data Commons that will automatically link viewers of one dataset to other directly relevant datasets and research papers that they may be interested in. It was also used to determine the similarities and differences between projects within ARS’ National Programs and create a visualization layer to allow leaders to explore and manage their programs easily.
This model was generated using the Word2Vec model, starting with a set of word vectors trained on Google News articles, and further training it on the titles+abstracts from PubAg and the titles+descriptions from Ag Data Commons. This model was trained using a vector length of 300 and the Continuous Bag of Words version of the algorithm with negative sampling.
This word vector model could be used for any Natural-Language Processing applications involving text with a large amount of agricultural research vocabulary.
Resources in this dataset:
Resource Title: Agricultural Word Vectors.
File Name: AgWordVectors-300.zip
Resource Description: Word vectors trained on the full titles/abstracts in PubAg and titles/abstracts in Ag Data Commons. (Part A)
Resource Title: Agricultural Word Vectors Trainables.
File Name: AgWordVectors-300.model.trainables.syn1neg.zip
Resource Description: Word vectors trained on the full titles/abstracts in PubAg and titles/abstracts in Ag Data Commons. (Part B)
Resource Title: Agricultural Word Vector Model.
File Name: AgWordVectors-300.model.wv_.vectors.zip
Resource Description: Word vectors trained on the full titles/abstracts in PubAg and titles/abstracts in Ag Data Commons. (Part C)
Funding
USDA-ARS
History
Data contact name
Parr, CynthiaData contact email
cynthia.parr@usda.govPublisher
Ag Data CommonsIntended use
These word vectors can be used for NLP applications involving text with a large amount of agricultural vocabulary.Theme
- Not specified
ISO Topic Category
- biota
- environment
- farming
National Agricultural Library Thesaurus terms
models; learning; artificial intelligence; computer simulationOMB Bureau Code
- 005:18 - Agricultural Research Service
OMB Program Code
- 005:040 - National Research
Pending citation
- No
Public Access Level
- Public