The dataset viewer is disabled because the authors forbid processing this dataset automatically and require the users to download the dataset files manually.
Dataset Card for RecipeNLG
Dataset Summary
RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.
While the RecipeNLG dataset is based on the Recipe1M+ dataset, it greatly expands the number of recipes available. The new dataset provides over 1 million new, preprocessed and deduplicated recipes on top of the Recipe1M+ dataset.
Supported Tasks and Leaderboards
[More Information Needed]
Languages
The dataset is in English.
Dataset Structure
Data Instances
{'id': 0,
'title': 'No-Bake Nut Cookies',
'ingredients': ['1 c. firmly packed brown sugar',
'1/2 c. evaporated milk',
'1/2 tsp. vanilla',
'1/2 c. broken nuts (pecans)',
'2 Tbsp. butter or margarine',
'3 1/2 c. bite size shredded rice biscuits'],
'directions': ['In a heavy 2-quart saucepan, mix brown sugar, nuts, evaporated milk and butter or margarine.',
'Stir over medium heat until mixture bubbles all over top.',
'Boil and stir 5 minutes more. Take off heat.',
'Stir in vanilla and cereal; mix well.',
'Using 2 teaspoons, drop and shape into 30 clusters on wax paper.',
'Let stand until firm, about 30 minutes.'],
'link': 'www.cookbooks.com/Recipe-Details.aspx?id=44874',
'source': 0,
'ner': ['brown sugar',
'milk',
'vanilla',
'nuts',
'butter',
'bite size shredded rice biscuits']}
Data Fields
id
(int
): ID.title
(str
): Title of the recipe.ingredients
(list
ofstr
): Ingredients.directions
(list
ofstr
): Instruction steps.link
(str
): URL link.source
(ClassLabel
): Origin of each recipe record, with possible value {"Gathered", "Recipes1M"}:- "Gathered" (0): Additional recipes gathered from multiple cooking web pages, using automated scripts in a web scraping process.
- "Recipes1M" (1): Recipes from "Recipe1M+" dataset.
ner
(list
ofstr
): NER food entities.
Data Splits
The dataset contains a single train
split.
Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
[More Information Needed]
Initial Data Collection and Normalization
[More Information Needed]
Who are the source language producers?
[More Information Needed]
Annotations
[More Information Needed]
Annotation process
[More Information Needed]
Who are the annotators?
[More Information Needed]
Personal and Sensitive Information
[More Information Needed]
Considerations for Using the Data
Social Impact of Dataset
[More Information Needed]
Discussion of Biases
[More Information Needed]
Other Known Limitations
[More Information Needed]
Additional Information
Dataset Curators
[More Information Needed]
Licensing Information
I (the "Researcher") have requested permission to use the RecipeNLG dataset (the "Dataset") at Poznań University of Technology (PUT). In exchange for such permission, Researcher hereby agrees to the following terms and conditions:
- Researcher shall use the Dataset only for non-commercial research and educational purposes.
- PUT makes no representations or warranties regarding the Dataset, including but not limited to warranties of non-infringement or fitness for a particular purpose.
- Researcher accepts full responsibility for his or her use of the Dataset and shall defend and indemnify PUT, including its employees, Trustees, officers and agents, against any and all claims arising from Researcher's use of the Dataset including but not limited to Researcher's use of any copies of copyrighted images or text that he or she may create from the Dataset.
- Researcher may provide research associates and colleagues with access to the Dataset provided that they first agree to be bound by these terms and conditions.
- If Researcher is employed by a for-profit, commercial entity, Researcher's employer shall also be bound by these terms and conditions, and Researcher hereby represents that he or she is fully authorized to enter into this agreement on behalf of such employer.
Citation Information
@inproceedings{bien-etal-2020-recipenlg,
title = "{R}ecipe{NLG}: A Cooking Recipes Dataset for Semi-Structured Text Generation",
author = "Bie{\'n}, Micha{\l} and
Gilski, Micha{\l} and
Maciejewska, Martyna and
Taisner, Wojciech and
Wisniewski, Dawid and
Lawrynowicz, Agnieszka",
booktitle = "Proceedings of the 13th International Conference on Natural Language Generation",
month = dec,
year = "2020",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.inlg-1.4",
pages = "22--28",
}
Contributions
Thanks to @abhishekkrthakur for adding this dataset.
- Downloads last month
- 542