以下为卖家选择提供的数据验证报告:
数据描述
DuoRC: (Q&A: Wikipedia and IMDB)
English dataset of questions and answers from Wikipedia and IMDb movie plots
By Huggingface Hub [source]
About this dataset
More Datasets
> For more datasets, click here.
Featured Notebooks
> - 🚨 Your notebook can be here! 🚨!
How to use the dataset
> > The DuoRC dataset is an English language dataset of questions and answers gathered from crowdsourced AMT workers on Wikipedia and IMDb movie plots. The workers were given freedom to pick answer from the plots or synthesize their own answers. It contains two sub-datasets - SelfRC and ParaphraseRC. SelfRC dataset is built on Wikipedia movie plots solely. ParaphraseRC has questions written from Wikipedia movie plots and the answers are given based on corresponding IMDb movie plots.
Research Ideas
> - This dataset can be used to train a model to answer questions about movie plots. > - This dataset can be used to train a model to answer questions about Wikipedia articles. > - This dataset can be used to find paraphrases of questions about movie plots
Acknowledgements
> If you use this dataset in your research, please credit the original authors.
> Data Source > >
License
> > > License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication > No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: SelfRC_train.csv
Column name | Description |
---|---|
plot | The plot of the movie. (String) |
title | The title of the movie. (String) |
question | The question about the plot. (String) |
answers | The answers to the question. (List of strings) |
no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: SelfRC_test.csv
Column name | Description |
---|---|
plot | The plot of the movie. (String) |
title | The title of the movie. (String) |
question | The question about the plot. (String) |
answers | The answers to the question. (List of strings) |
no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: ParaphraseRC_train.csv
Column name | Description |
---|---|
plot | The plot of the movie. (String) |
title | The title of the movie. (String) |
question | The question about the plot. (String) |
answers | The answers to the question. (List of strings) |
no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: SelfRC_validation.csv
Column name | Description |
---|---|
plot | The plot of the movie. (String) |
title | The title of the movie. (String) |
question | The question about the plot. (String) |
answers | The answers to the question. (List of strings) |
no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: ParaphraseRC_test.csv
Column name | Description |
---|---|
plot | The plot of the movie. (String) |
title | The title of the movie. (String) |
question | The question about the plot. (String) |
answers | The answers to the question. (List of strings) |
no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: ParaphraseRC_validation.csv
Column name | Description |
---|---|
plot | The plot of the movie. (String) |
title | The title of the movie. (String) |
question | The question about the plot. (String) |
answers | The answers to the question. (List of strings) |
no_answer | A binary value that indicates whether the question has a answer. (Integer) |
Acknowledgements
> If you use this dataset in your research, please credit the original authors. > If you use this dataset in your research, please credit Huggingface Hub.
