🍤CJR🍥

Wikipedia Movie Plots with AI Plot Summaries

Arts and EntertainmeMovies and TV ShowsNLPMulticlass ClassificText Mining

2

已售 0
31.16MB

数据标识:D17169491259242055

发布时间:2024/05/29

About Dataset

Context

While inspecting the great Wikipedia Movies Plots dataset by JustinR ( https://www.kaggle.com/jrobischon/wikipedia-movie-plots ), I figured that having plots being summarized would be of great use, since nowadays state-of-the-art NLP models have limitations regarding the number of tokens on input.
I wrote a medium article based on this dataset.

Content

Everything is the same as in https://www.kaggle.com/jrobischon/wikipedia-movie-plots, I simply added a new column with the summary of each and every plot with 128 tokens at maximum, using DistilBART-CNN-12-6 model( https://huggingface.co/sshleifer/distilbart-cnn-12-6 ) for summarization. Code here.

Acknowledgements

Please, go upvote https://www.kaggle.com/jrobischon/wikipedia-movie-plots dataset, since this is 100% based on that.

看了又看

暂无推荐

验证报告

目前该文件尚无匹配的数据质量验证程序。我们将在后续版本中提供相应的验证支持,敬请谅解。

data icon
Wikipedia Movie Plots with AI Plot Summaries
2
已售 0
31.16MB
申请报告