🍤CJR🍥

Wikipedia Movie Plots with AI Plot Summaries

Arts and EntertainmeMovies and TV ShowsNLPMulticlass ClassificText Mining

2

已售 0
31.16MB

数据标识:D17169491259242055

发布时间:2024/05/29

卖家暂未授权典枢平台对该文件进行数据验证,您可以向卖家

申请验证报告

数据描述

About Dataset

Context

While inspecting the great Wikipedia Movies Plots dataset by JustinR ( https://www.kaggle.com/jrobischon/wikipedia-movie-plots ), I figured that having plots being summarized would be of great use, since nowadays state-of-the-art NLP models have limitations regarding the number of tokens on input.
I wrote a medium article based on this dataset.

Content

Everything is the same as in https://www.kaggle.com/jrobischon/wikipedia-movie-plots, I simply added a new column with the summary of each and every plot with 128 tokens at maximum, using DistilBART-CNN-12-6 model( https://huggingface.co/sshleifer/distilbart-cnn-12-6 ) for summarization. Code here.

Acknowledgements

Please, go upvote https://www.kaggle.com/jrobischon/wikipedia-movie-plots dataset, since this is 100% based on that.

data icon
Wikipedia Movie Plots with AI Plot Summaries
2
已售 0
31.16MB
申请报告