麻酱

verify-tag35K Movies with Embedded Plots to Vectors

search enginesmovies and tv showsearth and naturenlprecommender systemstext

16

已售 0
889.4MB

数据标识:D17171235440486471

发布时间:2024/05/31

以下为卖家选择提供的数据验证报告:

数据描述

In this project, we will use a dataset of movies with plots. The original dataset is on https://www.kaggle.com/datasets/gabrieltardochi/wikipedia-movie-plots-with-plot-summaries

The plots were scraped from Wikipedia by jrobischon and then summarized by gabrieltardochi using DistilBART-CNN-12-6 model.

There are two plots, one is full and the other is shortened. I used CO.HERE AI to vectorize them. The processed dataset was published on Kaggle with two extra columns:

plot_vector_1024: Vectorized of the full plot in 1024 dimension (a vector of 1024 float numbers) plot_summary_vector_1024: Vectorized of the summarized plot in 1024 dimension (a vector of 1024 float numbers)

The detail of the process is on https://github.com/linhhlp/Machine-Learning-Applications/Text-2-Vect-Vector-Search

data icon
35K Movies with Embedded Plots to Vectors
16
已售 0
889.4MB
申请报告