🌸叶

verify-tagIMDB Movie Reviews (Binary Sentiment)

movies and tv showsnlptext

7

已售 0
49.62MB

数据标识:D17222375524713786

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

IMDB Large Movie Review Dataset: Binary Sentiment Classification

The classic sentiment analysis dataset


Source

> Huggingface Hub: link >

About this dataset

> This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset

How to use the dataset

> In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv. > > Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv

Research Ideas

> - This dataset can be used to train a binary sentiment classification model. > - This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. > - This dataset can be used to build a large movie review database for research purposes

Acknowledgements

> The dataset was originally posted on Huggingface Hub > > ### License > > > > License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication > > No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: train.csv

Column name Description
text The text of the review. (String)
label The label for the review, 0 for negative and 1 for positive. (Integer)

File: test.csv

Column name Description
text The text of the review. (String)
label The label for the review, 0 for negative and 1 for positive. (Integer)

File: unsupervised.csv

Column name Description
text The text of the review. (String)
label The label for the review, 0 for negative and 1 for positive. (Integer)
data icon
IMDB Movie Reviews (Binary Sentiment)
7
已售 0
49.62MB
申请报告