数据描述
Context
A training Dataset for multiple purpose : Regression, Classification, Neural Network. Some of the dataset provided online for the IMDB website where missing features. The goal of this dataset was to complete some of those missing features.
Content
Data were acquiered using BeautifoulSoup4 on the search-page of IMDB. Movies were ranked by the number of Votes, from high to low. The first 189900 Movies where scraped.
The database contains:
- Movie Name (str): the name of the movie/Serie
- Movie Date (date): the date when the movie came out
- Serie Name (str): the name of the serie season (if any)
- Serie Date (date): The date when the serie season came out (if any)
- Movie type (str): type of the movie (action,drama, sci-fi...)
- Number of votes (int): number of people who voted for the metascore
- Movie Revenue in millions of $ (int): revenue (box-office) the movie made in million-$
- Score (float): the mean-score attributed to the movie from 1 to 10 by the journalists
- Metascore (int): the mean-score attributed to the movie/serie from 1 to 100 by the viewers
- Time Duration (int): the duration of the movie in minutes
- Director (list): list of director(s) that directed the movie/serie/season
- Actors (list): list of main actor(s) that played in the movie/serie/season
- Restriction (str): Age restriction and warning (all public, all public with warning, 12, 12 with warnings, 16...)
- Description (str): A short abstract of the movie
Acknowledgements
Licence: CC BY-SA 4.0
验证报告
以下为卖家选择提供的数据验证报告:

IMDB New Dataset
48.52MB
申请报告