晓彤

verify-tagIMDb Dataset (lighter/fast version)

arts and entertainmentmovies and tv showsinternet

1

已售 0
84.29MB

数据标识:D17222374116584620

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

Context

The dataset contains two tables:

  • df_movies.csv
  • df_names.csv

df_movies table

The table consists some of the important information about the movie titles. The table is the merged and trimmed version of the following IMDb original tables:

  • title_basics (tconst a.k.a. ID_title, primaryTitle, originalTitle, startYear, runtimeMinutes, genres, averageRating, numVotes,)
  • title_principals (nconst a.k.a. ID_crew, category, job, characters)
  • title_crew (director, writer)

The table is sorted by ID_title, missing titles are the ones which are not either movie or tv_movie.

  • Each title have its number of crew times rows
  • Also note that the above number multiplies by number of directors and writers (due to merging tables, therefore groupby function need to be used for single entries for each film)

df_names table

The table is the trimmed version of "names_basics" table of IMDb's. It contains only the names which are involved in the movies.

data icon
IMDb Dataset (lighter/fast version)
1
已售 0
84.29MB
申请报告