老下头

verify-tagUnsplash Image Collection

internetimagetravel

12

已售 0
606.65MB

数据标识:D17175248339669069

发布时间:2024/06/05

以下为卖家选择提供的数据验证报告:

数据描述

The Unsplash Dataset

The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of searches across a nearly unlimited number of uses and contexts. Due to the breadth of intent and semantics contained within the Unsplash dataset, it enables new opportunities for research and learning.

The Unsplash Dataset is offered in two datasets:

  • the Lite dataset: available for commercial and noncommercial usage, containing 25k nature-themed Unsplash photos, 25k keywords, and 1M searches
  • the Full dataset: available for noncommercial usage, containing 3M+ high-quality Unsplash photos, 5M keywords, and over 250M searches

As the Unsplash library continues to grow, we’ll release updates to the dataset with new fields and new images, with each subsequent release being semantically versioned.

We welcome any feedback regarding the content of the datasets or their format. With your input, we hope to close the gap between the data we provide and the data that you would like to leverage. You can open an issue to report a problem or to let us know what you would like to see in the next release of the datasets.

For more on the Unsplash Dataset, see our announcement and site.


The Unsplash Dataset is made available for research purposes. It cannot be used to redistribute the images contained within. To use the Unsplash library in a product, see the Unsplash API.

Unsplash Dataset Documentation

The Unsplash Dataset is composed of multiple CSV files:

1 - photos.csv

The photos.csv dataset has one row per photo. It contains properties of the photo, the name of the contributor, the image URL, and overall stats.

Field Description
photo_id ID of the Unsplash photo
photo_url Permalink URL to the photo page on unsplash.com
photo_image_url URL of the image file. Note: this is a dynamic URL, so you can apply resizing and customization operations directly on the image
photo_submitted_at Timestamp of when the photo was submitted to Unsplash
photo_featured Whether the photo was promoted to the Editorial feed or not
photo_width Width of the photo in pixels
photo_height Height of the photo in pixels
photo_aspect_ratio Aspect ratio of the photo
photo_description Description of the photo written by the photographer
photographer_username Username of the photographer on Unsplash
photographer_first_name First name of the photographer
photographer_last_name Last name of the photographer
exif_camera_make Camera make (brand) extracted from the EXIF data
exif_camera_model Camera model extracted from the EXIF data
exif_iso ISO setting of the camera, extracted from the EXIF data
exif_aperture_value Aperture setting of the camera, extracted from the EXIF data
exif_focal_length Focal length setting of the camera, extracted from the EXIF data
exif_exposure_time Exposure time setting of the camera, extracted from the EXIF data
photo_location_name Location of the photo
photo_location_latitude Latitude of the photo
photo_location_longitude Longitude of the photo
photo_location_country Country where the photo was made
photo_location_city City where the photo was made
stats_views Total # of times that a photo has been viewed on the Unsplash platform
stats_downloads Total # of times that a photo has been downloaded via the Unsplash platform
ai_description Textual description of the photo, generated by a 3rd party AI
ai_primary_landmark_name Landmark present in the photo, generated by a 3rd party AI
ai_primary_landmark_latitude Latitude of the landmark, generated by a 3rd party AI
ai_primary_landmark_longitude Longitude of the landmark, generated by a 3rd party AI
ai_primary_landmark_confidence Landmark confidence of the 3rd party AI
blur_hash BlurHash hash of the photo

2 - keywords.csv

The keywords.csv dataset has one row per photo-keyword pair. It contains data about how a keyword is connected to a photo and the conversions of the photo our search engine for a particular keyword.

Field Description
photo_id ID of the Unsplash photo
keyword Keyword or search term
ai_service_1_confidence Confidence for the keyword from a 3rd party AI (0-100)
ai_service_2_confidence Confidence for the keyword from another 3rd party AI (0-100)
suggested_by_user Whether the keyword was added by a user (human)

3 - collections.csv

Note: A collection on Unsplash is a user created grouping of photos. These are similar to boards on Pinterest and can often group photos in complex and creative ways.

The collections.csv dataset has one row per photo-collection pair. Whenever a photo belongs to a collection created by a user, it will appear as one row. Each row describes when the photo was added to the collection and gives the title of the collection.

Field Description
photo_id ID of the Unsplash photo
collection_id ID of the Unsplash collection containing the photo
collection_title Title of the collection containing the photo
photo_collected_at Timestamp of when the photo was added to the collection

4 - conversions.csv

Note: a conversion is currently defined as a user selecting an image to download it.

The conversions.csv dataset has one row per search conversion. The dataset tells you which photo has been downloaded for a search, the country of origin, and an anonymous identifier to indiciate the unique users. The data goes back up to 1 year before the release of each version of the dataset.

Field Description
converted_at Timestamp of the conversion event
conversion_type Type of conversion (download only for now)
keyword Keyword that was searched and led to the conversion
photo_id Photo ID of the photo that converted
anonymous_user_id Anonymous user ID
conversion_country Country code of the device geolocation

5 - colors.csv

Note: The coverage and score data comes from a 3rd party AI

The colors.csv dataset has one row per major color present in the photo. The dataset tells which colors are contained within a photo, their coverage as a percentage, and a score for how in focus the color is.

Field Description
photo_id ID of the Unsplash photo
hex Hexadecimal representation of the color
red Red component of the photo in the RGB system
green Green component of the photo in the RGB system
blue Blue component of the photo in the RGB system
keyword Name of the closest color as a CSS color keyword
coverage Pixel coverage of the color as a percentage
score Score of the color in the photo (including the notion of focus)

Combining datasets

You can merge the different datasets through the primary key ID fields (usually the photo_id field). With this, you'll be able to cross-reference properties from the photos dataset with data from the keywords or conversions dataset.

data icon
Unsplash Image Collection
12
已售 0
606.65MB
申请报告