鱼泪

verify-tagDaily Google News (monthly update)

globalnlpdata analyticsnewsenglish

5

已售 0
30.91MB

数据标识:D17220850717134376

发布时间:2024/07/27

以下为卖家选择提供的数据验证报告:

数据描述

This dataset contains metadata of millions of news articles from Google News, including title, publisher, DateTime, link, and category.

This is also an automation project in which data is scraped every day at 4am UTC on 8 major categories. This dataset is expected to have a monthly update, thus the data collected daily will be merged into a single monthly csv file and published on Kaggle at the end of each month. One may expect the value of the dataset to continuously grow through time.

If you find this dataset useful, feel free to drop a like. If you have any requests/suggestions/inquires, feel free to leave it in the comment sections as well.

What does the dataset contain?

As mentioned, each monthly csv file mainly contain 5 columns

1. Title: The title of the news article

2. Publisher: The publisher of the news article

3. DateTime: The DateTime of when the news article is published on Google News

4. Link: A link that will direct users to the corresponding article, one may feel free to dig deeper and scrape extended content by following the links

5. Category: 8 major categories defined by Google News, particularly Business, Entertainment, Headlines, Health, Science, Sports, Technology and WorldWide.

data icon
Daily Google News (monthly update)
5
已售 0
30.91MB
申请报告