以下为卖家选择提供的数据验证报告:
数据描述
iPhone 14 📱 🐦 Tweets [11 July - Sept 9 2022 - 144k English] 📱 🐦
Updated on Sept 9th Includes sent tweets after launch
Trying to do something useful and add a dataset here in Kaggle, and while there are over 90+ datasets for Elon, there's none yet for tweets about the upcoming iPhone 14. I'm interested in seeing what apple is up to this year, so I thought it could be interesting to deep dive into what people have been saying this month before the release, which was announced today by Apple. It will happen on September 7th.
The dataset has 144k tweets created between July 11th and Sept 9th. Tweets are in English. As the new iPhone was just announced, I plan on updating the dataset to include newer examples and maybe a few older ones to increase the number of samples in the dataset, at least until the first week of launch.
Columns Description
- date_time - Date and Time tweet was sent
- username - Username that sent the tweet
- user_location - Location entered in the account location info on Twitter
- user_description - Text added to "about" in account
- verified - If the user has the "verified by Twitter" blue tick
- followers_count - Number of Followers
- following_count - Number of accounts followed by the person who sent the tweet
- tweet_like_count - How many people liked the tweet
- tweet_retweet_count - How many people retweeted the tweet
- tweet_reply_count - How many people replied to that tweet
- source - Where was the tweet sent from. The link has info if using iPhone, Android and others
- tweet_text - Text sent in the tweet
Data and Utilization
Data was scrapped from Twitter and uploaded as is, no further process to data cleaning was performed, but the data from the tweets are in very good shape. I'd maybe recommend separating data and time and finding a way to change the source from links to the device name or website, depending on what you are interested in using the data for.
Usage suggestions - Data can be used to perform sentiment analysis, look at the geographical distribution, trends, spam x ham identification, and others.
