Down Shift

verify-tagTwitter and Reddit Sentimental analysis Dataset

internetclassificationfeature engineeringonline communities

12

已售 0
10.05MB

数据标识:D17171515962714102

发布时间:2024/05/31

以下为卖家选择提供的数据验证报告:

数据描述

Context

This is was a Dataset Created as a part of the university Project On Sentimental Analysis On Multi-Source Social Media Platforms using PySpark.

There two datasets Respectively one Consists of Tweets from Twitter with Sentimental Label and the other from Reddit which Consists of Comments with its Sentimental Label.

  1. Twitter Dataset

2.Reddit Dataset

All these Tweets and Comments were extracted using there Respective Apis Tweepy and PRAW. These tweets and Comments Were Made on Narendra Modi and Other Leaders as well as Peoples Opinion Towards the Next Prime Minister of The Nation ( In Context with General Elections Held In India - 2019). All the Tweets and Comments From twitter and Reddit are Cleaned using Pythons re and also NLP with a Sentimental Label to each ranging from -1 to 1.

  1. 0 Indicating it is a Neutral Tweet/Comment 2.1 Indicating a Postive Sentiment 3.-1 Indicating a Negative Tweet/Comment

Content

Twitter.csv Dataset has around 163K Tweets along with Sentiment Labels. Reddit.csv Dataset has around 37K Comments along with its Sentimental Label So Generally Each Dataset has two columns, the first column has the cleaned tweets and Comments and the Second one indicates its Sentimental Label

Acknowledgements

This Dataset was Created with the help of my fellow teammates who passionately worked hard to gather more data with the help of the Tweepy and Reddit Apis. My Project Coordinator encouraged us to collect as much data as possible and he was the main motivation behind Implementing Sentimental Analysis on Multi-Source Social Media Platforms rather than a Single Platform Such as Twitter.

data icon
Twitter and Reddit Sentimental analysis Dataset
12
已售 0
10.05MB
申请报告