以下为卖家选择提供的数据验证报告:
数据描述
Tweet Sentiment's Impact on Stock Returns
862,231 Labeled Instances
By [source]
About this dataset
> This dataset contains 862,231 labeled tweets and associated stock returns, providing a comprehensive look into the impact of social media on company-level stock market performance. For each tweet, researchers have extracted data such as the date of the tweet and its associated stock symbol, along with metrics such as last price and various returns (1-day return, 2-day return, 3-day return, 7-day return). Also recorded are volatility scores for both 10 day intervals and 30 day intervals. Finally, sentiment scores from both Long Short - Term Memory (LSTM) and TextBlob models have been included to quantify the overall tone in which these messages were delivered. With this dataset you will be able to explore how tweets can affect a company's share prices both short term and long term by leveraging all of these data points for analysis!
More Datasets
> For more datasets, click here.
Featured Notebooks
> - 🚨 Your notebook can be here! 🚨!
How to use the dataset
> > In order to use this dataset, users can utilize descriptive statistics such as histograms or regression techniques to establish relationships between tweet content & sentiment with corresponding stock return data points such as 1-day & 7-day returns measurements. > > The primary fields used for analysis include Tweet Text (TWEET), Stock symbol (STOCK), Date (DATE), Closing Price at the time of Tweet (LAST_PRICE) a range of Volatility measures 10 day Volatility(VOLATILITY_10D)and 30 day Volatility(VOLATILITY_30D ) for each Stock which capture changes in market fluctuation during different periods around when Twitter reactions occur. Additionally Sentiment Polarity analysis undertaken via two Machine learning algorithms LSTM Polarity(LSTM_POLARITY)and Textblob polarity provide insight into whether people are expressing positive or negative sentiments about each company at given times which again could influence thereby potentially influence Stock Prices over shorter term periods like 1-Day Returns(1_DAY_RETURN),2-Day Returns(2_DAY_RETURN)or longer term horizon like 7 Day Returns7DAY RETURNS.Finally MENTION field indicates if names/acronyms associated with Companies were specifically mentioned in each Tweet or not which gives extra insight into whether company specific contexts were present within individual Tweets aka “Company Relevancy”
Research Ideas
> - Analyzing the degree to which tweets can influence stock prices. By analyzing relationships between variables such as tweet sentiment and stock returns, correlations can be identified that could be used to inform investment decisions. > - Exploring natural language processing (NLP) models for predicting future market trends based on textual data such as tweets. Through testing and evaluating different text-based models using this dataset, better predictive models may emerge that can give investors advance warning of upcoming market shifts due to news or other events. > - Investigating the impact of different types of tweets (positive/negative, factual/opinionated) on stock prices over specific time frames. By studying correlations between the sentiment or nature of a tweet and its effect on stocks, insights may be gained into what sort of news or events have a greater impact on markets in general
Acknowledgements
> If you use this dataset in your research, please credit the original authors. > Data Source > >
License
> > > License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication > No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: reduced_dataset-release.csv
Column name | Description |
---|---|
TWEET | Text of the tweet. (String) |
STOCK | Company's stock mentioned in the tweet. (String) |
DATE | Date the tweet was posted. (Date) |
LAST_PRICE | Company's last price at the time of tweeting. (Float) |
1_DAY_RETURN | Amount the stock returned or lost over the course of the next day after being tweeted about. (Float) |
2_DAY_RETURN | Amount the stock returned or lost over the course of the two days after being tweeted about. (Float) |
3_DAY_RETURN | Amount the stock returned or lost over the course of the three days after being tweeted about. (Float) |
7_DAY_RETURN | Amount the stock returned or lost over the course of the seven days after being tweeted about. (Float) |
PX_VOLUME | Volume traded at the time of tweeting. (Integer) |
VOLATILITY_10D | Volatility measure across 10 day window. (Float) |
VOLATILITY_30D | Volatility measure across 30 day window. (Float) |
LSTM_POLARITY | Labeled sentiment from LSTM. (Float) |
TEXTBLOB_POLARITY | Labeled sentiment from TextBlob. (Float) |
MENTION | Number of times the stock was mentioned in the tweet. (Integer) |
File: full_dataset-release.csv
Column name | Description |
---|---|
TWEET | Text of the tweet. (String) |
STOCK | Company's stock mentioned in the tweet. (String) |
DATE | Date the tweet was posted. (Date) |
LAST_PRICE | Company's last price at the time of tweeting. (Float) |
1_DAY_RETURN | Amount the stock returned or lost over the course of the next day after being tweeted about. (Float) |
2_DAY_RETURN | Amount the stock returned or lost over the course of the two days after being tweeted about. (Float) |
3_DAY_RETURN | Amount the stock returned or lost over the course of the three days after being tweeted about. (Float) |
7_DAY_RETURN | Amount the stock returned or lost over the course of the seven days after being tweeted about. (Float) |
PX_VOLUME | Volume traded at the time of tweeting. (Integer) |
VOLATILITY_10D | Volatility measure across 10 day window. (Float) |
VOLATILITY_30D | Volatility measure across 30 day window. (Float) |
LSTM_POLARITY | Labeled sentiment from LSTM. (Float) |
TEXTBLOB_POLARITY | Labeled sentiment from TextBlob. (Float) |
Acknowledgements
> If you use this dataset in your research, please credit the original authors. > If you use this dataset in your research, please credit .
