筱雨

verify-tagCOVID-19 All Vaccines Tweets

healthcarepublic healthnlptext mininghealth conditionscovid19

4

已售 0
85.94MB

数据标识:D17219673961262033

发布时间:2024/07/26

以下为卖家选择提供的数据验证报告:

数据描述

Context

I collect recent tweets about the COVID-19 vaccines used in entire world on large scale, as following:

  • Pfizer/BioNTech;
  • Sinopharm;
  • Sinovac;
  • Moderna;
  • Oxford/AstraZeneca;
  • Covaxin;
  • Sputnik V.

Data collection

The data is collected using tweepy Python package to access Twitter API. For each of the vaccine I use relevant search term (most frequently used in Twitter to refer to the respective vaccine)

Data collection frequency

Initial data was merged from tweets about Pfizer/BioNTech vaccine. I added then tweets from Sinopharm, Sinovac (both Chinese-produced vaccines), Moderna, Oxford/Astra-Zeneca, Covaxin and Sputnik V vaccines. The collection was in the first days twice a day, until I identified approximatively the new tweets quota and then collection (for all vaccines) stabilized at once a day, during morning hours (GMT).

Inspiration

You can perform multiple operations on the vaccines tweets. Here are few possible suggestions:

  • Study the subjects of recent tweets about the vaccine made by various producers;
  • Perform various NLP tasks on this data source (topic modelling, sentiment analysis);
  • Using the COVID-19 World Vaccination Progress (where we can see the progress of the vaccinations and the countries where the vaccines are administered), you can study the relationship between the vaccination progress and the discussions in social media (from the tweets) about the vaccines.
data icon
COVID-19 All Vaccines Tweets
4
已售 0
85.94MB
申请报告