雪碧瓜瓜

verify-tagCOVID-19 transmission periods per week per country

covid19

27

已售 0
15.65MB

数据标识:D17171614056999351

发布时间:2024/05/31

以下为卖家选择提供的数据验证报告:

数据描述

Context

This dataset is created as a part of covid-19 global forecasting challenge. It contains parameters for the SIR model for different locations worldwide. But the main value of the dataset is estimated transmission period (average period between single infected individual infects next susceptible in pure susceptible population) per week per location.

The model is defined as ODE system as follows: SIR ODE equations

In order to reflect the transmission rate changes caused by spread constraining measures (social distancing, etc.) the Beta parameter is modelled separately as spline model (spline node estimate for every week). See paramsWeekly.csv which holds the Beta parameter values for every week as well as estimated R0 values (derived from Beta and Gamma paramters) for every week.

The models are fitted on John Hopkins University data (time series) using several runs of Nelder-Mead simplex optimization method (best run is taken) starting at different initial locations and RMSE as a loss.

What parameters are fitted (estimated) per country/province:

  • the day when the infection emerged in the country
  • the initial infected count on the first day of the infection
  • beta (separate value for every week) - an average number of contacts (sufficient to spread the disease) per day each infected individual has
  • gamma - fixed fraction of the infected group that will recover during any given day
  • R0 - Equals beta/gamma

How to read the figures.

  • points are real observed data provided by Johns Hopkins University

  • curves are model prediction

  • blue is susceptible population - people that are not yet infected but can get the infection

  • red is infected population

  • green is removed population (recovered or dead). people that are not susceptible any more as they came through the infection.

Content

The dataset contains 3 data portions:

  1. Fitted SIR model parameters for different locations worldwide. a. Params.csv - parameters (and derived values) constant over time b. ParamsWeekly.csv - parameters (and derived values) that are estimated for every week separatly
  2. Figures directory that visually show how the fitted parameters match the data points.
  3. Predictions directory with CSV files with prediction for one year in the future for each individual location.

Warning

Always do visual check of the model fit (Figures directory) for quality control before start to use the corresponding parameter values in your analysis, as the dataset is obtained by automatic fitting procedure without manual quality control.

Acknowledgements

Thanks a lot Kaggle for organizing data sharing and challenges that make the world better.

Also many thanks to John Hopkins University for their hard work of gathering COVID-19 statistics worldwide.

Inspiration

You can try to find correlation between model parameters (e.g. gamma - patient recovery rate) and other properties of the modelled locations worldwide (e.g. weather, population density, level of medical care, etc.)

data icon
COVID-19 transmission periods per week per country
27
已售 0
15.65MB
申请报告