以下为卖家选择提供的数据验证报告:
数据描述
The Cosware Dataset ( heavy cough) is sub-dataset of Coswara-Dataset created by IIScLeap . This dataset consists of audio sound clips of heavy coughing from covid positive and covid negative patients from various locations in the world (five continents : Asia (90%), North America (5%), South America(0.14%) , Europe(2.5%) and Australia) from last two years (i.e. year 2020 to 2022).
The data is public to use under license Attribution-NonCommercial-NoDerivatives 4.0 International , and all rights ,liabilities and responsibilities related to the usage of dataset for research purpose depends on the IISc. those who wish to use the data for research can go through the usage guidelines provided by IISc at the github repo
This dataset ( subset ) is created in an intention to create a guide for extracting, preprocessing and executing various audio data features to classifiy Covid-19 patients.
for research ,people can refer the research papaer https://arxiv.org/pdf/2012.01926.pdf
Info about dataset content
- coswara_data : contains audio files with folder depicting id of the sample element
- csvs : information of all the participants for particular date
- train : data with few preprocessed features like mfcc , delta1 mfcc , chroma stft and many more ( dataset don't contain labels)
- train2 : contains date , id , path , audio features and status being positive or negative
- train_original : it is combined form of all the csvs of all days with information about sample
Task to be achieved on the dataset : implementing covid classification using audio or image (mfcc, spectrogram , mel-spectrogram) or preprocessed features ( train2.csv and train_original.csv).
