以下为卖家选择提供的数据验证报告:
数据描述
Note 📝: If you find this dataset useful, please consider giving it an upvote! Your support is appreciated.
Quick Start 🚀: If you're not up for reading all of this, head straight to the file section. There, you'll find detailed explanations of the files and all the variables you need.
Dataset Description: The dataset sourced from Druglib.com And Drugs.com, The dataset contains patient reviews on specific drugs along with related conditions. The reviews are categorized into reports on three aspects: benefits, side effects, and overall comments. Additionally, ratings are provided for overall satisfaction, side effects (on a 5-step scale), and effectiveness (on a 5-step scale).
Characteristics: It's a multivariate dataset with text data.
Subject Area: Health and Medicine.
Associated Tasks: Classification, Regression, Clustering.
Feature Type: Integer.
Number of Instances: 4143.
Number of Features: 8.
Data Collection: The data was obtained by crawling online pharmaceutical review sites. The purpose was to facilitate sentiment analysis of drug experiences across various facets, transferability of models among different conditions, and transferability among different data sources.
Data Split: The data is divided into a training set (75%) and a test set (25%), stored in two tab-separated-values (.CSV) files.
Usage Restrictions: Users of this dataset must agree to certain terms, including using the data only for research purposes, refraining from commercial use, not distributing the data to others, and citing the original authors.
Missing Values: There are no missing values.
Introductory Paper: The dataset is associated with a paper titled "Aspect-Based Sentiment Analysis of Drug Reviews Applying Cross-Domain and Cross-Data Learning" by F. Gräßer, Surya Kallumadi, H. Malberg, S. Zaunseder, published in the Digital Humanities Conference in 2018.
Variables:
urlDrugName (categorical): Name of the drug being reviewed. This variable indicates the specific medication that patients are providing reviews for.
condition (categorical): Name of the medical condition for which the drug is prescribed or used. This variable specifies the health issue or ailment that the drug is intended to address.
benefitsReview (text): Patient reviews regarding the benefits or positive effects they experienced while using the drug. This text field likely contains descriptions of how the drug helped alleviate symptoms or improve the patient's condition.
sideEffectsReview (text): Patient reviews detailing the side effects or adverse reactions experienced while using the drug. This text field may include descriptions of any unwanted or negative effects associated with the medication.
commentsReview (text): Overall comments provided by the patient about their experience with the drug. This text field captures the patient's general feedback or opinions about the medication, including any additional thoughts or observations.
rating (numerical): Patient rating of the drug on a scale from 1 to 10 stars. This numerical variable quantifies the patient's overall satisfaction or perception of the drug's effectiveness, with higher ratings indicating greater satisfaction.
sideEffects (categorical): Categorical variable representing the side effects rating provided by the patient. This variable likely indicates the severity or impact of side effects experienced, categorized into five levels (e.g., mild, moderate, severe).
effectiveness (categorical): Categorical variable representing the effectiveness rating provided by the patient. This variable indicates the perceived effectiveness of the drug in treating the patient's condition, categorized into five levels (e.g., ineffective, moderately effective, highly effective).
Variables Details:
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
reviewID | ID | Integer | Identifier for each review | No | |
urlDrugName | Feature | Categorical | Name of the drug being reviewed | No | |
rating | Feature | Integer | Patient rating of the drug | No | |
effectiveness | Feature | Categorical | Perceived effectiveness of the drug | No | |
sideEffects | Feature | Categorical | Severity of side effects experienced | No | |
condition | Feature | Categorical | Medical condition being treated | No | |
benefitsReview | Feature | Categorical | Patient reviews on benefits of the drug | No | |
sideEffectsReview | Feature | Categorical | Patient reviews on side effects of the drug | No | |
commentsReview | Feature | Categorical | Overall patient comments on the drug | No |
Cite:
@article{Grer2018AspectBasedSA, title={Aspect-Based Sentiment Analysis of Drug Reviews Applying Cross-Domain and Cross-Data Learning}, author={Felix Gr{"a}{\ss}er and Surya Kallumadi and Hagen Malberg and Sebastian Zaunseder}, journal={Proceedings of the 2018 International Conference on Digital Health}, year={2018}, url={https://api.semanticscholar.org/CorpusID:5040048} }
