以下为卖家选择提供的数据验证报告:
数据描述
-About this Data : Social media platforms have become the most prominent medium for spreading hate speech, primarily through hateful textual content. An extensive dataset containing emoticons, emojis, hashtags, slang, and contractions is required to detect hate speech on social media based on current trends. This dataset contains hate speech sentences in English and is confined into two classes, one representing hateful content and the other representing non-hateful content.
Specifications table | |
---|---|
Subject | Natural Language Processing - NLP |
Specific subject area | A curated dataset comprising emojis, emoticons, and contractions bundled into two classes, hateful and non-hateful, to detect hate speech in text. |
Type of data | Text |
Data format | Annotated, Analysed, Filtered Data |
Data Article | A curated dataset for hate speech detection on social media text |
Data source location | https://data.mendeley.com/datasets/9sxpkmm8xn/1 |
-Value of this Data :
- This dataset is useful for training machine learning models to identify hate speech on social media in text. It reflects current social media trends and the modern ways of writing hateful text, using emojis, emoticons, or slang. It will help social media managers, administrators, or companies develop automatic systems to filter out hateful content on social media by identifying a text and categorizing it as hateful or non-hateful speech.
- Deep Learning (DL) and Natural Language Processing (NLP) practitioners can be the target beneficiaries as this dataset can be used for detecting hateful speech through DL and NLP techniques. Here the samples are composed of text sentences and labels belonging to two categories “0″ for non-hateful and “1″ for hateful.
- Additionally, this data set can be used as a benchmark data set to detect hate speech
- The data set is neutralized in such a way that it can be used by anyone as it doesn't include any entities or names which can have an impact or cyber harm on the user that generated the content. Researchers can take advantage of the pre-processed dataset for their projects as it maintains and follows the policy guidelines.

Hate Speech Detection curated Dataset🤬
114.09MB
申请报告