不合衬

Pre-processed Spanish-lang suicide tendency texts

mental healthdata cleaningnlpfeature engineeringtext

￥4

33.08MB

数据标识：D17220519028594615

发布时间：2024/07/27

The dataset is used to analyze suicidal tendencies in texts, the original dataset is in English, messages extracted from different social networks such as twitter and reddit. the dataset was cleaned up by removing special characters, double spacing, stopwords and normalized with lemmatization

Content The dataset is a collection of posts from the "SuicideWatch" and "depression" subreddits of the Reddit platform. The posts are collected using Pushshift API. All posts that were made to "SuicideWatch" from Dec 16, 2008(creation) till Jan 2, 2021, were collected while "depression" posts were collected from Jan 1, 2009, to Jan 2, 2021. All posts collected from SuicideWatch are labeled as suicide, While posts collected from the depression subreddit are labeled as depression. Non-suicide posts are collected from r/teenagers.

Dataset original version https://www.kaggle.com/datasets/nikhileswarkomati/suicide-watch

看了又看

验证报告

以下为卖家选择提供的数据验证报告：

Pre-processed Spanish-lang suicide tendency texts

￥4

33.08MB

申请报告

Pre-processed Spanish-lang suicide tendency texts

关于典枢

下载与支持

服务协议

关于我们

官方公众号

技术交流群