悦 影

verify-tagSpam Email Classification Dataset

computer scienceclassificationtextbinary classificationemail and messaging

2

已售 0
133.43MB

数据标识:D17220404589752050

发布时间:2024/07/27

数据描述

Introduction

This is a csv file containing 83446 records of email which are labelled as either spam or not-spam. It is formed by combining the 2007 TREC Public Spam Corpus and Enron-Spam Dataset.

Columns

  1. label
    • '1' indicates that the email is classified as spam.
    • '0' denotes that the email is legitimate (ham).
  2. text
    • This column contains the actual content of the email messages.

Sources

  1. 2007 TREC Public Spam Corpus
  2. Enron-Spam Dataset

Code for combining and processing the two datasets: https://github.com/PuruSinghvi/Spam-Email-Classifier/blob/main/Combining%20Datasets.ipynb

Spam Email Classifier

A spam email classifier has been trained and built using this dataset. It can be found here: https://github.com/PuruSinghvi/Spam-Email-Classifier

看了又看

暂无推荐

验证报告

以下为卖家选择提供的数据验证报告:

data icon
Spam Email Classification Dataset
2
已售 0
133.43MB
申请报告