醒醒

verify-tagEmails for spam or ham classification (Trec 2006)

classificationbinary classificationemail and messaging

4

已售 0
146.5MB

数据标识:D17222216182874338

发布时间:2024/07/29

数据描述

This dataset contains emails for spam or ham classification. It's from "2006 TREC Public Spam Corpora". There are three files:

  1. email_origin.csv: Original raw email with label. Columns:
  • label: Int type, 1 for spam and 0 for ham
  • origin: String type, original raw email
  1. email_text.csv: Processed email body with label. Columns:
  • label: Int type, 1 for spam and 0 for ham
  • text: String type, processed email body
  1. trec06p.tgz: Origin compressed file downloaded from source.

How I process email (from email_origin to email_text):

Email Processing

More dataset for spam or ham classification:

Emails for spam or ham classification (Trec 2007)

Emails for spam or ham classification (Trec 2005)

Emails for spam or ham classification (Enron 2006)

Emails for spam or ham classification SpamAssassin

Source: https://plg.uwaterloo.ca/~gvcormac/treccorpus06/about.html

验证报告

以下为卖家选择提供的数据验证报告:

data icon
Emails for spam or ham classification (Trec 2006)
4
已售 0
146.5MB
申请报告