数据描述
This dataset is a comprehensive collection of email data, categorized into ‘spam’ and ‘ham’ (non-spam) categories originally obtained from (AUEB)'s offical website.. There is 6 versions of this dataset all of them in the form of separate text files. I have iterated and cleaned the dataset for every version for better use in training a model. You can reach to my model for further data cleaning processes.
The dataset contains the following features:
- Label: This is the target variable, indicating whether an email is ‘spam’ or ‘ham’.
- Label_num: This is a numerical representation of the target variable, where ‘spam’ is represented as 1 and ‘ham’ is represented as 0.
- Text: This is the content of the email.
验证报告
以下为卖家选择提供的数据验证报告:

Spam Email Classification Dataset
31.58MB
申请报告