🌸叶

verify-tagNLP Data: Abbreviations and Emoticons

data cleaningnlptext pre-processing

7

已售 0
94.17MB

数据标识:D17222460220712989

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

This dataset is valuable for natural language processing (NLP) and text-mining tasks. It includes various files containing data related to text preprocessing, making it a useful tool for researchers, data scientists, and developers working in the field of NLP. The dataset comprises the following components:

  • Text_dataset.br (98.72 MB): A large text corpus for training and testing NLP models.
  • abbreviations.csv (84.47 kB): A list of common abbreviations and their expanded forms.
  • apostrophe.csv (3.89 kB): A collection of words and phrases with and without apostrophes.
  • emoticons.csv (7.27 kB): Emoticons are commonly used in text and their meanings.

For more info check this repo ContextLens

data icon
NLP Data: Abbreviations and Emoticons
7
已售 0
94.17MB
申请报告