verify-tagLists of Words in 30 European Languages

languageseuropesocial sciencelinguistics

2

已售 0
17.27MB

数据标识:D17222397838605952

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

Context

Most of the NLP material in Kaggle deals with the analysis of the English language. With these collections of words from other spoken languages, you can solve the same problems and encounter a new language-specific one.

Content

This collection contains word lists in the following languages:

'Albanian', 'Belarusian', 'Bosnian', 'Bulgarian', 'Croatian', 'Czech', 'Danish', 'Dutch', 'English', 'Estonian', 'French', 'German', 'Greek', 'Hungarian', 'Icelandic', 'Italian', 'Latvian', 'Lithuanian', 'Norwegian (Bokmål and Nynorsk)', 'Polish', 'Portuguese', 'Romanian', 'Russian', 'Serbian', 'Slovak', 'Slovenian', 'Spanish', 'Swedish', 'Turkish', 'Ukrainian'.

The separate file languages indicates the encoding of each file. I had no problems reading files in Python. In R, if base::read.csv fails for some encoding, the readr::read_csv works.

Acknowledgements

These collections are based on https://github.com/LibreOffice/dictionaries

Inspiration

Any form of contact with the language we learn brings us closer to our goal. Working with a language that we know helps us understand it better. Have fun!

data icon
Lists of Words in 30 European Languages
2
已售 0
17.27MB
申请报告