verify-tagAncient Chinese Text (wenyanwen)

businessnlptext miningtext

6

已售 0
1.54GB

数据标识:D17219641983619827

发布时间:2024/07/26

以下为卖家选择提供的数据验证报告:

数据描述

Context

Classical Chinese(文言文) and ancient poetry (古诗词) are, probably the most reliable primary history source about the China.

They entail stories about ancient kings, legends of gods, of struggling braveries, of uncelebrated love, of how stars look like when dynasties toppled, of how people whistling, farming, entertaining and math puzzling with algebra thousands of years ago.

They hold different philosophies, some worship order and courtesy, some excel in deception of war, others believe in balance and nature.

They give birth to a language derived into thousands of living dialects and still spoken, written among more than a billion of human being on this planet.

Content

Data is from the 2020, March's data dump from wikisource > The data is in csv format with 4 columns:

  • id: id from datadump
  • url: The original wikisource file
  • title: The title of the article/ poetry
  • text: The textual data in Chinese

Acknowledgements

This dataset was parsed from Wikisource's data dump, thanks to all the contributor editing these words, as honest to the original as possible

Inspiration

  • What's the relationship between words, names?
  • Any generative model for such material?
  • Any way we can search through these text for event/ figure/ story better?
data icon
Ancient Chinese Text (wenyanwen)
6
已售 0
1.54GB
申请报告