aaabbbccc

verify-tagComplete Works of Rabindranath Tagore

languagesarts and entertainmentliteraturenlptext

1

已售 0
42.79MB

数据标识:D17220793658509583

发布时间:2024/07/27

以下为卖家选择提供的数据验证报告:

数据描述

Context

Rabindranath Thakur, (born May 7, 1861, Calcutta [now Kolkata], India—died August 7, 1941, Calcutta), Bengali poet, short-story writer, song composer, playwright, essayist, and painter who introduced new prose and verse forms and the use of colloquial language into Bengali literature, thereby freeing it from traditional models based on classical Sanskrit. He was highly influential in introducing Indian culture to the West and vice versa, and he is generally regarded as the outstanding creative artist of early 20th-century India. In 1913 he became the first non-European to receive the Nobel Prize for Literature. [Source: https://www.britannica.com/biography/Rabindranath-Tagore]

Content

This dataset includes 8 csv files under /csv directory and 7 txt files under /txt directory containing the complete works of Rabindranath Tagore. Each csv file includes all the literary works found in that genre in csv format, except all_collection.csv (which combines works from all genres). Each txt file includes all the literary works found in that genre in aggregated txt format. The content in both formats are passed through a basic preprocessing step to clear out empty spaces, in-page titles and page numbers. The txt formats would be suitable for training various sequential models, whereas csv formats would be useful for literary analyses and comparative studies.

Files:

  1. "all_collection.csv": Combination of all other files. This contains 3438 individual items from all genres.
  2. "drama.csv": Contains all the works in the "Drama" genre.
  3. "essay.csv": Contains all the works in the "Essay" genre.
  4. "misc.csv": Contains all the miscellaneous works.
  5. "novel.csv": Contains all the works in the "Novel" genre.
  6. "poem.csv": Contains all the works in the "Poetry" genre.
  7. "song.csv": Contains all the works in the "Song" genre.
  8. "story.csv": Contains all the works in the "Story" genre.
  9. "drama.txt": Contains aggregated works in the "Drama" genre.
  10. "essay.txt": Contains aggregated works in the "Essay" genre.
  11. "misc.txt": Contains aggregated miscellaneous works.
  12. "novel.txt": Contains aggregated works in the "Novel" genre.
  13. "poem.txt": Contains aggregated works in the "Poetry" genre.
  14. "song.txt": Contains aggregated works in the "Song" genre.
  15. "story.txt": Contains aggregated works in the "Story" genre.

Acknowledgements

I scraped the entire body of works of Rabindranath Tagore from the amazing source published by the "Department of Information Technology & Electronics, Government of West Bengal, India", https://rabindra-rachanabali.nltr.org All credit goes to them for their meticulous attempt in preparing this quality collection of works and publishing them online for others to benefit from.

Inspiration

While trying to work on a project where I wanted to generate text in the style of Rabindranath Tagore, a great luminary of Bengali literature and conduct statistical analysis on various themes found in the works of Rabindranath, I could not find a comprehensive dataset that provided granular access to individual items in every genre, e.g poem, story, essay, novel or drama. Hence, I decided to compile this dataset to do just that. I hope this will provide aspirations to machine learning enthusiasts and general literary experts alike to delve deep into the bottomless ocean of Bengali literature and explore the hidden gems unlocked by the advancement of multitude of techniques in the field of machine learning.

Banner image source: https://upload.wikimedia.org/wikipedia/commons/d/d1/Rabindranath_Tagore.jpg

data icon
Complete Works of Rabindranath Tagore
1
已售 0
42.79MB
申请报告