以下为卖家选择提供的数据验证报告:
数据描述
#About This dataset contains 10k German speeches of the Bundestag which have been webscraped, translated to English with OPUS-MT and summarized using BART. The Summarizations were done in German.
I hope to release many more datasets revolving around speeches with more NLP features extracted such as NEs, Sentiment and more.
If you're interested in more equivalent data, feel free to contact me or leave a post in the discussion tab!
Data
All speeches are from the ongoing 20th legislative period hence they were held between the years 10/2021-2022. If you have further questions regarding the data, feel free to ask!
Source
These speeches were webscraped and processed from the official website of the German Bundestag. The speeches underwent several NLP pipelines, two of which translate and summarize each speech.
Use-Cases
- Use as a starting point for a political analysis tool much like my project: Bundestag-Mine.
- Use for any kind of text processing, training, MLM or the likes
- Translate the summarizations to english and use them as targets to train better models
- Analyse the speeches for political analysis.
