以下为卖家选择提供的数据验证报告:
数据描述
Context
Augumentation of initial dataset:
Content
For using it in Jigsaw Rate Severity of Toxic Comments Example usage: ☣️ Jigsaw - Super Simple Naive Bayes by dataista0 (Julián Peller) Data scientist at Toptal Buenos Aires, Buenos Aires, Argentina with data from: @inproceedings{hateoffensive, title = {Automated Hate Speech Detection and the Problem of Offensive Language}, author = {Davidson, Thomas and Warmsley, Dana and Macy, Michael and Weber, Ingmar}, booktitle = {Proceedings of the 11th International AAAI Conference on Web and Social Media}, series = {ICWSM '17}, year = {2017}, location = {Montreal, Canada}, pages = {512-515} } the second dataset has an MIT license
Acknowledgements
thanks to dataista0 (Julián Peller) for posting original dataset and Davidson, Thomas for posting Automated Hate Speech Detection and the Problem of Offensive Language ALSO TO: Warmsley, Dana and Macy, Michael and Weber, Ingmar
Inspiration
Is Inter-Annotator Agreement relevant for offensiveness detection? Can be predicted? And applied to another datasets?
