十一

DAIGT V2 Train Dataset

lawtext

￥2

97.19MB

数据标识：D17220387097959980

发布时间：2024/07/27

Please use version 2 (there were some issues with v1 that I fixed)!

New release of DAIGT train dataset! Improvement:

new models: Cohere Command, Google Palm, GPT4 (from Radek!)
new prompts, including source texts from the original essays!
mapping of essay text to original prompt from persuade corpus
filtering by the famous "RDizzl3_seven"

persuade_corpus                       25996 chat_gpt_moth                          2421 llama2_chat                            2421 mistral7binstruct_v2                   2421 mistral7binstruct_v1                   2421 original_moth                          2421 train_essays                           1378 llama_70b_v1                           1172 falcon_180b_v1                         1055 darragh_claude_v7                      1000 darragh_claude_v6                      1000 radek_500                               500 NousResearch/Llama-2-7b-chat-hf         400 mistralai/Mistral-7B-Instruct-v0.1      400 cohere-command                          350 palm-text-bison1                        349 radekgpt4                               200

Sources (please upvote the original datasets!):

Text generated with ChatGPT by MOTH (https://www.kaggle.com/datasets/alejopaullier/daigt-external-dataset)
Persuade corpus contributed by Nicholas Broad (https://www.kaggle.com/datasets/nbroad/persaude-corpus-2/)
Text generated with Llama-70b and Falcon180b by Nicholas Broad (https://www.kaggle.com/datasets/nbroad/daigt-data-llama-70b-and-falcon180b)
Text generated with ChatGPT and GPT4 by Radek (https://www.kaggle.com/datasets/radek1/llm-generated-essays)
2000 Claude essays generated by @darraghdog (https://www.kaggle.com/datasets/darraghdog/hello-claude-1000-essays-from-anthropic)
LLM-generated essay using PaLM from Google Gen-AI by @kingki19 (https://www.kaggle.com/datasets/kingki19/llm-generated-essay-using-palm-from-google-gen-ai)
Official train essays
Essays I generated with various LLMs

License: MIT for the data I generated. Check source datasets for the other sources mentioned above.

看了又看

验证报告

以下为卖家选择提供的数据验证报告：

DAIGT V2 Train Dataset

￥2

97.19MB

申请报告

DAIGT V2 Train Dataset

关于典枢

下载与支持

服务协议

关于我们

官方公众号

技术交流群