LLM: 7 prompt training dataset

醜

LLM: 7 prompt training dataset

educationlaw

￥3

41.39MB

数据标识：D17222405484960848

发布时间：2024/07/29

Version 4: Adding the data from "LLM-generated essay using PaLM from Google Gen-AI" kindly generated by Kingki19 / Muhammad Rizqi. File: train_essays_RDizzl3_seven_v2.csv Human texts: 14247 LLM texts: 3004
**See also:** a new dataset of an additional 4900 LLM generated texts: **[LLM: Mistral-7B Instruct texts](https://www.kaggle.com/datasets/carlmcbrideellis/llm-mistral-7b-instruct-texts)**
Version 3: "The RDizzl3 Seven" File: train_essays_RDizzl3_seven_v1.csv
"Car-free cities"
"Does the electoral college work?"
"Exploring Venus"
"The Face on Mars"
"Facial action coding system"
"A Cowboy Who Rode the Waves"
"Driverless cars"

How this dataset was made: see the notebook "LLM: Make 7 prompt train dataset"

Version 2: (train_essays_7_prompts_v2.csv) This dataset is composed of 13,712 human texts and 1638 AI-LLM generated texts originating from 7 of the PERSUADE 2.0 corpus prompts.

Namely:

"Car-free cities"
"Does the electoral college work?"
"Exploring Venus"
"The Face on Mars"
"Facial action coding system"
"Seeking multiple opinions"
"Phones and driving"

This dataset is a derivative of the datasets

as well as the original competition training dataset

Version 1:This dataset is composed of 13,712 human texts and 1165 AI-LLM generated texts originating from 7 of the PERSUADE 2.0 corpus prompts.

看了又看

验证报告

以下为卖家选择提供的数据验证报告：