verify-tagLLM: 7 prompt training dataset

educationlaw

3

已售 0
41.39MB

数据标识:D17222405484960848

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

  • Version 4: Adding the data from "LLM-generated essay using PaLM from Google Gen-AI" kindly generated by Kingki19 / Muhammad Rizqi. File: train_essays_RDizzl3_seven_v2.csv Human texts: 14247 LLM texts: 3004

    **See also:** a new dataset of an additional 4900 LLM generated texts: **[LLM: Mistral-7B Instruct texts](https://www.kaggle.com/datasets/carlmcbrideellis/llm-mistral-7b-instruct-texts)**
  • Version 3: "The RDizzl3 Seven" File: train_essays_RDizzl3_seven_v1.csv

  • "Car-free cities"

  • "Does the electoral college work?"

  • "Exploring Venus"

  • "The Face on Mars"

  • "Facial action coding system"

  • "A Cowboy Who Rode the Waves"

  • "Driverless cars"

How this dataset was made: see the notebook "LLM: Make 7 prompt train dataset"

  • Version 2: (train_essays_7_prompts_v2.csv) This dataset is composed of 13,712 human texts and 1638 AI-LLM generated texts originating from 7 of the PERSUADE 2.0 corpus prompts.

Namely:

  • "Car-free cities"
  • "Does the electoral college work?"
  • "Exploring Venus"
  • "The Face on Mars"
  • "Facial action coding system"
  • "Seeking multiple opinions"
  • "Phones and driving"

This dataset is a derivative of the datasets

as well as the original competition training dataset

  • Version 1:This dataset is composed of 13,712 human texts and 1165 AI-LLM generated texts originating from 7 of the PERSUADE 2.0 corpus prompts.
data icon
LLM: 7 prompt training dataset
3
已售 0
41.39MB
申请报告