以下为卖家选择提供的数据验证报告:
数据描述
Dataset for 'LLM prompt recovery competition). Fields: 'original text'
, 'rewrite_prompts'
, 'rewritten_text'
,'id'
. I took this data for creating this dataset:
- llmpr-public-10k-unique
- gpt3-condenced-prompt-style
- around 2.5k generated data using gemma, based on @aatiffraz notebook
I just concated all data, dropped dublicates by 'rewrite_prompt'
. Also I processed data by deleting "Sure, here's" or "Sure, here is" from beginning with simple regex like r"Sure, .*:"
and then r"Here, .*:"
, removed \n
, _
and *
from start of message.

12.5k unique prompts llm rewrite comp dataset
32.04MB
申请报告