燕舞春风🍎

verify-tagAcquired Podcast Transcripts and RAG Evaluation

law

15

已售 0
13.34MB

数据标识:D17193986364554737

发布时间:2024/06/26

以下为卖家选择提供的数据验证报告:

数据描述

This dataset contains 200 Acquired Podcast Transcripts we collected from the official website (https://www.acquired.fm/) with metadata specified in acquired_metadata.csv.

We also developed a QA dataset for RAG evaluation in acquired-qa-evaluation.csv contains the following columns:

  • question: The question posed for evaluation.
  • human_answer: The answer provided by a human.
  • ai_answer_without_the_transcript: The answer provided by an AI model without access to the transcript.
  • ai_answer_without_the_transcript_correctness: The factual accuracy of the AI answer without the transcript verified by a human.
  • ai_answer_with_the_transcript: The answer provided by an AI model with access to the transcript.
  • ai_answer_with_the_transcript_correctness: The factual accuracy of the AI answer with the transcript verified by a human.
  • quality_rating_for_answer_with_transcript: The quality of the AI answer rated by a human.
  • post_url: The URL of the podcast episode related to the question.
  • file_name: The name of the transcript file associated with the episode.

The project was created and designed by me with the help of the following people:

  • Rain Jiang: crawler development and data collection
  • Yihong Chen: data parsing, cleaning, and analysis

The following are students in my Introduction to Generative AI course (Spring 2024), who created the QA dataset:

  • Priya Amara
  • Saviour Adelwin Anyagri
  • Ezgi Basaranlar
  • Sara Baskaran
  • Nimet Batan Altiyaprak
  • Reed Bidgood
  • Daniel Coleman
  • James Dalton
  • Chaitanya Dhullipala
  • Yin Ding
  • Aksel Dirkzwager
  • Malek Elsayyid
  • John Fabricatore
  • J'Quoi George
  • Ed Gorman
  • Amanda Grosz
  • Donald Harris
  • Bryan Horsey
  • David Kam
  • Daria Klimkovskaia
  • Mathieu Lippens
  • Ruth McDuffie
  • Ashish Mishra
  • Achal Modi
  • Jayaprakash Moses
  • Naomi Nyarinda Okemwa
  • Silvia Atelo Okwach
  • Kardam Patel
  • Pramila Paudyal
  • Chris Pic
  • Rajesh Rao
  • Ronald Russian
  • Summer Shaheed
  • Rohan Swain
  • Shriya Tandon
  • Aniket Turaskar
  • Upendar Vanavasam
  • Andrea Young
data icon
Acquired Podcast Transcripts and RAG Evaluation
15
已售 0
13.34MB
申请报告