DS数据代找

verify-tagClidSum跨语言对话摘要基准数据集

跨语言对话文本生成千言数据集模式识别

0.5

已售 1
140.86MB

数据标识:D17313835018761538

发布时间:2024/11/12

数据描述

ClidSum跨语言对话摘要基准数据集

作者:王佳安

数据集介绍

ClidSum基准数据集分为两部分:XSAMSum与XMediaSum,分别是基于SAMSum和MediaSum对话摘要数据集进行额外的标注完成的。ClidSum数据集包含了5万6千余英文对话文档,每个对话文档标注了对应的中文摘要与德语摘要。

数据预览

[{"dialogue":"Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye","summary":"Hannah needs Betty's number but Amanda doesn't have it. She needs to contact Larry.","summary_de":"hannah braucht bettys nummer, aber amanda hat sie nicht. sie muss larry kontaktieren.","summary_zh":"汉娜需要贝蒂的电话号码,但阿曼达没有。她得联系拉里。"},{"dialogue":"Eric: MACHINE!
Rob: That's so gr8!
Eric: I know! And shows how Americans see Russian ;)
Rob: And it's really funny!
Eric: I know! I especially like the train part!
Rob: Hahaha! No one talks to the machine like that!
Eric: Is this his only stand-up?
Rob: Idk. I'll check.
Eric: Sure.
Rob: Turns out no! There are some of his stand-ups on youtube.
Eric: Gr8! I'll watch them now!
Rob: Me too!
Eric: MACHINE!
Rob: MACHINE!
Eric: TTYL?
Rob: Sure :)","summary":"Eric and Rob are going to watch a stand-up on youtube.","summary_de":"eric und rob werden sich ein stand-up auf youtube ansehen.","summary_zh":"埃里克和罗伯要在youtube上看一场单口相声。"}]
 

验证报告

以下为卖家选择提供的数据验证报告:

data icon
ClidSum跨语言对话摘要基准数据集
0.5
已售 1
140.86MB
申请报告