洋洋

verify-tagNLI-TR (Turkish NLI Research)

languagessamplingeducationnlptext mining

3

已售 0
41.07MB

数据标识:D17222562749268631

发布时间:2024/07/29

数据描述


NLI-TR (Turkish NLI Research)

Unleash Your NLI Research in Turkish Language!

By Huggingface Hub [source]


About this dataset

> NLI-TR is a revolutionary set of two datasets that provide an unparalleled opportunity for the natural language processing and machine learning community to conduct inference research in the Turkish Language. The datasets - SNLI-TR and MNLI-TR - contain carefully curated natural language inference data that have been translated into Turkish. With NLI-TR, researchers can explore the exciting prospects of developing automated models tailored to make inferences on texts produced in this vibrant language. Moreover, they can also investigate how models trained on data from one language fare when applied in another, a valuable insight into cross-lingual generalization capabilities. NLI-TR offers both seasoned and budding researchers an unprecedented platform to further our understanding of natural language inferencing capability

More Datasets

> For more datasets, click here.

Featured Notebooks

> - 🚨 Your notebook can be here! 🚨!

How to use the dataset

> # How To Use The NLI-TR Dataset to Unlock Turkish NLI Research > Welcome to the exciting world of natural language inference (NLI) research! If you’re looking for a great dataset to use for your research in this field, the NLI-TR dataset is a perfect starting point. This guide will provide an overview of how you can use the data from this dataset to uncover new insights about NLI tasks in Turkish. >

> The NLI-TR dataset contains two large scale datasets intended for natural language inference tasks – SNLI-TR and MNLI- TR. Both datasets offer researchers an opportunity to explore Natural Language Inference (NLI) research in the Turkish language, with examples ranging from sentence paraphrasing task and classification tasks to question answering scenarios using various NLP techniques. > > ## Using the Data: > The data provided in this dataset includes both training and validation sets, making it easy for researchers who are just getting started with their projects. The SNLI_tr_train.csv file is used as input for training your models, while slni_tr_validation can be used as input for testing or validating model accuracy on unseen data. Additionally, multinli_tr_validation_{matched / mismatched}.csv files offer additional validation on how well your trained models perform on more complex scenarios such as sentence paraphrasing or question answering tasks using various NLP techniques. > > Each record includes four columns – premise ,hypothesis ,label , (and domain). The premise column specifies what information is provided before asking a question or making an inference; think of it as context clues that explain why one statement implies another statement more directly than others might do without them . The hypothesis column provides what lies at the heart of inference --the conclusion reached after introducing facts given before it . Last but not least we have label column which denotes whether two sentences entail each other (ENTAILMENT), contradict each other(CONTRADICTION) or are unrelated(NEUTRAL). A domain label has also been assigned by some authors when necessary; this mostly applies when inferring between sentences across different semantic domains such as weather vs sports vs finance etc . >

Research Ideas

> - Developing an NLI-based Turkish language question answering system. > - Training a sentiment analysis algorithm to identify sentiment in text written in Turkish. > - Building a Machine Learning Chatbot that uses NLI to understand conversational context and respond accordingly for users intending to converse in the Turkish language

Acknowledgements

> If you use this dataset in your research, please credit the original authors. > Data Source >

License

> > License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication > No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: snli_tr_train.csv

Column name Description
premise This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
hypothesis This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
label This column assigns either ‘entailment’ , 'contradiction',or ‘neutral' sentiment/word association depending on whether they accept(entailment), reject(contradiction), or are neutral towards each other(neutral). (String)

File: multinli_tr_validation_matched.csv

Column name Description
premise This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
hypothesis This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
label This column assigns either ‘entailment’ , 'contradiction',or ‘neutral' sentiment/word association depending on whether they accept(entailment), reject(contradiction), or are neutral towards each other(neutral). (String)

File: snli_tr_validation.csv

Column name Description
premise This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
hypothesis This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
label This column assigns either ‘entailment’ , 'contradiction',or ‘neutral' sentiment/word association depending on whether they accept(entailment), reject(contradiction), or are neutral towards each other(neutral). (String)

File: multinli_tr_validation_mismatched.csv

Column name Description
premise This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
hypothesis This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
label This column assigns either ‘entailment’ , 'contradiction',or ‘neutral' sentiment/word association depending on whether they accept(entailment), reject(contradiction), or are neutral towards each other(neutral). (String)

File: multinli_tr_train.csv

Column name Description
premise This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
hypothesis This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
label This column assigns either ‘entailment’ , 'contradiction',or ‘neutral' sentiment/word association depending on whether they accept(entailment), reject(contradiction), or are neutral towards each other(neutral). (String)

File: snli_tr_test.csv

Column name Description
premise This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
hypothesis This column contains sentences written in Turkish which have been translated from English sources used for SNLI and MNLI datasets respectively. (String)
label This column assigns either ‘entailment’ , 'contradiction',or ‘neutral' sentiment/word association depending on whether they accept(entailment), reject(contradiction), or are neutral towards each other(neutral). (String)

Acknowledgements

> If you use this dataset in your research, please credit the original authors. > If you use this dataset in your research, please credit Huggingface Hub.

验证报告

以下为卖家选择提供的数据验证报告:

data icon
NLI-TR (Turkish NLI Research)
3
已售 0
41.07MB
申请报告