EstQA Question Answering dataset – META-SHARE

Last view: 2026-05-01

43 Last view: 2026-05-01

Last update: 2026-04-08

2 Last update: 2026-04-08

EstQA Question Answering dataset

https://huggingface.co/datasets/anukaver/EstQA

Dataset for extractive question answering in Estonian. It based on Wikipedia articles, pre-filtered via PageRank.
Training set includes 776 context-question-answer triplets. There are several possible answers per question, each in a separate triplet. Number of different questions is 512.
Test set includes 603 samples. Each sample contains one or more golden answers. Altogether there are 892 golden answers.

If you use this dataset for research, please cite the following paper:

@mastersthesis{mastersthesis,
author = {Anu Käver},
title = {Extractive Question Answering for Estonian Language},
school = {Tallinn University of Technology (TalTech)},
year = 2021
}

You don’t have the permission to edit this resource.

DistributionDOI

10.15155/9-00-0000-0000-0000-00222L

Availability

Available - Unrestricted Use

Licence

CC - BY

Contact Person

text

Monolingual text corpusLanguages

Estonian

Linguality

Linguality type: Monolingual

Size

1,115 Entries

Metadata

Created: 04/27/2021

Last Updated: 04/08/2026

Metadata Creator

Version

Version: 1

People who looked at this resource also viewed the following: