Data and Tools
English TRAIN+DEV data v3.2 -- same as for SemEval-2016 Task 3 (subtasks A, B and C)
- Data for all English subtasks v3.2 is here
-
It includes a TRAIN/DEV split with reliable double-checked DEV
- Subtask A (6,398 questions + 40,288 comments) + unannotated (189,941 questions + 1,894,456 comments)
- Subtask B (317 original + 3,169 related questions)
- Subtask C (317 original questions + 3,169 related questions + 31,690 comments)
Arabic TRAIN+DEV data v1.3 -- same as for SemEval-2016 Task 3 (Subtask D)
- The Arabic TRAIN+DEV data v 1.3 can be found here
- It includes a TRAIN/DEV split with reliable double-checked DEV (1,281 original questions, and 37,795 potentially related question-answer pairs) + unannotated (163,383 question--answer pairs)
Test data from SemEval-2016 Task 3 -- can be used for training too (subtasks A, B, C and D)
Multi-domain SAMPLE data Task 3 -- data for the new Subtask E
- The StackExchange multi-domain SAMPLE data can be found here
- The sample data is taken from a StackExchange subforum that is not in the DEV, TRAIN or TEST sets.
Multi-domain TRAIN and DEV data Task 3 -- data for the new Subtask E
- The StackExchange multi-domain TRAIN and DEV data can be found here
- UPDATE 9/9/2016: user data has been added. The current version is v1_2.
Test data for SemEval-2017 Task 3
- Now available here for subtasks A, B, C, and D (check the README file)
- For subtask E it will be available on January 21, 2017
- For instructions on how to submit the test results of your systems, please read this page
- UPDATE 24/9/2016: test data for subtask E is now available here
Scorer v2.2 and random baselines (subtasks A, B, C, D and E) -- same as for SemEval-2016 Task 3
- Can be found here
UPDATE 3/12/2016: There is a new scorer available (v2.3) which can be used for all subtasks, including subtask E.
- It can be found here
RESULTS
- The gold labels, submissions and scores for all teams can be found here
- The gold labels inside the test XML can be found here
- The task description paper is here.
@InProceedings{SemEval-2017:task3,
author = {Nakov, Preslav and Hoogeveen, Doris and M\`{a}rquez, Llu\'{i}s and Moschitti, Alessandro and Mubarak, Hamdy and Baldwin, Timothy and Verspoor, Karin},
title = {{SemEval}-2017 Task 3: Community Question Answering},
booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation},
series = {SemEval '17},
month = {August},
year = {2017},
address = {Vancouver, Canada},
publisher = {Association for Computational Linguistics},
}