Data and Tools

 

English TRAIN+DEV data v3.2 -- same as for SemEval-2016 Task 3 (subtasks A, B and C)

  • Data for all English subtasks v3.2 is here
  • It includes a TRAIN/DEV split with reliable double-checked DEV
    • Subtask A (6,398 questions + 40,288 comments) + unannotated (189,941 questions + 1,894,456 comments)
    • Subtask B (317 original + 3,169 related questions)
    • Subtask C (317 original questions + 3,169 related questions + 31,690 comments)

 

Arabic TRAIN+DEV data v1.3 -- same as for SemEval-2016 Task 3 (Subtask D)

  • The Arabic TRAIN+DEV data v 1.3 can be found here
  • It includes a TRAIN/DEV split with reliable double-checked DEV (1,281 original questions, and 37,795 potentially related question-answer pairs) + unannotated (163,383 question--answer pairs)

 

Test data from SemEval-2016 Task 3 --  can be used for training too (subtasks A, B, C and D)

  • Can be found here
  • Format checker for the test output is here
  • The GOLD labels and results are here

 

Multi-domain SAMPLE data Task 3 -- data for the new Subtask E

  • The StackExchange multi-domain SAMPLE data can be found here
  • The sample data is taken from a StackExchange subforum that is not in the DEV, TRAIN or TEST sets.

 

Multi-domain TRAIN and DEV data Task 3 -- data for the new Subtask E

  • The StackExchange multi-domain TRAIN and DEV data can be found here
  • UPDATE 9/9/2016: user data has been added. The current version is v1_2.

 

Test data for SemEval-2017 Task 3

  • Now available here for subtasks A, B, C, and D (check the README file)
  • For subtask E it will be available on January 21, 2017
  • For instructions on how to submit the test results of your systems, please read this page
  • UPDATE 24/9/2016: test data for subtask E is now available here

 

Scorer v2.2 and random baselines (subtasks A, B, C, D and E) --  same as for SemEval-2016 Task 3

UPDATE 3/12/2016: There is a new scorer available (v2.3) which can be used for all subtasks, including subtask E.

  • It can be found here

 

RESULTS

  • The gold labels, submissions and scores for all teams can be found here
  • The gold labels inside the test XML can be found here
  • The task description paper is here.

 @InProceedings{SemEval-2017:task3,

     author    = {Nakov, Preslav and Hoogeveen, Doris and M\`{a}rquez, Llu\'{i}s and Moschitti, Alessandro and Mubarak, Hamdy and Baldwin, Timothy and Verspoor, Karin},
     title     = {{SemEval}-2017 Task 3: Community Question Answering},
     booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation},
     series    = {SemEval '17},
     month     = {August},
     year      = {2017},
     address   = {Vancouver, Canada},
     publisher = {Association for Computational Linguistics},
   }

Contact Info

Organizers


  • Preslav Nakov, Qatar Computing Research Institute, HBKU
  • Lluís Màrquez, Qatar Computing Research Institute, HBKU
  • Alessandro Moschitti, Qatar Computing Research Institute, HBKU
  • Hamdy Mubarak, Qatar Computing Research Institute, HBKU
  • Timothy Baldwin, The University of Melbourne
  • Doris Hoogeveen, The University of Melbourne
  • Karin Verspoor, The University of Melbourne

email : semeval-cqa@googlegroups.com

Other Info

Announcements


  • 14 Feb. 2017: Submit your paper by February 27
  • 11 Feb. 2017: The results and all scores are released
  • 30 Jan. 2017: The closing date for test submissions is January 30th midnight UTC-12.
  • 24 Jan. 2017: Test set for subtask E is available now. (here)
  • 12 Jan. 2017: Test sets for subtasks A-D are available now. (data webpage)
  • 9 Jan. 2017: The release of test data for subtasks A-D is delayed by some days. Apologies for the inconvenience.
  • 5 Jan. 2017: Submission deadline is set to be January 30.
  • 5 Jan. 2017: New web page created with instructions on how to submit system results.
  • 8 Dec 2016: Separate competitions for the subtasks have been set up at CodaLab, where you can submit your results: Subtask A, Subtask B, Subtask C, Subtask D, and Subtask E. You can submit results both for the development set and the test set here, receive scores and choose what to publish on the leaderboard.
  • 8 Dec 2016: A new scorer is now available from the Data and Tools page, which can also be used for subtask E
  • Register to participate here