Data and Tools

Data:

  • English subtask:
    • (Aug. 15) Trial data with details of test datasets, as well as the training data (all data released in STS 2012, 2013 and 2014). It also includes the evaluation script.
    • (Jan. 22) Raw annotations and the Perl scripts that generate the final gold standard files.
    • (Jan. 22) Test data with gold standard annotations. It also includes the evaluation script and the task baseline.
  • Spanish subtask:
  • Intepretable STS subtask:
    • (Oct. 16) Annotation guidelines made available.
    • NEW (Nov. 10) Final train data (including NEW evaluation script) made available.
      Evaluation script has been updated as follows:
      1. Bug affecting alignments which had multiple types fixed.
      2. Special case for the evaluation including types and score:
           - no type penalty between tags {SPE1, SPE2, REL, SIMI} when both scores are (0-2]
           - no type penalty between EQUI and SIMI/SPE with score 4.
    • (Jan. 22) Test data with gold standard annotations. It also includes the evaluation script, gold labels (.wa files) and the task baseline.

Tools:

  • Please visit the STS wiki for links to publicly available STS tools

Contact Info

email list: sts-semeval@googlegroups.com

Other Info

Announcements

  • NEW Nov. 10: final train data for interpretable STS, with updated evaluation script
  • Oct. 16: interpretable STS updated description, train data, guidelines
  • Aug. 15: subtasks with descriptions and trial data available
  • Please fill in SemEval registration form
  • Please join the mailing list for updates