English subtask:
(Aug. 15) Trial data with details of test datasets, as well as the training data (all data released in STS 2012, 2013 and 2014). It also includes the evaluation script.
(Jan. 22) Raw annotations and the Perl scripts that generate the final gold standard files.
(Jan. 22) Test data with gold standard annotations. It also includes the evaluation script and the task baseline.
Spanish subtask:
Intepretable STS subtask:
(Oct. 16) Annotation guidelines made available.
NEW (Nov. 10) Final train data (including NEW evaluation script) made available.
Evaluation script has been updated as follows:
1. Bug affecting alignments which had multiple types fixed.
2. Special case for the evaluation including types and score:
- no type penalty between tags {SPE1, SPE2, REL, SIMI} when both scores are (0-2]
- no type penalty between EQUI and SIMI/SPE with score 4.
(Jan. 22) Test data with gold standard annotations. It also includes the evaluation script, gold labels (.wa files) and the task baseline.
Please visit the STS wiki for links to publicly available STS tools