Data and Tools

SemEval 2017 Evaluation Data

STS 2017 Evaluation Sets v1.1 (released Jan 16th, 2017, updated Jan 18th, 2017) 

STS 2017 Evaluation Gold Standard (released Feb 14th, 2017)

New for 2017:

STS 2017 Trial Data (Sept 21st, 2016)

STS 2017 Monolingual Arabic and Cross-lingual Arabic-English Data (Updated Nov 9th, 2016)

STS 2017 Cross-lingual English-Spanish Data (Nov 7th, 2016)

Training Data

For training data, participants are encouraged to make use of all existing English, Spanish and cross-lingual English-Spanish data sets from prior STS evaluations. This includes all previously released trial, training and evaluation data (see STS wiki).

Since this is the first year that we will include Arabic as part of an STS evaluation, we will release training data for both monolingual Arabic and cross-lingual Arabic-English. Each training set will consist of approximately 14,000 pairs sourced from prior English STS evaluations. The monolingual Arabic and crosslingual Arabic-English data will be available by Oct 24th.

As with the 2016 evaluation, participants are allowed and very much encouraged to train purely unsupervised models and model components on arbitrary data (e.g., unsupervised word embeddings).

Evaluation Data

This year's shared task includes one evaluation set for each of the seven tracks described above. Each evaluation set consists of between 200 to 250 sentence pairs. Within each evaluation set, we will attempt to approximately balance the distribution of STS scores.

Contact Info

Organizers (alpha. order)

Eneko Agirre, Daniel Cer, Mona Diab, Iñigo Lopez-Gazpio and Lucia Specia

Wiki: STS Wiki

Discussion Group: STS-semeval

Other Info