Data
Test Data with gold standard scores
Download the test data here in .zip or .tar.gz. This data includes the training and trial data, which you are welcome to use.
Download the training data here in .zip or .tar.gz. This data includes the trial data, which you are welcome to use.
Trial Data
Download the trial data here. (updated on Nov. 7)
Lexical Levels
Lexical levels denote different sizes of text. This Task involves four types of comparisons between four different levels:
- 1- Paragraph to Sentence (paragraph2sentence)
- 2- Sentence to Phrase (sentence2phrase)
- 3- Phrase to Word (phrase2word)
- 4- Word to Sense (word2sense)
Rating Scale
Task 3 uses a five-point rating scale designed to capture to similarity of the small lexical item's semantics within the semantics of the larger semantic unit. A natural example of this is summarizing a paragraph into a sentence: a good summary is a sentence whose meaning closely approximates the overall meaning of the sentence. Similarly, a single word can summaries a multi-word expression, e.g,. "a very large, magnificent house" is highly similar by the word "mansion".
The rating scale is summarized by the following guidelines:
- 4, Very Similar -- The two items have very similar meanings and the most important ideas, concepts, or actions in the larger text are represented in the smaller text. Some less important information may be missing, but the smaller text is a very good summary of the larger text.
- 3, Somewhat Similar -- The two items share many of the same important ideas, concepts, or actions, but include slightly different details. The smaller text may use similar but not identical concepts (e.g., car vs. vehicle), or may omit a few of the more important ideas present in the larger text.
- 2, Somewhat related but not similar -- The two items have dissimilar meaning, but shared concepts, ideas, and actions that are related. The smaller text may use related but not necessary similar concepts (window vs. house) but should still share some overlapping concepts, ideas, or actions with the larger text.
- 1, Slightly related -- The two items describe dissimilar concepts, ideas and actions, but may share some small details or domain in common and might be likely to be found together in a longer document on the same topic.
- 0, Unrelated -- The two items do not mean the same thing and are not on the same topic.
Example Ratings
Following, we show example similarity ratings for each rating and comparison type.
Paragraph to Sentence
Paragraph: Teenagers take aerial shots of their neighbourhood using digital cameras sitting in old bottles which are launched via kites - a common toy for children living in the favelas. They then use GPS-enabled smartphones to take pictures of specific danger points - such as rubbish heaps, which can become a breeding ground for mosquitoes carrying dengue fever.
Rating | Sentence |
---|---|
4 | Students use their GPS-enabled cellphones to take birdview photographs of a land in order to find specific danger points such as rubbish heaps. |
3 | Teenagers are enthusiastic about taking aerial photograph in order to study their neighbourhood. |
2 | Aerial photography is a great way to identify terrestrial features that aren’t visible from the ground level, such as lake contours or river paths. |
1 | During the early days of digital SLRs, Canon was pretty much the undisputed leader in CMOS image sensor technology. |
0 | Syrian President Bashar al-Assad tells the US it will "pay the price" if it strikes against Syria. |
Sentence to Phrase
Sentence: Schumacher was undoubtedly one of the very greatest racing drivers there has ever been, a man who was routinely, on every lap, able to dance on a limit accessible to almost no-one else.
Rating | Phrase |
---|---|
4 | the unparalleled greatness of Schumacher’s driving abilities |
3 | driving abilities |
2 | formula one racing |
1 | north-south highway |
0 | orthodontic insurance |
Phrase to Word
Phrase: loss of air pressure in a tire
Rating | Phrase |
---|---|
4 | flat-tire |
3 | puncture |
2 | tire |
1 | parking |
0 | butterfly |
Word to Sense
Word: automobile#n
Rating | Phrase |
---|---|
4 | car#n#1 (a motor vehicle with four wheels; usually propelled by an internal combustion engine) |
3 | vehicle#n#1 (a conveyance that transports people or objects) |
2 | bike#n#1 (a motor vehicle with two wheels and a strong frame) |
1 | highway#n#1 (a major road for any form of motor transport) |
0 | pen#n#1 (a writing implement with a point from which ink flows) |
Data Format
The input files consist of three tab-separated fields: larger_side,
Task Data
Task 3 provides trial, training, and test data.