Task Description: Cross-Level Semantic Similarity

Organizers

David Jurgens, Mohammad Taher Pilehvar, and Roberto Navigli


Recent News

  • Test data released: refer to the Data page!
  • Training data released.

 

Overview

Semantic similarity is an essential component of many applications in Natural Language Processing (NLP). This task provides an evaluation for semantic similarity across different sizes of text, which we refer to as lexical levels. Unlike prior SemEval tasks on textual similarity that have focused on comparing similar-sized texts, this task evaluates the case where larger text must be compared to smaller text. Specifically, this task encompasses four semantic similarity comparisons:

  1. paragraph to sentence,
  2. sentence to phrase,
  3. phrase to word, and
  4. word to sense.

This task unifies multiple objectives from different areas of NLP under a single task: Paraphrasing, Summarization, Compositionality, and Meaning in Context. The task's objective is twofold: (1) to provide an evaluation framework for assessing similarity methods at different lexical levels, and (2) to provide a dataset for comparing general methods with those specialized for a specific type of comparison. Accordingly, we encourage participants to submit ratings for all comparison types, but we will also allow systems to be submitted that are specialized to a single comparison type. Our hope is to produce NLP methods that perform accurately across any semantic or textual level and enable downstream NLP applications to be applied to more diverse text input.

 

Task Details

Task participants will be provided with pairs of each comparison type and asked to rate the pair according to the semantic similarity of the smaller item to the larger item. As an example, given a sentence and a paragraph, a system would assess how similar is the meaning of the sentence to the meaning of the paragraph. Ideally, a high-similarity sentence would reflect overall meaning of the paragraph. In general, similarity judgments are based on the following rating levels:

  • 4 -- The two items have very similar meanings and the most important ideas, concepts, or actions in the larger text are represented in the smaller text.
  • 3 -- The two items share many of the same important ideas, concepts, or actions, but those expressed in the smaller text are similar but not identical to the most important in the larger text
  • 2 -- The two items have dissimilar meaning, but the shared concepts, ideas, and actions in the smaller text are related (but not similar) to those of the large text
  • 1 -- The two items describe dissimilar concepts, ideas and actions, but might be likely to be found together in a longer document on the same topic.
  • 0 -- The two items do not mean the same thing and are not on the same topic.

For word-to-sense comparison, a sense is paired with a word and the perceived meaning of the word is modulated by virtue of the comparison with the paired sense’s definition.
 

See the Data page for more examples of ratings for each type of comparison and information on the datasets with this task


See the Participate page for more details on how to participate in this task.

Contact Info

Organizers


email : semeval-2014-clss@googlegroups.com
group: groups.google.com/group/semeval-2014-clss

Other Info

Announcements