Task 10: Detecting Minimal Semantic Units and their Meanings

The DiMSUM shared task is concerned with predicting, given an English sentence, a broad-coverage representation of lexical semantics. The representation consists of two closely connected facets: a segmentation into minimal semantic units, and a labeling of some of those units with semantic classes known as supersenses.

For example, given the POS-tagged sentence

I_PRP googled_VBD restaurants_NNS in_IN the_DT area_NN and_CC Fuji_NNP Sushi_NNP came_VBD up_RB and_CC reviews_NNS were_VBD great_JJ so_RB I_PRP made_VBD a_DT carry_VB out_RP order_NN

the goal is to predict the representation

I googled_{communication} restaurants_GROUP in the area_LOCATION and Fuji_Sushi_GROUP came_up_{communication} and reviews_{COMMUNICATION} were_stative great so I made_ a carry_out_possession _order_{communication}

where lowercase labels are verb supersenses, UPPERCASE labels are noun supersenses, and _ joins tokens within a multiword expression. (carry_out_possession and made_order_{communication} are separate MWEs.)

Systems are expected to produce the both facets of the representation, though the manner in which they do this (e.g., pipeline vs. joint model) is up to you.

Gold standard training data labeled with the combined representation will be provided in two domains: online reviews and tweets. Blind test data will be in these two domains as well as a third, surprise domain.

For further details, see the task website: http://dimsum16.github.io/

SemEval-2016 Task 10

Task 10: Detecting Minimal Semantic Units and their Meanings

Contact Info

Organizers

Other Info

Announcements