For this year's task, we are offering the following subtasks:


Subtask A: Pairwise Comparison

Given two tweets, a successful system will be able to predict which tweet is funnier, according to the gold labels of the tweets. During evaluation, only tweets with differing labels will be evaluated, and the tweet with the higher label is said to be funnier in a given pair of tweets. The sample script released with the Trial Data performs this task specifically. 

For evaluation, we will release data formatted exactly like the Trial/Training data, but without labels. To evaluate this subtask, teams will produce predictions for every possible combination of tweet pairs from a given Evaluation file. The evaluation script will then select the appropriate pairs for evaluation. The Evaluation metric is accuracy micro-averaged across all Evaluation files.


CodaLab competition:



Subtask B: Semi-Ranking

Given an input file of tweets for a given hashtag, systems will produce a ranking of tweets from funniest to least funny. Since the tweet files do not relate an explicit rankings, we will be evaluating whether tweets having been placed in the appropriate bucket: winning tweet, top 10 but not winning, and not 10. In a certain sense this can be thought of as labeling, however there is a known cardianlity for tweets in each bucket: 1 tweet, 9 tweets, the rest of the tweets.

System evaluation will use a measure inspired by edit distance: for each tweet, how many moves must occur for it to be placed in the right bucket. For example, if the winning tweet has been placed in the top 10 but not winning bucket, and a tweet from the top 10 but not winning bucket has placed in the winning tweet bucket, the total edit error will be 2, 1 for each tweet. The final Evaluation measure will be the edit error normalized by 22, the maximum edit error. This Evaluation metric is averaged across all Evaluation files to produce the final metric.


CodaLab competition:

Contact Info

Discussion Group
Hashtag Wars SemEval

Other Info


  • 2/6/2017 [new]
    The results have been posted!
  • 1/9/2017
    Evaluation data has been released!
  • 12/6/2016
    CodaLab competitions are ready!
  • 10/19/2016
    Evaluation scripts for both subtasks have been released!
  • 9/5/2016
    Train data has been released!
  • 8/1/2016
    Trial data has been released!
  • For participation in any of this year's tasks, please register by completing this form