Data and Tools


I. New 2016 datasets for Subtasks A, B, C, D, E (either one can be used for both training and tuning).

II. Scorers and format checkers for all subtasks (v 2.3):

III. Older data (can be used for both training and tuning).

1. For Subtask A

2. For Subtasks C, D, and E

IV. Older data that can be used as dev-test only, but NOT for training or tuning

For Subtask A

  • SemEval-2013 Task 2 development-test: (i) Twitter-2013 and (ii) SMS-2013 messages (CANNOT be used for training or tuning!)
  • SemEval-2014 Task 9 development-test: (i) Twitter-2013, (ii) SMS-2013 messages, (iii) Twitter-2014, (iv) Twitter-2014-sracasm, and (v) Live Journal-2014 (CANNOT be used for training or tuning!)
  • SemEval-2015 Task 10 development-test: Twitter-2015 (CANNOT be used for training or tuning!)

1. Participants can choose to download the test data at any moment during the evaluation period (January 10-31, 2016). Regardless of the time of download, results are to be submitted by January 31, 2016, 23:59 hours. The time zone is Midway, Midway Islands, United States: see

2. The submission will be done using the SemEval START website:

3. Participants can make new submissions, which will substitute their earlier submissions on the START server multiple times, but only before the deadline (see above). Thus, we advise that participants submit their runs early, and possibly resubmit later if there is time for that (START was not closed).

4. Participants are free to participate for a single subtask or for any combination of subtasks.

5. We allow a single run per subtask.

6. The test datasets from previous years cannot be used for training or tuning; they can be used for development-time testing only. This includes the following: (i) Twitter-2013, (ii) SMS-2013 messages, (iii) Twitter-2014, (iv) Twitter-2014-sarcasm, (v) Live Journal-2014, and (vi) Twitter-2015

7.  Participants are free to use any other data (except for what is listed in 6 above): we will not distinguish between closed (that only use the provided data) and open (that also use additional data) runs. However, they will need to describe the resources and tools they have used to train their systems in the Web form they have recieved by email.


8. If you have not submitted a short description of your system when you did your submission, please do so ASAP (by email). Here is the template (to be filled for each subtask you participated in):



9. You are strongly encouraged to submit a short system description paper by February 26, 2016:


10. You can write a paper regardless of whether you plan to attend SemEval-2016 at San Diego (registration for SemEval-2016 and participation are optional, but strongly encouraged). Your paper will be published even if you do not register for SemEval-2016 (provided that reviewers find it to be of acceptable quality).


11. There is no need to describe the task in much detail, as you can point to the system description paper instead (BibTex below). We will cite back your system description paper in the task description paper:


  author    = {Preslav Nakov and Alan Ritter and Sara Rosenthal and Veselin Stoyanov and Fabrizio Sebastiani},
  title     = {{SemEval}-2016 Task 4: Sentiment Analysis in {T}witter},  booktitle = {Proceedings of the 10th International Workshop on Semantic Evaluation},
  series    = {SemEval '16},
  month     = {June},
  year      = {2016},
  address   = {San Diego, California},
  publisher = {Association for Computational Linguistics},

Contact Info

  • Preslav Nakov, Qatar Computing Research Institute, HBKU
  • Alan Ritter, The Ohio State University
  • Sara Rosenthal, Columbia University
  • Fabrizio Sebastiani, Qatar Computing Research Institute, HBKU
  • Veselin Stoyanov, Facebook


Other Info


  • Task description paper draft is now released!
  • EVALUATION results are now released!
  • The evaluation measures description was updated: see here