Data, Evaluation, and Results





  • Training and test data with stance gold labels and additional labels such as 'target of opinion' and 'sentiment' . These additional annotations were not part of the SemEval-2016 competiton, but are made available for future research. Details about this dataset are available in this paper:

    Saif M. Mohammad, Parinaz Sobhani, and Svetlana Kiritchenko. 2016. Stance and sentiment in tweets. Special Section of the ACM Transactions on Internet Technology on Argumentation in Social Media, 2017, 17(3).

Data Visualization:

  • An Interactive Visualization of the Stance Datset is now available. It shows various statistics about the data.
  • Note that it also shows sentiment and target of opinion annotations (in addition to stance).
  • Clicking on various visualization elements filters the data. For example, clicking on 'Feminism' and 'Favor' will filter all sub-visualizations to show information pertaining to only those tweet that express favor towards feminism. You can also use the check boxes on the left to view only test or training data, or data on particular targets.

Instructions to Annotators

  • We used this questionnaire to obtain annotations
  • Annotators were restricted to those living in USA
  • For the classification tasks A and B, options 3 and 4 listed in the questionaire were collapsed into one class NONE (neither favor nor against) 


  • We will use the macro-average of F-score(FAVOR) and F-score(AGAINST) as the bottom-line evaluation metric. 
  • Evaluation Script v2  (last updated: January 11th, 2016) (the same script can be used for both task A and Task B)
      You can use it to:
      -- check the format of your submission file
      -- determine performance when gold labels are available (note that you can also use the script to determine performance on a held out portion of the training data to gauge your system's progress)
  • A separate evaluation script to determine scores on the following subsets of the test set: (a) where the given target of interest is the same as the target of opinion in the tweet, and (b) where the given target of interest is *not* the target of opinion in the tweet. NOTE: This script was not part of the official competition, and is provided only as a means for further analysis of the results.

Contact Info


  • Saif M Mohammad
    National Research Council Canada
  • Svetlana Kiritchenko
    National Research Council Canada
  • Parinaz Sobhani
    University of Ottawa
  • Xiaodan Zhu
    National Research Council Canada
  • Colin Cherry
    National Research Council Canada


Other Info


  • An interactive visualization of the stance data is now available through the 'Data' page.
  • Results have been announced.
  • Test data has been released.
  • The Stance task (both Task A and B) will have the following evaluation period: Jan 11th (Mon) to Jan 18 (Mon).
  • Trial data, training data, and domain corpus have been released.