Data and Tools
A Scorer for Task 11 in Python (November 24th)
A Python Scorer is now available to download for this task (here).
The scorer can be tested using these gold and test inputs, using the following command-line call:
python semevalscorer.py gold.csv test.csv
Please see below (under Java scorer) for general comments about the scorer and its operation. The Python and Java scorers are functionally equivalent.
A Scorer for Task 11 in Java (November 23rd)
A Java Scorer is now available to download for this task (here).
The scorer can be tested using these gold and test inputs, using the following command-line call:
java -jar semevalscorer.jar gold.csv test.csv
This scorer was created to evaluate your system output for the SemEval 2015 Task 11. You have to prepare your system output in the same format as the training data provided online (integer version). See the files gold and test for examples. Then please use the above command to run the scorer.
The scorer expects the system output to be in the 11-scale integers (ranging from -5 to +5), for the convenience of the systems based on either regression model or classification model. Please make sure that the output value for each tweet in your submission is properly scaled and rounded (if you are using real-valued output).
The gold standard is derived from the averaged human judgments of trusted users. The script will transform the gold standard and the system output into a vector space representation, based on which the scorer will evaluate the output by calculating the cosine similarity of the two input vectors. Note that we will employ a linear penalty for submissions that do not cover all tweets. For example, if your submission provides sentiment judgments for only half of the tweets appearing in the test data set, then your final score will be halved, that is, 0.5*cosine(gold, test).
Important Note regarding Training Data (August 30th):
A transcription error in the representation of tweet-ids was observed after the training data files were first uploaded. In addition, some duplicates were observed in the training data. If you downloaded the training data files before August 30th, please download them again to ensure that you have the correct data files with the correct tweet-ids and without duplicate entries.
Important Note regarding Training Data (October 15th):
Some prospective participants have noted the perishability of our training tweets (as all tweets are potentially perishable). Please see the bottom of this page for a discussion of tweet perishability, and our approach to ensuring that ALL training tweets are made available participating systems.
Training Data
Training data for this task (8000 figurative tweets annotated with sentiment scores in the range -5...+5) is now available as a spreadsheet here (rounded integer scores) and here (real-valued scores)
Training data is now also available in a .tsv format here (rounded integer scores) and here (real-valued scores)
An .RTF README document for this trial data is available here.
Please see the imporant note about tweet perishability (and what you can do about it) at the bottom of this page!
Trial Data
Trial data for this task (1000 figurative tweets annotated with sentiment scores in the range -5...+5) is now available as a spreadsheet here.
An .RTF README document for this trial data is available here.
Tweet Text
The actual text of each tweet is not included, due to copyright/privacy concerns that come as standard with the use of Twitter data. A script is available here for retrieving the text of each tweet given its tweet-id.
For python 2.x
For python 3.x
Note: Tweets are a perishable commodity and may be deleted, archived or otherwise made inaccessible by their creators. Participants are encouraged to download the text of tweets via their tweet ids using the script provided at their earliest convenience.
As of October 15, 2014 approx. 15% of our training tweets have already perished for one of the above reasons. For this reason we have created a mapping from the published tweet-ids in the training data (above) to a new set of imperishable copies. This mapping of tweet-ids (perishable to imperishable, with weighted sentiment score) is downloadable here: