Data and Tools < SemEval-2017 Task 4

Data and Tools

I. Codalab *NEW*

The following are development sets and instructions to be used to practice submitting your output to Codalab. We encourage you to begin testing uploading immediately. Initially, you may want to upload the baseline provided in the zipped files.

4A English (codalab): 4a-english.zip
4A Arabic (codalab): 4a-arabic.zip
4B English (codalab): 4b-english.zip
4B Arabic (codalab): 4b-arabic.zip
4C English (codalab): 4c-english.zip
4C Arabic (codalab): 4c-arabic.zip
4D English (codalab): 4d-english.zip
4D Arabic (codalab): 4d-arabic.zip
4E English (codalab): 4e-english.zip
4E Arabic (codalab): 4e-arabic.zip

II. English Training Data

Download the English data from prior years organized for this year.
Link to the data as posted for last year (SemEval-2016).

III. Arabic Training Data

IV. Download Scripts

Script to download tweets and user information for the above datasets

V. Test input:

Test input v3.0 for phase 1 (January 11-22): subtasks A, C, E (deadline: passed)
Test input for phase 2 (January 23-30): subtasks B and D (deadline: passed)

VI. Arabic+English training data:

Training data for Arabic+English is here

NOTES:

1. For English, we provide a default split of the data from previous years into training, development and development-time testing datasets, participants are free to use this data in any way they find useful when training and tuning their systems, e.g., use a different split, perform cross-validation, train on all datasets, etc.

2. For English, unlike in previous years, for SemEval-2017 Task 4, there was no progress testing, and thus all the provided data could be used for training and development.

RESULTS:

All training data can be found here.
The test data can be found here.
The gold labels, submissions and scores for all teams can be found here.
The task paper can be found here.

@InProceedings{SemEval:2017:task4,
author    = {Sara Rosenthal and Noura Farra and Preslav Nakov},
title     = {{SemEval}-2017 Task 4: Sentiment Analysis in {T}witter},
booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation},
series    = {SemEval '17},
month     = {August},
year      = {2017},
address   = {Vancouver, Canada},
publisher = {Association for Computational Linguistics},
}

SemEval-2017 Task 4

Sentiment Analysis in Twitter

Data and Tools

Contact Info

Other Info

Announcements