Data and Tools

Trial Data

The trial data consists of a set of 30 documents collected from Wikinews ( about Apple Inc. A set of target entities (input) and the corresponding ordered list of events (the output timeline) is provided with the set of documents.

The trial data have been annotated with the extents of event mentions.

No training corpus will be provided in addition to the development corpus.



We also provide independently the 3 files used for the agreement on event mentions annotation, and the two TimeLines built by using these files for the agreement. The 3 files are also included in the whole corpus, but not the TimeLines. The annotation and the TimeLines have been reviewed.


Evaluation data

The evaluation data will consist of 3 sets of documents annotated with event mentions and a set of target entities. Each set will contain around 30 documents from Wikinews, for a total of around 30,000 tokens.



Documents. The documents will be available in two formats: CAT (Content Annotation Tool)  (Bartalesi Lenzi et al.,2012) labelled format and a format which mimics TimeML format (

CAT labelled format is an XML ­based stand­off format where different annotation layers are stored in separate document sections and are related to each other and to source data through pointers. Trial data are annotated with event mentions and the document creation time, so each document contains 2 different sections: one with the tokens and one with the markables.

The XSD schema of the annotated documents in CAT labelled format is available here.

In the alike TimeML format events are annotated using only the EVENT element (and not the MAKEINSTANCE as in TimeML). Elements has been added to mark out the sentences (s) and associate them to an unique id. The text is tokenized.


TimeLine. One file by TimeLine must be created. The first line contains the target entity.
The name of the files must be the mention of the target entity in lower case, and the extension “.txt”. In the case of multi-words entity, tokens will be separated by an underscore.
E.g.: steve_jobs.txt


Set of target entities. For each set of documents, one file is provided containing the list of target entities, one by line.


Evaluation tool

The evaluation script relies heavily on the TempEval-3 evaluation script (UzZaman et al., 2013) used to evaluate relations.

For each timeline, we use the evaluation metric presented at TempEval-3 to evaluate relations and to obtain the F1 score. The metric captures the temporal awareness of an annotation (UzZaman and Allen, 2011). Our evaluation script returns the micro average F1 score.


Before evaluating the temporal awareness, each timeline needs to be transformed into the corresponding graph representation. For that, we defined the following transformation steps:

  • ordering and time anchors
  1. Each time anchor is represented as a TIMEX3
  2. Each event is related to one TIMEX3 with the "SIMULTANEOUS" relation type
  3. If one event happens before another one, a "BEFORE" relation type is created between both events
  4. If one event happens at the same time as another one, a "SIMULTANEOUS" relation type is created between both events
  • ordering only
  1. If one event happens before another one, a "BEFORE" relation type is created between both events
  2. If one event happens at the same time as another one, a "SIMULTANEOUS" relation type is created between both events




Naushad UzZaman and James Allen (2011), "Temporal Evaluation." In Proceedings of The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (Short Paper), Portland, Oregon, USA.


Naushad UzZaman and Hector Llorens and Leon Derczynski and Marc Verhagen and James Allen and James Pustejovsky (2013) "SemEval-2013 Task 1: TEMPEVAL-3: Evaluating Time Expressions, Events, and Temporal Relations" Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 1–9, Atlanta, Georgia, June 14-15, 2013.

Contact Info


  • Anne-Lyse Minard
  • Eneko Agirre
  • Itziar Aldabe
  • Marieke van Erp
  • Bernardo Magnini
  • German Rigau
  • Manuela Speranza
  • Rubén Urizar


google group: semeval-task4-timeline

Other Info