SemEval-2014 Task 8

Participation

Registration & Communication

We ask all interested parties, including prospective participants, to subscribe to our spam-protected mailing list, where we will post updates a little more frequently than on the general task web site. Access to the task data requires prior registration to this list, where we will make available a licensing template and download instructions.

Schedule (Tentative for 2014)

Monday, November 4, 2013: trial data available;
Friday, December 13, 2013: training data available;
Monday, March 24, 2014: release of test data;
Sunday, March 30, 2014: submission of system results;
Thursday, May 15, 2014: paper submission;
Tuesday, June 10, 2014: reviewing results;
Monday, June 30, 2014 (tbc): camera-ready submission.

System Submissions

Submissions of system outputs must be made to the ‘official’ SemEval submission system, i.e. through ftp(1) upload to the server at ‘alt.qcri.org’. This server requires a user name and password to connect; all registered participants have received this information directly from the SemEval 2014 organizers (Preslav Nakov and Torsten Zesch). If you do not have this information, please contact the SemEval 2014 organizers immediately (and before the final submission deadline for this task, which is Sunday, March 30, 2014).

Our task has three target formats (DM, PAS, PCEDT) and two tracks (closed and open); participants are expected to submit results for all three target formats, but are free to submit to either of the two tracks, or to both of them (dependencing on whether or not any data or tools were used in additional to the training semantic dependency graphs provided for this task). Furthermore, participants are allowed to submit up to two runs for each target format and task, for example reflecting different parameterizations of their system. For details on the difference between the two tracks and definition of different runs, please see the evaluation page.

Given the above parameters, each submission can contain between three and twelve (3 × 2 × 2) result files, all in the official tab-separated SDP file format as is documented on the data page. To not have to deal with a large number of individual files, we ask that the complete submission be ‘packacked up’ in a single compressed archive (using ‘.zip’ or ‘.tgz’ archivers) before upload to the SemEval server; we are aware that this request deviates slightly from the instructions you have received from the SemEval organizers, but we have obtained their approval to implement a task-specific simpler scheme, i.e. using a single archive submitted for each participating team. We ask that you name result files uniformly, using the following scheme track ‘.’ format ‘.’ run ‘.sdp’, e.g. ‘closed.dm.2.sdp’ for the second run using the DM target format in the closed track, or ‘open.pcedt.1.sdp’ for the first run with open-track PCEDT outputs.

The name of the compressed archive that is uploaded should contain the team identifier, as assigned to you in the confirmation email of your registration from the SemEval organizers. If possible, please do not make multiple submissions for the same team, but in case you have to re-upload a set of system results (within the submission deadline), please add a suffix to the archive name: if your team identifier were GT, for example, and you had already uploaded a first set of results as ‘gt.zip’, then please use an archive name like ‘gt_1.zip’ for the re-submission.

In addition to the ‘.sdp’ result files, each submission archive must included a README file that provides the following information:

Team identifier (provided by SemEval organizers);
Team member name(s) and affiliation(s);
Designated contact person and email address;
Inventory of results files included in the archive;
System characteristics, including (if applicable):
- core approach;
- important features;
- critical tools used;
- data pre- or post-processing;
- additional data used.
Bibliographic references (if applicable).

To make our task in receiving and evaluating submisisons as smooth as possible, please be careful in implementing the above requirements and naming schemes. As always, please do not hesitate to contact the task organizers (at the email address indicated in the right column) in case you require additional information or clarification.

Contact Info

Organizers

Dan Flickinger
Jan Hajič
Marco Kuhlmann
Yusuke Miyao
Stephan Oepen
Yi Zhang
Daniel Zeman

sdp-organizers@emmtee.net

Other Info

Announcements

[22-apr-14] Complete results (system submissions and official scores) as well as the gold-standard test data are now available for public download.

[31-mar-14] We have received submissions from nine teams; a draft summary of evaluation results has been emailed to participating teams.

[25-mar-14] We have posted some additional, task-specific instructions for how to submit system results to the SemEval evaluation; please make sure to follow these requirements carefully.

[22-mar-14] The test data (and corresponding ‘companion’ syntactic analyses, for use in the open track) are now available to registered participants; please see the task mailing list for details.

[08-mar-14] We have released a minor update to the companion archive, adding a handful of missing dependencies and fixing a problem in the file format.

[05-feb-14] We have posted the description of a baseline approach and experimental results on the suggested development sub-set of our training data (Section 20) on the evaluation page; on the same page, we have further specified the mechanics of submitting results to the evaluation.

[17-jan-14] Version 1.0 of the ‘companion’ data for the open track is now available, providing syntactic analyses (in phrase structure and bi-lexical dependency form) as overlays to our training data. Please see the file README.txt in the companion archive for details.

[13-jan-14] We are releasing an update to the training data today, making a number of minor improvements to the DM and PCEDT graphs; also, we are now providing an on-line interface to search and explore visually the target representations for this task. For details, please see our task-specific mailing list.

[12-dec-13] Some 750,000 tokens of WSJ text, annotated in our three semantic dependency formats will become available for download tomorrow. To obtain the data, prospective participants need to enter a no-cost evaluation license with the Linguistic Data Consortium (LDC). For access to the license form, please subscribe to our spam-protected mailing list. Next, we are working to prepare our syntactic ‘companion’ data (to serve as optional input in the open track), which we expect to release in early January.

[24-nov-13] Version 1.1. of the trial data is now available, adding missing lemma values and streamlining argument labels in the DM format, removing a handful of items that used to have empty graphs in PAS, and generally aligning all items at the level of individual tokens (leaving 189 sentences in our trial data). This last move means that all three formats now uniformly use single-character Unicode glyphs for quote marks, dashes, ellipses, and apostrophes (rather than multi-character LaTeX-style approxmiations, as were used in the original ASCII release of the text). Furthermore, we encourage all interested parties, including prospective participants, to subscribe to our spam-protected mailing list, where we will post updates a little more frequently than on the general task web site.

[07-nov-13] We have clarified the interpretation of the top column (and renamed it from the earlier root) and elaborated the discussion of graph properties in the various formats. We will continue to extend and revise the documentation on our three types of dependency graphs, but only announce such incremental changes here when they affect the data format.

[04-nov-13] A 198-sentence subset of what will be the training data has been released as trial data, to exemplify the file format and type of annotations available. Please do get in touch, in case you see anything suprising!

[28-oct-13] We are in the process of finalizing the task description, posting some example dependencies, and making available some trial data. For the time being, please consider these pages very much a work in progress, i.e. contents and form will be subject to refinement over the next few days
.

Copyright 2026 - SemEval-2014 Task 8. All Right Reserved

Designed by BhMad Studio | Powered by GetSimple