Data and Tools


  • AMR specification
  • Newswire/Discussion Forum training data (39,260 sentences): For access to this data, download this form and mail or email it to LDC (contact info contained within form). They will send you a link to download the data. NOTES:
    1. Do NOT email the form to me. Please read the form carefully and follow the instructions, which include directions on where to mail it.
    2. You must fill this form out and mail it in, even if you filled out a similar form last year. New year, new data.
  • Biomedical training data (6,452 sentences) is available here and has been split into training, dev, and test sets. This data is all freely available to download; no form is needed.
  • Newswire/Discussion Forum and Biomedical blind evaluation data will be provided at the time of the evaluation to those that have registered.

Tools: (Note: all tools are provided as-is and provided graciously by their authors. They are, mostly, not maintained by the organizers of this task. We do not guarantee their suitability for your needs and will not provide technical support beyond documentation found on this page, of which there is currently none).

  • Unsupervised AMR-to-English aligner, courtesy of Nima Pourdamghani
  • Python library, courtesy of Nathan Schneider:
  • English tokenizer, courtesy of Ulf Hermjakob: here.
  • BLEU calculator (used for informal generation quality), courtesy of David Chiang: here.
  • Smatch, courtesy of Shu Cai, with help from many others, version of 16.11.14:here
  • CAMR is a popular AMR parser; both 2016 trophy recipients used CAMR-based systems: here
  • JAMR is another popular AMR parser and is also a generator: here
  • Most of the 2016 competitors have made their parsers available; consult system papers for details here

Contact Info


  • Jonathan May, University of Southern California Information Sciences Institute (USC/ISI)

email : Jon May

amr website: At ISI

Other Info