Abstract Meaning Representation Parsing and Generation

The task is over, but if you want to you can join our group here!

Overview:

Abstract Meaning Representation (AMR) is a compact, readable, whole-sentence semantic annotation. Annotation components include entity identification and typing, PropBank semantic roles, individual entities playing multiple roles, entity grounding via wikification, as well as treatments of modality, negation, etc.

Here is an example AMR for the sentence “The London emergency services said that altogether 11 people had been sent to hospital for treatment due to minor wounds.”

(s / say-01
      :ARG0 (s2 / service
            :mod (e / emergency)
            :location (c / city :wiki ‘‘London’’
                  :name (n / name :op1 ‘‘London’’)))
      :ARG1 (s3 / send-01
            :ARG1 (p / person :quant 11)
            :ARG2 (h / hospital)
            :mod (a / altogether)
            :purpose (t / treat-03
                  :ARG1 p
                  :ARG2 (w / wound-01
                        :ARG1 p
                        :mod (m / minor)))))

Note the inclusion of PropBank semantic frames (‘say-01’, ‘send-01’, ‘treat-03’, ‘wound-01’), grounding via wikification (‘London’), and multiple roles played by an entity (e.g. ‘11 people’ are the ARG1 of send-01, the ARG1 of treat-03, and the ARG1 of wound-01).

In 2016 SemEval held its first AMR parsing challenge and received strong submissions from 11 diverse teams. In 2017 we have extended the challenge as follows:

Subtask 1: Parsing Biomedical Data

As in 2016, participants will be provided with parallel English-AMR training data. They will have to parse new English data and return the obtained AMRs. The genre of the data is quite different from that in 2016. It focuses on scientific articles regarding cancer pathway discovery.

Here is an example parse of the sentence "Among tested agents, the B-Raf inhibitor dabrafenib was found to induce a strong V600E-dependent shift in cell viability."

(f / find-01
      :ARG1 (i2 / induce-01
            :ARG0 (s / small-molecule :name (n3 / name :op1 "dabrafenib")
                  :ARG0-of (i3 / inhibit-01
                        :ARG1 (e2 / enzyme :name (n2 / name :op1 "B-Raf")))
                  :ARG1-of (i4 / include-01
                        :ARG2 (a / agent
                              :ARG1-of (t2 / test-01))))
            :ARG2 (s2 / shift-01
                  :ARG1 (v / viability
                        :mod (c / cell))
                  :ARG0-of (d / depend-01
                        :ARG1 (m / mutate-01 :value "V600E"))
                  :mod (s3 / strong))))

Participants may use any resources at their disposal (but may not hand-annotate the blind data or hire other human beings to hand-annotate the blind data). The SemEval trophy goes to the system with the highest Smatch score.

More example Bio data with AMRs can be found here

Subtask 2: AMR-to-English Generation

In this completely new subtask, participants will be provided with AMRs and will have to generate valid English sentences. Scoring will make use of human evaluation. The domain of this subtask will be general news and discussion forum, much like was done in 2016's parsing task.

For the AMR from above:

(s / say-01
      :ARG0 (s2 / service
            :mod (e / emergency)
            :location (c / city :wiki ‘‘London’’
                  :name (n / name :op1 ‘‘London’’)))
      :ARG1 (s3 / send-01
            :ARG1 (p / person :quant 11)
            :ARG2 (h / hospital)
            :mod (a / altogether)
            :purpose (t / treat-03
                  :ARG1 p
                  :ARG2 (w / wound-01
                        :ARG1 p
                        :mod (m / minor)))))

a correct answer would, of course, be "The London emergency services said that altogether 11 people had been sent to hospital for treatment due to minor wounds." However, another correct answer would be "London emergency services say that altogether eleven people were sent to the hospital for treating of their minor wounds." Sentences will be automatically scored by single-reference BLEU and possibly other automated metrics as well. However, they will also be scored by human preference judgments, using the methods (and interface) employed by WMT. Ultimately, the results judged best by human evaluators get the SemEval trophy.

Example general-domain data with AMRs can be found here

Existing AMR-related research: Kevin Knight has been keeping a list here. It is hard to keep up though, so please send email to jonmay@isi.edu if yours is missing and you want a citation)

SemEval-2017 Task 9

We peek inside your brain so you don't have to!

Abstract Meaning Representation Parsing and Generation

Contact Info

Organizer

Other Info

Announcements