Task Description: Aspect Based Sentiment Analysis (ABSA)
Sentiment analysis is increasingly viewed as a vital task both from an academic and a commercial standpoint. The majority of current approaches, however, attempt to detect the overall polarity of a sentence, paragraph, or text span, regardless of the entities mentioned (e.g., laptops, restaurants) and their aspects (e.g., battery, screen; food, service). By contrast, this task is concerned with aspect based sentiment analysis (ABSA), where the goal is to identify the aspects of given target entities and the sentiment expressed towards each aspect. Datasets consisting of customer reviews with human-authored annotations identifying the mentioned aspects of the target entities and the sentiment polarity of each aspect will be provided.
In particular, the task consists of the following subtasks:
Subtask 1: Aspect term extraction
Given a set of sentences with pre-identified entities (e.g., restaurants), identify the aspect terms present in the sentence and return a list containing all the distinct aspect terms. An aspect term names a particular aspect of the target entity.
For example, "I liked the service and the staff, but not the food”, “The food was nothing much, but I loved the staff”. Multi-word aspect terms (e.g., “hard disk”) should be treated as single terms (e.g., in “The hard disk is very noisy” the only aspect term is “hard disk”).
Subtask 2: Aspect term polarity
For a given set of aspect terms within a sentence, determine whether the polarity of each aspect term is positive, negative, neutral or conflict (i.e., both positive and negative).
For example:
“I loved their fajitas” → {fajitas: positive}
“I hated their fajitas, but their salads were great” → {fajitas: negative, salads: positive}
“The fajitas are their first plate” → {fajitas: neutral}
“The fajitas were great to taste, but not to see” → {fajitas: conflict}
Subtask 3: Aspect category detection
Given a predefined set of aspect categories (e.g., price, food), identify the aspect categories discussed in a given sentence. Aspect categories are typically coarser than the aspect terms of Subtask 1, and they do not necessarily occur as terms in the given sentence.
For example, given the set of aspect categories {food, service, price, ambience, anecdotes/miscellaneous}:
“The restaurant was too expensive” → {price}
“The restaurant was expensive, but the menu was great” → {price, food}
Subtask 4: Aspect category polarity
Given a set of pre-identified aspect categories (e.g., {food, price}), determine the polarity (positive, negative, neutral or conflict) of each aspect category.
For example:
“The restaurant was too expensive” → {price: negative}
“The restaurant was expensive, but the menu was great” → {price: negative, food: positive}
Datasets:
Two domain-specific datasets for laptops and restaurants, consisting of over 6K sentences with fine-grained aspect-level human annotations have been provided for training.
Restaurant reviews:
This dataset consists of over 3K English sentences from the restaurant reviews of Ganu et al. (2009). The original dataset of Ganu et al. included annotations for coarse aspect categories (Subtask 3) and overall sentence polarities; we modified the dataset to include annotations for aspect terms occurring in the sentences (Subtask 1), aspect term polarities (Subtask 2), and aspect category-specific polarities (Subtask 4). We also corrected some errors (e.g., sentence splitting errors) of the original dataset. Experienced human annotators identified the aspect terms of the sentences and their polarities (Subtasks 1 and 2). Additional restaurant reviews, not in the original dataset of Ganu et al. (2009), are being annotated in the same manner, and they will be used as test data.
Laptop reviews:
This dataset consists of over 3K English sentences extracted from customer reviews of laptops. Experienced human annotators tagged the aspect terms of the sentences (Subtask 1) and their polarities (Subtask 2). This dataset will be used only for Subtasks 1 and 2. Part of this dataset will be reserved as test data.
Dataset format:
The sentences in the datasets are annotated using XML tags.
The following example illustrates the format of the annotated sentences of the restaurants dataset.
<sentence id="813">
<text>All the appetizers and salads were fabulous, the steak was mouth watering and the pasta was delicious!!!</text>
<aspectTerms>
<aspectTerm term="appetizers" polarity="positive" from="8" to="18"/>
<aspectTerm term="salads" polarity="positive" from="23" to="29"/>
<aspectTerm term="steak" polarity="positive" from="49" to="54"/>
<aspectTerm term="pasta" polarity="positive" from="82" to="87"/>
</aspectTerms>
<aspectCategories>
<aspectCategory category="food" polarity="positive"/>
</aspectCategories>
</sentence>
The possible values of the polarity field are: “positive”, “negative”, “conflict”, “neutral”. The possible values of the category field are: “food”, “service”, “price”, “ambience”, “anecdotes/miscellaneous”.
The following example illustrates the format of the annotated sentences of the laptops dataset. The format is the same as in the restaurant datasets, with the only exception that there are no annotations for aspect categories. Notice that we annotate only aspect terms naming particular aspects (e.g., “everything about it” does not name a particular aspect).
<sentence id="353">
<text>From the build quality to the performance, everything about it has been sub-par from what I would have expected from Apple.</text>
<aspectTerms>
<aspectTerm term="build quality" polarity="negative" from="9" to="22"/>
<aspectTerm term="performance" polarity="negative" from="30" to="41"/>
</aspectTerms>
</sentence>
In the sentences of both datasets, there is an <aspectTerm … /> element for each occurrence of an aspect term. For example, if the previous sentence contained two occurrences of the aspect term “performance”, there would be two <aspectTerm … /> elements, which would be identical if both occurrences had negative polarity. If a sentence has no aspect terms, there is no <aspectTerms> … </aspectTerms> element in its annotations, and similarly for the aspect categories in the restaurants dataset.
Please, note that:
- Any quote within an aspect term (e.g., "sales" team) has been replaced with " (the text and the offsets remain the same), e.g., <aspectTerm term=""sales" team" .../>.
- The sentences may contain spelling mistakes. The identified aspect terms should be returned as they appear in the sentences, even if misspelled (e.g., "warranty" as "warrenty").
- For each aspect term of the training data we include two attributes ("from and "to") that indicate its start and end offset in the text (e.g., <aspectTerm term="staff" polarity="negative" from="8" to="13"/>).
Evaluation:
Details of the evaluation measures will become available in due time.
All participating teams have been provided with annotated training data (sentences from the two datasets) in the format discussed above, to train their systems. During the SemEval evaluation phase, unlabeled test data will be provided. Similar to previous SemEval Sentiment Analysis tasks, each team may submit two runs:
- Constrained - using only the provided training data and other resources, such as lexicons
-
Unconstrained - using additional data for training. Teams will be asked to report what resources they used for each submitted run.
References:
G. Ganu, N. Elhadad, and A. Marian, “Beyond the stars: Improving rating predictions using review text content”. Proceedings of the 12th International Workshop on the Web and Databases, Providence, Rhode Island, 2009.
M. Hu and B. Liu, “Mining and summarizing customer reviews”. Proceedings of the 10th KDD, pp. 168–177, Seattle, WA, 2004.
S.-M. Kim and E. Hovy, “Extracting opinions, opinion holders, and topics expressed in online news media text”. Proceedings of the Workshop on Sentiment and Subjectivity in Text, pp. 1– 8, Sydney, Australia, 2006.
B. Liu, Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool, 2012.
S. Moghaddam and M. Ester, “Opinion digger: an unsupervised opinion miner from unstructured product reviews”. Proceedings of the 19th CIKM, pp. 1825–1828, Toronto, ON, 2010.
M. Tsytsarau and T. Palpanas. “Survey on mining subjective data on the web”. Data Mining and Knowledge Discovery, 24(3):478–514, 2012.
Z. Zhai, B. Liu, H. Xu, and P. Jia. “Clustering product features for opinion mining”. Proceedings of the 4th International Conference of WSDM, pp. 347–354, Hong Kong, 2011.
S. Brody and N. Elhadad. “An unsupervised aspect-sentiment model for online reviews”. Proceedings of NAACL, pages 804–812, Los Angeles, CA, 2010.