Germeval Task 2017 Shared Task on Aspect-based Sentiment in Social Media Customer Feedback


In the connected, modern world, customer feedback is a valuable source for insights on the quality of products or services. This feedback allows other customers to benefit from the experiences of others and enables businesses to react on requests, complaints or recommendations. However, the more people use a product or service, the more feedback is generated, which results in the major challenge of analyzing huge amounts of feedback in an efficient, but still meaningful way.

Thus, we propose a shared task on automatically analyzing customer reviews about “Deutsche Bahn” — the german public train operator with about two billion passengers each year.


 “RT @XXX: Da hört jemand in der Bahn so laut ‘700 Main Street’ durch seine Kopfhörer, dass ich mithören kann. :( :( :(“

As shown in the example, insights from reviews can be derived on different granularities. The review contains a general evaluation of the travel (The customer disliked the travel). Furthermore, the review evaluates a dedicated aspect of the train travel (“laut” → customer did not like the noise level).

Consequently, we frame the task as aspect-based sentiment analysis with four sub tasks:


Subtask A) Relevance Classification

Determine whether a social media post contains feedback about the «Deutsche Bahn» or if the post is off-topic/contains no evaluation.


Ehrlich die männer in Der Bahn haben keine manieren?<tab>relevant<tab>negative<tab>Atmosphäre#Haupt:negative

In the given post, the task is to identify that the post is relevant

Subtask B) Document-level Polarity

Identify, whether the customer evaluates the «Deutsche Bahn» or travel as positive, negative or neutral.


Re: Ingo Lenßen Guten morgen Ingo...bei mir kein regen aber bahn fehr wieder nicht....liebe grusse ....<tab>relevant<tab>negative <tab>Sonstige_Unregelmässigkeiten#Haupt:negativ 

In the given post, the task is to identify the posts’ polarity : negative

Subtask C) Aspect-level Polarity

Identify all aspects which are positively and negatively evaluated within the review. In order to increase comparability, the aspects are previously divided into categories (see data). Consequently, the aim of the subtasks is to identify all contained categories and their associated polarity.


"Alle so ""Yeah, Streik beendet"" Bahn so ""Okay, dafür werden dann natürlich die Tickets teurer"" Alle so ""Können wir wieder Streik haben?"""  <tab>relevant<tab>neutral <tab>Ticketkauf#Haupt:negativ Allgemein#Haupt:positive

In the given post, the task is to identify the aspects (and their polarity): Ticketkauf#Haupt:negativ Allgemein#Haupt:positive

Subtask D) Opinion Target Extraction

Identify the linguistic expression in the posts which are used to express the aspect-based sentiment (subtask C). The opinion target expression is defined by its starting and ending offsets.


   <text>@m_wabersich IC 2151? Der fährt nicht. Ich habe Ihnen die Alternative bereits genannt. /je</text>
       <Opinion category="Sonstige_Unregelmässigkeiten#Haupt" from="26" to="37" polarity="negative" target="fährt nicht"/>

In the given post the task ist to identify the target expression fährt nicht.

The whole data is available in bot TSV and XML. Subtask D) however, can only be done using the XML format, as the spans of the opinion target expression are not available in TSV. For more detail see the data section.

Об авторе Лидия Пивоварова

СПбГУ - старший преподаватель, University of Helsinki - PhD student
Запись опубликована в рубрике Конференции, Ресурсы/Софт. Добавьте в закладки постоянную ссылку.

Добавить комментарий

Ваш e-mail не будет опубликован. Обязательные поля помечены *