The use of linguistic resources beyond the scope of language studies, e.g., commercial
purposes, has become commonplace since the availability of massive amounts
of data and the development of software tools to process them. An interesting perspective
on these data is provided by Sentiment Analysis, which attempts to identify
the polarity of a text, but can also pursue further, more challenging aims, such as the
automatic identification of the specific entities and aspects being discussed in the
evaluative speech act, along with the polarity associated with them. This approach,
known as aspect-based sentiment analysis, seeks to offer fine-grained information
from raw text, but its success depends largely on the existence of pre-annotated
domain-specific corpora, which in turn calls for the design and validation of
an annotation schema. This paper examines the methodological aspects involved in
the creation of such annotation schema and is motivated by the scarcity of information
found in the literature. We describe the insights we obtained from the annotation
schema generation and validation process within our project, whose objectives
include the development of advanced sentiment analysis software of user reviews in
the tourism sector. We focus on the identification of the relevant entities and attributes
in the domain, which we extract from a corpus of user reviews, and go on to
describe the schema creation and validation process. We begin by describing the
corpus annotation process and its further iterative refinement by means of several
inter-annotator agreement measurements, which we believe is key to a successful
annotation schema.