Published online by Cambridge University Press: 09 July 2015
Variability is inherent in human language as different people make different choices when facing the same communicative act. In Natural Language Processing, variability is a challenge. It hinders some tasks such as evaluation of generated expressions, while it constitutes an interesting resource to achieve naturalness and to avoid repetitiveness. In this work, we present a methodological approach to study the influence of lexical variability. We apply this approach to TUNA, a corpus of referring expression lexicalizations, in order to study the use of different lexical choices. First, we reannotate the TUNA corpus with new information about lexicalization, and then we analyze this reannotation to study how people lexicalize referring expressions. The results show that people tend to be consistent when generating referring expressions. But at the same time, different people also share certain preferences.