We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Stance detection is identifying expressed beliefs in a document. While researchers widely use sentiment analysis for this, recent research demonstrates that sentiment and stance are distinct. This paper advances text analysis methods by precisely defining stance detection and outlining three approaches: supervised classification, natural language inference, and in-context learning. I discuss how document context and trade-offs between resources and workload should inform your methods. For all three approaches I provide guidance on application and validation techniques, as well as coding tutorials for implementation. Finally, I demonstrate how newer classification approaches can replicate supervised classifiers.
The exponential growth of social media data in the era of Web 2.0 has necessitated advanced techniques for sentiment analysis. While sentiment analysis in monolingual datasets has received significant attention that in code-mixed datasets still need to be studied more. Code-mixed data often contain a mixture of monolingual content (might be in transliterated form), single-script but multilingual content, and multi-script multilingual content. This paper explores the issue from three important angles. What will be the best strategy to deal with the data for sentiment detection? Whether to train the classifier with the whole of the dataset or only with the pure code-mixed subset from the dataset? How much important is the language identification (LID) for the task? If LID is to be done, how, and when will it be used to yield the best performance? We explore the questions in the light of three datasets of Tamil–English, Kannada–English, and Malayalam–English YouTube social media comments. Our solution incorporated mBERT and an optional LID module. We report our results using a set of metrics like precision, recall, $F_1$ score, and accuracy. The solutions provide considerable performance gain and some interesting insights for sentiment analysis from code-mixed data.
What has allowed inequalities in material resources to mount in advanced democracies? This chapter considers the role of media reporting on the economy in weakening accountability mechanisms that might otherwise have incentivized governments to pursue more equal outcomes. Building on prior work on the United States, we investigate how journalistic depictions of the economy relate to real distributional developments across OECD countries. Using sentiment analysis of economic news content, we demonstrate that the evaluative content of the economic news strongly and disproportionately tracks the fortunes of the very rich and that good (bad) economic news is more common in periods of rising (falling) income shares at the top. We then propose and test an explanation in which pro-rich biases in news tone arise from a journalistic focus on the performance of the economy in the aggregate, while aggregate growth is itself positively correlated with relative gains for the rich. The chapter’s findings suggest that the democratic politics of inequality may be shaped in important ways by the skewed nature of the informational environment within which citizens form economic evaluations.
It is well known that politicians speak differently when campaigning. The shadow of elections may affect candidates' change in tone during campaigns. However, to date, we lack a systematic study of the changes in communication patterns between campaign and non-campaign periods. In this study, we examine the sentiment expressed in 4.3 million tweets posted by members of national parliaments in the EU27 from 2018 to 2020. Our results show that (1) the opposition, even populists and Eurosceptics, send more positive messages during campaigns, (2) parties trailing in the polls communicate more negatively, and (3) that the changes are similar in national and European elections. These findings show the need to look beyond campaign times to understand parties' appeals and highlight the promises of social media data to move beyond traditional analyses of manifestos and speeches.
The rapid onset of coronavirus disease 2019 (COVID-19) created a complex virtual collective consciousness. Misinformation and polarization were hallmarks of the pandemic in the United States, highlighting the importance of studying public opinion online. Humans express their thoughts and feelings more openly than ever before on social media; co-occurrence of multiple data sources have become valuable for monitoring and understanding public sentimental preparedness and response to an event within our society.
Methods:
In this study, Twitter and Google Trends data were used as the co-occurrence data for the understanding of the dynamics of sentiment and interest during the COVID-19 pandemic in the United States from January 2020 to September 2021. Developmental trajectory analysis of Twitter sentiment was conducted using corpus linguistic techniques and word cloud mapping to reveal 8 positive and negative sentiments and emotions. Machine learning algorithms were used to implement the opinion mining how Twitter sentiment was related to Google Trends interest with historical COVID-19 public health data.
Results:
The sentiment analysis went beyond polarity to detect specific feelings and emotions during the pandemic.
Conclusions:
The discoveries on the behaviors of emotions at each stage of the pandemic were presented from the emotion detection when associated with the historical COVID-19 data and Google Trends data.
We examine the binary classification of sentiment views for verbal multiword expressions (MWEs). Sentiment views denote the perspective of the holder of some opinion. We distinguish between MWEs conveying the view of the speaker of the utterance (e.g., in “The company reinvented the wheel” the holder is the implicit speaker who criticizes the company for creating something already existing) and MWEs conveying the view of explicit entities participating in an opinion event (e.g., in “Peter threw in the towel” the holder is Peter having given up something). The task has so far been examined on unigram opinion words. Since many features found effective for unigrams are not usable for MWEs, we propose novel ones taking into account the internal structure of MWEs, a unigram sentiment-view lexicon and various information from Wiktionary. We also examine distributional methods and show that the corpus on which a representation is induced has a notable impact on the classification. We perform an extrinsic evaluation in the task of opinion holder extraction and show that the learnt knowledge also improves a state-of-the-art classifier trained on BERT. Sentiment-view classification is typically framed as a task in which only little labeled training data are available. As in the case of unigrams, we show that for MWEs a feature-based approach beats state-of-the-art generic methods.
Social media data are rapidly evolving and accessible, which presents opportunities for research. Data science techniques, such as sentiment or emotion analysis which analyse textual emotion, provide an opportunity to gather insight from social media. This paper describes a systematic scoping review of interdisciplinary evidence to explore how sentiment or emotion analysis methods alongside other data science methods have been used to examine nutrition, food and cooking social media content. A PRISMA search strategy was used to search nine electronic databases in November 2020 and January 2022. Of 7325 studies identified, thirty-six studies were selected from seventeen countries, and content was analysed thematically and summarised in an evidence table. Studies were published between 2014 and 2022 and used data from seven different social media platforms (Twitter, YouTube, Instagram, Reddit, Pinterest, Sina Weibo and mixed platforms). Five themes of research were identified: dietary patterns, cooking and recipes, diet and health, public health and nutrition and food in general. Papers developed a sentiment or emotion analysis tool or used available open-source tools. Accuracy to predict sentiment ranged from 33·33% (open-source engine) to 98·53% (engine developed for the study). The average proportion of sentiment was 38·8% positive, 46·6% neutral and 28·0% negative. Additional data science techniques used included topic modelling and network analysis. Future research requires optimising data extraction processes from social media platforms, the use of interdisciplinary teams to develop suitable and accurate methods for the subject and the use of complementary methods to gather deeper insights into these complex data.
Patients increasingly use physician rating websites to evaluate and choose potential healthcare providers. A sentiment analysis and machine learning approach can uniquely analyse written prose to quantitatively describe patients’ perspectives from interactions with their physicians.
Methods
Online written reviews and star scores were analysed from Healthgrades.com using a natural language processing sentiment analysis package. Demographics of otolaryngologists were compared and a multivariable regression for individual words was performed.
Results
This study analysed 18 546 online reviews of 1240 otolaryngologists across the USA. Younger otolaryngologists (aged less than 40 years) had higher sentiment and star scores compared with older otolaryngologists (p < 0.001). Male otolaryngologists had higher sentiment and star scores compared with female otolaryngologists (p < 0.001). ‘Confident’, ‘kind’, ‘recommend’ and ‘comfortable’ were words associated with positive reviews (p < 0.001).
Conclusion
Positive bedside manner was strongly reflected in better reviews, and younger age and male gender of the otolaryngologist were associated with better sentiment and star scores.
The present study aims to examine coronavirus disease 2019 (COVID-19) vaccination discussions on Twitter in Turkey and conduct sentiment analysis.
Methods:
The current study performed sentiment analysis of Twitter data with the artificial intelligence (AI) Natural Language Processing (NLP) method. The tweets were retrieved retrospectively from March 10, 2020, when the first COVID-19 case was seen in Turkey, to April 18, 2022. A total of 10,308 tweets accessed. The data were filtered before analysis due to excessive noise. First, the text is tokenized. Many steps were applied in normalizing texts. Tweets about the COVID-19 vaccines were classified according to basic emotion categories using sentiment analysis. The resulting dataset was used for training and testing ML (ML) classifiers.
Results:
It was determined that 7.50% of the tweeters had positive, 0.59% negative, and 91.91% neutral opinions about the COVID-19 vaccination. When the accuracy values of the ML algorithms used in this study were examined, it was seen that the XGBoost (XGB) algorithm had higher scores.
Conclusions:
Three of 4 tweets consist of negative and neutral emotions. The responsibility of professional chambers and the public is essential in transforming these neutral and negative feelings into positive ones.
Deep neural networks as an end-to-end approach lack robustness from an application point of view, as it is very difficult to fix an obvious problem without retraining the model, for example, when a model consistently predicts positive when seeing the word “terrible.” Meanwhile, it is less stressed that the commonly used attention mechanism is likely to “over-fit” by being overly sparse, so that some key positions in the input sequence could be overlooked by the network. To address these problems, we proposed a lexicon-enhanced attention LSTM model in 2019, named ATLX. In this paper, we describe extended experiments and analysis of the ATLX model. And, we also try to further improve the aspect-based sentiment analysis system by combining a vector-based sentiment domain adaptation method.
Sentiment analysis has gained widespread adoption in many fields, but not—until now—in literary studies. Scholars have lacked a robust methodology that adapts the tool to the skills and questions central to literary scholars. Also lacking has been quantitative data to help the scholar choose between the many models. Which model is best for which narrative, and why? By comparing over three dozen models, including the latest Deep Learning AI, the author details how to choose the correct model—or set of models—depending on the unique affective fingerprint of a narrative. The author also demonstrates how to combine a clustered close reading of textual cruxes in order to interpret a narrative. By analyzing a diverse and cross-cultural range of texts in a series of case studies, the Element highlights new insights into the many shapes of stories.
This chapter picks up a range of applications of corpus linguistics that have not been covered in Chapters 6 and 7. These are: applications of bilingual corpora in contrastive linguistics and translation; forensic linguistics, with a particular focus on authorship; the automatic extraction of information or opinion; the identification of ‘fake news’; and the various topics discussed under the head of sociolinguistics. In most cases the work described here uses the methods discussed in earlier chapters, both quantitative and qualitative. Quantitative techniques are particularly important in sociolinguistics, where the differential frequencies of items used by different social groups are important. In addition, applications such as information mining or sentiment analysis stray into computational linguistics and natural language processing, and illustrate both the common ground and the disparities between these approaches and corpus linguistics.
Sentiment analysis techniques have a long history in natural language processing and have become a standard tool in the analysis of political texts, promising a conceptually straightforward automated method of extracting meaning from textual data by scoring documents on a scale from positive to negative. However, while these kinds of sentiment scores can capture the overall tone of a document, the underlying concept of interest for political analysis is often actually the document’s stance with respect to a given target—how positively or negatively it frames a specific idea, individual, or group—as this reflects the author’s underlying political attitudes. In this paper, we question the validity of approximating author stance through sentiment scoring in the analysis of political texts, and advocate for greater attention to be paid to the conceptual distinction between a document’s sentiment and its stance. Using examples from open-ended survey responses and from political discussions on social media, we demonstrate that in many political text analysis applications, sentiment and stance do not necessarily align, and therefore sentiment analysis methods fail to reliably capture ground-truth document stance, amplifying noise in the data and leading to faulty conclusions.
This paper aims to address an important yet under-studied issue – how does violence from the side of the protestors affect overseas support for a democratic movement? The importance of this question is twofold. First, while violence and radicalization are not exactly unfamiliar territories for scholars of contentious politics, they do not receive as much attention when their effects spill beyond the domestic arenas. Second, this study seeks to examine international solidarity with democratic movements at the civil society level, which differs substantially from the conventional elite-centric approach when it comes to the intersection between democratization and international relations. Against this backdrop, this paper considers the relationship between violent tactics employed by the protestors during the anti-extradition movement and the sentiment expressed by people elsewhere towards the protests. To this end, a total of 9,659,770 tweets were extracted using Twitter Application Programming Interface during the period of 1 June 2019–31 January 2020. Leveraging computational methods such as topic modelling and sentiment analysis, findings in this paper demonstrate that a majority of foreign Twitter users were supportive of the protestors while held relatively negative sentiments against the government as well as the police. In addition, this study reveals that, broadly speaking, violence might cost a democratic movement by its international support, but could also garner more attention at times. Despite its restricted scope, this paper hopefully will shed some useful light on the dynamics underlying international solidarity for a democratic movement abroad as well as the complex mechanisms of interactions between people who protest at home and those who observe from overseas.
Large-scale coordinated efforts have been dedicated to understanding the global health and economic implications of the COVID-19 pandemic. Yet, the rapid spread of discrimination and xenophobia against specific populations has largely been neglected. Understanding public attitudes toward migration is essential to counter discrimination against immigrants and promote social cohesion. Traditional data sources to monitor public opinion are often limited, notably due to slow collection and release activities. New forms of data, particularly from social media, can help overcome these limitations. While some bias exists, social media data are produced at an unprecedented temporal frequency, geographical granularity, are collected globally and accessible in real-time. Drawing on a data set of 30.39 million tweets and natural language processing, this article aims to measure shifts in public sentiment opinion about migration during early stages of the COVID-19 pandemic in Germany, Italy, Spain, the United Kingdom, and the United States. Results show an increase of migration-related Tweets along with COVID-19 cases during national lockdowns in all five countries. Yet, we found no evidence of a significant increase in anti-immigration sentiment, as rises in the volume of negative messages are offset by comparable increases in positive messages. Additionally, we presented evidence of growing social polarization concerning migration, showing high concentrations of strongly positive and strongly negative sentiments.
Most analyses dealing with the interaction of parties in parliament assume their interests to be fixed between elections. However, a rational perspective suggests that parties adapt their behaviour throughout the legislative term. I argue that this change is influenced by incentives and possibilities to shape legislation and the need to distinguish oneself from competitors. While for government parties it matters whether they have to share offices, for opposition parties the influence on policy-making is important. By examining the sentiment of all parliamentary speeches on bill proposals from six established democracies over more than twenty years, I analyse institutional and contextual effects. The results show that single-party governments tend to become more positive towards the end of the legislative cycle compared to coalition governments. On the other hand, opposition parties under minority governments, or with more institutionalised influence on government bills, show a more negative trend in comparison to their counterparts.
This Element provides a basic introduction to sentiment analysis, aimed at helping students and professionals in corpus linguistics to understand what sentiment analysis is, how it is conducted, and where it can be applied. It begins with a definition of sentiment analysis and a discussion of the domains where sentiment analysis is conducted and used the most. Then, it introduces two main methods that are commonly used in sentiment analysis known as supervised machine-learning and unsupervised learning (or lexicon-based) methods, followed by a step-by-step explanation of how to perform sentiment analysis with R. The Element then provides two detailed examples or cases of sentiment and emotion analysis, with one using an unsupervised method and the other using a supervised learning method.
We give an in-depth account of compositional matrix-space models (CMSMs), a type of generic models for natural language, wherein compositionality is realized via matrix multiplication. We argue for the structural plausibility of this model and show that it is able to cover and combine various common compositional natural language processing approaches. Then, we consider efficient task-specific learning methods for training CMSMs and evaluate their performance in compositionality prediction and sentiment analysis.
With the increasing demand for a personalized product and rapid market response, many companies expect to explore online user-generated content (UGC) for intelligent customer hearing and product redesign strategy. UGC has the advantages of being more unbiased than traditional interviews, yielding in-time response, and widely accessible with a sheer volume. From online resources, customers’ preferences toward various aspects of the product can be exploited by promising sentiment analysis methods. However, due to the complexity of language, state-of-the-art sentiment analysis methods are still not accurate for practice use in product redesign. To tackle this problem, we propose an integrated customer hearing and product redesign system, which combines the robust use of sentiment analysis for customer hearing and coordinated redesign mechanisms. Ontology and expert knowledges are involved to promote the accuracy. Specifically, a fuzzy product ontology that contains domain knowledges is first learned in a semi-supervised way. Then, UGC is exploited with a novel ontology-based fine-grained sentiment analysis approach. Extracted customer preference statistics are transformed into multilevels, for the automatic establishment of opportunity landscapes and house of quality table. Besides, customer preference statistics are interactively visualized, through which representative customer feedbacks are concurrently generated. Through a case study of smartphone, the effectiveness of the proposed system is validated, and applicable redesign strategies for a case product are provided. With this system, information including customer preferences, user experiences, using habits and conditions can be exploited together for reliable product redesign strategy elicitation.
The outbreak and rapid spread of coronavirus disease 2019 (COVID-19) not only caused an adverse impact on physical health, but also brought about mental health problems among the public.
Methods
To assess the causal impact of COVID-19 on psychological changes in China, we constructed a city-level panel data set based on the expressed sentiment in the contents of 13 million geotagged tweets on Sina Weibo, the Chinese largest microblog platform.
Results
Applying a difference-in-differences approach, we found a significant deterioration in mental health status after the occurrence of COVID-19. We also observed that this psychological effect faded out over time during our study period and was more pronounced among women, teenagers and older adults. The mental health impact was more likely to be observed in cities with low levels of initial mental health status, economic development, medical resources and social security.
Conclusions
Our findings may assist in the understanding of mental health impact of COVID-19 and yield useful insights into how to make effective psychological interventions in this kind of sudden public health event.