We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Due to their significant role in creative design ideation, databases of causal ontology-based models for biological and technical systems have been developed. However, creating structured database entries through system models using a causal ontology requires the time and effort of experts. Researchers have worked toward developing methods that can automatically generate representations of systems from documents using causal ontologies by leveraging machine learning (ML) techniques. However, these methods use limited, hand-annotated data for building the ML models and have manual touchpoints that are not documented. While opportunities exist to improve the accuracy of these ML models, more importantly, it is required to understand the complete process of generating structured representations using causal ontology. This research proposes a new method and a set of rules to extract information relevant to the constructs of the SAPPhIRE model of causality from descriptions of technical systems in natural language and report the performance of this process. This process aims to understand the information in the context of the entire description. The method starts by identifying the system interactions involving material, energy and information and then builds the causal description of each system interaction using the SAPPhIRE ontology. This method was developed iteratively, verifying the improvements through user trials in every cycle. The user trials of this new method and rules with specialists and novice users of the SAPPhIRE modeling showed that the method helps in accurately and consistently extracting the information relevant to the constructs of the SAPPhIRE model from a given natural language description.
We analyze the disclosures of sustainable investing by Dutch pension funds in their annual reports by introducing a novel textual analysis approach using state-of-the-art natural language processing techniques to measure the awareness and implementation of sustainable investing. We find that a pension fund's size increases both the awareness and implementation of sustainable investing. Moreover, we analyze the role of signing a sustainable investment initiative. Although signing this initiative increases the specificity of pension fund statements about sustainable investing, we do not find an effect on the implementation of sustainable investing.
Recent advances in large language models (LLMs), such as GPT-4, have spurred interest in their potential applications across various fields, including actuarial work. This paper introduces the use of LLMs in actuarial and insurance-related tasks, both as direct contributors to actuarial modelling and as workflow assistants. It provides an overview of LLM concepts and their potential applications in actuarial science and insurance, examining specific areas where LLMs can be beneficial, including a detailed assessment of the claims process. Additionally, a decision framework for determining the suitability of LLMs for specific tasks is presented. Case studies with accompanying code showcase the potential of LLMs to enhance actuarial work. Overall, the results suggest that LLMs can be valuable tools for actuarial tasks involving natural language processing or structuring unstructured data and as workflow and coding assistants. However, their use in actuarial work also presents challenges, particularly regarding professionalism and ethics, for which high-level guidance is provided.
Attempts to use artificial intelligence (AI) in psychiatric disorders show moderate success, highlighting the potential of incorporating information from clinical assessments to improve the models. This study focuses on using large language models (LLMs) to detect suicide risk from medical text in psychiatric care.
Aims
To extract information about suicidality status from the admission notes in electronic health records (EHRs) using privacy-sensitive, locally hosted LLMs, specifically evaluating the efficacy of Llama-2 models.
Method
We compared the performance of several variants of the open source LLM Llama-2 in extracting suicidality status from 100 psychiatric reports against a ground truth defined by human experts, assessing accuracy, sensitivity, specificity and F1 score across different prompting strategies.
Results
A German fine-tuned Llama-2 model showed the highest accuracy (87.5%), sensitivity (83.0%) and specificity (91.8%) in identifying suicidality, with significant improvements in sensitivity and specificity across various prompt designs.
Conclusions
The study demonstrates the capability of LLMs, particularly Llama-2, in accurately extracting information on suicidality from psychiatric records while preserving data privacy. This suggests their application in surveillance systems for psychiatric emergencies and improving the clinical management of suicidality by improving systematic quality control and research.
Social determinants of health (SDoH), such as socioeconomics and neighborhoods, strongly influence health outcomes. However, the current state of standardized SDoH data in electronic health records (EHRs) is lacking, a significant barrier to research and care quality.
Methods:
We conducted a PubMed search using “SDOH” and “EHR” Medical Subject Headings terms, analyzing included articles across five domains: 1) SDoH screening and assessment approaches, 2) SDoH data collection and documentation, 3) Use of natural language processing (NLP) for extracting SDoH, 4) SDoH data and health outcomes, and 5) SDoH-driven interventions.
Results:
Of 685 articles identified, 324 underwent full review. Key findings include implementation of tailored screening instruments, census and claims data linkage for contextual SDoH profiles, NLP systems extracting SDoH from notes, associations between SDoH and healthcare utilization and chronic disease control, and integrated care management programs. However, variability across data sources, tools, and outcomes underscores the need for standardization.
Discussion:
Despite progress in identifying patient social needs, further development of standards, predictive models, and coordinated interventions is critical for SDoH-EHR integration. Additional database searches could strengthen this scoping review. Ultimately, widespread capture, analysis, and translation of multidimensional SDoH data into clinical care is essential for promoting health equity.
The rise of populism concerns many political scientists and practitioners, yet the detection of its underlying language remains fragmentary. This paper aims to provide a reliable, valid, and scalable approach to measure populist rhetoric. For that purpose, we created an annotated dataset based on parliamentary speeches of the German Bundestag (2013–2021). Following the ideational definition of populism, we label moralizing references to “the virtuous people” or “the corrupt elite” as core dimensions of populist language. To identify, in addition, how the thin ideology of populism is “thickened,” we annotate how populist statements are attached to left-wing or right-wing host ideologies. We then train a transformer-based model (PopBERT) as a multilabel classifier to detect and quantify each dimension. A battery of validation checks reveals that the model has a strong predictive accuracy, provides high qualitative face validity, matches party rankings of expert surveys, and detects out-of-sample text snippets correctly. PopBERT enables dynamic analyses of how German-speaking politicians and parties use populist language as a strategic device. Furthermore, the annotator-level data may also be applied in cross-domain applications or to develop related classifiers.
Stance detection is identifying expressed beliefs in a document. While researchers widely use sentiment analysis for this, recent research demonstrates that sentiment and stance are distinct. This paper advances text analysis methods by precisely defining stance detection and outlining three approaches: supervised classification, natural language inference, and in-context learning. I discuss how document context and trade-offs between resources and workload should inform your methods. For all three approaches I provide guidance on application and validation techniques, as well as coding tutorials for implementation. Finally, I demonstrate how newer classification approaches can replicate supervised classifiers.
Military Servicemembers and Veterans are at elevated risk for suicide, but rarely self-identify to their leaders or clinicians regarding their experience of suicidal thoughts. We developed an algorithm to identify posts containing suicide-related content on a military-specific social media platform.
Methods
Publicly-shared social media posts (n = 8449) from a military-specific social media platform were reviewed and labeled by our team for the presence/absence of suicidal thoughts and behaviors and used to train several machine learning models to identify such posts.
Results
The best performing model was a deep learning (RoBERTa) model that incorporated post text and metadata and detected the presence of suicidal posts with relatively high sensitivity (0.85), specificity (0.96), precision (0.64), F1 score (0.73), and an area under the precision-recall curve of 0.84. Compared to non-suicidal posts, suicidal posts were more likely to contain explicit mentions of suicide, descriptions of risk factors (e.g. depression, PTSD) and help-seeking, and first-person singular pronouns.
Conclusions
Our results demonstrate the feasibility and potential promise of using social media posts to identify at-risk Servicemembers and Veterans. Future work will use this approach to deliver targeted interventions to social media users at risk for suicide.
We compared study characteristics of randomized controlled trials funded by industry (N=697) to those not funded by industry (N=835). RCTs published in high-impact journals are more likely to be blinded, more likely to include a placebo, and more likely to post trial results on ClinicalTrials.gov. Our findings emphasize the importance of evaluating the quality of an RCT based on its methodological rigor, not its funder type.
Several disciplines, such as economics, law, and political science, emphasize the importance of legislative quality, namely well-written legislation. Low-quality legislation cannot be easily implemented because the texts create interpretation problems. To measure the quality of legal texts, we use information from the syntactic and lexical features of their language and apply these measures to a dataset of European Union legislation that contains detailed information on its transposition and decision-making process. We find that syntactic complexity and vagueness are negatively related to member states’ compliance with legislation. The finding on vagueness is robust to controlling for member states’ preferences, administrative resources, length of texts, and discretion. However, the results for syntactic complexity are less robust.
Multilingual question answering (MQA) is an effective access to multilingual data to provide accurate and precise answers, irrespective of language. Although a wide range of datasets is available for monolingual QA systems in natural language processing, benchmark datasets specifically designed for MQA are considerably limited. The absence of comprehensive and benchmark datasets hinders the development and evaluation of MQA systems. To overcome this issue, the proposed work attempts to develop the EHMQuAD dataset, an MQA dataset for low-resource languages such as Hindi and Marathi accompanying the English language. The EHMQuAD dataset is developed using a synthetic corpora generation approach, and an alignment is performed after translation to make the dataset more accurate. Further, the EHMMQA model is proposed to create an abstract framework that uses a deep neural network that accepts pairs of questions and context and returns an accurate answer based on those questions. The shared question and shared context representation have been designed separately to develop this system. The experiments of the proposed model are conducted on the MMQA, Translated SQuAD, XQuAD, MLQA, and EHMQuAD datasets, and EM and F1-score are used as performance measures. The proposed model (EHMMQA) is collated with state-of-the-art MQA baseline models for all possible monolingual and multilingual settings. The results signify that EHMMQA is a considerable step toward the MQA system utilizing Hindi and Marathi languages. Hence, it becomes a new state-of-the-art model for Hindi and Marathi languages.
This chapter surveys the history and main directions of natural language processing research in general, and for Slavic languages in particular. The field has grown enormously since its beginning. Especially since 2010, the amount of digital texts has been rapidly growing; furthermore, research has yielded an ever-greater number of highly usable applications. This is reflected in the increasing number and attendance of NLP conferences and workshops. Slavic countries are no exception; several have been organising international conferences for decades, and their proceedings are the best place to find publications on Slavic NLP research. The general trend of the evolution of NLP is difficult to predict. It is certain that deep learning, including various new types (e.g. contextual, multilingual) of word embeddings and similar ‘deep’ models will play an increasing role, while predictions also mention the increasing importance of the Universal Dependencies framework and treebanks and research into the theory, not only the practice, of deep learning, coupled with attempts at achieving better explainability of the resulting models.
Housing is an environmental social determinant of health that is linked to mortality and clinical outcomes. We developed a lexicon of housing-related concepts and rule-based natural language processing methods for identifying these housing-related concepts within clinical text. We piloted our methods on several test cohorts: a synthetic cohort generated by ChatGPT for initial infrastructure testing, a cohort with substance use disorders (SUD), and a cohort diagnosed with problems related to housing and economic circumstances (HEC). Our methods successfully identified housing concepts in our ChatGPT notes (recall = 1.0, precision = 1.0), our SUD population (recall = 0.9798, precision = 0.9898), and our HEC population (recall = N/A, precision = 0.9160).
Not all scientific publications are equally useful to policy-makers tasked with mitigating the spread and impact of diseases, especially at the start of novel epidemics and pandemics. The urgent need for actionable, evidence-based information is paramount, but the nature of preprint and peer-reviewed articles published during these times is often at odds with such goals. For example, a lack of novel results and a focus on opinions rather than evidence were common in coronavirus disease (COVID-19) publications at the start of the pandemic in 2019. In this work, we seek to automatically judge the utility of these scientific articles, from a public health policy making persepctive, using only their titles.
Methods:
Deep learning natural language processing (NLP) models were trained on scientific COVID-19 publication titles from the CORD-19 dataset and evaluated against expert-curated COVID-19 evidence to measure their real-world feasibility at screening these scientific publications in an automated manner.
Results:
This work demonstrates that it is possible to judge the utility of COVID-19 scientific articles, from a public health policy-making perspective, based on their title alone, using deep natural language processing (NLP) models.
Conclusions:
NLP models can be successfully trained on scienticic articles and used by public health experts to triage and filter the hundreds of new daily publications on novel diseases such as COVID-19 at the start of pandemics.
In this chapter, a case is made for the inclusion of computational approaches to linguistics within the theoretical fold. Computational models aimed at application are a special case of predictive models. The status quo in the philosophy of linguistics is that explanation is scientifically prior to prediction. This is a mistake. Once corrected, the theoretical place of prediction is restored and, with it, computational models of language. The chapter first describes the history behind the emergence of explanation over prediction views in the general philosophy of science. It’s then suggested that this post-positivist intellectual milieu influenced the rejection of computational linguistics in the philosophy of theoretical linguistics. A case study of the predictive power already embedded in contemporary linguistic theory is presented through some work on negative polarity items. The discussion moves to the competence–performance divide informed by the so-called Galilean style in linguistics that retains the explanatory over prediction ideal. In the final sections of the chapter, continuous methods, such as probabilistic linguistics, are used to showcase the explanatory and predictive possibilities of nondiscrete approaches, before a discussion of the contemporary field of deep learning in natural language processing (NLP), where these predictive possibilities are further amplified.
Innovation, typically spurred by reusing, recombining and synthesizing existing concepts, is expected to result in an exponential growth of the concept space over time. However, our statistical analysis of TechNet, which is a comprehensive technology semantic network encompassing over 4 million concepts derived from patent texts, reveals a linear rather than exponential expansion of the overall technological concept space. Moreover, there is a notable decline in the originality of newly created concepts. These trends can be attributed to the constraints of human cognitive abilities to innovate beyond an ever-growing space of prior art, among other factors. Integrating creative artificial intelligence into the innovation process holds the potential to overcome these limitations and alter the observed trends in the future.
Incarceration is a significant social determinant of health, contributing to high morbidity, mortality, and racialized health inequities. However, incarceration status is largely invisible to health services research due to inadequate clinical electronic health record (EHR) capture. This study aims to develop, train, and validate natural language processing (NLP) techniques to more effectively identify incarceration status in the EHR.
Methods:
The study population consisted of adult patients (≥ 18 y.o.) who presented to the emergency department between June 2013 and August 2021. The EHR database was filtered for notes for specific incarceration-related terms, and then a random selection of 1,000 notes was annotated for incarceration and further stratified into specific statuses of prior history, recent, and current incarceration. For NLP model development, 80% of the notes were used to train the Longformer-based and RoBERTa algorithms. The remaining 20% of the notes underwent analysis with GPT-4.
Results:
There were 849 unique patients across 989 visits in the 1000 annotated notes. Manual annotation revealed that 559 of 1000 notes (55.9%) contained evidence of incarceration history. ICD-10 code (sensitivity: 4.8%, specificity: 99.1%, F1-score: 0.09) demonstrated inferior performance to RoBERTa NLP (sensitivity: 78.6%, specificity: 73.3%, F1-score: 0.79), Longformer NLP (sensitivity: 94.6%, specificity: 87.5%, F1-score: 0.93), and GPT-4 (sensitivity: 100%, specificity: 61.1%, F1-score: 0.86).
Conclusions:
Our advanced NLP models demonstrate a high degree of accuracy in identifying incarceration status from clinical notes. Further research is needed to explore their scaled implementation in population health initiatives and assess their potential to mitigate health disparities through tailored system interventions.
This paper demonstrates workflows to incorporate text data into actuarial classification and regression tasks. The main focus is on methods employing transformer-based models. A dataset of car accident descriptions with an average length of 400 words, available in English and German, and a dataset with short property insurance claims descriptions, are used to demonstrate these techniques. The case studies tackle challenges related to a multilingual setting and long input sequences. They also show ways to interpret model output and to assess and improve model performance, by fine-tuning the models to the domain of application or to a specific prediction task. Finally, the paper provides practical approaches to handle classification tasks in situations with no or only few labelled data. The results achieved by using the language-understanding skills of off-the-shelf natural language processing (NLP) models with only minimal pre-processing and fine-tuning clearly demonstrate the power of transfer learning for practical applications.
This article sheds light on the significant yet nuanced roles of shame and guilt in influencing moral behaviour, a phenomenon that became particularly prominent during the COVID-19 pandemic with the community’s heightened desire to be seen as moral. These emotions are central to human interactions, and the question of how they are conveyed linguistically is a vast and important one. Our study contributes to this area by analysing the discourses around shame and guilt in English and Japanese online forums, focusing on the terms shame, guilt, haji (‘shame’) and zaiakukan (‘guilt’). We utilise a mix of corpus-based methods and natural language processing tools, including word embeddings, to examine the contexts of these emotion terms and identify semantically similar expressions. Our findings indicate both overlaps and distinct differences in the semantic landscapes of shame and guilt within and across the two languages, highlighting nuanced ways in which these emotions are expressed and distinguished. This investigation provides insights into the complex dynamics between emotion words and the internal states they denote, suggesting avenues for further research in this linguistically rich area.
This chapter provides an overview of studies that call on the syntactic features of connectives as a means to disambiguate their function and meaning. These syntactic features cover the morphosyntactic nature of discourse connectives as well as their syntagmatic distribution. On the basis of existing lexicons of discourse connectives, we first give an overview of the morphosyntactic distribution of discourse connectives in several European and non-European languages. We then address a number of studies that focus on the (semi-automatic) identification and annotation of discourse connectives in context. This is of particular interest in the field of natural language processing, but also in the field of contrastive linguistics, where it has been shown that syntactic categories, including those underlying the description of discourse connective uses, are not always cross-linguistically valid. The final section is devoted to the relationship between the syntagmatic position of discourse connectives and their meaning, which has given rise to numerous studies at the grammar-discourse interface highlighting the fuzzy boundary between discourse connectives and discourse markers.