We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Event extraction aims to find who did what to whom, when, and where from unstructured data. Over the past decade, event extraction has made advances in three waves. The first wave relied on supervised machine learning models trained from a large amount of manually annotated data and crafted features. The second wave introduced deep neural networks with distributional semantic embedding features but still required large annotated data sets. This chapter provides an overview of a third wave with a share-and-transfer framework, which enhances the portability of event extraction by transferring knowledge from a high-resource setting to another low-resource setting, reducing the need for annotated data. The first share step is to construct a common structured semantic representation space into which these complex structures can be encoded. Then, in the transfer step, we train event extractors over these representations in high-resource settings and apply the learned extractors to target data in the low-resource setting. We conclude with a summary of the current status of this new framework and point to remaining challenges and future research directions to address them.
Traditional event detection systems typically extract structured information on events by matching predefined event templates through slot filling. Automatically linking of related event templates extracted from different documents over a longer period of time is of paramount importance for analysts to facilitate situational monitoring and manage the information overload and other long-term data aggregation tasks. This chapter reports on exploring the usability of various machine learning techniques, textual, and metadata features to train classifiers for automatically linking related event templates from online news. In particular, we focus on linking security-related events, including natural and man-made disasters, social and political unrest, military actions and crimes. With the best models trained on moderate-size corpus (ca. 22,000 event pairs) that use solely textual features, one could achieve an F1 score of93.6%. This figure is further improved to 96.7% by inclusion of event metadata features, mainly thanks to the strong discriminatory power of automatically extracted geographical information related to events.
Stories are a pervasive phenomenon of human life. They also represent a cognitive tool to understand and make sense of the world and of its happenings. In this contribution we describe a narratology-based framework for modeling stories as a combination of different data structures and to automatically extract them from news articles. We introduce a distinction among three data structures (timelines, causelines, and storylines) that capture different narratological dimensions, respectively chronological ordering, causal connections, and plot structure. We developed the Circumstantial Event Ontology (CEO) for modeling (implicit) circumstantial relations as well as explicit causal relations and create two benchmark corpora: ECB+/CEO, for causelines, and the Event Storyline Corpus (ESC), for storylines. To test our framework and the difficulty in automatically extract causelines and storylines, we develop a series of reasonable baseline systems
A crucial aspect of understanding and reconstructing narratives is identifying the underlying causal chains, which explain why certain things happened and make a coherent story. To build such causal chains, we need to identify causal links between events in the story, which may be expressed explicitly as well as understood implicitly using commonsense knowledge.
This chapter reviews research efforts on the automated extraction of such event causality from natural language text. It starts with a brief review of existing causal models in psychology and psycholinguistics as a building block for understanding causation. These models are useful tools for guiding the annotation process to build corpora annotated with causal pairs. I then outline existing annotated resources, which are used to build and evaluate automated causality extraction systems. Furthermore, circumstantial events surrounding the causal complex are rarely expressed with language as they are part of common sense knowledge. Therefore, discovering causal common sense is also important to fill the gaps in the causal chains, and I discuss existing work in this line of research.
This chapter presents techniques for examining the distributional properties of narrative schemas in a subset of the New York Times (NYT) Corpus. In one technique, the narrative argument salience through entities annotated (NASTEA) task, we use the event participants indicated by narrative schemas to replicate salient entity annotations from the NYT Corpus. In another technique, we measure narrative schema stability by generating schemas with various permutations of input documents. Both of these techniques show differences between homogeneous and heterogeneous document categories. Homogeneous categories tend to perform better on the NASTEA task using fewer schemas and exhibit more stability, whereas heterogeneous categories require more schemas applied on average to peak in performance at the NASTEA task and exhibit less stability. This suggests that narrative schemas succeed at detecting and modeling the repetitive nature of template-written text, whereas more sophisticated models are required to understand and interpret the complex novelty found in heterogeneous categories.
Event structures are central in Linguistics and Artificial Intelligence research: people can easily refer to changes in the world, identify their participants, distinguish relevant information, and have expectations of what can happen next. Part of this process is based on mechanisms similar to narratives, which are at the heart of information sharing. But it remains difficult to automatically detect events or automatically construct stories from such event representations. This book explores how to handle today's massive news streams and provides multidimensional, multimodal, and distributed approaches, like automated deep learning, to capture events and narrative structures involved in a 'story'. This overview of the current state-of-the-art on event extraction, temporal and casual relations, and storyline extraction aims to establish a new multidisciplinary research community with a common terminology and research agenda. Graduate students and researchers in natural language processing, computational linguistics, and media studies will benefit from this book.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.