Search

Chapter 26 - Embedding and machine learning
from Part III - Fundamentals
James Bagrow, University of Vermont, Yong‐Yeol Ahn, Indiana University, Bloomington
Book:

Working with Network Data

Published online:

06 June 2024

Print publication:

13 June 2024, pp 429-446
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Machine learning, especially neural network methods, is increasingly important in network analysis. This chapter will discuss the theoretical aspects of network embedding methods and graph neural networks. As we have seen, much of the success of advanced machine learning is thanks to useful representations—embeddings—of data. Embedding and machine learning are closely aligned. Translating network elements to embedding vectors and sending those vectors as features to a predictive model often leads to a simpler, more performant model than trying to work directly with the network. Embeddings help with network learning tasks, from node classification to link prediction. We can even embed entire networks and then use models to summarize and compare networks. But not only does machine learning benefit from embeddings, but embeddings benefit from machine learning. Inspired by the incredible recent progress with natural language data, embeddings created by predictive models are becoming more useful and important. Often these embeddings are produced by neural networks of various flavors, and we explore current approaches for using neural networks on network data.

8 - Distributional Hypothesis and Representation Learning
Mihai Surdeanu, University of Arizona, Marco Antonio Valenzuela-Escárcega, University of Arizona
Book:

Deep Learning for Natural Language Processing

Published online:

01 February 2024

Print publication:

08 February 2024, pp 117-131
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

All the algorithms we covered so far rely on handcrafted features that must be designed and implemented by the machine learning developer. This is problematic for two reasons. First, designing such features can be a complicated endeavor. Second, most words in any language tend to be very infrequent. In our context, this means that most words are very sparse, and our text classification algorithm trained on word-occurrence features may generalize poorly. For example, if the training data for a review classification dataset contains the word great but not the word fantastic, a learning algorithm trained on these data will not be able to properly handle reviews containing the latter word, even though there is a clear semantic similarity between the two. In this chapter, we will begin to addresses this limitation. In particular, we will discuss methods that learn numerical representations of words that capture some semantic knowledge. Under these representations, similar words such as great and fantastic will have similar forms, which will improve the generalization capability of our machine learning algorithms.

4 - Features, Reduced: Feature Selection, Dimensionality Reduction and Embeddings
from Part One - Fundamentals
Pablo Duboue
Book:

The Art of Feature Engineering

Published online:

29 May 2020

Print publication:

25 June 2020, pp 79-111
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter presents a staple of Feature Engineering: the automatic reduction of features, either by direction selection or by projection to a smaller feature space.Central to Feature Engineering are efforts to reduce the number of features, as uninformative features bloat the ML model with unnecessary parameters. In turn, too many parameters then either produces suboptimal results, as they are easy to overfit, or require large amounts of training data. These efforts are either by explicitly dropping certain features (feature selection) or mapping the feature vector, if it is sparse, into a lower, denser dimension (dimensionality reduction). There are also cover some algorithms that perform feature selection as part of their inner computation (embedded feature selection or regularization). Feature selection takes the spotlight within Feature Engineering due to its intrinsic utility for Error Analysis. Some techniques such as feature ablation using wrapper methods are used as the starting step before a feature drill down. Moreover, as feature selection helps build understandable models, it intertwines with Error Analysis as the analysis profits from such understandable models.

ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS
Gee Y Lee, Scott Manski, Tapabrata Maiti
Journal:

ASTIN Bulletin: The Journal of the IAA / Volume 50 / Issue 1 / January 2020

Published online by Cambridge University Press:

22 October 2019, pp. 1-24

Print publication:

January 2020
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In insurance analytics, textual descriptions of claims are often discarded, because traditional empirical analyses require numeric descriptor variables. This paper demonstrates how textual data can be easily used in insurance analytics. Using the concept of word similarities, we illustrate how to extract variables from text and incorporate them into claims analyses using standard generalized linear model or generalized additive regression model. This procedure is applied to the Wisconsin Local Government Property Insurance Fund (LGPIF) data, in order to demonstrate how insurance claims management and risk mitigation procedures can be improved. We illustrate two applications. First, we show how the claims classification problem can be solved using textual information. Second, we analyze the relationship between risk metrics and the probability of large losses. We obtain good results for both applications, where short textual descriptions of insurance claims are used for the extraction of features.

Search Results

Refine search

Refine search

Actions for selected content:

4 results

Chapter 26 - Embedding and machine learning

Summary

8 - Distributional Hypothesis and Representation Learning

Summary

4 - Features, Reduced: Feature Selection, Dimensionality Reduction and Embeddings

Summary

ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

4 results

Chapter 26 - Embedding and machine learning

Summary

8 - Distributional Hypothesis and Representation Learning

Summary

4 - Features, Reduced: Feature Selection, Dimensionality Reduction and Embeddings

Summary

ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS