Search

3 results

16 - Evaluating Models
from Part IV - Models in Question
Macartan Humphreys, Wissenschaftszentrum Berlin für Sozialforschung, Alan M. Jacobs, University of British Columbia, Vancouver
Book:

Integrated Inferences

Published online:

13 October 2023

Print publication:

30 November 2023, pp 372-395
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We describe strategies for figuring out whether a model is likely doing more harm than good and for comparing the performance of different models to one another.

8 - Learning and Generalization
William W. Hsieh, University of British Columbia, Vancouver
Book:

Introduction to Environmental Data Science

Published online:

23 March 2023

Print publication:

23 March 2023, pp 245-282
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

A good model aims to learn the underlying signal without overfitting (i.e. fitting to the noise in the data). This chapter has four main parts: The first part covers objective functions and errors. The second part covers various regularization techniques (weight penalty/decay, early stopping, ensemble, dropout, etc.) to prevent overfitting. The third part covers the Bayesian approach to model selection and model averaging. The fourth part covers the recent development of interpretable machine learning.

MODEL SELECTION AND AVERAGING OF HEALTH COSTS IN EPISODE TREATMENT GROUPS
Shujuan Huang, Brian Hartman, Vytaras Brazauskas
Journal:

ASTIN Bulletin: The Journal of the IAA / Volume 47 / Issue 1 / January 2017

Published online by Cambridge University Press:

21 December 2016, pp. 153-167

Print publication:

January 2017
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Episode Treatment Groups (ETGs) classify related services into medically relevant and distinct units describing an episode of care. Proper model selection for those ETG-based costs is essential to adequately price and manage health insurance risks. The optimal claim cost model (or model probabilities) can vary depending on the disease. We compare four potential models (lognormal, gamma, log-skew-t and Lomax) using four different model selection methods (AIC and BIC weights, Random Forest feature classification and Bayesian model averaging) on 320 ETGs. Using the data from a major health insurer, which consists of more than 33 million observations from 9 million claimants, we compare the various methods on both speed and precision, and also examine the wide range of selected models for the different ETGs. Several case studies are provided for illustration. It is found that Random Forest feature selection is computationally efficient and sufficiently accurate, hence being preferred in this large data set. When feasible (on smaller data sets), Bayesian model averaging is preferred because of the posterior model probabilities.

Search Results

Refine search

Refine search

Actions for selected content:

3 results

16 - Evaluating Models

Summary

8 - Learning and Generalization

Summary

MODEL SELECTION AND AVERAGING OF HEALTH COSTS IN EPISODE TREATMENT GROUPS

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

3 results

16 - Evaluating Models

Summary

8 - Learning and Generalization

Summary

MODEL SELECTION AND AVERAGING OF HEALTH COSTS IN EPISODE TREATMENT GROUPS