Search

The dutch draw: constructing a universal baseline for binary classification problems
Part of
Etienne van de Bijl, Jan Klein, Joris Pries, Sandjai Bhulai, Mark Hoogendoorn, Rob van der Mei
Journal:

Journal of Applied Probability / Volume 62 / Issue 2 / June 2025

Published online by Cambridge University Press:

19 September 2024, pp. 475-493

Print publication:

June 2025
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Novel prediction methods should always be compared to a baseline to determine their performance. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is, therefore, required to evaluate the ‘goodness’ of a performance score. Comparing results with the latest state-of-the-art model is usually insightful. However, being state-of-the-art is dynamic, as newer models are continuously developed. Contrary to an advanced model, it is also possible to use a simple dummy classifier. However, the latter model could be beaten too easily, making the comparison less valuable. Furthermore, most existing baselines are stochastic and need to be computed repeatedly to get a reliable expected performance, which could be computationally expensive. We present a universal baseline method for all binary classification models, named the Dutch Draw (DD). This approach weighs simple classifiers and determines the best classifier to use as a baseline. Theoretically, we derive the DD baseline for many commonly used evaluation measures and show that in most situations it reduces to (almost) always predicting either zero or one. Summarizing, the DD baseline is general, as it is applicable to any binary classification problem; simple, as it can be quickly determined without training or parameter tuning; and informative, as insightful conclusions can be drawn from the results. The DD baseline serves two purposes. First, it is a robust and universal baseline that enables comparisons across research papers. Second, it provides a sanity check during the prediction model’s development process. When a model does not outperform the DD baseline, it is a major warning sign.

5 - Linear Least-Squares Regression and Binary Classification
Jeffrey A. Fessler, University of Michigan, Ann Arbor, Raj Rao Nadakuditi, University of Michigan, Ann Arbor
Book:

Linear Algebra for Data Science, Machine Learning, and Signal Processing

Published online:

01 November 2024

Print publication:

16 May 2024, pp 143-196
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Many applications require solving a system of linear equations 𝑨𝒙 = 𝒚 for 𝒙 given 𝑨 and 𝒚. In practice, often there is no exact solution for 𝒙, so one seeks an approximate solution. This chapter focuses on least-squares formulations of this type of problem. It briefly reviews the 𝑨𝒙 = 𝒚 case and then motivates the more general 𝑨𝒙 ≈ 𝒚 cases. It then focuses on the over-determined case where 𝑨 is tall, emphasizing the insights offered by the SVD of 𝑨. It introduces the pseudoinverse, which is especially important for the under-determined case where 𝑨 is wide. It describes alternative approaches for the under-determined case such as Tikhonov regularization. It introduces frames, a generalization of unitary matrices. It uses the SVD analysis of this chapter to describe projection onto a subspace, completing the subspace-based classification ideas introduced in the previous chapter, and also introduces a least-squares approach to binary classifier design. It introduces recursive least-squares methods that are important for streaming data.

3 - Logistic Regression
Mihai Surdeanu, University of Arizona, Marco Antonio Valenzuela-Escárcega, University of Arizona
Book:

Deep Learning for Natural Language Processing

Published online:

01 February 2024

Print publication:

08 February 2024, pp 30-48
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

As mentioned in the previous chapter, the perceptron does not perform smooth updates during training, which may slow down learning, or cause it to miss good solutions entirely in real-world situations. In this chapter, we will discuss logistic regression, a machine learning algorithm that elegantly addresses this problem. We also extend the vanilla logistic regression, which was designed for binary classification, to handle multiclass classification. Through logistic regression, we introduce the concept of cost function (i.e., the function we aim to minimize during training), and gradient descent, the algorithm that implements this minimization procedure.

2 - The Perceptron
Mihai Surdeanu, University of Arizona, Marco Antonio Valenzuela-Escárcega, University of Arizona
Book:

Deep Learning for Natural Language Processing

Published online:

01 February 2024

Print publication:

08 February 2024, pp 8-29
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter covers the perceptron, the simplest neural network architecture. In general, neural networks are machine learning architectures loosely inspired by the structure of biological brains. The perceptron is the simplest example of such architectures: it contains a single artificial neuron. The perceptron will form the building block for the more complicated architectures discussed later in the book. However, rather than starting directly with the discussion of this algorithm, we will start with something simpler: a children’s book and some fundamental observations about machine learning. From these, we will formalize our first machine learning algorithm, the perceptron.

The Danube and settlement prehistory – 80 years on
John Chapman
Journal:

European Journal of Archaeology / Volume 12 / Issue 1-3 / 2009

Published online by Cambridge University Press:

25 January 2017, pp. 145-156
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Although commentators have discussed myriad themes presented in the rich and extensive oeuvre of Childe, one of the topics that has been, in my view, seriously neglected is the topic of settlement types. In this article, I seek to make good this omission, starting from a consideration of The Danube in Prehistory. The basis of Childe's ideas on settlement types in the Neolithic and Copper Age of eastern Europe was a binary classification into ‘tells’ and ‘flat sites’ that, in turn, reflected a division between permanent and shifting cultivation and greater and lesser cultural complexity. However, the introduction into this debate of questions of trade, surplus production, and Neolithic ‘self-sufficiency’, as well as metallurgy and ritual, meant that the initial binary classification left a series of contradictions that Childe struggled to transcend in the last decade of his life.

Variable selection through CART ∗
Marie Sauve, Christine Tuleau-Malot
Journal:

ESAIM: Probability and Statistics / Volume 18 / 2014

Published online by Cambridge University Press:

22 October 2014, pp. 770-798

Print publication:

2014
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper deals with variable selection in regression and binary classificationframeworks. It proposes an automatic and exhaustive procedure which relies on the use ofthe CART algorithm and on model selection via penalization. This work, of theoreticalnature, aims at determining adequate penalties, i.e. penalties whichallow achievement of oracle type inequalities justifying the performance of the proposedprocedure. Since the exhaustive procedure cannot be realized when the number of variablesis too large, a more practical procedure is also proposed and still theoreticallyvalidated. A simulation study completes the theoretical results.

Search Results

Refine search

Refine search

Actions for selected content:

6 results

The dutch draw: constructing a universal baseline for binary classification problems

5 - Linear Least-Squares Regression and Binary Classification

Summary

3 - Logistic Regression

Summary

2 - The Perceptron

Summary

The Danube and settlement prehistory – 80 years on

Variable selection through CART ∗

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

6 results

The dutch draw: constructing a universal baseline for binary classification problems

5 - Linear Least-Squares Regression and Binary Classification

Summary

3 - Logistic Regression

Summary

2 - The Perceptron

Summary

The Danube and settlement prehistory – 80 years on

Variable selection through CART∗

Variable selection through CART ∗