We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Chapter 7 is dedicated to regularized regression methods, which – by penalizing models that are too complex – are capable of providing a reasonable tradeoff between bias and variance. Ridge regression implements L2 regularization, which results in more generalizable models, but does not perform any feature selection. L1 penalty used by the lasso allows, however, for simultaneous regularization and feature selection. The elastic net algorithm combines the two approaches by applying both L1 and L2 penalties, which allows for solutions combining the advantages of both ridge regression and the lasso. The chapter concludes by discussing a general class of Lq-regularized least squares optimization problems.
Chapter 13 discusses neural networks and deep learning; included is a presentation of deep convolutional networks that seem to have a great potential in the classification of medical images.
Multivariate biomarker discovery is increasingly important in the realm of biomedical research, and is poised to become a crucial facet of personalized medicine. This will prompt the demand for a myriad of novel biomarkers representing distinct 'omic' biosignatures, allowing selection and tailoring treatments to the various individual characteristics of a particular patient. This concise and self-contained book covers all aspects of predictive modeling for biomarker discovery based on high-dimensional data, as well as modern data science methods for identification of parsimonious and robust multivariate biomarkers for medical diagnosis, prognosis, and personalized medicine. It provides a detailed description of state-of-the-art methods for parallel multivariate feature selection and supervised learning algorithms for regression and classification, as well as methods for proper validation of multivariate biomarkers and predictive models implementing them. This is an invaluable resource for scientists and students interested in bioinformatics, data science, and related areas.
The previous chapter introduced feed-forward neural networks and demonstrated that, theoretically, implementing the training procedure for an arbitrary feed-forward neural network is relatively simple. Unfortunately, neural networks trained this way will suffer from several problems such as stability of the training process – that is, slow convergence due to parameters jumping around a good minimum – and overfitting. In this chapter, we will describe several practical solutions that mitigate these problems. In particular, we discuss minibatching, multiple optimization algorithms, other activation and cost functions, regularization, dropout, temporal averaging, and parameter initialization and normalization.
Based on Chapter 6, in this chapter we expand the discussion of neural networks to include networks that have more than one hidden layer. Common structures such as the convolutional neural network (CNN) or the Long Short-Term Memory network (LSTM) are explained and used along with Matlab’s Deep Network Designer App as well as Matlab script to implement and train such networks. Issues such as the vanishing or exploding gradient, normalization, and training strategies are discussed. Concepts that address overfitting and the vanishing or exploding gradient are introduced, including dropout and regularization. Transfer learning is discussed and showcased using Matlab’s DND App.
In this chapter we formulate the general regression problem relevant to function estimation. We begin with simple frequentist methods and quickly move to regression within the Bayesian paradigm. We then present two complementary mathematical formulations: one that relies on Gaussian process priors, appropriate for the regression of continuous quantities, and one that relies on Beta–Bernoulli process priors, appropriate for the regression of discrete quantities. In the context of the Gaussian process, we discuss more advanced topics including various admissible kernel functions, inducing point methods, sampling methods for nonconjugate Gaussian process prior-likelihood pairs, and elliptical slice samplers. For Beta–Bernoulli processes, we address questions of posterior convergence in addition to applications. Taken together, both Gaussian processes and Beta–Bernoulli processes constitute our first foray into Bayesian nonparametrics. With end of chapter projects, we explore more advanced modeling questions relevant to optics and microscopy.
Edited by
Alik Ismail-Zadeh, Karlsruhe Institute of Technology, Germany,Fabio Castelli, Università degli Studi, Florence,Dylan Jones, University of Toronto,Sabrina Sanchez, Max Planck Institute for Solar System Research, Germany
Abstract: Data assimilation has always been a particularly active area of research in glaciology. While many properties at the surface of glaciers and ice sheets can be directly measured from remote sensing or in situ observations (surface velocity, surface elevation, thinning rates, etc.), many important characteristics, such as englacial and basal properties, as well as past climate conditions, remain difficult or impossible to observe. Data assimilation has been used for decades in glaciology in order to infer unknown properties and boundary conditions that have important impact on numerical models and their projections. The basic idea is to use observed properties, in conjunction with ice flow models, to infer these poorly known ice properties or boundary conditions. There is, however, a great deal of variability among approaches. Constraining data can be of a snapshot in time, or can represent evolution over time. The complexity of the flow model can vary, from simple descriptions of lubrication flow or mass continuity to complex, continent-wide Stokes flow models encompassing multiple flow regimes. Methods can be deterministic, where only a best fit is sought, or probabilistic in nature. We present in this chapter some of the most common applications of data assimilation in glaciology, and some of the new directions that are currently being developed.
A good model aims to learn the underlying signal without overfitting (i.e. fitting to the noise in the data). This chapter has four main parts: The first part covers objective functions and errors. The second part covers various regularization techniques (weight penalty/decay, early stopping, ensemble, dropout, etc.) to prevent overfitting. The third part covers the Bayesian approach to model selection and model averaging. The fourth part covers the recent development of interpretable machine learning.
Many applications in geosciences require solving inverse problems to estimate the state of a physical system. Data assimilation provides a strong framework to do so when the system is partially observed and its underlying dynamics are known to some extent. In the variational flavor, it can be seen as an optimal control problem where initial conditions are the control parameters. Such problems are often ill-posed, regularization may be needed using explicit prior knowledge to enforce a satisfying solution. In this work, we propose to use a deep prior, a neural architecture that generates potential solutions and acts as implicit regularization. The architecture is trained in a fully-unsupervised manner using the variational data assimilation cost so that gradients are backpropagated through the dynamical model and then through the neural network. To demonstrate its use, we set a twin experiment using a shallow-water toy model, where we test various variational assimilation algorithms on an ocean-like circulation estimation.
Researchers of time series cross-sectional data regularly face the change-point problem, which requires them to discern between significant parametric shifts that can be deemed structural changes and minor parametric shifts that must be considered noise. In this paper, we develop a general Bayesian method for change-point detection in high-dimensional data and present its application in the context of the fixed-effect model. Our proposed method, hidden Markov Bayesian bridge model, jointly estimates high-dimensional regime-specific parameters and hidden regime transitions in a unified way. We apply our method to Alvarez, Garrett, and Lange’s (1991, American Political Science Review 85, 539–556) study of the relationship between government partisanship and economic growth and Allee and Scalera’s (2012, International Organization 66, 243–276) study of membership effects in international organizations. In both applications, we found that the proposed method successfully identify substantively meaningful temporal heterogeneity in parameters of regression models.
The jus temporis that is argued for in this chapter aims to explicate the value of human time that is to be found in the finite, irreversible, and unstoppable character of human time. To make the value of human time explicit, "rootedness" and "integration" are conceptually distinguished. The latter signifying qualified time, the former mere lapse of human time. Rootedness simply signifies the entanglement of presence on a territory with the lapse of finite and irreversible human time. This conception of rootedness is at the heart of jus temporis and its implications are not limited to questions of citizenship acquisition. It is argued that the value of rootedness equally applies to waiting time in procedures, endless forms of temporariness, and unlawful residence. Concretely, it is argued that this jus temporis implies two elements. The first is a certain openness to the future, the possibility that a certain situation will not last forever. The second element is that there should be end-terms at work in law: procedures may not last forever, temporariness may not continue eternally, and there should be a moment when long-term unlawful residences can become lawful.
In this paper, we concern with a backward problem for a nonlinear time fractional wave equation in a bounded domain. By applying the properties of Mittag-Leffler functions and the method of eigenvalue expansion, we establish some results about the existence and uniqueness of the mild solutions of the proposed problem based on the compact technique. Due to the ill-posedness of backward problem in the sense of Hadamard, a general filter regularization method is utilized to approximate the solution and further we prove the convergence rate for the regularized solutions.
In shape-from-focus (SFF) methods, a single focus measure is used to compute the focus volume. However, it seems that a single focus measure operator is not capable of computing accurate focus values for the images of diverse types of object shapes. Furthermore, most of the SFF methods try to improve the depth map without considering any additional structural or prior information. Consequently, the extracted shape of the object might lack important details. In this work, we address these problems and suggest a method in which depth hypotheses are combined for a more accurate 3D shape through 3D weighted least squares. First, depth hypotheses are obtained by applying a number of focus operators. Then, structural prior or guidance volume is extracted from the focus measure volumes. Finally, a 3D weighted least squares optimization technique is applied to the depth hypothesis volume, where weights are computed from the guidance volume. Thus, by inducing structural prior, an improved resultant depth map is obtained. The proposed method was tested using various image sequences of synthetic and microscopic real objects. Experimental results and comparative analysis demonstrated the effectiveness of the proposed method.
Wavelet theory is known to be a powerful tool for compressing and processing time series or images. It consists in projecting a signal on an orthonormal basis of functions that are chosen in order to provide a sparse representation of the data. The first part of this article focuses on smoothing mortality curves by wavelets shrinkage. A chi-square test and a penalized likelihood approach are applied to determine the optimal degree of smoothing. The second part of this article is devoted to mortality forecasting. Wavelet coefficients exhibit clear trends for the Belgian population from 1965 to 2015, they are easy to forecast resulting in predicted future mortality rates. The wavelet-based approach is then compared with some popular actuarial models of Lee–Carter type estimated fitted to Belgian, UK, and US populations. The wavelet model outperforms all of them.
In this chapter, we shall consider the design of neural nets, which are collections of perceptrons, or nodes, where the outputs of one rank (or layer of nodes becomes the inputs to nodes at the next layer. The last layer of nodes produces the outputs of the entire neural net. The training of neural nets with many layers requires enormous numbers of training examples, but has proven to be an extremely powerful technique, referred to as deep learning, when it can be used.We also consider several specialized forms of neural nets that have proved useful for special kinds of data. These forms are characterized by requiring that certain sets of nodes in the network share the same weights. Since learning all the weights on all the inputs to all the nodes of the network is in general a hard and time-consuming task, these special forms of network greatly simplify the process of training the network to recognize the desired class or classes of inputs. We shall study convolutional neural networks (CNNs), which are specially designed to recognize classes of images. We shall also study recurrent neural networks (RNNs) and long short-term memory networks (LSTMs), which are designed to recognize classes of sequences, such as sentences (sequences of words).
We show that several machine learning estimators, including square-root least absolute shrinkage and selection and regularized logistic regression, can be represented as solutions to distributionally robust optimization problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (robust Wasserstein profile inference), a novel inference methodology which extends the use of methods inspired by empirical likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a consequence we are able to choose regularization parameters for these machine learning estimators without the use of cross validation. Numerical experiments are also given to validate our theoretical findings.
To deal with the growing migrant crisis in North Africa, several states have considered granting amnesty to foreign displaced persons (both economic migrants and potential refugees) who have entered their territories clandestinely. Morocco has taken the lead in this policy approach, launching two successful amnesty campaigns in 2014 and 2017 that regularized the status of approximately 40,000 displaced persons in total. While policymakers in many North African states increasingly see this policy as a viable solution, it is less understood how ordinary citizens view such regularization policies. Hence, this article inquires: under what conditions do ordinary native citizens support regularizing clandestine migrants and refugees? Further, what factors correlate with either higher or lower levels of public support for (or opposition to) regularization campaigns? Drawing on an original representative public opinion poll from Morocco's Casablanca-Settat region completed in 2017, this article finds that more than 59 percent of native citizens of Morocco support these regularization campaigns. Particularly, Moroccans who were wealthier, female, and ethnic minorities (black Moroccans) endorsed regularization more strongly. By contrast, Moroccans opposed regularization when they had concerns about whether displaced persons hurt the economy, undermine cultural traditions, and reduce stability.
We study the numerical identification of an unknown portion of the boundary on which either the Dirichlet or the Neumann condition is provided from the knowledge of Cauchy data on the remaining, accessible and known part of the boundary of a two-dimensional domain, for problems governed by Helmholtz-type equations. This inverse geometric problem is solved using the plane waves method (PWM) in conjunction with the Tikhonov regularization method. The value for the regularization parameter is chosen according to Hansen's L-curve criterion. The stability, convergence, accuracy and efficiency of the proposed method are investigated by considering several examples.