To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The chapter introduces fundamental principles of deep learning. We discuss supervised learning of feedforward neural networks by considering a binary classification problem. Gradient descent techniques and backpropagation learning algorithms are introduced as means of training neural networks. The impact of neuron activations and convolutional and residual network architectures on the learning performance are discussed. Finally, regularization techniques such as batch normalization and dropout are introduced for improving the accuracy of trained models. The chapter is essential to connect advances in conventional deep learning algorithms to neuromorphic concepts.
Emphasizing how and why machine learning algorithms work, this introductory textbook bridges the gap between the theoretical foundations of machine learning and its practical algorithmic and code-level implementation. Over 85 thorough worked examples, in both Matlab and Python, demonstrate how algorithms are implemented and applied whilst illustrating the end result. Over 75 end-of-chapter problems empower students to develop their own code to implement these algorithms, equipping them with hands-on experience. Matlab coding examples demonstrate how a mathematical idea is converted from equations to code, and provide a jumping off point for students, supported by in-depth coverage of essential mathematics including multivariable calculus, linear algebra, probability and statistics, numerical methods, and optimization. Accompanied online by instructor lecture slides, downloadable Python code and additional appendices, this is an excellent introduction to machine learning for senior undergraduate and graduate students in Engineering and Computer Science.
Pater's (2019) target article proposes that neural networks will provide theories of learning that generative grammar lacks. We argue that his enthusiasm is premature since the biases of neural networks are largely unknown, and he disregards decades of work on machine learning and learnability. Learning biases form a two-way street: all learners have biases, and those biases constrain the space of learnable grammars in mathematically measurable ways. Analytical methods from the related fields of computational learning theory and grammatical inference allow one to study language learning, neural networks, and linguistics at an appropriate level of abstraction. The only way to satisfy our hunger and to make progress on the science of language learning is to confront these core issues directly.
The target article (Pater 2019) proposes to use neural networks to model learning within existing grammatical frameworks. This is easier said than done. There is a fundamental gap to be bridged that does not receive attention in the article: how can we use neural networks to examine whether it is possible to learn some linguistic representation (a tree, for example) when, after learning is finished, we cannot even tell if this is the type of representation that has been learned (all we see is a sequence of numbers)? Drawing a correspondence between an abstract linguistic representational system and an opaque parameter vector that can (or perhaps cannot) be seen as an instance of such a representation is an implementational mapping problem. Rather than relying on existing frameworks that propose partial solutions to this problem, such as harmonic grammar, I suggest that fusional research of the kind proposed needs to directly address how to ‘find’ linguistic representations in neural network representations.
Joe Pater's (2019) target article calls for greater interaction between neural network research and linguistics. I expand on this call and show how such interaction can benefit both fields. Linguists can contribute to research on neural networks for language technologies by clearly delineating the linguistic capabilities that can be expected of such systems, and by constructing controlled experimental paradigms that can determine whether those desiderata have been met. In the other direction, neural networks can benefit the scientific study of language by providing infrastructure for modeling human sentence processing and for evaluating the necessity of particular innate constraints on language acquisition.
From my perspective, Pater's (2019) target article does a great service both to researchers who work in generative linguistics and to researchers who utilize neural networks—and especially to researchers who might find themselves wanting to do both by harnessing the insights of each tradition. The fusion of theories of linguistic representation and probabilistic learning techniques has certainly led to many interesting and valuable insights about the nature of both linguistic representation and the language acquisition process. However, I feel that the most exciting aspect of Pater's article is the increasing interpretability of neural network models, especially when combined with insights from the theoretical framework of generative linguistics. This allows for the possibility that neural networks could be used to actually generate new theories of representation. I describe how I think this theory-generation process might work with interpretable neural networks.
The birthdate of both generative linguistics and neural networks can be taken as 1957, the year of the publication of foundational work by both Noam Chomsky and Frank Rosenblatt. This article traces the development of these two approaches to cognitive science, from their largely autonomous early development in the first thirty years, through their collision in the 1980s around the past-tense debate (Rumelhart & McClelland 1986, Pinker & Prince 1988) and their integration in much subsequent work up to the present. Although this integration has produced a considerable body of results, the continued general gulf between these two lines of research is likely impeding progress in both: on learning in generative linguistics, and on the representation of language in neural modeling. The article concludes with a brief argument that generative linguistics is unlikely to fulfill its promise of accounting for language learning if it continues to maintain its distance from neural and statistical approaches to learning.
Breast cancer is the second leading cause of cancer-related deaths among women globally and the most prevalent cancer in women. Artificial intelligence (AI)-based frameworks have shown great promise in correctly classifying breast carcinomas, particularly those that may have been difficult to discern through routine microscopy. Additionally, mitotic number quantification utilizing AI technology is more accurate than manual counting. With its many advantages, such as improved accuracy, efficiency and consistency as shown in this literature review, AI has promise for significantly enhancing breast cancer diagnosis in the clinical world despite the paramount obstacles that must be addressed. Ongoing research and innovation are essential for overcoming these challenges and effectively harnessing AI’s transformative potential in breast cancer detection and assessment.
In deep learning (DL), the instability phenomenon is widespread and well documented, and the most commonly used measure of stability is the Lipschitz constant. While a small Lipchitz constant is traditionally viewed as guarantying stability, it does not capture the instability phenomenon in DL for classification well. The reason is that a classification function – which is the target function to be approximated – is necessarily discontinuous, thus having an ‘infinite’ Lipchitz constant. As a result, the classical approach will deem every classification function unstable, yet basic classification functions a la ‘is there a cat in the image?’ will typically be locally very ‘flat’ – and thus locally stable – except at the decision boundary. The lack of an appropriate measure of stability hinders a rigorous theory for stability in DL, and consequently, there are no proper approximation theoretic results that can guarantee the existence of stable networks for classification functions. In this paper, we introduce a novel stability measure $\mathcal{S}(f)$, for any classification function $f$, appropriate to study the stability of classification functions and their approximations. We further prove two approximation theorems: First, for any $\epsilon \gt 0$ and any classification function $f$ on a compact set, there is a neural network (NN) $\psi$, such that $\psi - f \neq 0$ only on a set of measure $\lt \epsilon$; moreover, $\mathcal{S}(\psi ) \geq \mathcal{S}(f) - \epsilon$ (as accurate and stable as $f$ up to $\epsilon$). Second, for any classification function $f$ and $\epsilon \gt 0$, there exists a NN $\psi$ such that $\psi = f$ on the set of points that are at least $\epsilon$ away from the decision boundary.
In deep learning, interval neural networks are used to quantify the uncertainty of a pre-trained neural network. Suppose we are given a computational problem $P$ and a pre-trained neural network $\Phi _P$ that aims to solve $P$. An interval neural network is then a pair of neural networks $(\underline {\phi }, \overline {\phi })$, with the property that $\underline {\phi }(y) \leq \Phi _P(y) \leq \overline {\phi }(y)$ for all inputs $y$, where the inequalities are meant componentwise. $(\underline {\phi }, \overline {\phi })$ are specifically trained to quantify the uncertainty of $\Phi _P$, in the sense that the size of the interval $[\underline {\phi }(y),\overline {\phi }(y)]$ quantifies the uncertainty of the prediction $\Phi _P(y)$. In this paper, we investigate the phenomenon when algorithms cannot compute interval neural networks in the setting of inverse problems. We show that in the typical setting of a linear inverse problem, the problem of constructing an optimal pair of interval neural networks is non-computable, even with the assumption that the pre-trained neural network $\Phi _P$ is an optimal solution. In other words, there exist classes of training sets $\Omega$, such that there is no algorithm, even randomised (with probability $p \geq 1/2$), that computes an optimal pair of interval neural networks for each training set ${\mathcal{T}} \in \Omega$. This phenomenon happens even when we are given a pre-trained neural network $\Phi _{{\mathcal{T}}}$ that is optimal for $\mathcal{T}$. This phenomenon is intimately linked to instability in deep learning.
Bridge the gap between theoretical concepts and their practical applications with this rigorous introduction to the mathematics underpinning data science. It covers essential topics in linear algebra, calculus and optimization, and probability and statistics, demonstrating their relevance in the context of data analysis. Key application topics include clustering, regression, classification, dimensionality reduction, network analysis, and neural networks. What sets this text apart is its focus on hands-on learning. Each chapter combines mathematical insights with practical examples, using Python to implement algorithms and solve problems. Self-assessment quizzes, warm-up exercises and theoretical problems foster both mathematical understanding and computational skills. Designed for advanced undergraduate students and beginning graduate students, this textbook serves as both an invitation to data science for mathematics majors and as a deeper excursion into mathematics for data science students.
This chapter introduces the foundational mathematical concepts behind neural networks, backpropagation, and stochastic gradient descent (SGD). It begins by generalizing the Chain Rule and providing a brief overview of automatic differentiation, which is essential for efficiently computing derivatives in machine learning models. The chapter then explains backpropagation within the context of multilayer neural networks, specifically focusing on multilayer perceptrons (MLPs). It covers the implementation of SGD, highlighting its advantages in optimizing large datasets. Practical examples using the PyTorch library are provided, including the classification of images from the Fashion-MNIST dataset. The chapter provides a solid foundation in the mathematical tools and techniques that underpin modern AI.
An introduction to AI, including an overview of essential technologies such as machine learning and deep learning, and a discussion on generative AI and its potential limitations. The chapter includes an exploration of AI's history, including its relationship to cybernetics, its role as a codebreaker, periods of optimism and “AI winters,” and today's global development with generative AI. Chapter 1 also include an analysis of AI's role in the international and national context, focusing on potential conflicts of goals and threats that can arise from technology.
This Element provides a comprehensive guide to deep learning in quantitative trading, merging foundational theory with hands-on applications. It is organized into two parts. The first part introduces the fundamentals of financial time-series and supervised learning, exploring various network architectures, from feedforward to state-of-the-art. To ensure robustness and mitigate overfitting on complex real-world data, a complete workflow is presented, from initial data analysis to cross-validation techniques tailored to financial data. Building on this, the second part applies deep learning methods to a range of financial tasks. The authors demonstrate how deep learning models can enhance both time-series and cross-sectional momentum trading strategies, generate predictive signals, and be formulated as an end-to-end framework for portfolio optimization. Applications include a mixture of data from daily data to high-frequency microstructure data for a variety of asset classes. Throughout, they include illustrative code examples and provide a dedicated GitHub repository with detailed implementations.
Psychiatric disorders lead to disability, premature mortality and economic burden, highlighting the urgent need for more effective treatments. The understanding of psychiatric disorders as conditions of large-scale brain networks has created new opportunities for developing targeted, personalised, and mechanism-based therapeutic interventions. Non-invasive brain stimulation (NIBS) techniques, such as transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS), can directly modulate dysfunctional neural networks, enabling treatments tailored to the individual’s unique functional network patterns.
As NIBS techniques depend on our understanding of the neural networks involved in psychiatric disorders, this review offers a neural network-informed perspective on their applications. We focus on key disorders, including depression, schizophrenia, and obsessive-compulsive disorder, and examine the role of NIBS on cognitive impairment, a transdiagnostic feature that does not respond to conventional treatments. We discuss the advancements in identifying NIBS response biomarkers with the use of electrophysiology and neuroimaging, which can inform the development of optimised, mechanism-based, personalised NIBS treatment protocols.
We address key challenges, including the need for more precise, individualised targeting of dysfunctional networks through integration of neurophysiological, neuroimaging and genetic data and the use of emerging techniques, such as low- intensity focused ultrasound, which has the potential to improve spatial precision and target access. We finally explore future directions to improve treatment protocols and promote widespread clinical use of NIBS as a safe, effective and patient-centred treatment for psychiatric disorders.
This chapter discusses how to apply principles of statistics, optimization, and linear algebra in advanced techniques of data science and machine learning. The chapter shows how to use principal component analysis and singular value decomposition for analyzing complex datasets and discusses advanced estimation techniques such as logistic regression, Gaussian process models, and neural networks.
This chapter explores key elements of AI as relevant to intellectual property law. Understanding how artificial intelligence works is crucial for applying legal regimes to it. Legal practitioners, especially IP lawyers, need a deep understanding of AI’s technical nuances. Intellectual property doctrines aim to achieve practical ends, and their application to AI is highly fact-dependent. Patent law, for example, requires technical expertise in addition to legal knowledge. This chapter tracks the development of AI from simple programming to highly sophisticated learning algorithms. It emphasizes how AI is rapidly evolving and that many of these systems are already being widely adopted in society. AI is transforming fields like education, law, healthcare, and finance. While AI offers numerous benefits, it also raises concerns about bias and transparency, among numerous other ethical implications.
Ethnicity and race are vital for understanding representation, yet individual-level data are often unavailable. Recent methodological advances have allowed researchers to impute racial and ethnic classifications based on publicly available information, but predictions vary in their accuracy and can introduce statistical biases in downstream analyses. We provide an overview of common estimation methods, including Bayesian approaches and machine learning techniques that use names or images as inputs. We propose and test a hybrid approach that combines surname-based Bayesian estimation with the use of publicly available images in a convolutional neural network. We find that the proposed approach not only reduces statistical bias in downstream analyses but also improves accuracy in a sample of over 16,000 local elected officials. We conclude with a discussion of caveats and describe settings where the hybrid approach is especially suitable.
Language models can produce fluent, grammatical text. Nonetheless, some maintain that language models don’t really learn language and also that, even if they did, that would not be informative for the study of human learning and processing. On the other side, there have been claims that the success of LMs obviates the need for studying linguistic theory and structure. We argue that both extremes are wrong. LMs can contribute to fundamental questions about linguistic structure, language processing, and learning. They force us to rethink arguments and ways of thinking that have been foundational in linguistics. While they do not replace linguistic structure and theory, they serve as model systems and working proofs of concept for gradient, usage-based approaches to language. We offer an optimistic take on the relationship between language models and linguistics.
In this chapter we introduce the main concepts of neural networks (NNs). Next, we present the main building blocks of a neural network and we discuss the most common training techniques.