We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We propose a physics-constrained convolutional neural network (PC-CNN) to solve two types of inverse problems in partial differential equations (PDEs), which are nonlinear and vary both in space and time. In the first inverse problem, we are given data that is offset by spatially varying systematic error (i.e., the bias, also known as the epistemic uncertainty). The task is to uncover the true state, which is the solution of the PDE, from the biased data. In the second inverse problem, we are given sparse information on the solution of a PDE. The task is to reconstruct the solution in space with high resolution. First, we present the PC-CNN, which constrains the PDE with a time-windowing scheme to handle sequential data. Second, we analyze the performance of the PC-CNN to uncover solutions from biased data. We analyze both linear and nonlinear convection-diffusion equations, and the Navier–Stokes equations, which govern the spatiotemporally chaotic dynamics of turbulent flows. We find that the PC-CNN correctly recovers the true solution for a variety of biases, which are parameterized as non-convex functions. Third, we analyze the performance of the PC-CNN for reconstructing solutions from sparse information for the turbulent flow. We reconstruct the spatiotemporal chaotic solution on a high-resolution grid from only 1% of the information contained in it. For both tasks, we further analyze the Navier–Stokes solutions. We find that the inferred solutions have a physical spectral energy content, whereas traditional methods, such as interpolation, do not. This work opens opportunities for solving inverse problems with partial differential equations.
Optical microrobots are activated by a laser in a liquid medium using optical tweezers. To create visual control loops for robotic automation, this work describes a deep learning-based method for orientation estimation of optical microrobots, focusing on detecting 3-D rotational movements and localizing microrobots and trapping points (TPs). We integrated and fine-tuned You Only Look Once (YOLOv7) and Deep Simple Online Real-time Tracking (DeepSORT) algorithms, improving microrobot and TP detection accuracy by $\sim 3$% and $\sim 11$%, respectively, at the 0.95 Intersection over Union (IoU) threshold in our test set. Additionally, it increased mean average precision (mAP) by 3% at the 0.5:0.95 IoU threshold during training. Our results showed a 99% success rate in trapping events with no false-positive detection. We introduced a model that employs EfficientNet as a feature extractor combined with custom convolutional neural networks (CNNs) and feature fusion layers. To demonstrate its generalization ability, we evaluated the model on an independent in-house dataset comprising 4,757 image frames, where microrobots executed simultaneous rotations across all three axes. Our method provided mean rotation angle errors of $1.871^\circ$, $2.308^\circ$, and $2.808^\circ$ for X (yaw), Y (roll), and Z (pitch) axes, respectively. Compared to pre-trained models, our model provided the lowest error in the Y and Z axes while offering competitive results for X-axis. Finally, we demonstrated the explainability and transparency of the model’s decision-making process. Our work contributes to the field of microrobotics by providing an efficient 3-axis orientation estimation pipeline, with a clear focus on automation.
Generation of science-ready data from processed data products is one of the major challenges in next-generation radio continuum surveys with the Square Kilometre Array (SKA) and its precursors, due to the expected data volume and the need to achieve a high degree of automated processing. Source extraction, characterization, and classification are the major stages involved in this process. In this work we focus on the classification of compact radio sources in the Galactic plane using both radio and infrared images as inputs. To this aim, we produced a curated dataset of $\sim$20 000 images of compact sources of different astronomical classes, obtained from past radio and infrared surveys, and novel radio data from pilot surveys carried out with the Australian SKA Pathfinder. Radio spectral index information was also obtained for a subset of the data. We then trained two different classifiers on the produced dataset. The first model uses gradient-boosted decision trees and is trained on a set of pre-computed features derived from the data, which include radio-infrared colour indices and the radio spectral index. The second model is trained directly on multi-channel images, employing convolutional neural networks. Using a completely supervised procedure, we obtained a high classification accuracy (F1-score > 90%) for separating Galactic objects from the extragalactic background. Individual class discrimination performances, ranging from 60% to 75%, increased by 10% when adding far-infrared and spectral index information, with extragalactic objects, PNe and Hii regions identified with higher accuracies. The implemented tools and trained models were publicly released and made available to the radioastronomical community for future application on new radio data.
The use of machine learning in robotics is a vast and growing area of research. In this chapter we consider a few key variations using: the use of deep neural networks, the applications of reinforcement learning and especially deep reinforcement learning, and the rapidly emerging potential for large language models.
Targeted spraying application technologies have the capacity to drastically reduce herbicide inputs, but to be successful, the performance of both machine vision–based weed detection and actuator efficiency needs to be optimized. This study assessed (1) the performance of spotted spurge recognition in ‘Latitude 36’ bermudagrass turf canopy using the You Only Look Once (YOLOv3) real-time multiobject detection algorithm and (2) the impact of various nozzle densities on model efficiency and projected herbicide reduction under simulated conditions. The YOLOv3 model was trained and validated with a data set of 1,191 images. The simulation design consisted of four grid matrix regimes (3 × 3, 6 × 6, 12 × 12, and 24 × 24), which would then correspond to 3, 6, 12, and 24 nonoverlapping nozzles, respectively, covering a 50-cm-wide band. Simulated efficiency testing was conducted using 50 images containing predictions (labels) generated with the trained YOLO model and by applying each of the grid matrixes to individual images. The model resulted in prediction accuracy of an F1 score of 0.62, precision of 0.65, and a recall value of 0.60. Increased nozzle density (from 3 to 12) improved actuator precision and predicted herbicide-use efficiency with a reduction in the false hits ratio from ∼30% to 5%. The area required to ensure herbicide deposition to all spotted spurge detected within images was reduced to 18%, resulting in ∼80% herbicide savings compared to broadcast application. Slightly greater precision was predicted with 24 nozzles but was not statistically different from the 12-nozzle scenario. Using this turf/weed model as a basis, optimal actuator efficacy and herbicide savings would occur by increasing nozzle density from 1 to 12 nozzles within the context of a single band.
Simulating abundances of stable water isotopologues, that is, molecules differing in their isotopic composition, within climate models allows for comparisons with proxy data and, thus, for testing hypotheses about past climate and validating climate models under varying climatic conditions. However, many models are run without explicitly simulating water isotopologues. We investigate the possibility of replacing the explicit physics-based simulation of oxygen isotopic composition in precipitation using machine learning methods. These methods estimate isotopic composition at each time step for given fields of surface temperature and precipitation amount. We implement convolutional neural networks (CNNs) based on the successful UNet architecture and test whether a spherical network architecture outperforms the naive approach of treating Earth’s latitude-longitude grid as a flat image. Conducting a case study on a last millennium run with the iHadCM3 climate model, we find that roughly 40% of the temporal variance in the isotopic composition is explained by the emulations on interannual and monthly timescale, with spatially varying emulation quality. The tested CNNs outperform simple baseline models such as random forest and pixel-wise linear regression substantially. A modified version of the standard UNet architecture for flat images yields results that are as good as the predictions by the spherical CNN. Variations in the implementation of isotopes between climate models likely contribute to an observed deterioration of emulation results when testing on data obtained from different climate models than the one used for training. Future work toward stable water-isotope emulation might focus on achieving robust climate–oxygen isotope relationships or exploring the set of possible predictor variables.
Climate models are primary tools for investigating processes in the climate system, projecting future changes, and informing decision makers. The latest generation of models provides increasingly complex and realistic representations of the real climate system, while there is also growing awareness that not all models produce equally plausible or independent simulations. Therefore, many recent studies have investigated how models differ from observed climate and how model dependence affects model output similarity, typically drawing on climatological averages over several decades. Here, we show that temperature maps of individual days drawn from datasets never used in training can be robustly identified as “model” or “observation” using the CMIP6 model archive and four observational products. An important exception is a prototype storm-resolving simulation from ICON-Sapphire which cannot be unambiguously assigned to either category. These results highlight that persistent differences between simulated and observed climate emerge at short timescales already, but very high-resolution modeling efforts may be able to overcome some of these shortcomings. Moreover, temporally out-of-sample test days can be assigned their dataset name with up to 83% accuracy. Misclassifications occur mostly between models developed at the same institution, suggesting that effects of shared code, previously documented only for climatological timescales, already emerge at the level of individual days. Our results thus demonstrate that the use of machine learning classifiers, once trained, can overcome the need for several decades of data to evaluate a given model. This opens up new avenues to test model performance and independence on much shorter timescales.
Monitoring river water levels is essential for the study of floods and mitigating their risks. River gauges are a well-established method for river water-level monitoring but many flood-prone areas are ungauged and must be studied through gauges located several kilometers away. Taking advantage of river cameras to observe river water levels is an accessible and flexible solution but it requires automation. However, current automated methods are only able to extract uncalibrated river water-level indexes from the images, meaning that these indexes are relative to the field of view of the camera, which limits their application. With this work, we propose a new approach to automatically estimate calibrated river water-level indexes from images of rivers. This approach is based on the creation of a new dataset of 32,715 images coming from 95 river cameras in the UK and Ireland, cross-referenced with gauge data (river water-level information), which allowed us to train convolutional neural networks. These networks are able to accurately produce two types of calibrated river water-level indexes from images: one for continuous river water-level monitoring, and the other for flood event detection. This work is an important step toward the automated use of cameras for flood monitoring.
Ocean wave climate has a significant impact on near-shore and off-shore human activities, and its characterization can help in the design of ocean structures such as wave energy converters and sea dikes. Therefore, engineers need long time series of ocean wave parameters. Numerical models are a valuable source of ocean wave data; however, they are computationally expensive. Consequently, statistical and data-driven approaches have gained increasing interest in recent decades. This work investigates the spatiotemporal relationship between North Atlantic wind and significant wave height ($ {H}_s $) at an off-shore location in the Bay of Biscay, using a two-stage deep learning model. The first step uses convolutional neural networks to extract the spatial features that contribute to $ {H}_s $. Then, long short-term memory is used to learn the long-term temporal dependencies between wind and waves.
This chapter introduces machine learning in contemporary artificial intelligence. The first section looks at an expert system developed in the early days of AI research – ID3, which employs a decision-tree-based algorithm. The second section looks at advances in deep learning, which has transformed modern machine learning. We introduce a deep learning model inspired by the mammalian visual system, illustrating how it can extract hierarchical information from the raw data. The third section addresses two examples of neural networks -- autoencoders and convolutional neural networks, which can feature in layers of deep learning networks. The last section looks at a distinct type of machine learning -- reinforcement learning. We explain how deep reinforcement learning has made possible the two most spectacular milestones in artificial intelligence - AlphaGo and AlphaGo Zero.
The past 50 yr of advances in weed recognition technologies have poised site-specific weed control (SSWC) on the cusp of requisite performance for large-scale production systems. The technology offers improved management of diverse weed morphology over highly variable background environments. SSWC enables the use of nonselective weed control options, such as lasers and electrical weeding, as feasible in-crop selective alternatives to herbicides by targeting individual weeds. This review looks at the progress made over this half-century of research and its implications for future weed recognition and control efforts; summarizing advances in computer vision techniques and the most recent deep convolutional neural network (CNN) approaches to weed recognition. The first use of CNNs for plant identification in 2015 began an era of rapid improvement in algorithm performance on larger and more diverse datasets. These performance gains and subsequent research have shown that the variability of large-scale cropping systems is best managed by deep learning for in-crop weed recognition. The benefits of deep learning and improved accessibility to open-source software and hardware tools has been evident in the adoption of these tools by weed researchers and the increased popularity of CNN-based weed recognition research. The field of machine learning holds substantial promise for weed control, especially the implementation of truly integrated weed management strategies. Whereas previous approaches sought to reduce environmental variability or manage it with advanced algorithms, research in deep learning architectures suggests that large-scale, multi-modal approaches are the future for weed recognition.
Understanding the meteorological drivers of extreme impacts in social or environmental systems is important to better quantify current and project future climate risks. Impacts are typically an aggregated response to many different interacting drivers at various temporal scales, rendering such driver identification a challenging task. Machine learning–based approaches, such as deep neural networks, may be able to address this task but require large training datasets. Here, we explore the ability of Convolutional Neural Networks (CNNs) to predict years with extremely low gross primary production (GPP) from daily weather data in three different vegetation types. To circumvent data limitations in observations, we simulate 100,000 years of daily weather with a weather generator for three different geographical sites and subsequently simulate vegetation dynamics with a complex vegetation model. For each resulting vegetation distribution, we then train two different CNNs to classify daily weather data (temperature, precipitation, and radiation) into years with extremely low GPP and normal years. Overall, prediction accuracy is very good if the monthly or yearly GPP values are used as an intermediate training target (area under the precision-recall curve AUC $ \ge $ 0.9). The best prediction accuracy is found in tropical forests, with temperate grasslands and boreal forests leading to comparable results. Prediction accuracy is strongly reduced when binary classification is used directly. Furthermore, using daily GPP during training does not improve the predictive power. We conclude that CNNs are able to predict extreme impacts from complex meteorological drivers if sufficient data are available.
The Lee–Carter model has become a benchmark in stochastic mortality modeling. However, its forecasting performance can be significantly improved upon by modern machine learning techniques. We propose a convolutional neural network (NN) architecture for mortality rate forecasting, empirically compare this model as well as other NN models to the Lee–Carter model and find that lower forecast errors are achievable for many countries in the Human Mortality Database. We provide details on the errors and forecasts of our model to make it more understandable and, thus, more trustworthy. As NN by default only yield point estimates, previous works applying them to mortality modeling have not investigated prediction uncertainty. We address this gap in the literature by implementing a bootstrapping-based technique and demonstrate that it yields highly reliable prediction intervals for our NN model.
Determining the composition of a mixed material is an open problem that has attracted the interest of researchers in many fields. In our recent work, we proposed a novel approach to determine the composition of a mixed material using convolutional neural networks (CNNs). In machine learning, a model “learns” a specific task for which it is designed through data. Hence, obtaining a dataset of mixed materials is required to develop CNNs for the task of estimating the composition. However, the proposed method instead creates the synthetic data of mixed materials generated from using only images of pure materials present in those mixtures. Thus, it eliminates the prohibitive cost and tedious process of collecting images of mixed materials. The motivation for this study is to provide mathematical details of the proposed approach in addition to extensive experiments and analyses. We examine the approach on two datasets to demonstrate the ease of extending the proposed approach to any mixtures. We perform experiments to demonstrate that the proposed approach can accurately determine the presence of the materials, and sufficiently estimate the precise composition of a mixed material. Moreover, we provide analyses to strengthen the validation and benefits of the proposed approach.
We propose a new neighbouring prediction model for mortality forecasting. For each mortality rate at age x in year t, mx,t, we construct an image of neighbourhood mortality data around mx,t, that is, Ꜫmx,t (x1, x2, s), which includes mortality information for ages in [x-x1, x+x2], lagging k years (1 ≤ k ≤ s). Combined with the deep learning model – convolutional neural network, this framework is able to capture the intricate nonlinear structure in the mortality data: the neighbourhood effect, which can go beyond the directions of period, age, and cohort as in classic mortality models. By performing an extensive empirical analysis on all the 41 countries and regions in the Human Mortality Database, we find that the proposed models achieve superior forecasting performance. This framework can be further enhanced to capture the patterns and interactions between multiple populations.
In this paper, we develop a methodology to automatically classify claims using the information contained in text reports (redacted at their opening). From this automatic analysis, the aim is to predict if a claim is expected to be particularly severe or not. The difficulty is the rarity of such extreme claims in the database, and hence the difficulty, for classical prediction techniques like logistic regression to accurately predict the outcome. Since data is unbalanced (too few observations are associated with a positive label), we propose different rebalance algorithm to deal with this issue. We discuss the use of different embedding methodologies used to process text data, and the role of the architectures of the networks.
We cannot miss deep learning in a modern pattern recognition textbook, and we introduce CNN (convolutional neural networks) in this chapter. Although the mathematical derivation of CNN, especially the back-propagation process and gradient computation, is complex, we use a lot of useful tools to help readers understand what exactlyis going on in a CNN. Hence, this chapter focuses on accessibility rather than completeness. In its exercise problems, we introduce more relevant topics and methods.
Graphlet counting is a widely explored problem in network analysis and has been successfully applied to a variety of applications in many domains, most notatbly bioinformatics, social science, and infrastructure network studies. Efficiently computing graphlet counts remains challenging due to the combinatorial explosion, where a naive enumeration algorithm needs O(Nk) time for k-node graphlets in a network of size N. Recently, many works introduced carefully designed combinatorial and sampling methods with encouraging results. However, the existing methods ignore the fact that graphlet counts and the graph structural information are correlated. They always consider a graph as a new input and repeat the tedious counting procedure on a regular basis even if it is similar or exactly isomorphic to previously studied graphs. This provides an opportunity to speed up the graphlet count estimation procedure by exploiting this correlation via learning methods. In this paper, we raise a novel graphlet count learning (GCL) problem: given a set of historical graphs with known graphlet counts, how to learn to estimate/predict graphlet count for unseen graphs coming from the same (or similar) underlying distribution. We develop a deep learning framework which contains two convolutional neural network models and a series of data preprocessing techniques to solve the GCL problem. Extensive experiments are conducted on three types of synthetic random graphs and three types of real-world graphs for all 3-, 4-, and 5-node graphlets to demonstrate the accuracy, efficiency, and generalizability of our framework. Compared with state-of-the-art exact/sampling methods, our framework shows great potential, which can offer up to two orders of magnitude speedup on synthetic graphs and achieve on par speed on real-world graphs with competitive accuracy.
The user's gaze can provide important information for human–machine interaction, but the analysis of manual gaze data is extremely time-consuming, inhibiting wide adoption in usability studies. Existing methods for automated areas of interest (AOI) analysis cannot be applied to tangible products with a screen-based user interface (UI), which have become ubiquitous in everyday life. The objective of this paper is to present and evaluate a method to automatically map the user's gaze to dynamic AOIs on tangible screen-based UIs based on computer vision and deep learning. This paper presents an algorithm for automated Dynamic AOI Mapping (aDAM), which allows the automated mapping of gaze data recorded with mobile eye tracking to the predefined AOIs on tangible screen-based UIs. The evaluation of the algorithm is performed using two medical devices, which represent two extreme examples of tangible screen-based UIs. The different elements of aDAM are examined for accuracy and robustness, as well as the time saved compared to manual mapping. The break-even point for an analyst's effort for aDAM compared to manual analysis is found to be 8.9 min gaze data time. The accuracy and robustness of both the automated gaze mapping and the screen matching indicate that aDAM can be applied to a wide range of products. aDAM allows, for the first time, automated AOI analysis of tangible screen-based UIs with AOIs that dynamically change over time. The algorithm requires some additional initial input for the setup and training, but analyzed gaze data duration and effort is only determined by computation time and does not require any additional manual work thereafter. The efficiency of the approach has the potential for a broader adoption of mobile eye tracking in usability testing for the development of new products and may contribute to a more data-driven usability engineering process in the future.
The selection of the correct convergence angle is essential for achieving the highest resolution imaging in scanning transmission electron microscopy (STEM). The use of poor heuristics, such as Rayleigh's quarter-phase rule, to assess probe quality and uncertainties in the measurement of the aberration function results in the incorrect selection of convergence angles and lower resolution. Here, we show that the Strehl ratio provides an accurate and efficient way to calculate criteria for evaluating the probe size for STEM. A convolutional neural network trained on the Strehl ratio is shown to outperform experienced microscopists at selecting a convergence angle from a single electron Ronchigram using simulated datasets. Generating tens of thousands of simulated Ronchigram examples, the network is trained to select convergence angles yielding probes on average 85% nearer to optimal size at millisecond speeds (0.02% of human assessment time). Qualitative assessment on experimental Ronchigrams with intentionally introduced aberrations suggests that trends in the optimal convergence angle size are well modeled but high accuracy requires a high number of training datasets. This near-immediate assessment of Ronchigrams using the Strehl ratio and machine learning highlights a viable path toward the rapid, automated alignment of aberration-corrected electron microscopes.