We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Environmental enrichment programmes are widely used to improve welfare of captive and laboratory animals, especially non-human primates. Monitoring enrichment use over time is crucial, as animals may habituate and reduce their interaction with it. In this study we aimed to monitor the interaction with enrichment items in groups of rhesus macaques (Macaca mulatta), each consisting of an average of ten individuals, living in a breeding colony. To streamline the time-intensive task of assessing enrichment programmes we automated the evaluation process by using machine learning technologies. We built two computer vision-based pipelines to evaluate monkeys’ interactions with different enrichment items: a white drum containing raisins and a non-food-based puzzle. The first pipeline analyses the usage of enrichment items in nine groups, both when it contains food and when it is empty. The second pipeline counts the number of monkeys interacting with a puzzle across twelve groups. The data derived from the two pipelines reveal that the macaques consistently express interest in the food-based white drum enrichment, even several months after its introduction. The puzzle enrichment was monitored for one month, showing a gradual decline in interaction over time. These pipelines are valuable for assessing enrichment by minimising the time spent on animal observation and data analysis; this study demonstrates that automated methods can consistently monitor macaque engagement with enrichments, systematically tracking habituation responses and long-term effectiveness. Such advancements have significant implications for enhancing animal welfare, enabling the discontinuation of ineffective enrichments and the adaptation of enrichment plans to meet the animals’ needs.
Treating images as data has become increasingly popular in political science. While existing classifiers for images reach high levels of accuracy, it is difficult to systematically assess the visual features on which they base their classification. This paper presents a two-level classification method that addresses this transparency problem. At the first stage, an image segmenter detects the objects present in the image and a feature vector is created from those objects. In the second stage, this feature vector is used as input for standard machine learning classifiers to discriminate between images. We apply this method to a new dataset of more than 140,000 images to detect which ones display political protest. This analysis demonstrates three advantages to this paper’s approach. First, identifying objects in images improves transparency by providing human-understandable labels for the objects shown on an image. Second, knowing these objects enables analysis of which distinguish protest images from non-protest ones. Third, comparing the importance of objects across countries reveals how protest behavior varies. These insights are not available using conventional computer vision classifiers and provide new opportunities for comparative research.
Fast and efficient identification is critical for reducing the likelihood of weed establishment and for appropriately managing established weeds. Traditional identification tools require either knowledge of technical morphological terminology or time-consuming image matching by the user. In recent years, deep learning computer vision models have become mature enough to enable automatic identification. The major remaining bottlenecks are the availability of a sufficient number of high-quality, reliably identified training images and the user-friendly, mobile operationalization of the technology. Here, we present the first weed identification and reporting app and website for all of Australia. It includes an image classification model covering more than 400 species of weeds and some Australian native relatives, with a focus on emerging biosecurity threats and spreading weeds that can still be eradicated or contained. It links the user to additional information provided by state and territory governments, flags species that are locally reportable or notifiable, and allows the creation of observation records in a central database. State and local weed officers can create notification profiles to be alerted of relevant weed observations in their area. We discuss the background of the WeedScan project, the approach taken in design and software development, the photo library used for training the WeedScan image classifier, the model itself and its accuracy, and technical challenges and how these were overcome.
Visual odometry (VO) is a key technology for estimating camera motion from captured images. In this paper, we propose a novel RGB-D visual odometry by constructing and matching features at the superpixel level that represents better adaptability in different environments than state-of-the-art solutions. Superpixels are content-sensitive and perform well in information aggregation. They could thus characterize the complexity of the environment. Firstly, we designed the superpixel-based feature SegPatch and its corresponding 3D representation MapPatch. By using the neighboring information, SegPatch robustly represents its distinctiveness in various environments with different texture densities. Due to the inclusion of depth measurement, the MapPatch constructs the scene structurally. Then, the distance between SegPatches is defined to characterize the regional similarity. We use the graph search method in scale space for searching and matching. As a result, the accuracy and efficiency of matching process are improved. Additionally, we minimize the reprojection error between the matched SegPatches and estimate camera poses through all these correspondences. Our proposed VO is evaluated on the TUM dataset both quantitatively and qualitatively, showing good balance to adapt to the environment under different realistic conditions.
With the rise of deep reinforcement learning (RL) methods, many complex robotic manipulation tasks are being solved. However, harnessing the full power of deep learning requires large datasets. Online RL does not suit itself readily into this paradigm due to costly and time-consuming agent-environment interaction. Therefore, many offline RL algorithms have recently been proposed to learn robotic tasks. But mainly, all such methods focus on a single-task or multitask learning, which requires retraining whenever we need to learn a new task. Continuously learning tasks without forgetting previous knowledge combined with the power of offline deep RL would allow us to scale the number of tasks by adding them one after another. This paper investigates the effectiveness of regularisation-based methods like synaptic intelligence for sequentially learning image-based robotic manipulation tasks in an offline-RL setup. We evaluate the performance of this combined framework against common challenges of sequential learning: catastrophic forgetting and forward knowledge transfer. We performed experiments with different task combinations to analyse the effect of task ordering. We also investigated the effect of the number of object configurations and the density of robot trajectories. We found that learning tasks sequentially helps in the retention of knowledge from previous tasks, thereby reducing the time required to learn a new task. Regularisation-based approaches for continuous learning, like the synaptic intelligence method, help mitigate catastrophic forgetting but have shown only limited transfer of knowledge from previous tasks.
Artificial Intelligence (AI) is reshaping the world as we know it, impacting all aspects of modern society, basically due to the advances in computer power, data availability and AI algorithms. The dairy sector is also on the move, from the exponential growth in AI research, to ready to use AI-based products, this new evolution to Dairy 4.0 represents a potential ‘game-changer’ for the dairy sector, to confront challenges regarding sustainability, welfare, and profitability. This research reflection explores the possible impact of AI, discusses the main drivers in the field and describes its origins, challenges, and opportunities. Further, we present a multidimensional vision considering factors that are not commonly considered in dairy research, such as geopolitical aspects and legal regulations that can have an impact on the application of AI in the dairy sector. This is just the beginning of the third tide of AI, and a future is still ahead. For now, the current advances in AI at on-farm level seem limited and based on the revised data, we believe that AI can be a ‘game-changer’ only if it is integrated with other components of Dairy 4.0 (such as robotics) and is fully adopted by dairy farmers.
This chapter introduces the reader to facial recognition technology (FRT) history and the development of FRT from the perspective of science and technologies studies. Beginning with the traditionally accepted origins of FRT in 1964–1965, developed by Woody Bledsoe, Charles Bisson, and Helen Wolf Chan in the United States, Simon Taylor discusses how FRT builds on earlier applications in mug shot profiling, imaging, biometrics, and statistical categorisation. Grounded in the history of science and technology, the chapter demonstrates how critical aspects of FRT infrastructure are aided by scientific and cultural innovations from different times of locations: that is, mugshots in eighteenth-century France; mathematical analysis of caste in nineteenth-century British India; innovations by Chinese closed-circuit television companies and computer vision start-ups conducting bio-security experiments on farm animals. This helps to understand FRT development beyond the United States-centred narrative. The aim is to deconstruct historical data, mathematical, and digital materials that act as ‘back-stage elements’ to FRT and are not so easily located in infrastructure yet continue to shape uses today. Taylor’s analysis lays a foundation for the kinds of frameworks that can better help regulate and govern FRT as a means for power over populations in the following chapters.
Varietal identification plays a pivotal role in viticulture for several purposes. Nowadays, such identification is accomplished using ampelography and molecular markers, techniques requiring specific expertise and equipment. Deep learning, on the other hand, appears to be a viable and cost-effective alternative, as several recent studies claim that computer vision models can identify different vine varieties with high accuracy. Such works, however, limit their scope to a handful of selected varieties and do not provide accurate figures for external data validation. In the current study, five well-known computer vision models were applied to leaf images to verify whether the results presented in the literature can be replicated over a larger data set consisting of 27 varieties with 26 382 images. It was built over 2 years of dedicated field sampling at three geographically distinct sites, and a validation data set was collected from the Internet. Cross-validation results on the purpose-built data set confirm literature results. However, the same models, when validated against the independent data set, appear unable to generalize over the training data and retain the performances measured during cross validation. These results indicate that further enhancement have been done in filling such a gap and developing a more reliable model to discriminate among grape varieties, underlining that, to achieve this purpose, the image resolution appears to be a crucial factor in the development of such models.
This article designs a robotic Chinese character writing system that can resist random human interference. Firstly, an innovative stroke extraction method of Chinese characters was devised. A basic Chinese character stroke extraction method based on cumulative direction vectors is used to extract the components that make up the strokes of Chinese characters. The components are then stitched together into strokes based on the sequential base stroke joining method. To enable the robot to imitate handwriting Chinese character skills, we utilised stroke information as the demonstration and modelled the skills using dynamic movement primitives (DMPs). To suppress random human interference, this article combines improved DMPs and conductance control to adjust robot trajectories based on real-time visual measurements. The experimental results show that the proposed method can accurately extract the strokes of most Chinese characters. The designed trajectory adjustment method offers better smoothness and robustness than direct rotating and translating curves. The robot is able to adjust its posture and trajectory in real time to eliminate the negative impacts of human interference.
Bag manipulation through robots is complex and challenging due to the deformability of the bag. Based on the dynamic manipulation strategy, we propose a new framework, ShakingBot, for the bagging tasks. ShakingBot utilizes a perception module to identify the key region of the plastic bag from arbitrary initial configurations. According to the segmentation, ShakingBot iteratively executes a novel set of actions, including Bag Adjustment, Dual-arm Shaking, and One-arm Holding, to open the bag. The dynamic action, Dual-arm Shaking, can effectively open the bag without the need to take into account the crumpled configuration. Then, the robot inserts the items and lifts the bag for transport. We perform our method on a dual-arm robot and achieve a success rate of 21/33 for inserting at least one item across various initial bag configurations. In this work, we demonstrate the performance of dynamic shaking action compared to the quasi-static manipulation in the bagging task. We also show that our method generalizes to variations despite the bag’s size, pattern, and color. Supplementary material is available at https://github.com/zhangxiaozhier/ShakingBot.
The demand for flexible grasping of various objects by robotic hands in the industry is rapidly growing. To address this, we propose a novel variable stiffness gripper (VSG). The VSG design is based on a parallel-guided beam structure inserted by a slider from one end, allowing stiffness variation by changing the length of the parallel beams participating in the system. This design enables continuous adjustment between high compliance and high stiffness of the gripper fingers, providing robustness through its mechanical structure. The linear analytical model of the deflection and stiffness of the parallel beam is derived, which is suitable for small and medium deflections. The contribution of each parameter of the parallel beam to the stiffness is analyzed and discussed. Also, a prototype of the VSG is developed, achieving a stiffness ratio of 70.9, which is highly competitive. Moreover, a vision-based force sensing method utilizing ArUco markers is proposed as a replacement for traditional force sensors. By this method, the VSG is capable of closed-loop control during the grasping process, ensuring efficiency and safety under a well-defined grasping strategy framework. Experimental tests are conducted to emphasize the importance and safety of stiffness variation. In addition, it shows the high performance of the VSG in adaptive grasping for asymmetric scenarios and its ability to flexible grasping for objects with various hardness and fragility. These findings provide new insights for future developments in the field of variable stiffness grippers.
Computer vision and machine learning are rapidly advancing fields of study. For better or worse, these tools have already permeated our everyday lives and are used for everything from auto-tagging social media images to curating what we view in our news feed. In this chapter, we discuss historical and contemporary approaches used to study face recognition, detection, manipulation, and generation. We frame our discussion within the context of how this work has been applied to the study of older adults, but also acknowledge that more work is needed both within this domain as well as at its intersection with, e.g., race and gender. Throughout the chapter we review, and at the end offer links to (Table 11.1), a number of resources that researchers can start using now in their research. We also discuss ongoing concerns related to the ethics of artificial intelligence and to using this emerging technology responsibly.
Analyzing the appearances of political figures in large-scale news archives is increasingly important with the growing availability of large-scale news archives and developments in computer vision. We present a deep learning-based method combining face detection, tracking, and classification, which is particularly unique because it does not require any re-training when targeting new individuals. Users can feed only a few images of target individuals to reliably detect, track, and classify them. Extensive validation of prominent political figures in two news archives spanning 10 to 20 years, one containing three U.S. cable news and the other including two major Japanese news programs, consistently shows high performance and flexibility of the proposed method. The codes are made readily available to the public.
In recent years, deep learning-based robotic grasping methods have surpassed analytical methods in grasping performance. Despite the results obtained, most of these methods use only planar grasps due to the high computational cost found in 6D grasps. However, planar grasps have spatial limitations that prevent their applicability in complex environments, such as grasping manufactured objects inside 3D printers. Furthermore, some robotic grasping techniques only generate one feasible grasp per object. However, it is necessary to obtain multiple possible grasps per object because not every grasp generated is kinematically feasible for the robot manipulator or does not collide with other close obstacles. Therefore, a new grasping pipeline is proposed to yield 6D grasps and select a specific object in the environment, preventing collisions with obstacles nearby. The grasping trials are performed in an additive manufacturing unit that has a considerable level of complexity due to the high chance of collision. The experimental results prove that it is possible to achieve a considerable success rate in grasping additive manufactured objects. The UR5 robot arm, Intel Realsense D435 camera, and Robotiq 2F-140 gripper are used to validate the proposed method in real experiments.
Site-specific weed management (on the scale of a few meters or less) has the potential to greatly reduce pesticide use and its associated environmental and economic costs. A prerequisite for site-specific weed management is the availability of accurate maps of the weed population that can be generated quickly and cheaply. Improvements and cost reductions in unmanned aerial vehicles (UAVs) and camera technology mean these tools are now readily available for agricultural use. We used UAVs to collect aerial images captured in both RGB and multispectral formats of 12 cereal fields (wheat [Triticum aestivum L.] and barley [Hordeum vulgare L.]) across eastern England. These data were used to train machine learning models to generate prediction maps of locations of black-grass (Alopecurus myosuroides Huds.), a prolific weed in UK cereal fields. We tested machine learning and data set resampling methods to obtain the most accurate system for predicting the presence and absence of weeds in new out-of-sample fields. The accuracy of the system in predicting the absence of A. myosuroides is 69% and its presence above 5 g in weight with 77% accuracy in new out-of-sample fields. This system generates prediction maps that can be used by either agricultural machinery or autonomous robotic platforms for precision weed management. Improvements to the accuracy can be made by increasing the number of fields and samples in the data set and the length of time over which data are collected to gather data across the entire growing season.
Artificial Intelligence plays a main role in supporting and improving smart manufacturing and Industry 4.0, by enabling the automation of different types of tasks manually performed by domain experts. In particular, assessing the compliance of a product with the relative schematic is a time-consuming and prone-to-error process. In this paper, we address this problem in a specific industrial scenario. In particular, we define a Neuro-Symbolic approach for automating the compliance verification of the electrical control panels. Our approach is based on the combination of Deep Learning techniques with Answer Set Programming (ASP), and allows for identifying possible anomalies and errors in the final product even when a very limited amount of training data is available. The experiments conducted on a real test case provided by an Italian Company operating in electrical control panel production demonstrate the effectiveness of the proposed approach.
Edited by
Alik Ismail-Zadeh, Karlsruhe Institute of Technology, Germany,Fabio Castelli, Università degli Studi, Florence,Dylan Jones, University of Toronto,Sabrina Sanchez, Max Planck Institute for Solar System Research, Germany
Abstract: Lava flow and lava dome growth are two main manifestations of effusive volcanic eruptions. Less-viscous lava tends to flow long distances depending on slope topography, heat exchange with the surroundings, eruption rate, and the erupted magma rheology. When magma is highly viscous, its eruption on the surface results in a lava dome formation, and an occasional collapse of the dome may lead to a pyroclastic flow. In this chapter, we consider two models of lava dynamics: a lava flow model to determine the internal thermal state of the flow from its surface thermal observations, and a lava dome growth model to determine magma viscosity from the observed lava dome morphological shape. Both models belong to a set of inverse problems. In the first model, the lava thermal conditions at the surface (at the interface between lava and the air) are known from observations, but its internal thermal state is unknown. A variational (adjoint) assimilation method is used to propagate the temperature and heat flow inferred from surface measurements into the interior of the lava flow. In the second model, the lava dome viscosity is estimated based on a comparison between the observed and simulated morphological shapes of lava dome shapes using computer vision techniques.
The effects of anthropogenic aerosol, solid or liquid particles suspended in the air, are the biggest contributor to uncertainty in current climate perturbations. Heavy industry sites, such as coal power plants and steel manufacturers, large sources of greenhouse gases, also emit large amounts of aerosol in a small area. This makes them ideal places to study aerosol interactions with radiation and clouds. However, existing data sets of heavy industry locations are either not public, or suffer from reporting gaps. Here, we develop a supervised deep learning algorithm to detect unreported industry sites in high-resolution satellite data, using the existing data sets for training. For the pipeline to be viable at global scale, we employ a two-step approach. The first step uses 10 m resolution data, which is scanned for potential industry sites, before using 1.2 m resolution images to confirm or reject detections. On held-out test data, the models perform well, with the lower resolution one reaching up to 94% accuracy. Deployed to a large test region, the first stage model yields many false positive detections. The second stage, higher resolution model shows promising results at filtering these out, while keeping the true positives, improving the precision to 42% overall, so that human review becomes feasible. In the deployment area, we find five new heavy industry sites which were not in the training data. This demonstrates that the approach can be used to complement existing data sets of heavy industry sites.
For greater autonomy of visual control-based solutions, especially applied to mobile robots, it is necessary to consider the existence of unevenness in the navigation surface, an intrinsic characteristic of several real applications. In general, depth information is essential for navigating three-dimensional environments and for the consistent parameter calibration of the visual models. This work proposes a new solution, including depth information in the visual path-following (VPF) problem, which allows the variation of the perception horizon at runtime while forcing the coupling between optical and geometric quantities. A new NMPC (nonlinear model predictive control) framework considering the addition of a new input to an original solution for the constrained VPF-NMPC allows the maintenance of low computational complexity. Experimental results in an outdoor environment with a medium-sized commercial robot demonstrate the correctness of the proposal.
The goal of few-shot semantic segmentation is to learn a segmentation model that can segment novel classes in queries when only a few annotated support examples are available. Due to large intra-class variations, the building of accurate semantic correlation remains a challenging job. Current methods typically use 4D kernels to learn the semantic correlation of feature maps. However, they still face the challenge of reducing the consumption of computation and memory while keeping the availability of correlations mined by their methods. In this paper, we propose the adaptively mining correlation network (AMCNet) to alleviate the aforementioned issues. The key points of AMCNet are the proposed adaptive separable 4D kernel and the learnable pyramid correlation module, which form the basic block for correlation encoder and provide a learnable concatenation operation over pyramid correlation tensors, respectively. Experiments on the PASCAL VOC 2012 dataset show that our AMCNet surpasses the state-of-the-art method by
$0.7\%$
and
$2.2\%$
on 1-shot and 5-shot segmentation scenarios, respectively.