We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
While driver telematics has gained attention for risk classification in auto insurance, scarcity of observations with telematics features has been problematic, which could be owing to either privacy concerns or favorable selection compared to the data points with traditional features.
To handle this issue, we apply a data integration technique based on calibration weights for usage-based insurance with multiple sources of data. It is shown that the proposed framework can efficiently integrate traditional data and telematics data and can also deal with possible favorable selection issues related to telematics data availability. Our findings are supported by a simulation study and empirical analysis in a synthetic telematics dataset.
Precision Medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle. Autoimmune diseases are those in which the body’s natural defense system loses discriminating power between its own cells and foreign cells, causing the body to mistakenly attack healthy tissues. These conditions are very heterogeneous in their presentation and therefore difficult to diagnose and treat. Achieving precision medicine in autoimmune diseases has been challenging due to the complex etiologies of these conditions, involving an interplay between genetic, epigenetic, and environmental factors. However, recent technological and computational advances in molecular profiling have helped identify patient subtypes and molecular pathways which can be used to improve diagnostics and therapeutics. This review discusses the current understanding of the disease mechanisms, heterogeneity, and pathogenic autoantigens in autoimmune diseases gained from genomic and transcriptomic studies and highlights how these findings can be applied to better understand disease heterogeneity in the context of disease diagnostics and therapeutics.
Today, there is a growing movement to use accumulated archaeological information to contribute to discussions of general issues facing human societies, including our own. In this regard, the archaeological record is most unique and helpful when viewed at broad comparative scales. Most relevant data for these sorts of analyses are collected through the cultural resource management (CRM) process. Still, by and large, interpretation remains limited to individual projects, and data integration across projects is nearly nonexistent. What would it take for CRM to achieve real data integration? In this article, we discuss these issues and suggest one potential solution. The most pressing need we identify is for data products that integrate the primary data emanating from CRM at broad spatial and temporal scales, which are suitable for research by archaeologists and other social scientists. We argue that the time is right for the discipline to invest in organizations that produce such products.
Most archaeological investigations in the United States and other countries must comply with preservation laws, especially if they are on government property or supported by government funding. Academic and cultural resource management (CRM) studies have explored various social, temporal, and environmental contexts and produce an ever-increasing volume of archaeological data. More and more data are born digital, and many legacy data are digitized. There is a building effort to synthesize and integrate data at a massive scale and create new data standards and management systems. Taxpayer dollars often fund archaeological studies that are intended, in spirit, to promote historic preservation and provide public benefits. However, the resulting data are difficult to access and interoperationalize, and they are rarely collected and managed with their long-term security, accessibility, and ethical reuse in mind. Momentum is building toward open data and open science as well as Indigenous data sovereignty and governance. The field of archaeology is reaching a critical point where consideration of diverse constituencies, concerns, and requirements is needed to plan data collection and management approaches moving forward. This theme issue focuses on challenges and opportunities in archaeological data collection and management in academic and CRM contexts.
Genetics has been an important tool for discovering new aspects of biology across life. In humans, there is growing momentum behind the application of this knowledge to drive innovation in clinical care, most notably through developments in precision medicine. Nowhere has the impact of genetics on clinical practice been more striking than in the field of rare disorders. For most of these conditions, individual disease susceptibility is influenced by DNA sequence variation in a single or a small number of genes. In contrast, most common disorders are multifactorial and are caused by a complex interplay of multiple genetic, environmental and stochastic factors. The longstanding division of human disease genetics into rare and common components has obscured the continuum of human traits and echoes aspects of the century-old debate between the Mendelian and biometric views of human genetics. In this article, we discuss the differences in data and concepts between rare and common disease genetics. Opportunities to unify these two areas are noted and the importance of adopting a holistic perspective that integrates diverse genetic and environmental factors is discussed.
This chapter discusses several key directions such as data analytics in cyberphysical systems, multidomain mining, machine Learning concepts such as deep learning, generative adversarial networks, and challenges of model reuse. Last but not the least, the chapter closes with thoughts on ethical thinking in the data analytics process.
Chapter one sets the context of linked data for research. It describes the ways in which linked data is being used to improve diagnosis, treatment and healthcare delivery and to understand the drivers of health. The advantages of using linked data for research are discussed. The chapter surveys the kinds of data currently being linked for research and different linkage methods and considers the potential and challenges for future international data linkage.
Health research around the world relies on access to data, and much of the most valuable, reliable, and comprehensive data collections are held by governments. These collections, which contain data on whole populations, are a powerful tool in the hands of researchers, especially when they are linked and analyzed, and can help to address “wicked problems” in health and emerging global threats such as COVID-19. At the same time, these data collections contain sensitive information that must only be used in ways that respect the values, interests, and rights of individuals and their communities. Sharing Linked Data for Health Research provides a template for allowing research access to government data collections in a regulatory environment designed to build social license while supporting the research enterprise.
Industrial Data Analytics needs access to huge amounts of data, which is scattered across different IT systems. As part of an integrated reference kit for Industrial Data Analytics, there is a need for a data backend system that provides access to data. This system needs to have solutions for the extraction of data, the management of data and an analysis pipeline for those data. This paper presents an approach for this data backend system.
Chapter one sets the context of linked data for research. It describes the ways in which linked data is being used to improve diagnosis, treatment and healthcare delivery and to understand the drivers of health.The advantages of using linked data for research are discussed. The chapter surveys the kinds of data currently being linked for research and different linkage methods and considers the potential and challenges for future international data linkage.
The authors explain in this work a new approach to observing and controlling linear systems whose inputs and outputs are not fixed in advance. They cover a class of linear time-invariant state/signal system that is general enough to include most of the standard classes of linear time-invariant dynamical systems, but simple enough that it is easy to understand the fundamental principles. They begin by explaining the basic theory of finite-dimensional and bounded systems in a way suitable for graduate courses in systems theory and control. They then proceed to the more advanced infinite-dimensional setting, opening up new ways for researchers to study distributed parameter systems, including linear port-Hamiltonian systems and boundary triplets. They include the general non-passive part of the theory in continuous and discrete time, and provide a short introduction to the passive situation. Numerous examples from circuit theory are used to illustrate the theory.
Low-accruing clinical trials delay translation of research breakthroughs into the clinic, expose participants to risk without providing meaningful clinical insight, increase the cost of therapies, and waste limited resources. By tracking patient accrual, Clinical and Translational Science Awards hubs can identify at-risk studies and provide them the support needed to reach recruitment goals and maintain financial solvency. However, tracking accrual has proved challenging because relevant patient- and protocol-level data often reside in siloed systems. To address this fragmentation, in September 2020 the South Carolina Clinical and Translational Research Institute, with an academic home at the Medical University of South Carolina, implemented a clinical trial management system (CTMS), with its access to patient-level data, and incorporated it into its Research Integrated Network of Systems (RINS), which links study-level data across disparate systems relevant to clinical research. Within the first year of CTMS implementation, 324 protocols were funneled through CTMS/RINS, with more than 2600 participants enrolled. Integrated data from CTMS/RINS have enabled near-real-time assessment of patient accrual and accelerated reimbursement from industry sponsors. For institutions with bioinformatics or programming capacity, the CTMS/RINS integration provides a powerful model for tracking and improving clinical trial efficiency, compliance, and cost-effectiveness.
This article presents the background to and prospects for a new initiative in archaeological field survey and database integration. The Roman Hinterland Project combines data from the Tiber Valley Project, Roman Suburbium Project, and the Pontine Region Project into a single database, which the authors believe to be one of the most complete repositories of data for the hinterland of a major ancient metropolis, covering nearly 2000 years of history. The logic of combining these databases in the context of studying the Roman landscape is explained and illustrated with analyses that show their capacity to contribute to major debates in Roman economy, demography, and the longue durée of the human condition in a globalizing world.
Personalized medicine has exposed wearable sensors as new sources of biomedical data which are expected to accrue annual data storage costs of approximately $7.2 trillion by 2020 (>2000 exabytes). To improve the usability of wearable devices in healthcare, it is necessary to determine the minimum amount of data needed for accurate health assessment.
Methods:
Here, we present a generalizable optimization framework for determining the minimum necessary sampling rate for wearable sensors and apply our method to determine optimal optical blood volume pulse sampling rate. We implement t-tests, Bland–Altman analysis, and regression-based visualizations to identify optimal sampling rates of wrist-worn optical sensors.
Results:
We determine the optimal sampling rate of wrist-worn optical sensors for heart rate and heart rate variability monitoring to be 21–64 Hz, depending on the metric.
Conclusions:
Determining the optimal sampling rate allows us to compress biomedical data and reduce storage needs and financial costs. We have used optical heart rate sensors as a case study for the connection between data volumes and resource requirements to develop methodology for determining the optimal sampling rate for clinical relevance that minimizes resource utilization. This methodology is extensible to other wearable sensors.
The Human Brain Project (HBP), an EU Flagship Initiative, is currently building an infrastructure that will allow integration of large amounts of heterogeneous neuroscience data. The ultimate goal of the project is to develop a unified multi-level understanding of the brain and its diseases, and beyond this to emulate the computational capabilities of the brain. Reference atlases of the brain are one of the key components in this infrastructure. Based on a new generation of three-dimensional (3D) reference atlases, new solutions for analyzing and integrating brain data are being developed. HBP will build services for spatial query and analysis of brain data comparable to current online services for geospatial data. The services will provide interactive access to a wide range of data types that have information about anatomical location tied to them. The 3D volumetric nature of the brain, however, introduces a new level of complexity that requires a range of tools for making use of and interacting with the atlases. With such new tools, neuroscience research groups will be able to connect their data to atlas space, share their data through online data systems, and search and find other relevant data through the same systems. This new approach partly replaces earlier attempts to organize research data based only on a set of semantic terminologies describing the brain and its subdivisions.
Genebanks play an important role in the conservation of global plant biodiversity. The European Search Catalogue for Plant Genetic Resources (EURISCO) was created as a central entry point to provide information on these collections. However, a major challenge lies in the heterogeneity of scientific plant names. This makes the selection of suitable plant material, e.g. for research or breeding purposes, significantly more difficult. For this reason, the taxonomic backbone of EURISCO has been completely revised. Search terms entered by users are now automatically checked against taxonomic reference repositories, allowing a variety of synonyms to be identified. In addition, a fuzzy search has been implemented, which makes the search function tolerant of erroneous data (e.g. caused by typing errors). Besides improvements of the search interface, more support will be given to EURISCO's data providers. The new developments provide a tool that makes it easier to identify problem cases within the data, such as accepted/non-accepted taxonomic names, and will successively improve the quality of taxonomic information in EURISCO.
In this paper we use formal tools from category theory to develop a foundation for creating and managing models in systems where knowledge is distributed across multiple representations and formats. We define a class of models which incorporate three different representations---computations, logical semantics, and data--as well as model mappings (functors) to establish relationships between them. We prove that our models support model merge operations called colimits and use these to define a methodology for model integration.
The objective of this paper is to accurately determine mobile robots' position and orientation by integrating information received from odometry and an inertial sensor. The position and orientation provided by odometry are subject to different types of errors. To improve the odometry, an inertial measurement unit is exploited to give more reliable attitude information. However, the nonlinear dynamic of these systems and their complexities such as different sources of errors make navigation difficult. Since the dynamic models of navigation systems are nonlinear in practice, in this study, a Cubature Kalman Filter (CKF) has been proposed to estimate and correct the errors of these systems. The information from odometry and a gyroscope are integrated using a CKF. Simulation results are provided to illustrate the superiority and the higher reliability of the proposed approach in comparison with conventional nonlinear filtering algorithms such as an Extended Kalman Filter (EKF).
The need for coordinated regional and global electronic databases to assist prevention, early detection, rapid response, and control of biological invasions is well accepted. The Pacific Basin Information Node (PBIN), a node of the National Biological Information Infrastructure, has been increasingly engaged in the invasive species enterprise since its establishment in 2001. Since this time, PBIN has sought to support frontline efforts at combating invasions, through working with stakeholders in conservation, agriculture, forestry, health, and commerce to support joint information needs. Although initial emphasis has been on Hawaii, cooperative work with other Pacific islands and countries of the Pacific Rim is already underway and planned.
The field of disease ecology – the study of the spread and impact of parasites and pathogens within their host populations and communities – has a long history of using mathematical models. Dating back over 100 years, researchers have used mathematics to describe the spread of disease-causing agents, understand the relationship between host density and transmission and plan control strategies. The use of mathematical modelling in disease ecology exploded in the late 1970s and early 1980s through the work of Anderson and May (Anderson and May, 1978, 1981, 1992; May and Anderson, 1978), who developed the fundamental frameworks for studying microparasite (e.g. viruses, bacteria and protozoa) and macroparasite (e.g. helminth) dynamics, emphasizing the importance of understanding features such as the parasite's basic reproduction number (R0) and critical community size that form the basis of disease ecology research to this day. Since the initial models of disease population dynamics, which primarily focused on human diseases, theoretical disease research has expanded hugely to encompass livestock and wildlife disease systems, and also to explore evolutionary questions such as the evolution of parasite virulence or drug resistance. More recently there have been efforts to broaden the field still further, to move beyond the standard ‘one-host-one-parasite’ paradigm of the original models, to incorporate many aspects of complexity of natural systems, including multiple potential host species and interactions among multiple parasite species.