We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Current World Health Organization (WHO) reports claim a decline in COVID-19 testing and reporting of new infections. To discuss the consequences of ignoring severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection, the endemic characteristics of the disease in 2023 with the ones estimated before using 2022 data sets are compared. The accumulated numbers of cases and deaths reported to the WHO by the 10 most infected countries and global figures were used to calculate the average daily numbers of cases DCC and deaths DDC per capita and case fatality rates (CFRs = DDC/DCC) for two periods in 2023. In some countries, the DDC values can be higher than the upper 2022 limit and exceed the seasonal influenza mortality. The increase in CFR in 2023 shows that SARS-CoV-2 infection is still dangerous. The numbers of COVID-19 cases and deaths per capita in 2022 and 2023 do not demonstrate downward trends with the increase in the percentages of fully vaccinated people and boosters. The reasons may be both rapid mutations of the coronavirus, which reduced the effectiveness of vaccines and led to a large number of re-infections, and inappropriate management.
Because conducting experimental coinfections is intractable in most parasite systems, inferences about the presence and strength of interspecific interactions in parasite communities are often made from analyses of field data. It is unclear whether methods used to test for competition are able to detect competition in field-collected datasets. Data from a study of the intestinal helminth communities of creek chub (Semotilus atromaculatus) were used to explore the potential of commonly available methods to detect negative interactions among parasite species in species-poor, low-intensity communities. Model communities were built in the absence of competition and then modified by four modes of competition. Both parametric and null model approaches were utilized to analyze modelled parasite communities to determine the conditions under which competitive interactions were discerned. Correlations had low Type I error rates but did not reliably detect competition, when present, at a statistically significant level. Results from logistical regressions were similar but showed improved statistical power. Results from null model approaches varied. Envelope analyses had near ideal properties when parasite prevalence was high but had high Type I error rates in low prevalence communities. Co-occurrence analyses demonstrated promising results with certain co-occurrence metrics and randomization algorithms, but also had many more cases of failure to detect competition when present and/or reject competition when it was absent. No analytical approach was clearly superior, and the variability observed in the present investigation mirrors similar efforts, suggesting that clear guidelines for detecting competition in parasite communities with observational data will be elusive.
This first chapter introduces our unique approach to teaching statistics. We note that while we review the statistical formulas for each method, we focus on the practical component of statistical analysis. We teach the readers how to apply and interpret the statistical methods and results. We then briefly describe the book’s content, which includes a concise explanation of the statistical techniques covered in each chapter. We end the chapter with suggestions on using the book to gain maximum benefit.
The recent progress of deep learning techniques has produced models capable of achieving high scores on traditional Natural Language Inference (NLI) datasets. To understand the generalization limits of these powerful models, an increasing number of adversarial evaluation schemes have appeared. These works use a similar evaluation method: they construct a new NLI test set based on sentences with known logic and semantic properties (the adversarial set), train a model on a benchmark NLI dataset, and evaluate it in the new set. Poor performance on the adversarial set is identified as a model limitation. The problem with this evaluation procedure is that it may only indicate a sampling problem. A machine learning model can perform poorly on a new test set because the text patterns presented in the adversarial set are not well represented in the training sample. To address this problem, we present a new evaluation method, the Invariance under Equivalence test (IE test). The IE test trains a model with sufficient adversarial examples and checks the model’s performance on two equivalent datasets. As a case study, we apply the IE test to the state-of-the-art NLI models using synonym substitution as the form of adversarial examples. The experiment shows that, despite their high predictive power, these models usually produce different inference outputs for equivalent inputs, and, more importantly, this deficiency cannot be solved by adding adversarial observations in the training data.
Cross-sectional studies are a type of observational studies in which the researcher commonly assesses the exposure, outcome, and other variables (such as confounding variables) at the same time. They are also referred to as “prevalence studies.” These studies are useful in a range of disciplines across the social and behavioral sciences. The common statistical estimates from these studies are correlation values, prevalence estimates, prevalence odds ratios, and prevalence ratios. These studies can be completed relatively quickly, are relatively inexpensive to conduct, and may be used to generate new hypotheses. However, the major limitation of these studies are biases due to sampling, length-time bias, same source bias, and the inability to have a clear temporal association between exposure and outcome in many scenarios. The researcher should be careful while interpreting the measure of association from these studies, as it may not be appropriate to make causal inferences from these associations.
Economic evaluations have been increasingly conducted in different countries to aid national decision-making bodies in resource allocation problems based on current and prospective evidence on costs and effects data for a set of competing health care interventions. In 2016, the Dutch National Health Care Institute issued new guidelines that aggregated and updated previous recommendations on key elements for conducting economic evaluation. However, the impact on standard practice after the introduction of the guidelines in terms of design, methodology and reporting choices, is still uncertain. To assess this impact, we examine and compare key analysis components of economic evaluations conducted in the Netherlands before (2010–2015) and after (2016–2020) the introduction of the recent guidelines. We specifically focus on two aspects of the analysis that are crucial in determining the plausibility of the results: statistical methodology and missing data handling. Our review shows how, over the last period, many components of economic evaluations have changed in accordance with the new recommendations towards more transparent and advanced analytic approaches. However, potential limitations are identified in terms of the use of less advanced statistical software together with rarely satisfactory information to support the choice of missing data methods, especially in sensitivity analysis.
Latent semantic analysis (LSA) and correspondence analysis (CA) are two techniques that use a singular value decomposition for dimensionality reduction. LSA has been extensively used to obtain low-dimensional representations that capture relationships among documents and terms. In this article, we present a theoretical analysis and comparison of the two techniques in the context of document-term matrices. We show that CA has some attractive properties as compared to LSA, for instance that effects of margins, that is, sums of row elements and column elements, arising from differing document lengths and term frequencies are effectively eliminated so that the CA solution is optimally suited to focus on relationships among documents and terms. A unifying framework is proposed that includes both CA and LSA as special cases. We empirically compare CA to various LSA-based methods on text categorization in English and authorship attribution on historical Dutch texts and find that CA performs significantly better. We also apply CA to a long-standing question regarding the authorship of the Dutch national anthem Wilhelmus and provide further support that it can be attributed to the author Datheen, among several contenders.
Chapter 3 subjects the theory’s macro-level hypotheses to statistical tests based on a comprehensive new dataset: the Performance of International Institutions Project (PIIP). It begins by describing the PIIP’s scope, contents, and sources. The empirical analysis is divided into four sections. The first examines the relationship between performance and policy autonomy. I find a positive association when policy autonomy is measured using a survey of international bureaucrats, a proxy for de facto policy autonomy, but no relationship when it is measured using formal rules, a proxy for de jure policy autonomy. The second section turns to the determinants of de facto policy autonomy, showing that the survey-based measure is positively predicted both by the quantity, depth, and breadth of operational alliances and by the exercise of governance tasks with high monitoring costs for states. In the third section, I employ a simultaneous equations strategy to isolate the effect of performance and de facto policy autonomy on one another. The fourth section summarizes a battery of robustness checks.
Quality of life (QoL) is an abstract construct that has been formally recognised and widely used in human medicine. In recent years, QoL has received increasing attention in animal and veterinary sciences, and the measurement of QoL has been a focus of research in both the human and animal fields. Lord Kelvin said “When you cannot measure it, when you cannot express it in numbers — you have scarcely in your thoughts, advanced to a stage of science, whatever the matter may be” (Lord Kelvin 1893). So are we able to measure animal QoL? The psychometric measurement principles for abstract constructs such as human intelligence have been well rehearsed and researched. Application of traditional and newer psychometric approaches is becoming more widespread as a result of increasing human and animal welfare expectations which have brought a greater emphasis on the individual. In recent decades the field of human medicine has developed valid measures of experienced pain and QoL of individuals, including those who are not capable of self-report. More recently, researchers who are interested in the measurement of animal pain and QoL have begun to use similar methodologies. In this paper, we will consider these methodologies and the opportunities and difficulties they present.
This chapter focuses on techniques of detecting anomalies, starting with some of the basic statistical techniques and going into data analytics techniques.
In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question’s answer by “connecting the dots” across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions, originating from comprehensive evaluations against nine existing systems, are (1) a case study in which two preference ordering approaches are systematically compared, and where the pointwise approach is shown to outperform the pairwise approach, thus adding to the existing survey of observations on this topic; (2) since our system outperforms a highly-effective TF-IDF-based IR technique by 3.5 and 4.9 points on the development and test sets, respectively, it demonstrates some of the further task improvement possibilities (e.g., in terms of an efficient learning algorithm, semantic features) on this task; (3) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (4) the human-engineered features make it an interpretable machine learning model for the task.
Social networks often play critical and complex roles in the recovery process. Recent advancements in methods for measuring and modeling social networks create opportunities to advance the science of recovery by testing theories and treatment models with a more explicit focus on the dynamic roles of social environments. This chapter provides a primer on several methods for measuring and modeling social network data, including those that use egocentric approaches (i.e., where social networks are modeled as characteristics of individuals) and sociocentric approaches (i.e., where social networks are modeled as complete entities of interconnected individuals). Aspects of measurement and analysis that commonly differ for social network data are highlighted. Examples of research on substance use disorders that apply these methods are also described to illustrate the types of insights that may be obtained using these approaches.
Scientific research leads to the generation of new knowledge and is either basic or applied. A question is formulated first, followed by the creation of a hypothesis that needs to be tested. Research in obstetrics and gynaecology can explain the physiology of the reproductive system, the physiology of pregnancy and various clinical abnormalities. Clinical research is carried out through clinical trials. Randomized controlled trials are the gold standard and are performed in four phases with the involvement of patients. Systematic reviews and meta-analyses form the basis of evidence-based medicine. Performance of high-quality research should be controlled by the rules and regulations of national research governance policy to ensure patient safety and ethical issues are adhered to. Research misconduct is a serious failure that leads to incorrect results. Researchers must be aware of the moral rules governing research as well as the consequences for themselves and science.
The past decade has witnessed some dramatic methodological changes in the wider disciplines of psycholinguistics, psychology, and experimental linguistics. One such set of changes comprises the development of open and transparent research practices, which have increasingly been adopted in response to concerns that empirical results often fail to replicate and may not generalise across samples and experimental conditions (Gibson & Fedorenko, 2013; Maxwell, Lau, & Howard, 2015; McElreath & Smaldino, 2015; Yarkoni, 2020). Another important set of changes concerns the use of sophisticated statistical techniques, such as mixed-effects models (Baayen, Davidson, & Bates, 2008) and Bayesian analyses (Vasishth, Nicenboim, Beckman, Li & Kong, 2018), which can provide much more information about magnitudes of effects and sources of variation than the more traditional statistical approaches.
Paradoxically, doing corpus linguistics is both easier and harder than it has ever been before. On the one hand, it is easier because we have access to more existing corpora, more corpus analysis software tools, and more statistical methods than ever before. On the other hand, reliance on these existing corpora and corpus linguistic methods can potentially create layers of distance between the researcher and the language in a corpus, making it a challenge to do linguistics with a corpus. The goal of this Element is to explore ways for us to improve how we approach linguistic research questions with quantitative corpus data. We introduce and illustrate the major steps in the research process, including how to: select and evaluate corpora, establish linguistically-motivated research questions, observational units and variables, select linguistically interpretable variables, understand and evaluate existing corpus software tools, adopt minimally sufficient statistical methods, and qualitatively interpret quantitative findings.
In machine-learning applications, data selection is of crucial importance if good runtime performance is to be achieved. In a scenario where the test set is accessible when the model is being built, training instances can be selected so they are the most relevant for the test set. Feature Decay Algorithms (FDA) are a technique for data selection that has demonstrated excellent performance in a number of tasks. This method maximizes the diversity of the n-grams in the training set by devaluing those ones that have already been included. We focus on this method to undertake deeper research on how to select better training data instances. We give an overview of FDA and propose improvements in terms of speed and quality. Using German-to-English parallel data, first we create a novel approach that decreases the execution time of FDA when multiple computation units are available. In addition, we obtain improvements on translation quality by extending FDA using information from the parallel corpus that is generally ignored.
Biostatistics with R provides a straightforward introduction on how to analyse data from the wide field of biological research, including nature protection and global change monitoring. The book is centred around traditional statistical approaches, focusing on those prevailing in research publications. The authors cover t-tests, ANOVA and regression models, but also the advanced methods of generalised linear models and classification and regression trees. Chapters usually start with several useful case examples, describing the structure of typical datasets and proposing research-related questions. All chapters are supplemented by example datasets, step-by-step R code demonstrating analytical procedures and interpretation of results. The authors also provide examples of how to appropriately describe statistical procedures and results of analyses in research papers. This accessible textbook will serve a broad audience, from students, researchers or professionals looking to improve their everyday statistical practice, to lecturers of introductory undergraduate courses. Additional resources are provided on www.cambridge.org/biostatistics.
Continuous point-of-care patient monitoring is now the standard in emergency room and critical care settings, and the technology to produce small, affordable, safe bedside vital sign monitors is ubiquitous. The statistical methods to validate these emerging monitoring technologies, however, are in their infancy. Validation statistics have centered on the Bland–Altman method and cardiac output measurement, but this method fails to evaluate the ability of a device to reliably detect serial changes (trend analysis). Newer statistical methods such as concordance and polar plots have been developed to assess trending. Small-sized studies assessing within-subject trending require other statistical approaches. Since clinical validation studies must be of a sufficient standard to be used in evidence-based reviews, researchers assessing the value of emerging clinical monitoring technologies must have an understanding of these new statistical methodologies. They must also take into consideration the precision of the reference method and issues pertaining to setting the criteria for accepting a new monitoring method, particularly when using percentage error and the traditional <30% benchmark.
We present a method for mining the web for text entered on mobile devices. Using searching, crawling, and parsing techniques, we locate text that can be reliably identified as originating from 300 mobile devices. This includes 341,000 sentences written on iPhones alone. Our data enables a richer understanding of how users type “in the wild” on their mobile devices. We compare text and error characteristics of different device types, such as touchscreen phones, phones with physical keyboards, and tablet computers. Using our mined data, we train language models and evaluate these models on mobile test data. A mixture model trained on our mined data, Twitter, blog, and forum data predicts mobile text better than baseline models. Using phone and smartwatch typing data from 135 users, we demonstrate our models improve the recognition accuracy and word predictions of a state-of-the-art touchscreen virtual keyboard decoder. Finally, we make our language models and mined dataset available to other researchers.
Close encounters or resonances overlaps can create chaotic motion in small bodies in the Solar System. Approaches that measure the separation rate of trajectories that start infinitesimally near, or changes in the frequency power spectrum of time series, among others, can discover chaotic motion. In this paper, we introduce the ACF index (ACFI), which is based on the auto-correlation function of time series. Auto-correlation coefficients measure the correlation of a time-series with a lagged duplicate of itself. By counting the number of auto-correlation coefficients that are larger than 5% after a certain amount of time has passed, we can assess how the time series auto-correlates with each other. This allows for the detection of chaotic time-series characterized by low ACFI values.