We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This last chapter summarizes most of the material in this book in a range of concluding statements. It provides a summary of the lessons learned. These lessons can be viewed as guidelines for research practice.
In practice we do not always have clear guidance from economic theory about specifying an econometric model. At one extreme, it may be said that we should “let the data speak.” It is good to know that when they “speak” that what they say makes sense. We must be aware of a particularly important phenomenon in empirical econometrics: the spurious relationship. If you encounter a spurious relationship but do not recognize it as such, you may inadequately consider such a relationship for hypothesis testing or for the creation of forecasts. A spurious relationship appears when the model is not well specified. In this chapter, we see from a case study that people can draw strong but inappropriate conclusions if the econometric model is not well specified. We see that if you a priori hypothesize a structural break at a particular moment in time, and based on that very assumption analyze the data, then it is easy to draw inaccurate conclusions. As with influential observations, the lesson here is that one should first create an econometric model, and, given that model, investigate whether there could have been a structural break.
This chapter deals with missing data and a few approaches to managing such. There are several reasons why data can be missing. For example, people can throw away older data, which can sometimes be sensible. It may also be the case that you want to analyze a phenomenon that occurs at an hourly level but only have data at the daily level; thus, the hourly data are missing. It may also be that a survey is simply too long, so people get tired and do not answer all questions. In this chapter we review various situations where data are missing and how we can recognize them. Sometimes we know how to manage the situation of missing data. Often there is no need to panic and modifications of models and/or estimation methods can be used. We encounter a case in which data can be made missing on purpose, by selective sampling, to subsequently facilitate empirical analysis. Such analysis explicitly takes account of the missingness, and the impact of missing data can become minor.
This chapter introduces the continuous-time Fourier transform (CTFT) and its properties. Many examples are presented to illustrate the properties. The inverse CTFT is derived. As one example of its application, the impulse response of the ideal lowpass filter is obtained. The derivative properties of the CTFT are used to derive many Fourier transform pairs. One result is that the normalized Gaussian signal is its own Fourier transform, and constitutes an eigenfunction of the Fourier transform operator. Many such eigenfunctions are presented. The relation between the smoothness of a signal in the time domain and its decay rate in the frequency domain is studied. Smooth signals have rapidly decaying Fourier transforms. Spline signals are introduced, which have provable smoothness properties in the time domain. For causal signals it is proved that the real and imaginary parts of the CTFT are related to each other. This is called the Hilbert transform, Poisson’’s transform, or the Kramers–Kronig transform. It is also shown that Mother Nature “computes” a Fourier transform when a plane wave is propagating across an aperture and impinging on a distant screen – a well-known result in optics, crystallography, and quantum physics.
Common time series models allow for a correlation between observations that is likely to be largest for points that are close together in time. Adjustments can be made, also, for seasonal effects. Variation in a single spatial dimension may have characteristics akin to those of time series, and comparable models find application there. Autoregressive models, which make good intuitive sense and are simple to describe, are the starting point for discussion; then moving on to autoregressive moving average with possible differencing. The "forecast" package for R has mechanisms that allow automatic selection of model parameters. Exponential smoothing state space (exponential time series or ETS) models are an important alternative that have often proved effective in forecasting applications. ARCH and GARCH heteroskedasticity models are further classes that have been developed to handle the special characteristics of financial time series.
Increased autocorrelation (AR) of system-specific measures has been suggested as a predictor for critical transitions in complex systems. Increased AR of mood scores has been reported to anticipate depressive episodes in major depressive disorder, while other studies found AR increases to be associated with depressive episodes themselves. Data on AR in patients with bipolar disorders (BD) is limited and inconclusive.
Methods
Patients with BD reported their current mood via daily e-diaries for 12 months. Current affective status (euthymic, prodromal, depressed, (hypo)manic) was assessed in 26 bi-weekly expert interviews. Exploratory analyses tested whether self-reported current mood and AR of the same item could differentiate between prodromal phases or affective episodes and euthymia.
Results
A total of 29 depressive and 20 (hypo)manic episodes were observed in 29 participants with BD. Self-reported current mood was significantly decreased during the two weeks prior to a depressive episode (early prodromal, late prodromal), but not changed prior to manic episodes. The AR was neither a significant predictor for the early or late prodromal phase of depression nor for the early prodromal phase of (hypo)mania. Decreased AR was found in the late prodromal phase of (hypo)mania. Increased AR was mainly found during depressive episodes.
Conclusions
AR changes might not be better at predicting depressive episodes than simple self-report measures on current mood in patients with BD. Increased AR was mostly found during depressive episodes. Potentially, changes in AR might anticipate (hypo)manic episodes.
Childhood trauma (CT) may increase vulnerability to psychopathology through affective dysregulation (greater variability, autocorrelation, and instability of emotional symptoms). However, CT associations with dynamic affect fluctuations while considering differences in mean affect levels across CT status have been understudied.
Methods
346 adults (age = 49.25 ± 12.55, 67.0% female) from the Netherlands Study of Depression and Anxiety participated in ecological momentary assessment. Positive and negative affect (PA, NA) were measured five times per day for two weeks by electronic diaries. Retrospectively-reported CT included emotional neglect and emotional/physical/sexual abuse. Linear regressions determined associations between CT and affect fluctuations, controlling for age, sex, education, and mean affect levels.
Results
Compared to those without CT, individuals with CT reported significantly lower mean PA levels (Cohen's d = −0.620) and higher mean NA levels (d = 0.556) throughout the two weeks. CT was linked to significantly greater PA variability (d = 0.336), NA variability (d = 0.353), and NA autocorrelation (d = 0.308), with strongest effects for individuals reporting higher CT scores. However, these effects were entirely explained by differences in mean affect levels between the CT groups. Findings suggested consistency of results in adults with and without lifetime depressive/anxiety disorders and across CT types, with sexual abuse showing the smallest effects.
Conclusions
Individuals with CT show greater affective dysregulation during the two-week monitoring of emotional symptoms, likely due to their consistently lower PA and higher NA levels. It is essential to consider mean affect level when interpreting the impact of CT on affect dynamics.
The brain must make inferences about, and decisions concerning, a highly complex and unpredictable world, based on sparse evidence. An “ideal” normative approach to such challenges is often modeled in terms of Bayesian probabilistic inference. But for real-world problems of perception, motor control, categorization, language understanding, or commonsense reasoning, exact probabilistic calculations are computationally intractable. Instead, we suggest that the brain solves these hard probability problems approximately, by considering one, or a few, samples from the relevant distributions. Here we provide a gentle introduction to the various sampling algorithms that have been considered as the approximation used by the brain. We broadly summarize these algorithms according to their level of knowledge and their assumptions regarding the target distribution, noting their strengths and weaknesses, their previous applications to behavioural phenomena, as well as their psychological plausibility.
Quantifying the multiscale hydraulic heterogeneity in aquifers and their effects on solute transport is the task of this chapter. Using spatial statistics, we explain how to quantify spatial variability of hydraulic properties or parameters in the aquifer using the stochastic or random field concept. In particular, we discuss spatial covariance, variogram, statistical homogeneity, heterogeneity, isotropy, and anisotropy concepts. Field examples complement the discussion. We then present a highly parameterized heterogeneous media (HPHM) approach for simulating flow and solute transport in aquifers with spatially varying hydraulic properties to meet our interest and observation scale. However, our limited ability to collect the needed information for this approach promotes alternatives such as Monte Carlo simulation, zonation, and equivalent homogeneous media (EHM) approaches with macrodispersion approaches. This chapter details the EHM with the macordispersion concept.
This chapter first reviews the linear first-order non-homogeneous ordinary differential equation. Introduction to statistics and stochastic processes follows. Afterward, it explains the stochastic fluid continuum concept, associated control volume, and spatial- and ensemble-representative control volume concepts. It then uses the well-known solute concentration definition as an example to elucidate the volume- and spatial-, ensemble-average, and ergodicity concepts. This chapter is to provide basic mathematics and statistics knowledge necessary to comprehend the themes of this book. Besides, this chapter’s home works demonstrate the power of the widely available Microsoft Excel for scientific investigations.
Delving into the specifics of spatial and temporal analytics, this chapter explores topics such as spatial neighborhood and temporal evolution of large amounts of network traffic data.
This chapter discusses some basic concepts quantifying the characteristics of functions with random fluctuations. Deterministic functions are discussed first in order to introduce functions with randomness and ways to quantify them. The concepts of phase space, ensemble mean, ergodic process, moments, covariance functions, and correlation functions are discussed briefly.
This chapter gives a short summary of mathematical instruments required to model sensor systems in the presence of both deterministic and random processes. The concepts are organized in a compact overview for a more rapid consultation, emphasizing the convergences between different contexts.
Annual resolution sediment layers, known as varves, can provide continuous and high-resolution chronologies of sedimentary sequences. In addition, varve counting is not burdened with the high laboratory costs of geochronological analyses. Despite a more than 100-year history of use, many existing varve counting techniques are time consuming and difficult to reproduce. We present countMYvarves, a varve counting toolbox which uses sliding-window autocorrelation to count the number of repeated patterns in core scans or outcrop photos. The toolbox is used to build an annually-resolved record of sedimentation rates, which are depth-integrated to provide ages. We validate the model with repeated manual counts of a high sedimentation rate lake with biogenic varves (Herd Lake, USA) and a low sedimentation rate glacial lake (Lago Argentino, Argentina). In both cases, countMYvarves is consistent with manual counts and provides additional sedimentation rate data. The toolbox performs multiple simultaneous varve counts, enabling uncertainty to be quantified and propagated into the resulting age-depth model. The toolbox also includes modules to automatically exclude non-varved portions of sediment and interpolate over missing or disrupted sediment. CountMYvarves is open source, runs through a graphical user interface, and is available online for download for use on Windows, macOS or Linux at https://doi.org/10.5281/zenodo.4031811.
Cross-cultural methodology has made significant progress. Yet, research methodology is in part a response to the challenges posed by research agendas given the intellectual and technological resources available at the time. Just as the contemporary achievements in cross-cultural research methodology are a response to the necessity of the synchronic cross-cultural research in the globalizing world of the late 20th century, the changing cultural landscape of the early 21st century demands a diachronic research agenda – cross-temporal research of psychological phenomena in human populations. Meaning equivalence and autocorrelation are highlighted as major challenges to both synchronic and diachronic approaches in cross-cultural psychology. Researchers need to meet these challenges in the future.
Barnard and Steinerberger [‘Three convolution inequalities on the real line with connections to additive combinatorics’, Preprint, 2019, arXiv:1903.08731] established the autocorrelation inequality
where the constant $0.411$ cannot be replaced by $0.37$. In addition to being interesting and important in their own right, inequalities such as these have applications in additive combinatorics. We show that for $f$ to be extremal for this inequality, we must have
Our central technique for deriving this result is local perturbation of $f$ to increase the value of the autocorrelation, while leaving $||f||_{L^{1}}$ unchanged. These perturbation methods can be extended to examine a more general notion of autocorrelation. Let $d,n\in \mathbb{Z}^{+}$, $f\in L^{1}$, $A$ be a $d\times n$ matrix with real entries and columns $a_{i}$ for $1\leq i\leq n$ and $C$ be a constant. For a broad class of matrices $A$, we prove necessary conditions for $f$ to extremise autocorrelation inequalities of the form
We investigated the efficiency of the autoregressive repeatability model (AR) for genetic evaluation of longitudinal reproductive traits in Portuguese Holstein cattle and compared the results with those from the conventional repeatability model (REP). The data set comprised records taken during the first four calving orders, corresponding to a total of 416, 766, 872 and 766 thousand records for interval between calving to first service, days open, calving interval and daughter pregnancy rate, respectively. Both models included fixed (month and age classes associated to each calving order) and random (herd-year-season, animal and permanent environmental) effects. For AR model, a first-order autoregressive (co)variance structure was fitted for the herd-year-season and permanent environmental effects. The AR outperformed the REP model, with lower Akaike Information Criteria, lower Mean Square Error and Akaike Weights close to unity. Rank correlations between estimated breeding values (EBV) with AR and REP models ranged from 0.95 to 0.97 for all studied reproductive traits, when the total bulls were considered. When considering only the top-100 selected bulls, the rank correlation ranged from 0.72 to 0.88. These results indicate that the re-ranking observed at the top level will provide more opportunities for selecting the best bulls. The EBV reliabilities provided by AR model was larger for all traits, but the magnitudes of the annual genetic progress were similar between two models. Overall, the proposed AR model was suitable for genetic evaluations of longitudinal reproductive traits in dairy cattle, outperforming the REP model.
Review of correlation and simple linear regression. Introduction to lagged (cross-) correlation for identifying recurrent and periodic features in common between pairs of time-series, statistical evidence of possible causal relationships. Introduction to (lagged) autocorrelation for identifying recurrent and periodic features in time-series. Use of correlation and simple linear regression for statistical comparison of time-series to reference datasets, with focus on periodic (sinusoidal) reference datasets. Interpretation of statistical effect-size and significance (p-value).
Overview of key identifying features of noise as can typically occur in geoscience time-series. Categorisation according to noise colour; white, red and blue noise. Consideration of autocorrelation and autoregression, power spectral density and power-law. Worked red-noise example to illustrate.
Introduction: Understanding the spatial distribution of opioid abuse at the local level may facilitate community intervention strategies. The purpose of this analysis was to apply spatial analytical methods to determine clustering of opioid-related emergency medical services (EMS) responses in the City of Calgary. Methods: Using opioid-related EMS responses in the City of Calgary between January 1st through October 31st, 2017, we estimated the dissemination area (DA) specific spatial randomness effects by incorporating the spatial autocorrelation using intrinsic Gaussian conditional autoregressive model and generalized linear mixed models (GLMM). Global spatial autocorrelation was evaluated by Morans I index. Both Getis-Ord Gi and the LISA function in Geoda were used to estimate the local spatial autocorrelation. Two models were applied: 1) Poisson regression with DA-specific non-spatial random effects; 2) Poisson regression with DA-specific G-side spatial random effects. A pseudolikelihood approach was used for model comparison. Two types of cluster analysis were used to identify the spatial clustering. Results: There were 1488 opioid-related EMS responses available for analysis. Of the responses, 74% of the individuals were males. The median age was 33 years ( IQR: 26-42 years) with 65% of individuals between 20 and 39 years, and 27% between 40 and 64 years. In 62% of EMS responses, poisoning/overdose was the chief complaint. The global Morans Index implied the presence of global spatial autocorrelation. Comparing the two models applied suggested that the spatial model provided a better fit for the adjusted opioid-related EMS response rate. Calgary Center and East were identified as hot spots by both types of cluster analysis. Conclusion: Spatial modeling has a better predictability to assess potential high risk areas and identify locations for community intervention strategies. The clusters identified in Calgarys Center and East may have implications for future response strategies.