1 Introduction
The gambler’s fallacy and the hot hand belief have been classified as two exemplars of human misperceptions of random sequential events and widely studied in multiple disciplines such as psychology, sports, behavioral economics and neuroeconomics (e.g., Camerer, Loewenstein, & Prelec, Reference Camerer, Loewenstein and Prelec2005; Gilovich, Griffin, & Kahneman, Reference Gilovich, Griffin and Kahneman2002; Gilovich, Vallone, & Tversky, Reference Gilovich, Vallone and Tversky1985; Kahneman, Reference Kahneman and Frangsmyr2002; Malkiel, Reference Malkiel2003; Rabin, Reference Rabin2002; Rabin & Vayanos, 2010). Often manifested in more intricate forms, these Reference Rabin and Vayanostwo phenomena can be demonstrated by independent and identically distributed Bernoulli trials. Suppose that a fair coin with equal probabilities of coming up a head (h) and a tail (t) is tossed repeatedly and the first three outcomes produce three heads (h,h,h). In predicting the next outcome, one with the gambler’s fallacy would predict (h,h,h,t) — a reversal of the streak. In contrast, one with the hot hand belief would predict (h,h,h,h) — a continuation of the streak.
The fact that people exhibit two opposing expectations upon the same past information — negative recency in the gambler’s fallacy and positive recency in the hot hand belief — has been the center of attention in the research on perception of randomness, pattern detection and judgment of uncertainty (for reviews, see, Ayton & Fischer, Reference Ayton and Fischer2004; Oskarsson, Van Boven, McClelland, & Hastie, Reference Oskarsson, Van Boven, McClelland and Hastie2009). Among existing theories, a prevailing account is the representativeness heuristic, which attributes both the gambler’s fallacy and the hot hand belief to a false belief of the “law of small numbers” (Gilovich, et al., Reference Gilovich, Vallone and Tversky1985; Tversky & Kahneman, Reference Tversky and Kahneman1971). By this account, people tend to believe that a local sample should resemble the underlying population and chance is perceived as “a self-correcting process in which a deviation in one direction induces a deviation in the opposite direction to restore the equilibrium” (Tversky & Kahneman, Reference Tversky and Kahneman1974, p. 1125). Thus, in the gambler’s fallacy, a tail is due to reverse a streak of heads. In the hot hand belief, a streak of successes may indicate the existence of a hot hand by which the streak tends to be prolonged (see also Tversky & Gilovich, Reference Tversky and Gilovich1989).
However, the representativeness account has been criticized for its incompleteness and testability (e.g., Ayton & Fischer, Reference Ayton and Fischer2004; Falk & Konold, Reference Falk and Konold1997; Gigerenzer, Reference Gigerenzer1996; Kubovy & Gilden, Reference Kubovy, Gilden, Lockhead and Pomerantz1991). Ayton and Fischer (Reference Ayton and Fischer2004) suggest that the gambler’s fallacy arises from the experience of negative recency in sequences of natural events such as roulette games, but the hot hand belief arises from the experience of positive recency in serial fluctuations in human performance. Similarly, it has been proposed that the hot hand belief can arise when people evaluate the performance of a mutual fund manager rather than the fluctuations of the portfolio (Rabin, Reference Rabin2002; Rabin & Vayanos, Reference Rabin and Vayanos2010), or, the gambler’s luck rather than the outcomes of a roulette game (Croson & Sundali, Reference Croson and Sundali2005; Sundali & Croson, Reference Sundali and Croson2006). Moreover, Burns and Corpus (Reference Burns and Corpus2004) show that subjects assume positive recency for forecasting scenarios they rated as “nonrandom” and negative recency for scenarios they rated as “random”. Burns (Reference Burns2004) further argues that the hot hand belief is a fast and frugal heuristic to detect changes in the shooting accuracy of basketball players. This argument is consistent with the finding of “residual nonstationarity” in Sun (Reference Sun, Forbus, Gentner and Regier2004), in which it is suggested that the fluctuations in players’ performance can be obscured by real-time adjustments based on the detection of a hot hand. For example, after making several shots in a row, a player might try a more difficult shot or the opponent players may increase the defense effort. (For a review on the hot hand study, see Bar-Eli, Avugos, & Raab, Reference Bar-Eli, Avugos and Raab2006.)
Compared to the representativeness account, the alternative interpretations distinguish the hot hand belief from the gambler’s fallacy by deviations from a random process. When the underlying process is truly random (or statistically impossible to tell apart from independent and stationary Bernoulli trials), both beliefs are considered as biases or misperceptions of randomness. In particular, both beliefs appear to share a common intuition that streak patterns are “rare” and “remarkable” — a streak of heads is unlikely to occur if the coin is fair, or, a basketball player is unlikely to make shots in streaks unless he or she has a hot hand. However, the independence assumption of Bernoulli trials states that, for a fair coin, a streak will occur as often as any other patterns of the same length in its exact order (i.e., the equiprobability of “n-grams”, Falk & Konold, Reference Falk and Konold1997, p. 306). Then, what is so special about streak patterns that people normally tend to avoid them and only expect them when they feel “hot”? In the present paper, we show that streak patterns do possess a set of properties that set them apart from other patterns, and these properties may provide an alternative explanation for the particular role of streak patterns in people’s perception and judgment of randomness.
We exemplify by comparing two patterns (h,h,h,t) and (h,h,h,h). When a fair coin is tossed repeatedly, both patterns have the same probability of occurrence in any four successive trials. However, it takes on average 16 tosses to encounter the first occurrence of (h,h,h,t) but 30 tosses to encounter the first occurrence of (h,h,h,h). In other words, streak pattern (h,h,h,h) has been “delayed” for its first occurrence. The expected number of trials required for the first occurrence of a particular pattern is a statistical property known as “waiting time”, which can be different among patterns due to different pattern compositions (see Gardner, Reference Gardner1988; Graham, Knuth, & Patashnik, Reference Graham, Knuth and Patashnik1994). While the probability of occurrence (or frequency) describes how often a pattern occurs, the waiting time describes when a pattern will occur from the time at which monitoring begins. Interestingly, these are different statistical properties and clearly bear different psychological relevance. For example, for a passenger who is waiting for a bus, when the first bus arrives probably is more relevant than how often the bus arrives. It is the goal of this paper to demonstrate a plausible link between the statistics of pattern times and people’s perception of randomness.
It is important to note that the concept of waiting time has recently received attention in psychology literature (Hahn & Warren, Reference Hahn and Warren2009; Sun, Tweney, & Wang, Reference Sun, Tweney and Wang2010a, 2010b). Hahn and Warren (Reference Hahn and Warren2009) show that, in a global sequence of moderate length, streak patterns such as (h,h,h,h) have higher “probabilities of nonoccurrence” than (h,h,h,t). Base on this result, they argue that, given people’s limited exposure to the environment (e.g., the number of coin tosses is limited), misperceptions of randomness such as the gambler’s fallacy might actually emerge as apt reflections of these environmental statistics. Sun, Tweney, and Wang (Reference Sun, Tweney and Wang2010a) criticize Hahn and Warren’s interpretation by clarifying the relationship between the probability of nonoccurrence and waiting time. In particular, Sun et al. argue that the probability of nonoccurrence is a manifestation of waiting time, which is independent of the length of the global sequence, and neither statistic would justify the prediction of reverting of a streak by the gambler’s fallacy (also see Sun, et al., Reference Sun, Tweney and Wang2010b). Notwithstanding the debate, the argument of treating waiting time as a part of the environmental statistics appears to be quite plausible. Given that different statistics can arise from the same process of coin tossing (or basketball shooting), it is likely that they have been actually experienced by people and have different effects on people’s perception of randomness. In the following, we examine these statistics in detail and discuss their psychological implications.
2 Mean time, waiting time, and variance of interarrival times
Let us call the occurrence of a pattern an arrival of the pattern when a coin (fair or biased) is tossed repeatedly. We can define a counting process N(n),n ≥ 1, where N(n) denote the number of arrivals of a pattern by the time n (i.e., by the nth toss). The process has parameters µ and σ2 as the mean and variance of the time between successive arrivals. (A more detailed treatment is provided in the Appendix. The results presented in this paper are verified by simulations conducted in the R statistics environment and the scripts are available in this issue of the journal.)
For a particular pattern, its interarrival time T is defined as the number of tosses between any two successive occurrences of the pattern, and the first arrival time T * is defined as the number of tosses until the first occurrence since the beginning of the counting process. The mean of interarrival times E [T] , hence referred to as “mean time”, is determined by the individual probabilities of the elements in the pattern. Assume a fair coin with equal probabilities of heads and tails,
That is, patterns (h,h,h,t) and (h,h,h,h) have the same mean time (16 tosses) between successive arrivals. This is equivalent to the statement that (h,h,h,t) and (h,h,h,h) have the same probability of occurrence. Regardless of the number of coin tosses, a gambler will encounter either pattern equally often. After n tosses, the expected number of encounters for either pattern is the same as in
However, the mean of the first arrival time E[T *], the waiting time, can be different due to the different amount of “self-overlap” within a particular pattern (see Figure 1). The amount of self-overlap (s) can be defined as the maximum length of a sub-pattern that has to occur twice (with or without overlap) to start and finish one occurrence of the original pattern. For example, among all patterns of length 4, pattern (h,h,h,h) has the largest amount of self-overlap (s = 3), and (h,h,h,t) is non-overlapping (s = 0). A direct consequence of self-overlap is that the pattern’s first occurrence will be delayed when s>0. Imagine that one is waiting for an occurrence of (h,h,h,h) and has already obtained three heads — a sub-pattern of length 3, (h,h,h) — if the 4th toss is a tail, the waiting has to start from scratch and the waiting time spent on the sub-pattern (h,h,h) is wasted. In contrast, when one is waiting for pattern (h,h,h,t) and has already obtained three heads, if the 4th toss is a head, the waiting continues but it still has three heads to start with. It can be shown that, for a fair coin, among all patterns of length 4, (h,h,h,h) and (h,h,h,t) have the longest and shortest waiting times, respectively (also see Table 1),
Moreover, it can be shown that waiting time E[ T * ] is almost perfectly correlated to the variance of interarrival times Var (T) for patterns of the same length (see Table 1 and Appendix). Intuitively, both E[ T * ] and Var (T) are direct consequences of the self-overlapping property, and the amount of self-overlap in the pattern determines the minimum distance by which a consecutive occurrence can follow (i.e., the shortest interarrival time). While consecutive reoccurrences of (h,h,h,t) have to be completely separated from each other thus more evenly distributed, consecutive reoccurrences of (h,h,h,h) can overlap with each other and tend to be clustered (see Figure 1). As a consequence, among all possible patterns of length 4, these two patterns have the smallest and largest variance of interarrival times, respectively:
3 Frequency versus delay
In essence, the contrast between mean time and waiting time lies in the contrast between “frequency” and “delay”. On one hand, the mean time estimates the average distance between consecutive occurrences and equals to the inversion of the probability of occurrence, therefore, it is a measure of frequency. When the coin is fair, patterns of the same length have the same mean time thus the same frequency to occur (see Equations 1 and 2). On the other hand, waiting time estimates when a pattern will occur since one starts counting and the time of occurrence is delayed on the basis of the pattern’s mean time: Equations (3) and (4) show that the amount of delay for an overlapping pattern (s>0) equals the waiting time for the repeating sub-pattern of length s; for a non-overlapping pattern (s = 0), no delay is incurred and its waiting time always equals its mean time (also see Table 1).Footnote 1
If one assumes that people’s perception of randomness is shaped by the environment (e.g., Ayton & Fischer, Reference Ayton and Fischer2004; Lopes & Oden, Reference Lopes and Oden1987; Pinker, Reference Pinker1997), it is likely that people have actually experienced different statistics from the same process, although they might not be aware of the exact distinction. The contrast, either between mean time and waiting time, or between frequency and delay, might have important implications regarding people’s perception of sequential patterns, particularly in the gambler’s fallacy and the hot hand belief. In the following, we first examine the (ex-ante) perception or expectation of patterns as an integrated sequence, then, the prediction of a single outcome based on the perception of patterns.
First, due to the largest amount of self-overlap, a streak pattern is the most delayed pattern for its first occurrence, comparing to all other patterns of the same length. The amount of delay is considerably large even for short streaks (see Table 1), and it will grow exponentially as the length of the streak grows. For example, for a streak of 10 heads in tossing a fair coin, its mean time is 1024 tosses, and its waiting time is 2046 tosses, 1022 tosses away from the mean time (which is the waiting time for a streak of 9 heads). Given that the mean time remains the same for all patterns of the same length, it is possible that people’s sense of rareness about streak patterns have stemmed from their experiences of the long waiting times.
Moreover, the waiting time statistic can manifest itself in many other forms. One example is the probability of occurrence at least once — the probability that a particular pattern occurs at least once when a coin is tossed N times — which is complementary of the probability of nonoccurrence — the probability that a particular pattern will not occur at all in N tosses. The latter probability has been discussed by Hahn and Warren (Reference Hahn and Warren2009), and Sun et al. (Reference Sun, Tweney and Wang2010a) provide an analytical solution to both probabilities. It can be shown that among all patterns of length 4, the streak pattern (h,h,h,h) has the lowest probability of occurrence at least once for any N>4, which is another consequence of the self-overlapping probability in the pattern composition (Figure 2 shows the comparison between (h,h,h,h) and (h,h,h,t)). This fact might from another prospective explain why streak patterns are under-represented in people’s perception. That is, because of its clustering tendency, overlapped reoccurrences of a streak pattern may be counted only once or replaced by one count of a longer streak. Such speculation appears to be consistent with the finding in a recent study by Olivola and Oppenheimer (Reference Olivola and Oppenheimer2008): when participants recalled the studied binary sequence, the lengths of streaks imbedded in the original sequence were underestimated.
Nevertheless, although the long waiting time might provide a statistical basis to justify people’s perception of streak patterns as rare events, it does not justify the prediction of a single outcome to reverse (or, to avoid) a streak pattern by the gambler’s fallacy. By the independence assumption of Bernoulli trials, given that one has already obtained three heads in a row, the additional time to encounter (h,h,h,h) is E[T h,h,h,h], and the additional time to encounter (h,h,h,t) is E[T h,h,h,t], which is the same in the case of a fair coin (see Equations 3 and 4). That is, the statement that the streak pattern(h,h,h,h)’s first occurrence is “delayed” is an ex-ante expectation when the pattern is treated as a whole as one starts tossing the coin from scratch. However, such statement does not mean that the “streak-reversal” pattern (h,h,h,t)’s first occurrence is “expedited” since its waiting time cannot be shorter than its mean time. In other words, although waiting time (or probability of occurrence at least once) may depict (h,h,h,t) as the most “representative” pattern of the coin tossing process (for its waiting time is equal to its mean time, or, its occurrences are most evenly distributed), it does not predict that a streak of heads will soon be reversed by a tail, thus, it does not vindicate the error in the gambler’s fallacy.
Then, what about the prediction of a single outcome to continue a streak by the hot hand belief? The debate over the statistical validity of the hot hand belief has lasted more than twenty years (e.g., Bar-Eli et al., Reference Bar-Eli, Avugos and Raab2006), and it is not likely to be ended by simply introducing a new set of statistics. However, pattern time statistics do seem to support some of the existing theories. In particular, it has been suggested that the hot hand belief arises when people are evaluating human performance, and people pay particular attention to streak patterns in order to detect a change in the performance, for example, fluctuations in the shooting accuracy of basketball players (e.g., Ayton & Fischer, Reference Ayton and Fischer2004; Burns, Reference Burns2004; Burns & Corpus, Reference Burns and Corpus2004; Sun, Reference Sun, Forbus, Gentner and Regier2004). By such account, the prediction to continue a streak is actually valid on the basis of a higher probability of a single outcome (e.g., a higher shooting accuracy, a higher probability of heads in case of a biased coin). It can be shown that by the measure of either mean time or waiting time, streak patterns are indeed a good indicator for detecting the changes in the probability of single outcomes. Figure 3 shows mean time and waiting time as the function of the probability of heads (ph), respectively. It shows that with a small change in ph, both mean time and waiting time change more rapidly for pattern (h,h,h,h) than for pattern (h,h,h,t). For example, when ph increases from .5 to .6, E[T h,h,h,h] drops from 16 tosses to 7.7 tosses (Δ = 8.3) and drops more rapidly from 30 tosses to 16.8 tosses (Δ = 13.2). In contrast, E[T h,h,h,t] and only drops from 16 tosses to 11.6 tosses (Δ= 4.4) (note that for pattern (h,h,h,t), its mean time and waiting time are identical at all levels of ph).
Furthermore, Figure 3 also shows that the mean time and waiting time depict different pictures regarding the occurrences of streak pattern (h,h,h,h) at various levels of ph. For example, at ph = .5, (h,h,h,h) and (h,h,h,t) are indifferent by the measure of mean time but quite distinguishable by the measure of waiting time, a fact we have mentioned before. It also shows that (h,h,h,h) will occur more frequently than (h,h,h,t) (shorter mean time) as soon as ph>.5, but remains delayed (longer waiting time) for its first occurrence until ph>.7. Then, one may wonder if the mean time and waiting time have different effects on people’s perception, or, if people perceive the properties of frequency and delay differently, how these effects can be integrated to produce a single response in the subjective preference over patterns?
As mentioned before, waiting time E[T *] is in effect an indicator of the variance of the interarrival times Var (T) as both are direct consequences of the inter-overlapping property of the pattern. From this prospective, the contrast between frequency (mean time) and delay (waiting time) actually reflects the contrast between the mean and variance of the same random variable interarrival times, E[T] and Var (T). To evaluate the combined effect of frequency and delay on people’s perception, a quantitative measure may be provided by the mean-variance paradigm in the modern portfolio theory (Markowitz, Reference Markowitz1952; Sharpe, Reference Sharpe1994) or the coefficient of variation in risk perception (Weber, Shafir, & Blais, Reference Weber, Shafir and Blais2004).Footnote 2 In this paradigm, to describe the desirability of an option in the tradeoff between return and risk, a Sharpe Ratio (Sharpe, Reference Sharpe1994) is calculated as the ratio between the expected return and the variance of the returns. Similarly for pattern occurrences, we can calculate a “Frequency-Delay ratio” (100/µ σ) to describe the tradeoff between the frequency of occurrence (1/µ) and the delay (σ), where µ=E[T] and σ = SD(T) are the mean and standard deviation of interarrival times, respectively, and the constant 100 is to express the ratio as a percentage. Analogous to the Sharpe Ratio by which people would prefer an option with a higher return and a lower risk, the assumption underlying the Frequency-Delay ratio is that people would be more willingly to make a prediction on a pattern with a higher frequency of occurrence and a lower amount of delay. In other words, a person can possess both the gambler’s fallacy and the hot hand belief, and which belief arises would depend on how frequency and delay are weighted separately and how they are incorporated. Figure 4 shows the Frequency-Delay ratio at various levels of ph, where pattern (h,h,h,t) has a higher ratio when ph<.61, and pattern (h,h,h,h) has a higher ratio when ph>.62. Described by such ratio, a person would be more willingly to predict on (h,h,h,t) when ph<.61, and more willingly to predict on (h,h,h,h) when ph>.62. That is, “a streak of heads is unlikely to occur if the coin is fair, and a basketball player is unlikely to make shots in streaks unless he or she (really) has a hot hand.”
4 Conclusion
We presented a set of statistics on the time of pattern occurrences in Bernoulli trials. In particular, we demonstrated that, due to the different pattern composition, different statistical properties can arise. The mean time measures the frequency of pattern occurrences, and the waiting time measures the amount of delay in pattern occurrence times. While previous research on perception of random patterns has focused on the mean time or the frequency of occurrences (e.g., Budescu, Reference Budescu1987; Falk & Konold, Reference Falk and Konold1997; Gilovich, et al., Reference Gilovich, Vallone and Tversky1985; Lopes & Oden, Reference Lopes and Oden1987; Nickerson, Reference Nickerson2002; Oskarsson, et al., Reference Oskarsson, Van Boven, McClelland and Hastie2009), the waiting time and the property of delay have not been addressed until recently. It is likely that people are not able to precisely differentiate these statistics. However, given that these statistics can be observed from the same process, it is possible that they all have played roles in shaping people’s perception of random patterns. It has long been argued that people may have an accurate sense of randomness but fail to reveal it in their behavior (e.g., Pinker, Reference Pinker1997; for an overview of different opinions, see Rapoport & Budescu, Reference Rapoport and Budescu1992). The distinction between mean time and waiting time may provide a new prospective in the studies on people’s perception of random events. For instance, the mean time does not differentiate any patterns in the case of tossing a fair coin; if one assumes people have acquired accurate experiences from the environment, it is quite possible that people’s notion of streak patterns as rare events is guided by the waiting time.
It should be noted that the statistics of pattern times can manifest themselves in many forms and each manifestation may receive different interpretations, depending on the specific assumptions about human perception of sequential events and the specific task environment. For example, regarding the distinction between frequency and delay, people may be more sensitive to one property than to the other; or one property is more important than the other in different situations. For a passenger waiting for a bus, if he or she is concerned only with catching the first bus (or, at least one bus), the waiting time (delay) is certainly more important. However, if the passenger is interested in estimating the number of bus arrivals in a certain period of time, the mean time should be the statistic of choice. Another example is that, if we assume the actual basketball shooting as tossing a coin with a fixed probability of heads (as the null hypothesis), the hot hand belief would be judged as a fallacy. However, if we assume that the basketball player’s shooting accuracy is initially unknown and can fluctuate, paying attention to the occurrence of streaks may actually be an effective way to detect the change.
Even in simple cases like coin tossing, people’s perception of randomness surely cannot be reduced to a certain set of statistics. Many other perceptual and cognitive mechanisms can come into play, such as perception of proportion and symmetry (Rapoport & Budescu, Reference Rapoport and Budescu1997), subjective complexity (Falk & Konold, Reference Falk and Konold1997), or, working memory capacity (Kareev, Reference Kareev1992). Nevertheless, the statistics of pattern times appear to qualify for a useful toolset since they provide objective measures in situations where sequential information is essential. For example, people respond differently depending on whether they view sequences all at once on paper or they actually observe sequences unfolding over time (e.g., Olivola & Oppenheimer, Reference Olivola and Oppenheimer2008). They habitually look for sequential patterns and their perception of patterns influence their responses in single experimental trials even when the sequence of trials is completely independent (Barraclough, Conroy, & Lee, Reference Barraclough, Conroy and Lee2004; Huettel, Reference Huettel2006; Huettel, Mack, & McCarthy, Reference Huettel, Mack and McCarthy2002). Moreover, the dissociation of frequency and delay might have important implications in studies on people’s intertemporal choices. And it has been suggested that people are sensitive to time discounting while the behavioral and neural effects of delays are independent of probability (e.g., Luhmann, Chun, Yi, Lee, & Wang, Reference Luhmann, Chun, Yi, Lee and Wang2008; McClure, Laibson, Loewenstein, & Cohen, Reference McClure, Laibson, Loewenstein and Cohen2004). In these aspects, we conjecture that examination of the pattern time statistics, combined with empirical experiments, might be a fruitful approach in future investigations on pattern detection, perception of randomness, and judgment and decision-making under uncertainty.
Appendix. The mean of the first arrival time, and the mean and variance of interarrival times
Let X 1, X 2,… be independent variables with P{Xi = j} = p(j), j≥ 0 . In the case of coin tossing, j = 0,1, p(0) = ph and p(1) = pt represent the probabilities of a head and a tail, respectively. For a pattern (x 1,…,xr) of length r, we say that an arrival (renewal) occurs at time n, n≥ r, if (X n − r + 1,…,Xn ) = (x 1,…,xr ). LetN(n) denote the number of arrivals by time n. N(n),n ≥ 1, is a counting process in which the first arrival time has a different distribution than the other interarrival times. Then, N(n),n ≥ 1 is said to be a general or delayed renewal process with parameters µ and σ2 as the mean and variance of the time between successive arrivals.
Define indicator variables I(i), I(i) = 1 if there is an arrival at time i and I(i) = 0 otherwise, i ≥ r. I(i) are Bernoulli random variables with parameter p, where
Then, the mean interarrival time is given by
and the variance of interarrival times is given by
An overlap index s is defined as the maximum number of elements at the end of the pattern that can be used as the beginning part of the next arrival,
We further define s = 0 when no equality can be found in Equation (8). Then, s = 0 for (h,h,h,t), s = 2 for (h,t,h,t), and s = 3 for (h,h,h,h).
Since the first arrival time can have a different distribution, let T denote the interarrival time of the process when reoccurrences are included, and T * denote the first arrival time. For pattern (h,h,h,t), r = 4 and s = 0, so that N(n),n ≥ 1 is an ordinary renewal process. Therefore, for a fair coin, ph = pt = 1/2, from Equation (6),
Since two arrivals of (h,h,h,t) cannot occur within a distance less than r (r = 4) of each other, it follows that I(r)I(r + j) = 0 when 1 ≤ j ≤ r − 1, and,
Assume a fair coin, from Equation (7), we obtain
For pattern (h,h,h,h), r = 4 and s = 3, and
where is the first arrival time for pattern (h,h,h,h), is the first arrival time for pattern (h,h,h), and T h,h,h,h is the interarrival time for (h,h,h,h). Since the coin is tossed independently, we have
For a fair coin, from Equation (6), we have
Then, can be obtained by recursively applying Equation (10) s times, starting from the shortest non-overlap pattern (h). That is,
For the variance of interarrival times, since no two arrivals of (h,h,h,h) can occur within a distance (r−s−1) of each other, it follows that I(r)I(r + j) = 0 if 1 ≤ j ≤ (r−s−1). Therefore, from Equation (7), we have
The overlapped arrivals can happen in the following ways,
Thus,
From Equations (10) and (11), it can be seen the overlapped arrivals will extend both the mean of the first arrival time and the variance of interarrival times. Thus, these two variables are correlated, and both are positively determined by the overlap index s. The procedure of calculating the variance of the first arrival times is omitted here.