Introduction
Statistical analysis of field-collected data to determine if community structure is influenced by interspecific competition is fraught with difficulties (Gotelli Reference Gotelli2000). In parasite communities, the challenges are amplified for two reasons. First, with notable exceptions (e.g., Holmes Reference Holmes1961, Reference Holmes1962), the systems themselves are not conducive to experimental proof of underlying competitive interactions. Therefore, patterns observed in natural systems cannot be calibrated to known interactions demonstrated via experiment. Second, the underlying structure of many datasets leads to mutually conflicting potential interpretations. Many parasite communities are composed of species with low prevalence and abundance, leading to few hosts harbouring multiple parasite species and a large number of hosts that are lightly infected or uninfected. Thus, the data matrix (parasite x host) resulting from a field survey can be consistent with a community experiencing high levels of competition, or a community entirely lacking interspecific interactions due to transmission limitation.
Heuristic methods have been devised for categorizing communities along a continuum of competitive interactivity. For example, the classic formulation of Holmes and Price (Reference Holmes, Price, Anderson and Kikkawa1986) and subsequent authors arrayed parasite communities from those likely to be isolationist (structured largely by the autecological properties of species) and those likely to exhibit interactivity (and therefore have some aspect of community structure influenced by interspecific interactions). Discerning whether datasets from natural communities bear the signal of competition is an empirical, analytical problem, not a logical one. For example, intestinal helminth communities of fishes have been characterized as isolationist because most communities consist of relatively few species (compared to the communities of parasites in birds and mammals) that typically infect their hosts at lower intensities and prevalence (Kennedy et al. Reference Kennedy, Bush and Aho1986; Salgado-Maldonado et al. Reference Salgado-Maldonado, Caspeta-Majdujano, Mendoza-Franco, Rubio-Godoy, Garcia-Vasquez, Mercado-Silva, Guzman-Valdivieso and Matamoros2019). Because these systems are intractable experimentally, inferences about the processes that govern community structure will rely on analysis of field-collected data. Whether the methods available can detect the presence of competition in such datasets is not yet known, nor does there exist a comprehensive comparative guide to methods that might be worth employing in the first place.
The goal of this study was to build a preliminary framework for evaluating how different analytical approaches behave statistically when used to test the hypothesis that field-collected community data are structured, in part, by competitive interactions. A dataset collected over nearly two decades in the same host–parasite system was utilized to construct model parasite communities experiencing interspecific competition of several different modes and intensities. Four methods of analysis (two for the pattern of species co-occurrence and two for pairwise abundance relationships) were deployed, allowing this study to initially characterize how well these methods detect competition when it is present and reject the hypothesis of competition when it is absent. The results herein confirm some of the well-known difficulties found when trying to discern competitive interactions in a community dataset, as well as some initially surprising relationships between the properties of parasite communities and their likelihood of carrying a pattern of community interactivity that can be detected analytically.
Materials and methods
Samples of creek chub (Semotilus atromaculatus) from small streams of the Big and Little Nemaha Rivers in southeastern Nebraska, USA, were collected for various research projects from 2003 to 2017 (Barger Reference Barger2006, Reference Barger2007, Reference Barger2019, Reference Barger2020; Robinson and Barger Reference Robinson and Barger2007; Barger and Olsen Reference Barger and Olsen2013). Barger (Reference Barger2020) provides additional details on sampling sites and methods employed in collecting and processing creek chub and their parasites.
Seventy-nine samples of at least 20 creek chub totalling 2,260 fish (mean = 29 fish/sample; range: 20–92) were included in this set of analyses. Intestinal helminths representing five species were recovered: a trematode Allocreadium lobatum, a proteocephalid tapeworm Proteocephalus sp., a nematode Rhabdochona canadensis, an acanthocephalan Paulisentis missouriensis, and the Asian fish tapeworm Schyzocotyle acheilognathi. The latter species was excluded from the present investigation because it was present in only 29 creek chub. For convenience in figures and tables, the remaining four species are referred to by abbreviations: ALOB = A. lobatum; PROT = Proteocephalus sp.; RCAN = R. canadensis; PMIS = P. missouriensis.
For each parasite species, those samples of the original 79 from which no parasite was recovered were omitted: A. lobatum (13 samples omitted; 1,897 total hosts remaining); Proteocephalus sp. (21 samples omitted; 1,705 total hosts remaining); R. canadensis (43 samples omitted; 1,081 total hosts remaining); and P. missouriensis (22 samples omitted; 1,635 total hosts remaining). These four datasets were then used as the population of potential infections from which model communities were assembled.
Model infrapopulations (one for each parasite species per individual model creek chub) were assembled by randomly drawing an abundance (0 to maximum observed) from the observed datasets described immediately above. Thus, each infracommunity was assembled by assuming infection by each parasite species was independent of all other species. Thirty component communities, each consisting of 30 infracommunities, were assembled in this way using the observed datasets, and this set of communities is referred to as the NOCOMP (i.e., communities with no effect of interspecific competition).
Four models of interspecific competition were used to modify the NOCOMP communities: PREEMPT (pre-emptive competition), RANDWIN (random winner), ABWIN (most abundant wins), and ACANWIN (acanthocephalan wins). In PREEMPT infracommunities, a parasite species was chosen at random and assumed to have infected that host prior to all other species of parasites, the latter of which had all their non-zero intensities reduced by 50%. In this model, competitive exclusion was possible only if an inferior competitor had an initial intensity of 1.0, in which case it was reduced to 0 (zero).
RANDWIN communities were similar, except that the randomly chosen winner eliminated all inferior competitors with equal or lower intensities and reduced all others’ intensities by 50%; competitive exclusion was comparatively common using this model. ABWIN communities designated the most abundant species in each infracommunity as the dominant competitor and reduced all other intensities by 50%; as with PREEMPT, ABWIN communities experienced competitive exclusion only when an inferior competitor had an intensity of 1.0. ACANWIN communities designated the acanthocephalan (P. missouriensis) as the dominant competitor in any infracommunity in which it occurred (competition was absent when P. missouriensis was absent), and it reduced all other intensities by 50% and resulted in exclusion only when an inferior competitor had an initial intensity of 1.0.
RANDWIN was the most extreme model of interspecific competition employed herein, where the only way an inferior competitor could persist in an infracommunity was if it was more abundant than the randomly chosen superior competitor. All models of competition employed a very heavy burden of competition on inferior competitors and no burden on superior competitors, and thus, competition was highly asymmetric. In addition, the per-capita reduction of intensities was high; inferior competitors suffered a minimum of 50% loss of intensity. This simulated a very strong effect of competition at the infracommunity level, which was thought necessary to effectively determine whether available methods of analysing field-collected datasets could detect competition in communities. In addition, since parasite communities of fish consist of relatively few co-occurring species at relatively low intensities, it seemed wise to employ strong competitive effects to maximize the possibility that these effects would be detectable in the first place.
Taken together, the five models above consisted of 150 component communities (30 x 5 models) including in aggregate 4,500 infracommunities (150 x 30 infracommunities), each with a maximum of four parasite species present. These five models are referred to as BASE models because they are based on the observed infection parameters in the field-collected creek chub. Since more species-rich communities with higher parasite prevalence are often seen as being highly interactive, the effects of increasing parasite prevalence and parasite species richness were investigated herein as well.
In PREV models, the above procedures were duplicated, except the underlying prevalence (and therefore mean abundance) of each of the four species was doubled. The existing intensities (non-zero abundances) for each species were simply copied, replacing the equivalent number of uninfected hosts. Each parasite’s observed prevalence was less than 0.5, so doubling prevalence was possible in each case. Doubling prevalence modelled a situation in which interspecific encounters were increased but without a joint increase in the intensity of each encounter (i.e., infracommunity abundances remained as for the BASE datasets).
In SPRICH models, the number of parasite species present was doubled from four to eight. Each observed parasite species was duplicated using the data in BASE NOCOMP. Thus, the copy of A. lobatum has the same prevalence, mean abundance, etc. as the actual A. lobatum, etc. Because community assembly was random, each duplicated species exhibited its own infection parameters in model infra- and component communities, although their average infection parameters closely mirrored the species from which they were duplicated. In this aspect of the study, the total number of species that could be engaged in interspecific competition increased, but the underlying probabilities of infection and infection intensities remained unchanged.
Each duplicated species in the SPRICH models is identifiable by its four-letter code, which is the reverse of the four-letter code of its originator species (i.e., BOLA is the duplicate of ALOB (A. lobatum), TORP of PROT, NACR of RCAN, and SIMP of PMIS). In both PREV and SPRICH communities, the same five sets of communities were created as were for the BASE communities. In ACANWIN models in the SPRICH dataset, PMIS remained the acanthocephalan designated as the superior competitor; SIMP was an inferior competitor.
Four sets of analyses were performed on each dataset, two straight-forward methods used widely in these kinds of studies (correlation; logistic regression) and two null-model approaches that have not been widely applied to parasite communities (envelope analysis; co-occurrence). Logistic regression and null-model co-occurrence analyses concern the presence-absence aspect of community structure, whereas correlation and envelope approaches analyse abundance data. In all cases, these analyses are deployed to determine if a particular aspect of community structure is consistent with what would be expected in the case of negative interspecific interactions such as interspecific competition as modelled herein. Spearman’s correlations and logistical regression models were performed in MiniTab (www.minitab.com), and EcoSim Professional (www.garyentsminger.com/ecosim/index.htm) was used for all null model analyses. In all cases, the expectation was that the method would not detect competition in NOCOMP communities but would detect competition in PREEMPT, RANDWIN, ABWIN, and ACANWIN communities. In addition, the expectation was that all methods would display a higher likelihood of detecting competition in the PREV and SPRICH datasets than in the BASE datasets.
Spearman’s rank correlation (rs) was calculated for each species pair among the 30 infracommunities in each component community under the hypothesis that interspecific competition would increase the number of negative and significantly negative correlation coefficients. Any correlations in which a species was absent prior to the modelled effects of competition was excluded from subsequent summaries. Any correlation that could not be performed because modelled competition resulted in the exclusion of a species from the component community was tallied as a significantly negative correlation because interspecific competition resulted in exclusion both at the infra and component community level. The mean rs among the analyses per dataset-model was calculated, along with the percentage of those correlations that were significantly negative and positive (at an uncorrected α=0.05).
Envelope analyses were conducted as a non-parametric, null-model-based approach similar to correlation analyses. Correlations among parasite abundances are unlikely to be linear because of the high number of hosts harbouring few or no parasites of both species. Negative associations between parasite abundances could follow a pattern wherein the lower left quadrant of the abundance biplot is filled, but the upper right quadrant is empty, a situation in which there are no simultaneous heavy infections of both species. This is similar to the constraint envelopes or upper bounds familiar in macroecological studies of geographical range size and other species’ traits (Brown and Maurer Reference Brown and Maurer1987; Gotelli and Entsminger Reference Gotelli and Entsminger2009). This pattern is not conducive to parametric regression techniques because these points can obscure an overall pattern of negative association among parasite abundances.
EcoSim provides a method by which the pairwise abundance values of two parasite species among hosts are randomized and then tests whether the observed distribution of abundance values is different than a simulated set of values. Multiple methods are available for evaluating whether the lower-left quadrant of points is different than what would be expected by chance, of which Dispersion was chosen (other metrics performed erratically in initial trials). Dispersion is calculated by dividing the biplot space into four quadrants using the median values of each parasite’s abundance as the center point and then calculating the variance among the number of points falling into each of those four quadrants. If observed values are significantly concentrated in the lower left quadrant, then the variance of the observed dispersion index will be significantly higher than what is produced in the simulated plots (as the distribution of points among the four quadrants approaches uniformity, the variance approaches zero).
Initial envelope analyses were run both on unmodified datasets and then on datasets trimmed of any hosts in which the abundance of both parasite species under consideration was zero (i.e., eliminating uninfected hosts). Analyses of the full datasets resulted in no detection of competition, whereas analyses of trimmed datasets revealed the presence of competition in some cases. Not surprisingly, the very high number of uninfected hosts can obscure that competition is occurring in the hosts that are infected. Results from trimmed datasets are included herein. Similarly, two types of boundaries were explored (symmetrical and asymmetrical), but this distinction resulted in no differences in the outcomes of the analyses. Results from using asymmetrical boundaries are presented herein. See Brown and Maurer (Reference Brown and Maurer1987) and Gotelli and Entsminger (Reference Gotelli and Entsminger2009) for more details on these analyses and their implementation.
Logistical regression models were built for each species combination of dataset and model (N = 900 hosts per regression model). The presence-absence of the target species was the dependent variable, and the abundances of the other species were independent variables. Component community was included as a categorical covariate but was not significant in each case. Regression coefficients should be negative when interspecific competition is influencing community structure. The fraction of coefficients that were negative was calculated, as well as the mean coefficient for each variable.
EcoSim also implements 36 different null models to determine whether an observed community matrix (species x sites) displays evidence of being structured by interspecific competition. Rows in the matrix (species) and columns (sites) can be treated in different ways during randomization to produce a distribution of reference matrices. EcoSim has nine ways in which this can be implemented, of which four were used herein that were recommended based on analyses of the Type I and II error rates (Gotelli Reference Gotelli2000): 1) Sim 2 (Rows Fixed; Columns Equiprobable [FE]) is a model of community assembly in which each species colonizes each site independently of each other, and all sites have the same probability of being colonized; 2) Sim 8 (Rows Proportional; Columns Proportional [PP]) models community assembly in which colonization is dependent both on the row totals (each species’ prevalence) and the column totals (each site’s species richness); 3) Sim 9 (Rows Fixed; Columns Fixed [FF]) maintains both row and column totals as in the observed matrix when modelling community assembly; 4) User-defined (Rows User-Defined; Columns Fixed [UF]), in which the row totals were constrained by each species’ overall abundance, but site totals (columns) remained as in the observed matrix.
The C-Score (Stone and Roberts Reference Stone and Roberts1990) and the V-ratio (Schluter Reference Schluter1984) were used as metrics of co-occurrence. The former calculates the mean number of checkerboard units between all species pairs (i.e., in this case, the number of times that one host has parasite A but not parasite B, and one host has parasite B but not parasite A). Hosts in which both parasite species of a pair are present, and those in which both are absent, do not contribute to the C-Score. C-Scores were calculated for each matrix and then compared to 5,000 randomly produced matrices. The C-score should be larger than simulated scores in a competitively structured community. The V-ratio measures the ratio of the variance of the column sums and the sum of the row variances (V = σ2 column sums / Σ of row σ2). The observed V-ratio should be smaller than the mean of the simulated matrices in a competitively structured community (Vobs < Vsim). The C-Score was used in conjunction with Sim 2 (FE), Sim 9 (FF), and the user-defined option (UF). The V-ratio was used with Sim 2 (FE), Sim 8 (PP), and the user-defined option (UF).
Standardized effect scores (SES) were calculated for each of the analytical results using the methods of Gotelli and Rohde (Reference Gotelli and Rohde2002). For each, the difference between the observed and the mean of the simulated C-Scores or V-ratios was divided by the standard deviation from the simulated data; thus, each SES is the number of standard deviations higher or lower than the expected SES of zero. For C-Scores, negative SES indicates more species aggregation in hosts than expected by chance because the observed C-Score is lower than the simulated C-Score, and positive SES indicates more species segregation among hosts because the observed C-Score is higher than the simulated C-Score. For the V-ratio, the interpretation of the sign of the SES is reversed.
The goal of these investigations was to explore how each method performed in relationship to the different models of competition employed to create the underlying databases, and in relationship to each other. Like other statistical tests, a good test for competition will not detect competition when it is not there (low Type I error rate) and will reliably detect competition when it is present (high power; low Type II error rate). Thus, the pattern of the effect detected among community types within a single analytical method is as important as any strictly inferential statistical outcome comparing different methods or modes of competition. As such, interpretations are limited to the patterns observed and do not employ statistical comparisons among methods or modes of competition. It is not the goal of this study to recommend any one method to detect competition or to definitively exclude a method from future use. At this early stage of evaluation of these methods and approaches, the differences observed are stark (e.g., poor methods are obviously poor, and promising methods are obviously promising).
Results
Parameters of modelled communities (Table 1) were substantially reduced by competition in communities in all cases (Figure 1). Matrix fill (fraction of cells in a host–parasite matrix including a parasite) and mean infracommunity species richness were reduced between 12% and 30%, and total infracommunity abundance was reduced from 10% to nearly 40%, depending on the dataset and competition model employed. Doubling the prevalence and doubling the species richness generally amplified the degree of reduction displayed in communities altered by competition, and to roughly the same degree (Figure 1). The exception to this result is the ACANWIN models, which behaved the same in the BASE and SPRICH datasets because competition is only registered when the dominant acanthocephalan is present. Overall, PREEMPT and ACANWIN models showed similar reductions in all parameters compared to BASE models, whereas RANDWIN and ABWIN displayed similarly greater degrees of competitive effects. The exception to this observation was for total abundance in the ABWIN models, which is expected because the most abundant member of each infracommunity has its abundance preserved, thus leading to overall lesser effects of competition on the total infracommunity abundance of parasites. RANDWIN models always displayed the largest effects of competition on parasite community parameters.
a Three sources of data were used to construct model communities: BASE used the observed data for each of the 4 species; PREV used data in which the prevalence and mean abundance of each species was doubled from the observed data; SPRICH used data in which each of the 4 species was copied to double the species richness.
b ALOB = Allocreadium lobatum; PROT = Proteocephalus sp.; RCAN = Rhabdochona canadensis; PMIS = Paulisentis missouriensis; BOLA, TORP, NACR, PMIS are copied species, respectively.
Correlations among species pairs were generally nearly zero or slightly positive when competition was absent (NOCOMP) for BASE, PREV, and SPRICH datasets, and all four models of competition resulted in average correlation coefficients that were negative in most cases (Figure 2a). The absolute values of the correlation coefficients were small (0.05–0.15), but the pattern observed among models was as expected, with PREEMPT and ACANWIN being least likely to show negative correlations, and RANDWIN and ABWIN being most likely. Analysis of all three datasets produced similar patterns, with the exception of the ACANWIN model when prevalence was doubled (PREV), which showed slightly positive average correlation coefficients. Doubling the prevalence and doubling the species richness resulted in fewer negative correlations (Figure 2a and Table 2). The absolute number of significantly negative correlations was small for all three datasets (Table 2), never exceeding 9% of the total.
Nearly the same pattern was present for logistical regression analyses (Figure 2b). Mean coefficients were near zero in NOCOMP and negative in the four models of competition. Significant negative associations were found consistently when competition was strongest (RANDWIN) and prevalence and species richness were unaltered (BASE). In all other models, the percentage of potential associations that were significant was lower (< 50%) and decreased in PREV and SPRICH datasets in all cases except for PREV with ACANWIN. The coefficients, whether significant or not, were only weakly negative (Figure 2b) and did not explain much variation in the presence-absence of each target species (Table 3).
Results of envelope analyses did not differ when different boundary types (symmetrical vs. asymmetrical) were utilized (mean SES = 22.98 and 22.95, respectively). Results did differ dramatically when using full datasets and when trimming each dataset of hosts that were uninfected (double-zeroes) (mean SES = -0.34 and 46.28, respectively). When uninfected hosts were included, the analysis almost never detected competition and oftentimes detected more co-occurrence than expected by the null model instead (negative SES). In contrast, when uninfected hosts were excluded, the model reliably detected competition in all models of competition. Standardized effect sizes were very large, and the vast majority of individual tests indicated the presence of a pattern in the abundance biplot that was nonrandomly concentrated in the lower left quadrant (Figure 2c). In BASE and SPRICH datasets, envelope analyses also detected competition when it was not present (NOCOMP), usually as often as when competition was present (Figure 2c). In the PREV dataset, however, envelope analyses did not detect competition in NOCOMP models, but did so consistently in PREEMPT, RANDWIN, ABWIN, and to lesser extent, ACANWIN models (Table 4). Power was strong in these analyses, but Type I error rates were near unity for BASE and SPRICH datasets. Power was good and Type I error rate was low in the PREV dataset.
Results of null model analyses of community composition varied among null models employed, metrics of competition, dataset, and model of competition (Figure 3). The overall pattern of mean scores for Sim 2 (FE) was consistent with a useful method for detecting competition for both the C-Score and the V-ratio, with near-zero SES in NOCOMP and higher absolute values of SES under all competition models and datasets. However, the overall detection rate of competition was low (Table 5); increasing prevalence and species richness reduced the likelihood of detecting competition when it was present.
Sim 9 (FF; C-Score) was almost universally poor for detecting competition when present and only showed any tendency to do so with the ACANWIN model. In contrast, Sim 8 (PP; V-ratio) was very good at detecting competition when present, but also very likely to do so when competition was absent. Use of each parasite species’ overall abundance as a user-defined limit on row totals (UF) produced widely divergent results among different combinations of metric, dataset, and competition models. When the C-Score was employed, this approach showed poor ability to detect competition when present for both the BASE and PREV datasets, then excellent ability to do so in SPRICH datasets, but with an equally likely ability to detect competition when it was absent. When the V-ratio was used, the UF null model detected competition indiscriminately in BASE and SPRICH datasets, including in NOCOMP models, but did not reliably detect competition in PREV models (note that the very high value for RANDWIN in PREV models is due to one extremely high outlier result). Regardless of metric used, the UF null model approach led to some detection of species aggregation in models where there is actually competition, most notably in the PREV datasets when ACANWIN is the model of competition. Rates of detection of competition in models including competition were generally low and only high in competition models when it was also high in NOCOMP models (high Type I error).
Discussion
The promise that a pattern of negative associations among parasite species in a sample of hosts is indicative of some underlying biological phenomenon of import was recognized as early as Cross (Reference Cross1934, and references therein). The innate difficulties of using those patterns to discern the process responsible for their generation have not been reduced much in the decades since (Janovy Reference Janovy2002). Cross’s (Reference Cross1934) classical presentation of an abundance biplot between acanthocephalans and tapeworms in a sample of ciscoes, where all fish were infected but none contained many worms of either species, could be a result of what he thought at the time, non-specific immunity. Just as logical, however, are several other explanations, including indirect competition for resources, direct interspecific antagonism, limited spatial overlap of transmission foci in the ecosystems studied, transmission limitation, not to mention different varieties of these categorical explanations themselves (e.g., priority effects in competition versus direct confrontation between antagonists). Since so many processes can produce patterns of parasite abundance that at least mimic the effects of competition, discerning competition is perhaps too heavy a burden for any one analytical approach or technique.
This set of problems was recognized by early workers first grappling with field data of parasite communities in a modern context (Kuris Reference Kuris, Esch, Bush and Aho1990; Sousa Reference Sousa, Esch, Bush and Aho1990; Fernandez and Esch Reference Fernandez and Esch1991; and citations within). Dobson (Reference Dobson1985) argued that his review of the literature and modelling efforts strongly suggested interspecific competition is as important in regulating parasite populations and communities as it is in free-living organisms like plants. Further, he noted that, at the time, the differences in the perceived importance of competition were largely due to differences in opinions among workers in different subdisciplines (e.g., those working on different groups of parasites and/or groups of hosts). In his view, what was needed was a more dispassionate approach with an explicit mathematical framework.
Kuris (Reference Kuris, Esch, Bush and Aho1990) reasoned dispassionately when discussing trematode larvae in snails that competition might be important if parasite prevalences are high, if there is a dominance hierarchy among parasite species, and if interference (direct competition) is more common than exploitation (indirect competition), which all seem reasonable. Much of the concern at the time was concentrated on how parasite communities vary in the degree to which competition is important (isolationist vs interactive; assemblages vs communities), and the relative importance of competition when it is present to the myriad other processes contributing to parasite community structure (transmission limitation, spatial and temporal heterogeneity, host immunity, etc.). Probably no one has argued for very long (without being interrupted) that parasite communities are strictly compensatory systems, driven like multidimensional thermostats according to Lotka-Volterra-like rules, but there is unlikely to be any resolution in the near future regarding just how important such mechanics are to the composition of parasite communities compared to non-equilibrial processes.
A different question is addressed by the current investigation. Assume competition is present. Could we even see it? Are our methods up to the task of detecting competition when it is pervasive in a community of parasites, and under what conditions do those odds improve? Well-designed experiments, whether controlled or natural, can detect competition mechanistically, e.g., via documenting species turnover, competitive exclusion, pre-emption, reduced organismal fitness, and/or microhabitat shifts (see Kuris Reference Kuris, Esch, Bush and Aho1990; Sousa Reference Sousa, Esch, Bush and Aho1990; Lafferty et al. Reference Lafferty, Sammond and Kuris1994; Esch et al. Reference Esch, Curtis and Barger2001; Poulin Reference Poulin2001; Janovy Reference Janovy2002; and references therein). Most parasite systems are not particularly amenable to experiments, however, and therefore, the analysis of field-collected data will continue to be the basis of attempts to discern whether the stamp of competition is present in a particular parasite community. Traditional statistical techniques continue to be used by workers seeking negative associations among the patterns of presence and abundance of two or more species in a sample of hosts (Krasnov et al. Reference Krasnov, Mouillot, Shenbrot, Khokhlova and Poulin2005; Poulin Reference Poulin2005; Johnson and Buller Reference Johnson and Buller2011; Fenton et al. Reference Fenton, Knowles, Petchey and Pedersen2014; Salgado-Maldonado et al. Reference Salgado-Maldonado, Caspeta-Majdujano, Mendoza-Franco, Rubio-Godoy, Garcia-Vasquez, Mercado-Silva, Guzman-Valdivieso and Matamoros2019). In addition, null models have been deployed for analysing such data (Cort et al. Reference Cort, McMullen and Brackett1937; Lafferty et al. Reference Lafferty, Sammond and Kuris1994; Lotz and Font Reference Lotz and Font1994; Janovy et al. Reference Janovy, Clopton, Clopton, Snyder, Efting and Krebs1995; Gotelli and Rohde Reference Gotelli and Rohde2002; Soldanova et al. Reference Soldanova, Kuris, Scholz and Lafferty2012; Laidemitt et al. Reference Laidemitt, Anderson, Wearing, Mutuku, Mkoji and Loker2019), and others have used multivariate techniques (Carbaret and Hoste Reference Cabaret and Hoste1998), longitudinal studies (Fenton et al. Reference Fenton, Knowles, Petchey and Pedersen2014), or other modelling techniques (Fenton et al. Reference Fenton, Viney and Lello2010; Dallas et al. Reference Dallas, Laine and Ovaskainen2019).
The present investigation contributes to this body of case-studies and analyses by providing side-by-side comparisons of four different techniques for detecting competition using a long-term dataset from which to build model communities that were then modified by different forms and intensities of interspecific competition. The source data come from hosts (fishes) that have been thought to harbour intestinal helminth communities mostly absent any competitive interactions (Kennedy et al. Reference Kennedy, Bush and Aho1986; Holmes and Price Reference Holmes, Price, Anderson and Kikkawa1986). And, in fact, previous studies of the underlying data found very little evidence of interspecific competition within or among host fishes (Barger Reference Barger2020, Reference Barger2021). Building model communities structured by competition of various forms from these data produced significant reductions in the parameters of a community matrix that a reasonable investigator would want to be able to detect. Matrix fill, species richness, and parasite abundance were all reduced by 10% to nearly 40% (Figure 1). In addition, doubling the prevalence and/or species richness of the source communities increased the negative effect of competition on these parameters (Figure 1).
An ideal analytical technique would detect competition when it is present and strong and fail to do so when it was absent or weak. In the context of the current investigation, such a method would not signify competition in the NOCOMP communities but would do so frequently in those communities structured by competitive interactions. Methods that detect competition even though it is absent have a high Type I error rate, and methods that fail to detect competition when it is present have a high Type II error rate (low power).
Spearman’s correlations and logits performed similarly in this regard. NOCOMP communities were reliably diagnosed as absent competitive effects, whereas nearly all competition-influenced communities showed some sign of competitive effects. In both cases, however, the strength of that effect declined when doubling prevalence or species richness in the underlying source community (Figure 2). For correlation analyses, the fraction of correlations that were significantly negative remained very low, regardless of the model of competition employed or the source community parameters (Table 2). If fewer than 10% of correlations are likely to reveal competition when it is as strong as in the present study, then this method shows little promise for situations similar to the ones employed herein. Logistical regression performed better in detecting competition when it was present (Figure 2b), and particularly well in the communities with randomly chosen competitive winners at low prevalence and low species diversity. However, power declined when prevalence was doubled and declined by about half when species richness was doubled. The amount of variation that logits explained was predictably low (Table 3).
The two null model approaches exhibited very high variability in their performance. Envelope analysis admirably detected competition in almost all cases in which it was present (Table 4) but also suffered from high Type I error rates in the BASE communities and when species richness was doubled. When prevalence was doubled, the method was among the most reliable: never detecting competition when absent and detecting competition between 67% and 100% of the time when competition was present. When species richness was doubled, the method was even better at detecting competition when present but also unfortunately poor in that it detected competition when absent (Table 4).
Type I error rates were encouragingly low in many analytical combinations used in co-occurrence analyses, and doubling the prevalence usually decreased the Type I error rate compared to BASE communities and communities in which species richness was doubled. Overall, the C-score performed better than the V-ratio for NOCOMP communities, with only elevated Type I error rates when using user-defined abundance values in communities with double the species richness (Table 5). Outside of that case, the C-score did poorly in detecting competition when it was present, hardly ever doing so in Sim 9 (FF) analyses, and only slightly more often in Sim 2 (FE) analyses (Table 5). The V-ratio results for Sim 2 (FE) were similar as for the C-score but differed wildly for Sim 8 (PP) and when user-defined abundances were used. In the former case, V-ratios were good at detecting competition when present, but also had nearly equally high Type I error rates; both power and Type I error rates declined when prevalence was doubled (Table 5). The three source communities produced three different patterns for the V-ratio when user-defined abundances were utilized, all of them unsatisfactory from the perspective of minimizing Type I or Type II error rates.
Caution is warranted when interpreting these patterns and results because they strictly apply only to the circumstances in the present investigation, which represent but a small fraction of the variety of parasite communities that might be analysed for the presence of competition and the modes of competition that might be operating. With those caveats in mind, several conclusions can be drawn. First, there is enormous variability among the outcomes found in this study. Thus, analysis of the same dataset with two or more methods is very likely to produce conflicting statistical results and therefore uncertainty about the presence or importance of competition in structuring parasite communities such as these.
Second, no single method performed similarly across the different underlying levels of prevalence and species richness in the source community. Indeed, in many cases, increasing the species richness or prevalence of parasites led to decreases in the power of statistical tests to detect competition (e.g., in Spearman’s pairwise correlations and logistical regression). Different underlying parameters of species prevalence and abundance seem to matter enormously, and not always in ways predicted by theory.
For example, high prevalence should lead to more interspecific encounters and therefore more competition, which it did in these models (Figure 1). Nevertheless, some methods displayed reduced ability to detect that competitive effect under those circumstances. At least in some cases, this result has a straightforward interpretation. When communities comprise very few species at relatively low abundances, the most common state of an infected host is to harbour just one parasite species. Doubling the prevalence makes this less likely, as does doubling the species richness, so that the overall effect is more coinfected hosts. The consequent increase in competitive interactions is apparently offset by the reduction in the number of infracommunities harbouring only one parasite species to begin with. Thus, biplots of species abundances contain less evidence of negative associations. This effect, if found to be more general, would be pernicious in the sense that the methods used to detect competition would perform worse as the overall effect of competition on the community increases. And it should be noted that null model approaches suffered less from this effect, displaying increased power with increasing prevalence and/or species richness in a few instances.
Third, there is no single approach that is demonstrably preferrable to the others, although some are clearly unsatisfactory. Correlations had good Type I properties but poor power overall, an empirical result consistent with Poulin’s (Poulin Reference Poulin2005) concern regarding using correlation coefficients to discern competitive effects in communities and Brown’s (Brown et al. Reference Brown, Bedrick, Ernest, Cartro, Kelly, Taper and Lele2004) more general description of the statistical constraints on correlation coefficients. Correlation analyses have a good chance, when competition is strong, of suggesting that there is competition, but they are unlikely to return a conventionally significant result. The results of other studies are consistent with this finding (Salgado-Maldonado et al. Reference Salgado-Maldonado, Caspeta-Majdujano, Mendoza-Franco, Rubio-Godoy, Garcia-Vasquez, Mercado-Silva, Guzman-Valdivieso and Matamoros2019). Similarly, logistic regression passes the Type I error test of a good analytical technique, but only displays good power when prevalence and species richness are low, and when competition is modelled as a random winner, which is probably the least biologically realistic of the models chosen here.
Envelope analyses performed nearly perfectly when prevalence was doubled, but suffered from uniformly high Type I error rates when other source communities were used. The enormous SES values observed in these analyses (>20 standard errors of the mean) also suggest an all-or-nothing effect (i.e., either the method detects nothing but competition or it detects none of it). Of the many different models used herein for co-occurrence analyses, not one had both admirable Type I and Type II error rates. Some null model approaches suggest that they may be fruitful targets for further investigation, however, including Sim 2 (FE) using either the C-score or the V-ratio. Results from both exhibited the correct pattern of results but failed in the absolute effect size when competition was present. Perhaps this method might prove useful in analysis of parasite communities that are far richer and comprise far more abundant and prevalent species than those modelled here.
At this point, the results of the present investigation do not lead to any easy solutions for the detection of competition in samples of field-collected hosts. Most studies using null model approaches to detect competition do so without comparing the results to what would be found using alternative methods. Additional modelling studies are warranted but with much broader ranges of species richness and infection parameters of community members than presented here. Furthermore, the modes of competition employed here are not necessarily those most commonly evident in real parasite communities, and it may be that a different approach to modelling competition will produce better correspondence between the results of analytical techniques and the underlying mechanisms producing community patterns.
The search for appropriate methods for detecting competition remains in its infancy, and very few studies have approached this problem empirically and comparatively (Fenton et al. Reference Fenton, Knowles, Petchey and Pedersen2014). Given the exceptional diversity of naturally occurring host–parasite communities, it is not surprising that the literature on the subject is wildly inconsistent. Perhaps the presence and strength of interspecific competition is so system and context dependent as to make traditional study methods powerless. Even more concerning are examples like the present study where even exaggerated levels of competition lead to opaque results among analytical approaches, and in some cases results that are quite contradictory. Fenton et al. (Reference Fenton, Knowles, Petchey and Pedersen2014) found similar results in a two-parasite system of small mammals, wherein multiple analytical methods failed to detect the competition between nematodes and coccidia demonstrated via perturbation experiments, and classical correlation approaches suggested interactions in the opposite direction.
Finally, if these results turn out to be more generally applicable, then the literature may have grossly underestimated the effects of competition operating in natural parasite communities. Although long-held to be absent interspecific interactions, the typically low diversity and low intensity communities of parasites in fishes and other hosts do show evidence of negative interspecific interactions (Kennedy Reference Kennedy1992; Vidal-Martinez and Kennedy Reference Vidal-Martinez and Kennedy2000; Salgado-Maldonado et al. Reference Salgado-Maldonado, Caspeta-Majdujano, Mendoza-Franco, Rubio-Godoy, Garcia-Vasquez, Mercado-Silva, Guzman-Valdivieso and Matamoros2019). The particular communities forming the basis of the current investigation do not display such evidence (Barger Reference Barger2020, Reference Barger2021), but further work aimed at improving our ability to detect competition when present may warrant reconsideration of the role of competition in natural parasite communities and therefore reanalysis of existing datasets.
Acknowledgements
Many undergraduate student research colleagues at Peru State College, Peru, Nebraska, USA participated in this work by collecting and processing samples of creek chub for their own projects.
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Competing interest
The author declares none.
Ethical standard
The author asserts that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional guides on the care and use of laboratory animals.