Introduction
During military operations, personnel are required to maintain performance in both their role-specific physical tasks (e.g., load carriage) and corresponding cognitive tasks (e.g., decision making and communication) (Crawford et al., Reference Crawford, Teo, Lafferty, Drake, Bingham, Gallon, O’connell, Chittum, Arzola and Berry2017; Scribner, Reference Scribner2016). Failure to maintain performance, in either domain, can result in reduced combat readiness and decreased operational performance (Crawford et al., Reference Crawford, Teo, Lafferty, Drake, Bingham, Gallon, O’connell, Chittum, Arzola and Berry2017; Vrijkotte et al., Reference Vrijkotte, Roelands, Meeusen and Pattyn2016). Consequently, there is growing interest in the relationship between military-specific physical activity and cognitive performance within military operators (Armstrong et al., Reference Armstrong, Smith, Risius, Doyle, Wardle, Greeves, House, Tipton and Lomax2022; Bhattacharyya et al., Reference Bhattacharyya, Pal, Chatterjee and Majumdar2017; Eddy et al., Reference Eddy, Hasselquist, Giles, Hayes, Howe, Rourke, Coyne, O’Donovan, Batty and Brunyé2015; Giles et al., Reference Giles, Hasselquist, Caruso and Eddy2019; Kobus et al., Reference Kobus, Brown, Wu, Robusto and Bartlett2010; Nibbeling et al., Reference Nibbeling, Oudejans, Ubink and Daanen2014; Son et al., Reference Son, Hyun, Beck, Jung and Park2019, Reference Son, Jung, Hwang, Beck and Park2022; Vine et al., Reference Vine, Myers, Coakley, Blacker and Runswick2021). Despite this interest, the methodologies and approaches used to investigate this relationship have differed considerably, particularly concerning the assessment of cognitive performance.
Based on the assessment tools used to date, and the visual and auditory requirements of soldiers, two assessment tools were developed: A Military Specific Auditory N-Back Task (MSANT), and a Shoot-/Don’t-Shoot (SDST). The former used phonetically described pairs of letters and represented aspects of military radio communications, while the latter represented aspects of any military scenario where visual search and inhibition are required (e.g., assaulting an enemy position or operations in builtup areas). The current study, therefore, aimed to detail the methodology of the MSANT and SDST, along with quantifying the typical day-to-day variability of both assessment tools under seated and walking condition. The investigation did not seek to investigate the influence of physical fatigue or dual tasking on the performance of these assessment tools.
Methods
The full methods for this project are available in the Supplementary Material, with the raw data available at: https://osf.io/jekv8/. Briefly, the study comprised of two elements. First, the day-to-day variability of the MSANT and SDST was assessed in a seated condition on three separate occasions (Part 1). This was chosen due to the large variability in potential application of these assessment tools in future projects. Second, within a sub-sample of the study population, the day-to-day variability of the MSANT and SDST was assessed during a 10-min walking activity, on three separate occasions (Part 2). While a matched study population for this part of the study would have been optimal, given the time required for this portion of the study (a result of the necessity to reach a physiological steady state before conducting the test, and the recovery period required between each walking bout to prevent the onset of fatigue), a sub-sample approach was instead chosen. Physiological steady state refers to the stabilization in the physiological responses to exercise (e.g., increases in heart rate). Without this stability, variability in cognitive performance could be induced as a consequence of adapting to the exercise stimulus opposed to just reflecting the typical variation in test performance.
All laboratory visits were separated by a minimum of 48 hr, and participants were required to arrive in a fed and hydrated state having avoided caffeine for a minimum of 3 hr. Study visits were completed at approximately the same time of day (±2 hr) to control for the potential effect of circadian rhythm on test performance. All participants were recruited from the university population (all were students or from academic positions), spoke fluent English, and had self-declared normal or corrected to normal vision.
Twenty-eight participants volunteered for Part 1 of the study (14 male, 14 female, age [mean ± SD] 27.3 ± 4.3 year) and 12 participants for Part 2 (6 male, 6 female, age 28.4 ± 3.5 year). Sample size for Part 1 was calculated using an a priori power calculation (G Power; version 3.1.9.4) (Prajapati et al., Reference Prajapati, Dunne and Armstrong2010). For the seated portion of the investigation, 28 participants were required to a moderate effect size (f = 0.25), with a statistical power of 80% and an alpha level of 0.05, based upon a correlation coefficient of r = 0.5 (identified from initial pilot testing). A moderate effect size (Cohen, Reference Cohen1988) was selected based on the combination of effect sizes reported in previous investigations, utilizing similar cognitive assessment tools (Eddy et al., Reference Eddy, Hasselquist, Giles, Hayes, Howe, Rourke, Coyne, O’Donovan, Batty and Brunyé2015), and the anticipated smallest effect size of interest to military policymakers. The sub-sample size was designed to represent the typical size (and therefore likely variation) of study populations within this research area (e.g., Bhattacharyya et al., Reference Bhattacharyya, Pal, Chatterjee and Majumdar2017; Crowell et al., Reference Crowell, Krausman, Harper, Faughn and Sharp1999; Eddy et al., Reference Eddy, Hasselquist, Giles, Hayes, Howe, Rourke, Coyne, O’Donovan, Batty and Brunyé2015). Ethical approval was provided by the Institution’s Research Ethics Committee, with written consent obtained from all participants.
Cognitive assessments
The MSANT involved identifying a pair of phonetically described letters two previous to an auditory tone (i.e., 2-back). During the seated condition, participants recorded their answers, while during walking trials, participants were required to relay their answers verbally which were recorded on their behalf. The SDST was designed to be a visual search and inhibition task similar to those tasks previously employed within the literature (Armstrong et al., Reference Armstrong, Smith, Risius, Doyle, Wardle, Greeves, House, Tipton and Lomax2022; Eddy et al., Reference Eddy, Hasselquist, Giles, Hayes, Howe, Rourke, Coyne, O’Donovan, Batty and Brunyé2015; Kobus et al., Reference Kobus, Brown, Wu, Robusto and Bartlett2010). The assessment involved responding appropriately to targets and non-targets. Participants were instructed to place equal importance on both response time and accuracy. For the SDST, there was a 2:1 ratio between targets and non-targets.
For Part 1, during the first visit, participants were familiarized (two full trial completions of each assessment) with the MSANT and SDST, in a randomized counterbalanced order. For the second, and third visits, participants completed the MSANT and SDST in the same randomized counterbalanced order. For Part 2, a sub-sample of 12 participants completed 3 additional laboratory visits completing the SDST and MSANT while walking on a treadmill. Again the MSANT and SDST were completed in a randomized order. All tests were completed with 10 min of seated rest between trials to negate the influence of physical fatigue. To enable a physiological steady state to occur, participants completed 5 min of walking before the commencement of the cognitive assessments. For all walking trials, participants walked on a motorized treadmill (6.5 km·h−1, 1% gradient) at a load carriage speed representing a typical “enemy contact” speed (Armstrong et al., Reference Armstrong, Ward, Lomax, Tipton and House2019).
Statistical analysis
Data were principally analyzed using JASP (JASP, 2020; v0.14.1). For normally distributed data, a one-way ANOVA was employed to identify whether a likely main effect of assessment time point was apparent. Effect sizes are presented as Omega squared (Ѡ 2) (Levine & Hullett, Reference Levine and Hullett2002). For non-normally distributed data, a Friedman’s test was employed with effect sizes presented using Kendall’s W. Holm-Bonferroni adjusted pairwise comparisons, and pairwise comparisons using Conover’s test were made post-hoc as appropriate. For key assessment variables, equivalency between trials was calculated using the two one-sided test approach (Lakens et al., Reference Lakens, Scheel and Isager2018). Based upon the a priori sample size calculation, d = 0.5 was employed as the smallest effect size of interest. To describe the typical variation in assessment parameters between trials, Limits of agreement ± 95% confidence intervals, standard error of the mean, and smallest detectable change values were calculated (Hopkins, Reference Hopkins2000; Ludbrook, Reference Ludbrook2010; van Kampen et al., Reference van Kampen, Willems, van Beers, Castelein, Scholtes and Terwee2013).
Results
Descriptive statistics are presented in Table 1, with day-to-day variation descriptors reported in Table 2. One participant was removed from the analysis, due to being more than two SDs outside the remainder of the data set.
Note. Blank cells denote data that were not collected due to the seated condition acting as the familiarization for the walking condition.
Abbreviations: ASTO, accuracy-speed trade-off; CR, correct response; FAM, familiarization; MSANT, Military-Specific Auditory N-Back Task; RT, response time; S, seated; SDST, Shoot-/Don’t-Shoot Task; W, walking.
Abbreviations: ASTO, accuracy-speed trade-off; CI, confidence intervals; CR, correct response; MSANT, Military-Specific Auditory N-Back Task; RT, response time; S, seated; SDC, smallest detectable change; SDST, Shoot-/Don’t-Shoot Task; SEM, standard error of the mean; W, walking.
Seated performance
Military-Specific Auditory N-Back Task
There was no likely main effect for time for total correct response (χ 2(4) = 4.531, p = .361, Kendall’s W = 0.492) or combined correct responses (χ 2(4) = 3.856, p = .426, Kendall’s W = 0.488); however, a likely main effect for time was evident for partial correct responses (χ 2(4) = 11.846, p = .019, Kendall’s W = 0.426). For the key variable of combined correct responses, the comparison between trial 1 versus trial 2 was both statistically equivalent (W (25) = 64, p = .002) and not statistically different (W (25) = 70, p = .938). Similarly, trial 2 versus trial 3 were both statistically equivalent (W (25) = 20, p = .06) and not statistically different. Likewise, trial 1 versus trial 3 were also both statistically equivalent (W (25) = 50, p = .032) and not statistically different.
Shoot-/Don’t-Shoot Task
There was no likely main effect for time on either shoot correct (χ 2(4) = 4.00, p = .406, Kendall’s W = 0.175), don’t-shoot correct (χ 2(4) = 3.069, p = .546, Kendall’s W = 0.482), total correct (χ 2(4) = 3.375, p = .497, Kendall’s W = 0.471), and average response time (F (2.981,77.515) = 1.035, p = .382, Ѡ 2 = 0.001). There was, however, a main effect for time in the accuracy-speed trade-off (ASTO) parameter (F (4,104) = 7.037, p < .001, Ѡ 2 = 0.089). Importantly, the sole noteworthy difference occurred between familiarization 1 and trial 3 (t (26) = 4.855, p < .001, d = 0.756) suggesting no discernible difference was likely between performances in the three experimental trials, following two familiarization trials. For the key variable of total correct responses, trial 1 versus trial 2, trial 1 versus trial 3, and trial 2 versus trial 3 were both statistically equivalent (1 vs. 2: W (26) = 93, p = .011; 1 vs. 3: W (26) = 61, p = .047; 2 vs. 3: W (26) = 41, p = .040) and not statistically different. For the other key variable of ASTO, all comparisons were likely neither statistically equivalent (1 vs. 2: t (26) = −1.701, p = .050; 2 vs. 3: t (26) = −0.127, p = .45; 1 vs. 3: t (26) = 0.287, p = .612) nor statistically different.
Walking performance
Military-Specific Auditory N-Back Task
As with seated MSANT performance, there was no likely effect of time on total correct responses (χ 2(2) = 1.000, p = .607, Kendall’s W = 0.568), partial correct responses (χ 2(2) = 1.280, p = .527, Kendall’s W = 0.541), and combined correct responses (χ 2(2) = 1.000, p = .607, Kendall’s W = 0.582). For the key variable of combined correct responses, trials 1 versus 2 and trials 2 versus 3 were statistically equivalent (1 vs. 2: W (11) = 12, p = .017; 2 vs. 3: W (11) = 13, p = .020) and not statistically different. Conversely, trial 1 versus 3 was neither statistically equivalent nor statistically different.
Shoot-/Don’t-Shoot Task
Again there were no likely effects of time on shoot correct responses (χ 2(2) = 4.800, p = .091, Kendall’s W = 0.449), don’t-shoot correct responses (χ 2(2) = 2.480, p = .289, Kendall’s W = 0.672), total correct responses (χ 2(2) = 3.161, p = .206, Kendall’s W = 0.741), response times (F (2,22) = 2.880, p = .077, Ѡ 2 = 0.018), and ASTO (F (2,22) = 2.713, p = .088, Ѡ 2 = 0.042). For the key variable of total correct responses, all comparisons were neither statistically equivalent (1 vs. 2: W (11) = 6, p = .096; 1 vs. 3: W (11) = 0, p = .093; 2 vs. 3: W (11) = 14, p = .084) nor statistically different. Similarly, for the other key variable of ASTO all comparisons were likely neither statistically equivalent (1 vs. 2: t (11) = 0.127, p = .549; 2 vs. 3: t (11) = −1.205, p = .127; 1 vs. 3: t (11) = 0.787, p = .776) nor statistically different.
Discussion
This study has described the methods of two military-specific cognitive assessment tools (MSANT and SDST) and quantified their typical day-to-day variability. These data provide typical magnitudes of variance for the key assessment parameters. While no likely performance differences were observed across the experimental measurement points, not all walking comparisons were statistically equivalent, suggesting additional data are required before this assertion is made, for the given equivalency bounds. It should however be noted that borderline statistically significant results may become non-significant where correction for multiple testing is utilized. The current investigation has also demonstrated the suitability of these assessment tools for use during military-specific physical activity within a laboratory setting.
Before this investigation, the day-to-day performance variation in any military-specific cognitive assessments had not been quantified. This is an issue for several reasons, including the translational ability of research findings to the “real world” (Close et al., Reference Close, Kasper and Morton2019) and also for methodological decision making (e.g., sample size calculations). Moreover, with military operations rarely conducted in isolation, information on inter-test performance is highly relevant to research investigating sequential or repeated bout performance. The comparison between seated and walking performance was not a research question of interest in the current study, particularly given that deficits in cognitive performance are typically observed after ~30 min of military activity (e.g., Eddy et al., Reference Eddy, Hasselquist, Giles, Hayes, Howe, Rourke, Coyne, O’Donovan, Batty and Brunyé2015; Giles et al., Reference Giles, Hasselquist, Caruso and Eddy2019). However, observationally, the typical variation in performance between trials appears similar between seated and walking conditions.
Familiarizing participants with assessment tools is critical for research, particularly when time limitations may inhibit access to study participants (e.g., military populations). Collectively, the current study’s data demonstrate that beyond two full seated trials, a continued improvement in performance was not likely apparent, suggesting this familiarization length is sufficient to minimize possible learning effects.
Several limitations exist with the current investigation, including the use of a civilian population, and the limited walking sub-sample size. As acknowledged previously, the smaller sub-sample size was chosen for largely practical reasons, although it does match many studies within this area, highlighting issues with underpowered investigations. Future research should attempt to pair reliable and applied cognitive tasks (such as those described herein) with operationally relevant and appropriate physical activity. This in turn will support enhanced applied research as well as enabling a greater focus to be placed on developing mitigation strategies where the greatest mission impact can be obtained.
Acknowledgment
The authors would like to thank the participants for their participation in the current project.
Supplementary Materials
To view supplementary material for this article, please visit http://doi.org/10.1017/exp.2022.11.
Authorship contributions
Research concept and study design: C.V., S.C., S.M., S.B., O.R; Data collection, data and statistical analyses, and writing of the initial manuscript: C.V.; Reviewing/editing final manuscript: C.V., S.C., S.M., S.B., and O.R.
Data availability statement
The raw data for this project can be viewed at: https://osf.io/jekv8/.
Funding statement
The authors declare no funding was received for this research.
Conflict of interest
The authors have no conflicts of interest to declare.
Comments
Comments to the Author: Dear Authors,
I understand that the current study will be focused on effect of cognitive trials according to two conditions (Part 1 or Part 2) on three separate occasions. There are, however, serious issues regarding extreme small samples, different sample-size for each part’s experiment, poor information on definition of three separate occasions, how to select statistical analyses to clarify the objective of the present study and so on. Especially, I would like to suggest kindly that selection of statistical methodology would be reconsidered to clarify effect of two cognitive trials according to two experimental conditions (Part 1 and Part 2). For instance, if the interactive effect between conditions and occasions for each cognitive trial was examined statistically according to the objective of this study, I would suggest application of the repeated two-way ANOVA to analyze the data collected in this study. Considering the above concerns carefully, I would like to suggest kindly that the manuscript might need to be rewritten after reanalyzing the data.