Linking language features to clinical symptoms and multimodal imaging in individuals at clinical high risk for psychosis

S. S. Haas; G. E. Doucet; S. Garg; S. N. Herrera; C. Sarac; Z. R. Bilgrami; R. B. Shaik; C. M. Corcoran

doi:10.1192/j.eurpsy.2020.73

Linking language features to clinical symptoms and multimodal imaging in individuals at clinical high risk for psychosis

Part of: Psychosis Spectrum Disorders Biological Basis of Mental Disorders and their Treatment

Published online by Cambridge University Press: 11 August 2020

and

S. S. Haas*: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
G. E. Doucet: Affiliation:
Boys Town National Research Hospital, Omaha, Nebraska, USA
S. Garg: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
S. N. Herrera: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
C. Sarac: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
Z. R. Bilgrami: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
R. B. Shaik: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
C. M. Corcoran: Affiliation:
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
*: S. S. Haas, E-mail: shalaila.haas@mssm.edu

Article contents

Abstract
Background.
Methods.
Results.
Conclusions.
Introduction
Methods
Results
Discussion
Conclusion
Conflict of Interest
Authorship Contributions
Data Availability Statement
Supplementary Materials
References

Abstract

Background.

Abnormalities in the semantic and syntactic organization of speech have been reported in individuals at clinical high-risk (CHR) for psychosis. The current study seeks to examine whether such abnormalities are associated with changes in brain structure and functional connectivity in CHR individuals.

Methods.

Automated natural language processing analysis was applied to speech samples obtained from 46 CHR and 22 healthy individuals. Brain structural and resting-state functional imaging data were also acquired from all participants. Sparse canonical correlation analysis (sCCA) was used to ascertain patterns of covariation between linguistic features, clinical symptoms, and measures of brain morphometry and functional connectivity related to the language network.

Results.

In CHR individuals, we found a significant mode of covariation between linguistic and clinical features (r = 0.73; p = 0.003), with negative symptoms and bizarre thinking covarying mostly with measures of syntactic complexity. In the entire sample, separate sCCAs identified a single mode of covariation linking linguistic features with brain morphometry (r = 0.65; p = 0.05) and resting-state network connectivity (r = 0.63; p = 0.01). In both models, semantic and syntactic features covaried with brain structural and functional connectivity measures of the language network. However, the contribution of diagnosis to both models was negligible.

Conclusions.

Syntactic complexity appeared sensitive to prodromal symptoms in CHR individuals while the patterns of brain-language covariation seemed preserved. Further studies in larger samples are required to establish the reproducibility of these findings.

Keywords

Clinical high risk for psychosis multimodal natural language processing neuroimaging sparse canonical correlation analysis

Type: Research Article
Information: European Psychiatry , Volume 63 , Issue 1 , 2020 , e72

DOI: https://doi.org/10.1192/j.eurpsy.2020.73 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2020. Published by Cambridge University Press on behalf of the European Psychiatric Association

Introduction

Schizophrenia is a major psychiatric disorder presenting with positive, negative, and cognitive symptoms [1]. Language disturbances are a cardinal feature of schizophrenia that manifest at all levels, from comprehension to production [Reference Covington, He, Brown, Naçi, McClain and Fjordbak2,Reference DeLisi3]. Language production disturbances have been reported in phonetics, morphology, syntax, semantics, and pragmatics; the most common abnormalities include idiosyncratic semantic associations, neologisms and word approximation, poverty of speech, and reduced grammatical complexity [Reference Covington, He, Brown, Naçi, McClain and Fjordbak2,Reference Ditman and Kuperberg4–7].

These language abnormalities implicate the corresponding brain networks. Current models of language processing support the dual-stream model, which specifies a ventral stream that primarily supports comprehension, and a dorsal stream that primarily supports articulation [Reference Hickok and Poeppel8]. Typically, the ventral stream is largely bilateral while the dorsal stream is strongly left-lateralized [Reference Hickok and Poeppel8]. Within the ventral stream, speech sounds are initially processed within the auditory regions of superior temporal gyrus while portions of the middle and inferior temporal lobe (including the fusiform gyrus) and the anterior temporal lobe correspond to the lexical interface, which links phonological to semantic information [Reference Chang, Rieger, Johnson, Berger, Barbaro and Knight9–11]. The dorsal stream includes Broca’s region in the inferior frontal gyrus, the insula, the parietotemporal sylvian region (considered a sensorimotor interface region), and motor and premotor cortical regions [Reference Chang, Rieger, Johnson, Berger, Barbaro and Knight9–11]. However, language does not simply involve language-specialized regions, but also relies on brain networks that support general cognitive functions [Reference Liemburg, Vercammen, Ter Horst, Curcic-Blake, Knegtering and Aleman12,Reference Shirer, Ryali, Rykhlevskaia, Menon and Greicius13], mainly the executive control network (ECN) [Reference Seeley, Menon, Schatzberg, Keller, Glover and Kenna14], the salience network (SAL) [Reference Seeley, Menon, Schatzberg, Keller, Glover and Kenna14], and the default-mode network (DMN) [Reference Simony, Honey, Chen, Lositsky, Yeshurun and Wiesel15], which forms the functional basis of brain organization [Reference Buckner and DiNicola16].

Multiple functional and structural neuroimaging studies in patients with schizophrenia [Reference Liemburg, Vercammen, Ter Horst, Curcic-Blake, Knegtering and Aleman12,Reference Cavelti, Winkelbeiner, Federspiel, Walther, Stegmayer and Giezendanner17–21], and in clinical and genetic high-risk groups [Reference Lawrie, Whalley, Kestelman, Abukmeil, Byrne and Hodges22–25], have established the presence of abnormalities in the language-related brain regions and in the networks supporting language-functions and their association with language dysfunction. By contrast, the corresponding literature in clinical high-risk (CHR) individuals is just beginning to emerge. Indicators of semantic dysfunction have been associated with lower gray matter density [Reference Meijer, Schmitz, Nieman, Becker, Amelsvoort and Dingemans26] and aberrant functional activation of brain regions within the language network [Reference Sabb, Erp, Hardt, Dapretto, Caplan and Cannon25].

Linguistic profiling has benefited from computational methods that enable the automatic and precise labeling of speech features in patients with schizophrenia and CHR [Reference Bedi, Carrillo, Cecchi, Slezak, Sigman and Mota27–32]. Our group has demonstrated that semantic and syntactic abnormalities may be useful in predicting syndromal transition in CHR [Reference Bedi, Carrillo, Cecchi, Slezak, Sigman and Mota27,Reference Corcoran, Carrillo, Fernández‐Slezak, Bedi, Klim and Javitt28]; with measures of semantic coherence being the most discriminant.

In the current study, we extend our previous work, as we seek to relate automatically derived language features to brain structure and functional connectivity in CHR individuals. As language is supported by a wide range of regions and networks, we use a whole-brain, multivariate approach to our analysis. Specifically, we employ sparse canonical correlation analysis (sCCA) [Reference Witten, Tibshirani and Hastie33] to identify linked patterns of covariation between multiple linguistic features and brain morphology and functional connectivity. sCCA is an extension of traditional CCA, is more appropriate for smaller samples, it is less susceptible to overfitting and has been extensively used to describe brain-cognition associations by us [Reference Moser, Doucet, Ing, Dima, Schumann and Bilder34,Reference Moser, Doucet, Lee, Rasgon, Krinsky and Leibu35] and others [Reference Avants, Libon, Rascovsky, Boller, McMillan and Massimo36–38]. Our initial hypotheses are that (a) amount of speech and measures of syntactic complexity will show significant covariation with symptoms; (b) both syntactic and semantic features will covary with brain structural and functional measures of the language network and its functional integration with cognitive control networks; and (c) brain-language covariation patterns would be altered by CHR status.

Methods

Sample

Individuals at CHR for psychosis and healthy individuals (HIs) were recruited at Columbia University and at the Icahn School of Medicine at Mount Sinai (ISMMS), both in New York, USA. Individuals were characterized as CHR based on the Structured Interview for Prodromal Syndromes/Scale of Prodromal Symptoms (SIPS/SOPS) [Reference Miller, McGlashan, Rosen, Cadenhead, Ventura and McFarlane39] if they met criteria for the attenuated positive symptom syndrome, which requires at least 1 SIPS/SOPS-positive item in the prodromal range (3–5) with symptoms beginning or worsening in the past year, and symptoms occurring at an average frequency of once per week in the prior month.

HIs had no personal history of any psychiatric disorders and no family history of psychosis in their first-degree relatives. Additionally, all participants were screened to exclude concomitant medical and neurological disorders, lifetime history of significant head trauma, current substance use disorders, contraindications to magnetic resonance imaging (MRI) scanning and were required to be fluent in English. Further details on recruitment and eligibility screening are presented in the Supplementary Material, Section 1.1.1. The sample included 46 CHR and 22 HIs (Table 1) of whom 30 (CHR = 17; HI = 13) were recruited at Columbia University and 38 at the ISMMS (CHR = 29; HI = 9) (Supplementary Material, Section 1.1.2 and Table S1).

Table 1. Demographic and clinical characteristics of the whole sample.

Continuous variables are shown as mean (standard deviation).

Abbreviations: CHR, clinical high risk; GFS, Global Functioning Scale; SIPS/SOPS, Structured Interview for Prodromal Syndromes/Scale of Prodromal Symptoms.

^a Significant case–control differences at p < 0.05.

Clinical assessment

In addition to the SIPS/SOPS, all participants were assessed using the Structured Clinical Interview for DSM-5 Axis 1 Disorders [Reference First, Williams, Karg and Spitzer40], Global Functioning Scale [Reference Cornblatt, Auther, Niendam, Smith, Zinberg and Bearden41], and the Edinburgh Handedness Inventory [Reference Oldfield42] (Supplementary Material, Section 1.1.3).

Language assessment

Naturalistic speech samples were obtained from all participants via open-ended 30–45 min narrative interviews following our previous work [Reference Bedi, Carrillo, Cecchi, Slezak, Sigman and Mota27,Reference Corcoran, Carrillo, Fernández‐Slezak, Bedi, Klim and Javitt28]. Interviews were transcribed by an independent Health Insurance Portability and Accountability Act (HIPAA) compliant company (https://sftp.transcribeme.com), and deidentified for analysis (Supplementary Material, Section 1.1.4). Interview transcripts were preprocessed as previously described [Reference Bedi, Carrillo, Cecchi, Slezak, Sigman and Mota27,Reference Corcoran, Carrillo, Fernández‐Slezak, Bedi, Klim and Javitt28] using the Natural Language Toolkit (NLTK; http://www.nltk.org/) [Reference Bird, Klein and Loper43]. The NLTK was used to extract the minimum, maximum, mean, and standard deviation of number of words per sentence to assess the amount of speech. Latent Semantic Analysis was used to quantify the minimum, maximum, mean, and standard deviation of the semantic coherence between consecutive sentences to assess semantics. Additionally, Part of Speech tagging based on the Penn Tree Bank was used with NLTK [Reference Bird, Klein and Loper43,Reference Santorini44] to extract the frequencies of each tag in the speech specimens. Details of the process and definitions of the linguistic variables are provided in the Supplementary Material, Section 1.1.5 and Table S3.

Neuroimaging

Structural and functional MRI were acquired on a GE MR750 3T scanner at Columbia University and on a 3T Siemens Skyra scanner (Erlangen, Germany) at the ISMMS. Both high-resolution structural and resting-state functional imaging data (rs-fMRI) were acquired in all participants using comparable protocols at each site as described in detail in the Supplementary Material, Section 1.2.1.

Following standard preprocessing and quality control, brain structural and rs-fMRI data were further analyzed to extract measures of brain morphometry and functional network connectivity (details in Supplementary Material, Sections 1.2.2 and 1.2.3). Segmentation and parcellation of the structural images were implemented in Freesurfer 6.0 (http://surfer.nmr.mgh.harvard.edu/) to yield 68 cortical thickness measures and 20 subcortical volume measures (defined in Supplementary Table S5). To enhance reproducibility, resting-state networks were defined using the templates available through the Functional Imaging in Neuropsychiatric Disorders Lab at Stanford University, USA (https://findlab.stanford.edu/functional_ROIs.html) for the language network (LAN), DMN, ECN, SAL, sensorimotor network (SMN), and auditory network (AN) networks [Reference Shirer, Ryali, Rykhlevskaia, Menon and Greicius13] (Supplementary Figure S1). In each participant, Fisher Z-transformed Pearson’s correlation coefficients were used to compute network cohesiveness (i.e., average correlation of each voxel’s time series with every other voxel within each network). Network integration was computed as the correlation between the average time-series of each pair of networks. This process yielded 11 network connectivity measures (Supplementary Table S4). Prior to further analyses, imaging datasets were harmonized using ComBat [Reference Fortin, Cullen, Sheline, Taylor, Aselcioglu and Cook45], a Bayesian batch adjustment approach that accommodates the effect of site (https://github.com/Jfortin1/ComBatHarmonization).

Statistical analyses

Conventional statistical analyses

Group differences in demographic, clinical, and linguistic features were analyzed using univariate and multivariate analyses; age, sex, and site were included as covariates when appropriate. Additional group-level analyses were undertaken to assess group differences in brain structure and functional connectivity that are described only in the Supplementary Material, Section 1.3. For all analyses, results are considered significant at p < 0.05 following Benjamini–Hochberg false-discovery-rate correction for multiple testing.

Sparse canonical correlation analyses

We implemented sCCA [Reference Witten, Tibshirani and Hastie33] in MatlabR2018b using an in-house script in accordance with our previously published work [Reference Moser, Doucet, Ing, Dima, Schumann and Bilder34,Reference Moser, Doucet, Lee, Rasgon, Krinsky and Leibu35,Reference Doucet, Moser, Luber, Leibu and Frangou46] to test the association between the linguistic, clinical, and neuroimaging data (details in Supplementary Material, Section 1.4). We considered four datasets; a nonimaging dataset comprising the clinical variables (Supplementary Table S2), a nonimaging dataset comprising the linguistic variables (Supplementary Table S3), a functional dataset comprising the functional network connectivity variables (Supplementary Table S4), and a structural dataset comprising the morphometric variables (Supplementary Table S5). All datasets were normalized by calculating Z-scores for each variable prior to entering the sCCA. We conducted separate sCCAs, using identical procedures, to identify patterns of covariation between the language and clinical datasets in CHR-individuals only, and between the language dataset and each of the imaging datasets in all participants. Diagnostic group was included in the language dataset for each of the imaging sCCAs in order to examine group effects. Brain morphometry and functional connectivity were considered separately because they represent different aspects of brain organization (an analysis of the pooled imaging data is also presented in the Supplementary Material). For each sCCA, we selected the optimal sparse criteria based on the parameters that maximized the sCCA correlation. We then computed the optimal sCCA model and determined its significance using permutations (n = 10,000). The p-value was defined as the number of permutations that resulted in a higher correlation than the original data divided by the total number of permutations. Thus, the p-value is explicitly corrected for multiple testing as it is compared against the null distribution of maximal correlation values across all estimated sCCAs. In each sCCA model, each canonical variate relates a weighted set of linguistic features to a weighted set of imaging measures. The weights of each feature in each variate provide an indication of their importance in the model.

Results

Linguistic features in the cohort

Four linguistic features (foreign word, list item marker, plural proper noun, possessive wh-pronoun) were not included in subsequent analyses because more than 50% of the sample achieved the same score. The descriptive statistics of the remaining 38 linguistic features in each group are presented in Table 2 (Supplementary Table S6). No group differences were identified at P _FDR < 0.05 for the individual linguistic features.

Table 2. Linguistic features of the sample.

Variables are shown as mean (standard deviation).

Abbreviation: CHR, clinical high risk; detailed definition of each variable is provided in Supplementary Table S3.

Linked dimensions of language and clinical symptoms

This sCCA model identified a single significant mode (canonical r = 0.73; p = 0.003) (Figure 1A). The weights of all the variables examined are provided in Supplementary Tables S7 and S8. The most heavily weighted linguistic features involved measures of syntactic complexity: use of coordinating conjunctions, adverbs, and verbs (Figure 1B). The most heavily weighted clinical measures were avolition, decreased experience of emotion, impaired tolerance to stress, bizarre thinking, and decreased ideational richness (Figure 1C).

Figure 1. Sparse canonical correlation analysis (sCCA) for language features and clinical symptoms in individuals at clinical high-risk (CHR) for psychosis. (A) sCCA of linguistic features and clinical symptoms in CHR individuals identified a single significant mode; (B) linguistic features with the highest absolute weights; and (C) clinical symptoms with the highest absolute weights. Additional information in Supplementary Tables S7 and S8.

Linked dimensions of language and functional connectivity

This sCCA model identified a single significant mode (canonical r = 0.63; p = 0.01) (Figure 2A); no further modes were significant (unadjusted p > 0.05). The weights of all the variables examined are provided in Supplementary Tables S9 and S10. We found negligible effects of diagnosis (weight = −0.03) and handedness (weight = 0). The most heavily weighted linguistic features involved measures of coherence (maximum semantic coherence) and measures of syntactic complexity involving the use of adjective, determiners, verbs, and pronouns (Figure 2B). The most heavily weighted connectivity measures were cohesiveness of the LAN, ECN, SAL, AN, and DMN networks and integration between the LAN and AN networks (Figure 2C).

Figure 2. Sparse canonical correlation analysis (sCCA) for language features and resting-state network functional connectivity in the entire study sample. (A) sCCA of linguistic features and resting-state network functional connectivity in the entire sample identified a single significant mode. The weight of diagnosis was −0.03; (B) linguistic features with the highest absolute weights; and (C) connectivity measures with the highest absolute weights. Additional information in Supplementary Tables S9 and S10.

Linked dimensions of language and brain structure

This sCCA model identified a single significant mode (canonical r = 0.65; p = 0.05); no further modes were significant (unadjusted p > 0.05) (Figure 3A). The weights of all the variables examined are provided in Supplementary Tables S11 and S12. The contribution of diagnosis (weight = 0.07) and handedness (weight = 0.02) were negligible. The most heavily weighted linguistic features involved measures of volume of speech relating to sentence length, mean semantic coherence, and measures of syntactic complexity (interjection and subordinating conjunction) (Figure 3B). The weights for the cortical and subcortical features were generally low (range of absolute values: 0.01–0.25). Amongst cortical regions with the highest values observed were left pars opercularis and triangularis, the bilateral superior temporal gyrus and rostral anterior cingulate and on the right, the medial orbitofrontal gyrus and frontal pole, the temporal pole and the inferior temporal and fusiform gyri, and inferior parietal lobule (Figure 3C). A large number of subcortical regions negatively covaried with linguistic features including bilateral thalamus, hippocampus, nucleus accumbens, pallidum and ventral diencephalon, the right amygdala, and left caudate nucleus.

Figure 3. Sparse canonical correlation analysis (sCCA) for language features and brain morphometry in the entire study sample. (A) sCCA of linguistic features and brain structure in the entire sample identified a single significant mode. The weight of diagnosis was negligible (w = 0.07); (B) linguistic features with the highest absolute weights; and (C) brain morphometry measures with the highest absolute weights. Additional information in Supplementary Tables S11 and S12.

Discussion

We found no effect of diagnosis on individual linguistic features obtained in CHR and HIs. Nevertheless, in the CHR individuals, measures of syntactic complexity covaried with negative symptoms reflecting avolition and poverty of thought and disorganized symptoms of bizarre thinking. Linguistic measures of the amount of speech, semantic coherence, and syntactic complexity covaried with the measures of brain structure and resting-state functional connectivity that emphasized regions and networks involved in speech and language processing. There was no diagnostic effect on these patterns of covariation, although the study may not have been sufficiently powered to detect such a difference.

Language features and clinical symptoms

In an independent sample of 34 CHR individuals, we have previously reported associations of the SIPS/SOPS total negative symptom severity with maximum phrase length, minimum semantic coherence, and use of determiners, respectively, implicating the amount and the semantic and syntactic organization of speech [Reference Bedi, Carrillo, Cecchi, Slezak, Sigman and Mota27]. In our subsequent study involving 93 CHR individuals, no association between language features and clinical symptomatology was identified [Reference Corcoran, Carrillo, Fernández‐Slezak, Bedi, Klim and Javitt28]. Here we found that predominantly negative symptoms covaried mainly with measures of syntactic complexity. One interpretation is that subtle disturbances in syntactic complexity may be more sensitive to clinical symptoms than other linguistic features. Alternatively, linguistic-clinical associations may depend on the specific characteristics of the CHR sample and should be further investigated across samples.

Language and functional connectivity

The pattern of covariation between language features and resting-state connectivity was comparable in CHR and HIs. As expected, the cohesiveness of the LAN network emerged as one of the variables most strongly associated with language features. The cohesiveness of the DMN in connection to language is supported by prior studies linking introspective functions with verbal resources [Reference Morin and Michaud47]. Efficient language and speech processing depend on multiple other cognitive functions, notably attention/salience, working memory, and cognitive control [Reference Fedorenko48]. Aligned with this notion, the sCCA results underscore the importance of the cohesiveness of the ECN for language, which supports goal-directed behaviors [Reference Fox, Snyder, Vincent, Corbetta, Van Essen and Raichle49] and is modulated by general cognitive effort [Reference Duncan and Owen50]. The ECN and SAL networks are considered dissociable respectively supporting sustained and adaptive cognitive control [Reference Dosenbach, Fair, Miezin, Cohen, Wenger and Dosenbach51]. In the context of the current results, the role of the ECN involves the moment-to-moment monitoring of speech production while SAL connectivity may be relevant to the adaptive control of semantic and syntactic organization of speech as dictated by contextual demands. We note the positive weight of the integration between LAN and AN for language processing. Studies of patients with schizophrenia have repeatedly found that the functional integration of these two networks is reduced but this feature may mainly arise in the context of hallucinations [Reference Alderson-Day, Diederen, Fernyhough, Ford, Horga and Margulies52]. Generally, the absence of an effect of CHR status in this analysis suggests that this effect is either too small to be detected in the current sample or that the mapping of linguistic features to the resting-state connectivity is not disturbed in CHR individuals.

Language features and brain structure

The sCCA links semantic coherence and syntactic complexity with variation mainly in prefrontal, and temporal regions and subcortical volumes. It is currently thought that semantic processing in the brain follows a “spoke-and-hub” model [Reference Pobric, Jefferies and Ralph53,Reference Patterson, Nestor and Rogers54]; modality-specific content, primarily involving attributes, is represented in the spokes while more-abstract, amodal representations, referring mainly to semantic significance, are held in the hubs. Further, modality-specific representations are mainly left-lateralized for verbal content and right-lateralized for visual content while amodal representations are distributed in hubs in both hemispheres. Although there is debate about the number and locations of semantic hubs, plausible candidates have been proposed in different cortical regions, including the anterior-inferior-temporal lobe [Reference Patterson, Nestor and Rogers54,Reference Ralph, Jefferies, Patterson and Rogers55] the anterior-inferior-parietal [Reference Binder, Desai, Graves and Conant56] and the inferior-frontal cortex [57–61]. Of note, the top weighted regions, map closely to temporal and prefrontal sematic hubs. The neural correlates of syntactic processing are also debated [Reference Fedorenko, Nieto-Castañón and Kanwisher62] but there is general agreement that the key regions involved are the left opercular and left triangular portions of the inferior frontal cortex [Reference Grodzinsky and Friederici63]. Our results therefore conform to current expectations regarding brain structure-language mapping. No effect of group was detected in this analysis which may relate to issues of power or may indicate that the mapping of linguistic features to brain structural measures is not disturbed in CHR individuals.

Limitations

The main limitation of the current study is the small sample size particularly with regards to HIs. Additionally, site effects may have further influenced the power of the study to detect diagnostic differences. We therefore consider our data preliminary pending replication in larger samples. The linguistic and neuroimaging features examined were chosen for their potential translational value because of the relative ease in collecting such data in clinical settings. Brain structural and resting-state data acquisition have the advantage of brevity and does not require active patient engagement. Similarly, linguistic features were selected according to our prior data and were based on free natural speech which is easy to elicit in clinical settings. Future studies could expand the range of features to include speech graphs [Reference Mota, Vasconcelos, Lemos, Pieretti, Kinouchi and Cecchi64], prosody, pragmatics, metaphoricity [Reference Gutierrez, Shutova, Marghetis and Bergen65], and discourse or conversations. Some CHR individuals were prescribed antipsychotics at the time of testing although their cumulative exposure was minimal. Nevertheless, an effect of medication cannot be conclusively excluded.

Conclusion

Overall, we identified significant patterns of covariation between linguistic features and clinical symptoms in CHR individuals. The linguistic features were predominantly linked with negative symptoms and bizarre thinking, suggesting these symptoms co-occur with alterations in language processing in CHR. Future studies will be necessary to determine whether symptoms and language processing changes emerge simultaneously, or the onset of one precedes the other. No diagnostic effect was noted in the pattern of covariation between linguistic features and brain morphometry and resting-state network connectivity. These findings suggest relatively intact patterns of brain-language covariance. Further studies are needed to confirm the reproducibility of these findings.

Acknowledgments

This work was supported in part through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai. Dr. Corcoran received support from the National Institute of Mental Health (R01 MH107558-01 and R01 MH115332-01). Dr. Doucet received support from the National Institute of Aging (R03 AG064001) and National Institute of General Medical Sciences (P20GM130447).

Conflict of Interest

The authors have no potential or actual conflict of interest.

Authorship Contributions

Dr. Corcoran was involved in study design, data analyses, and manuscript write-up. Dr. Haas conducted the data analyses and wrote the first draft of the manuscript. All authors were involved in participant recruitment and assessment, and contributed to data analyses and manuscript write-up.

Data Availability Statement

The data are available upon request from the National Data Archive and from Dr. Corcoran.

Supplementary Materials

To view supplementary material for this article, please visit http://dx.doi.org/10.1192/j.eurpsy.2020.73.

References

Association AP. Diagnostic and statistical manual of mental disorders (DSM-5®). New York: American Psychiatric Publishing, 2013.Google Scholar

Covington, MA, He, C, Brown, C, Naçi, L, McClain, JT, Fjordbak, BS, et al.Schizophrenia and the structure of language: the linguist’s view. Schizophr Res. 2005;77:85–98.CrossRef Google Scholar PubMed

DeLisi, LE. Speech disorder in schizophrenia: review of the literature and exploration of its relation to the uniquely human capacity for language. Schizophr Bull. 2001;27:481–496.CrossRef Google Scholar PubMed

Ditman, T, Kuperberg, GR. Building coherence: a framework for exploring the breakdown of links across clause boundaries in schizophrenia. J Neurolinguist. 2010;23:254–269.CrossRef Google Scholar

Fraser, WI, King, KM, Thomas, P, Kendell, RE. The diagnosis of schizophrenia by language analysis. Br J Psychiatry. 1986;148:275–278.CrossRef Google Scholar

Kuperberg, GR. Language in schizophrenia, part 2: what can psycholinguistics bring to the study of schizophrenia… and vice versa? Lang Linguist Compass. 2010;4:590–604.CrossRef Google Scholar

Brown, M, Kuperberg, GR. A hierarchical generative framework of language processing: Linking language perception, interpretation, and production abnormalities in schizophrenia. Front Human Neurosci. 2015;9:643.CrossRef Google Scholar

Hickok, G, Poeppel, D. Neural basis of speech perception. Neurobiology of language. Netherlands: Elsevier, 2016; p. 299–310.CrossRef Google Scholar

Chang, EF, Rieger, JW, Johnson, K, Berger, MS, Barbaro, NM, Knight, RT. Categorical speech representation in human superior temporal gyrus. Nat Neurosci. 2010;13:1428.CrossRef Google Scholar PubMed

Humphries, C, Sabri, M, Lewis, K, Liebenthal, E. Hierarchical organization of speech perception in human auditory cortex. Front Neurosci. 2014;8:406.CrossRef Google Scholar PubMed

Mesgarani, N, Cheung, C, Johnson, K, Chang, EF. Phonetic feature encoding in human superior temporal gyrus. Science. 2014;343:1006–1010.CrossRef Google Scholar PubMed

Liemburg, EJ, Vercammen, A, Ter Horst, GJ, Curcic-Blake, B, Knegtering, H, Aleman, A. Abnormal connectivity between attentional, language and auditory networks in schizophrenia. Schizophr Res. 2012;135:15–22.CrossRef Google Scholar

Shirer, WR, Ryali, S, Rykhlevskaia, E, Menon, V, Greicius, MD. Decoding subject-driven cognitive states with whole-brain connectivity patterns. Cerebr Cortex. 2012;22:158–165.CrossRef Google Scholar PubMed

Seeley, WW, Menon, V, Schatzberg, AF, Keller, J, Glover, GH, Kenna, H, et al.Dissociable intrinsic connectivity networks for salience processing and executive control. J Neurosci. 2007;27:2349–2356.CrossRef Google Scholar PubMed

Simony, E, Honey, CJ, Chen, J, Lositsky, O, Yeshurun, Y, Wiesel, A, et al.Dynamic reconfiguration of the default mode network during narrative comprehension. Nat Commun. 2016;7:12141.CrossRef Google Scholar PubMed

Buckner, RL, DiNicola, LM. The brain’s default network: updated anatomy, physiology and evolving insights. Nat Rev Neurosci. 2019;20:593–608.CrossRef Google Scholar PubMed

Cavelti, M, Winkelbeiner, S, Federspiel, A, Walther, S, Stegmayer, K, Giezendanner, S, et al.Formal thought disorder is related to aberrations in language-related white matter tracts in patients with schizophrenia. Psychiatry Res Neuroimag. 2018;279:40–50.CrossRef Google Scholar PubMed

Kircher, TT, Bulimore, ET, Brammer, MJ, Williams, SC, Broome, MR, Murray, RM, et al.Differential activation of temporal cortex during sentence completion in schizophrenic patients with and without formal thought disorder. Schizophr Res. 2001;50:27–40.CrossRef Google Scholar PubMed

Palaniyappan, L, Mota, NB, Oowise, S, Balain, V, Copelli, M, Ribeiro, S, et al.Speech structure links the neural and socio-behavioural correlates of psychotic disorders. Progr Neuro-Psychopharmacol Biol Psychiatry. 2019;88:112–120.CrossRef Google Scholar PubMed

Sommer, IE, Ramsey, NF, Mandl, RC, Van Oel, CJ, Kahn, RS. Language activation in monozygotic twins discordant for schizophrenia. Br J Psychiatry. 2004;184:128–135.CrossRef Google Scholar

Tagamets, MA, Cortes, CR, Griego, JA, Elvevåg, B. Neural correlates of the relationship between discourse coherence and sensory monitoring in schizophrenia. Cortex. 2014;55:77–87.CrossRef Google Scholar

Lawrie, SM, Whalley, H, Kestelman, JN, Abukmeil, SS, Byrne, M, Hodges, A, et al.Magnetic resonance imaging of brain in people at high risk of developing schizophrenia. Lancet. 1999;353:30–33.CrossRef Google Scholar PubMed

Rajarethinam, R, Venkatesh, B, Peethala, R, Phan, KL, Keshavan, M. Reduced activation of superior temporal gyrus during auditory comprehension in young offspring of patients with schizophrenia. Schizophr Res. 2011;130:101–105.CrossRef Google Scholar PubMed

Whyte, M-C, Whalley, HC, Simonotto, E, Flett, S, Shillcock, R, Marshall, I, et al.Event-related fMRI of word classification and successful word recognition in subjects at genetically enhanced risk of schizophrenia. Psychol Med. 2006;36:1427–1439.CrossRef Google Scholar PubMed

Sabb, FW, van Erp, TG, Hardt, ME, Dapretto, M, Caplan, R, Cannon, TD, et al.Language network dysfunction as a predictor of outcome in youth at clinical high risk for psychosis. Schizophr Res. 2010;116:173–83.CrossRef Google Scholar PubMed

Meijer, JH, Schmitz, N, Nieman, DH, Becker, HE, van Amelsvoort, TA, Dingemans, PM, et al.Semantic fluency deficits and reduced grey matter before transition to psychosis: a voxelwise correlational analysis. Psychiatry Res. Neuroimag. 2011;194:1–6.CrossRef Google Scholar PubMed

Bedi, G, Carrillo, F, Cecchi, GA, Slezak, DF, Sigman, M, Mota, NB, et al.Automated analysis of free speech predicts psychosis onset in high-risk youths. NPJ Schizophr. 2015;1:15030.CrossRef Google Scholar PubMed

Corcoran, CM, Carrillo, F, Fernández‐Slezak, D, Bedi, G, Klim, C, Javitt, DC, et al.Prediction of psychosis across protocols and risk cohorts using automated language analysis. World Psychiatry. 2018;17:67–75.CrossRef Google Scholar PubMed

de Boer, J, van Hoogdalem, M, Mandl, R, Brummelman, J, Voppel, A, Begemann, M, et al.Language in schizophrenia: relation with diagnosis, symptomatology and white matter tracts. NPJ Schizophr. 2020;6:1–10.Google Scholar PubMed

De Boer, J, Voppel, A, Begemann, M, Schnack, H, Wijnen, F, Sommer, I. Clinical use of semantic space models in psychiatry and neurology: a systematic review and meta-analysis. Neurosci Biobehav Rev. 2018;93:85–92.CrossRef Google Scholar PubMed

Rezaii, N, Walker, E, Wolff, P. A machine learning approach to predicting psychosis using semantic density and latent content analysis. NPJ Schizophr. 2019;5:1–12.CrossRef Google Scholar PubMed

Tahir, Y, Yang, Z, Chakraborty, D, Thalmann, N, Thalmann, D, Maniam, Y, et al.Non-verbal speech cues as objective measures for negative symptoms in patients with schizophrenia. PLoS ONE. 2019;14:e0214314.CrossRef Google Scholar PubMed

Witten, DM, Tibshirani, R, Hastie, T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics. 2009;10:515–534.CrossRef Google Scholar PubMed

Moser, DA, Doucet, GE, Ing, A, Dima, D, Schumann, G, Bilder, RM, et al.An integrated brain–behavior model for working memory. Mol Psychiatry. 2018;23:1974–1980.CrossRef Google Scholar PubMed

Moser, DA, Doucet, GE, Lee, WH, Rasgon, A, Krinsky, H, Leibu, E, et al.Multivariate associations among behavioral, clinical, and multimodal imaging phenotypes in patients with psychosis. JAMA Psychiatry. 2018;75:386–395.CrossRef Google Scholar PubMed

Avants, BB, Libon, DJ, Rascovsky, K, Boller, A, McMillan, CT, Massimo, L, et al.Sparse canonical correlation analysis relates network-level atrophy to multivariate cognitive measures in a neurodegenerative population. Neuroimage. 2014;84:698–711.CrossRef Google Scholar

Smith, SM, Nichols, TE, Vidaurre, D, Winkler, AM, Behrens, TE, Glasser, MF, et al.A positive-negative mode of population covariation links brain connectivity, demographics and behavior. Nat Neurosci. 2015;18:1565.CrossRef Google Scholar PubMed

Xia, CH, Ma, Z, Ciric, R, Gu, S, Betzel, RF, Kaczkurkin, AN, et al.Linked dimensions of psychopathology and connectivity in functional brain networks. Nat Commun. 2018;9:1–14.CrossRef Google Scholar PubMed

Miller, TJ, McGlashan, TH, Rosen, JL, Cadenhead, K, Ventura, J, McFarlane, W, et al.Prodromal assessment with the Structured Interview for Prodromal Syndromes and the Scale of Prodromal Symptoms: predictive validity, interrater reliability, and training to reliability. Schizophr Bull. 2003;29:703–715.CrossRef Google Scholar PubMed

First, M, Williams, J, Karg, R, Spitzer, R. Structured clinical interview for DSM-5—research version (SCID-5 for DSM-5, research version; SCID-5-RV). Vol 2015. Arlington, VA: American Psychiatric Association;p. 1–94.Google Scholar

Cornblatt, BA, Auther, AM, Niendam, T, Smith, CW, Zinberg, J, Bearden, CE, et al.Preliminary findings for two new measures of social and role functioning in the prodromal phase of schizophrenia. Schizophr Bull. 2007;33:688–702.CrossRef Google Scholar PubMed

Oldfield, RC. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia. 1971;9:97–113.CrossRef Google Scholar PubMed

Bird, S, Klein, E, Loper, E. Natural language processing with Python: analyzing text with the Natural Language Toolkit: “O’Reilly Media, Inc.”; 2009.Google Scholar

Santorini, B. Part-of-speech tagging guidelines for the Penn Treebank project (3rd revision, 2nd printing). UPenn Philadelphia, PA: Ms, Department of Linguistics, 1990.Google Scholar

Fortin, J-P, Cullen, N, Sheline, YI, Taylor, WD, Aselcioglu, I, Cook, PA, et al.Harmonization of cortical thickness measurements across scanners and sites. Neuroimage. 2018;167:104–120.CrossRef Google Scholar PubMed

Doucet, GE, Moser, DA, Luber, MJ, Leibu, E, Frangou, S. Baseline brain structural and functional predictors of clinical outcome in the early course of schizophrenia. Mol Psychiatry. 2018;1–10.Google Scholar PubMed

Morin, A, Michaud, J. Self-awareness and the left inferior frontal gyrus: inner speech use during self-related processing. Brain Res Bull. 2007;74:387–396.CrossRef Google Scholar PubMed

Fedorenko, E. The role of domain-general cognitive control in language comprehension. Front Psychol. 2014;5:335.CrossRef Google Scholar PubMed

Fox, MD, Snyder, AZ, Vincent, JL, Corbetta, M, Van Essen, DC, Raichle, ME. The human brain is intrinsically organized into dynamic, anticorrelated functional networks. Proc Natl Acad Sci. 2005;102:9673–9678.CrossRef Google Scholar PubMed

Duncan, J, Owen, AM. Common regions of the human frontal lobe recruited by diverse cognitive demands. Trends Neurosci. 2000;23:475–483.CrossRef Google Scholar PubMed

Dosenbach, NU, Fair, DA, Miezin, FM, Cohen, AL, Wenger, KK, Dosenbach, RA, et al.Distinct brain networks for adaptive and stable task control in humans. Proc Natl Acad Sci. 2007;104:11073–11078.CrossRef Google Scholar PubMed

Alderson-Day, B, Diederen, K, Fernyhough, C, Ford, JM, Horga, G, Margulies, DS, et al.Auditory hallucinations and the brain’s resting-state networks: findings and methodological observations. Schizophr Bull. 2016;42:1110–1123.CrossRef Google Scholar PubMed

Pobric, G, Jefferies, E, Ralph, MAL. Category-specific versus category-general semantic impairment induced by transcranial magnetic stimulation. Curr Biol. 2010;20:964–968.CrossRef Google Scholar PubMed

Patterson, K, Nestor, PJ, Rogers, TT. Where do you know what you know? The representation of semantic knowledge in the human brain. Nat Rev Neurosci. 2007;8:976–987.CrossRef Google Scholar

Ralph, MAL, Jefferies, E, Patterson, K, Rogers, TT. The neural and computational bases of semantic cognition. Nat Rev Neurosci. 2017;18:42.CrossRef Google Scholar PubMed

Binder, JR, Desai, RH, Graves, WW, Conant, LL. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cerebr Cortex. 2009;19:2767–2796.CrossRef Google Scholar PubMed

Bookheimer, S. Functional MRI of language: new approaches to understanding the cortical organization of semantic processing. Ann Rev Neurosci. 2002;25:151–188.CrossRef Google Scholar PubMed

Carota, F, Kriegeskorte, N, Nili, H, Pulvermüller, F. Representational similarity mapping of distributional semantics in left inferior frontal, middle temporal, and motor cortex. Cerebr Cortex. 2017;27:294–309.Google Scholar PubMed

Pulvermüller, F. Brain embodiment of syntax and grammar: Discrete combinatorial mechanisms spelt out in neuronal circuits. Brain Lang. 2010;112:167–179.CrossRef Google Scholar PubMed

Schomers, MR, Pulvermüller, F. Is the sensorimotor cortex relevant for speech perception and understanding? An integrative review. Front Hum Neurosci. 2016;10:435.CrossRef Google Scholar PubMed

Tate, MC, Herbet, G, Moritz-Gasser, S, Tate, JE, Duffau, H. Probabilistic map of critical functional regions of the human cerebral cortex: Broca’s area revisited. Brain. 2014;137:2773–2782.CrossRef Google Scholar PubMed

Fedorenko, E, Nieto-Castañón, A, Kanwisher, N. Syntactic processing in the human brain: What we know, what we don’t know, and a suggestion for how to proceed. Brain Lang. 2012;120:187–207.CrossRef Google Scholar

Grodzinsky, Y, Friederici, AD. Neuroimaging of syntax and syntactic processing. Curr Opin Neurobiol. 2006;16:240–246.CrossRef Google Scholar PubMed

Mota, NB, Vasconcelos, NA, Lemos, N, Pieretti, AC, Kinouchi, O, Cecchi, GA, et al.Speech graphs provide a quantitative measure of thought disorder in psychosis. PLoS ONE. 2012;7:1.CrossRef Google Scholar PubMed

Gutierrez, ED, Shutova, E, Marghetis, T, Bergen, B. Literal and metaphorical senses in compositional distributional semantic models. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Volume 1. Long Papers, 2016; p. 183–193.Google Scholar

Table 1. Demographic and clinical characteristics of the whole sample.

Table 2. Linguistic features of the sample.

Haas et al. supplementary material

File 586 KB

Submit a response

Comments

No Comments have been published for this article.

Article contents

Linking language features to clinical symptoms and multimodal imaging in individuals at clinical high risk for psychosis

Abstract

Keywords

Introduction

Methods

Sample

Clinical assessment

Language assessment

Neuroimaging

Statistical analyses

Conventional statistical analyses

Sparse canonical correlation analyses

Results

Linguistic features in the cohort

Linked dimensions of language and clinical symptoms

Linked dimensions of language and functional connectivity

Linked dimensions of language and brain structure

Discussion

Language features and clinical symptoms

Language and functional connectivity

Language features and brain structure

Limitations

Conclusion

Acknowledgments

Conflict of Interest

Authorship Contributions

Data Availability Statement

Supplementary Materials

References

Haas et al. supplementary material

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests