We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
One of the most popular instruments used to assess perceived social support is the Multidimensional Scale of Perceived Social Support (MSPSS). Although the original structure of the MSPSS was defined to include three specific factors (significant others, friends and family), studies in the literature propose different factor solutions. In this study, we addressed the controversial factor structure of the MSPSS using a meta-analytic confirmatory factor analysis approach. For this purpose, we utilized studies in the literature that examined and reported the internal structure of the MSPSS. However, we used summary data from 59 samples of 54 studies (total N = 27,905) after excluding studies that did not meet the inclusion criteria. We tested five different models discussed in the literature and found that the fit indices of the correlated 3-factor model and the bifactor model were quite good. Therefore, we also examined both models’ factor loadings and omega coefficients. Since there was no sharp difference between the two models and the theoretical structure of the scale was represented by the correlated three factors, we decided that the correlated three-factor model was more appropriate for the internal structure of the MSPSS. We then examined the measurement invariance for this model according to language and sample type (clinical and nonclinical) and found that metric invariance was achieved. As a result, we found that the three-factor structure of the MSPSS was supported in this study.
The two statistical approaches commonly used in the analysis of dyadic and group data, multilevel modeling and structural equation modeling, are reviewed. Next considered are three different models for dyadic data, focusing mostly on the very popular actor–partner interdependence model (APIM). We further consider power analyses for the APIM as well as the partition of nonindependence. We then present an overview of the analysis of over-time dyadic data, considering growth-curve models, the stability-and-influence model, and the over-time APIM. After that, we turn to group data and focus on considerations of the analysis of group data using multilevel modeling, including a discussion of the social relations model, which is a model of dyadic data from groups of persons. The final topic concerns measurement equivalence of constructs across members of different types in dyadic and group studies.
In this chapter we review advanced psychometric methods for examining the validity of self-report measures of attitudes, beliefs, personality style, and other social psychological and personality constructs that rely on introspection. The methods include confirmatory-factor analysis to examine whether measurements can be interpreted as meaningful continua, and measurement invariance analysis to examine whether items are answered the same way in different groups of people. We illustrate the methods using a measure of individual differences in openness to political pluralism, which includes four conceptual facets. To understand how the facets relate to the overall dimension of openness to political pluralism, we compare a second-order factor model and a bifactor model. We also check to see whether the psychometric patterns of item responses are the same for males and females. These psychometric methods can both document the quality of obtained measurements and inform theorists about nuances of their constructs.
Few studies have examined the psychometric properties of the Connor-Davidson Resilience Scale (CD-RISC) in a large adolescent community sample, finding a significant disparity. This study explores the psychometric properties of the CD-RISC among Spanish adolescents by means of exploratory factor analysis (EFA), Rasch analysis, and measurement invariance (MI) across sex, as well as internal consistency and criterion validity. The sample was comprised of 463 adolescents (231 girls), aged 12 to 18 years, who completed the CD-RISC and other measures on emotional status and quality of life. The EFA suggested that the CD-RISC structure presented a unidimensional model. Consequently, shorter unidimensional CD-RISC models observed in the literature were explored. Thus, the Campbell-Sills and Stein CD–RISC–10 showed the soundest psychometric properties, providing an adequate item fit and supporting MI and non-differential item functioning across sex. Item difficulty levels were biased toward low levels of resilience. Some items showed malfunctioning in lower response categories. With regard to reliability, categorical omega was. 82. Strong associations with health-related quality of life, major depressive disorder symptoms, and emotional symptoms were observed. A weak association was found between resilience and the male sex. Campbell-Sills and Stein’s CD–RISC–10 model emerges as the best to assess resilience among Spanish adolescents, as already reported in adults. Thus, independently of the developmental stage, the core of resilience may reside in the aspects of hardiness and persistence.
With efforts increasing worldwide to understand and treat paranoia, there is a pressing need for cross-culturally valid assessments of paranoid beliefs. The recently developed Revised Green et al., Paranoid Thoughts Scale (R-GPTS) constitutes an easy to administer self-report assessment of mild ideas of reference and more severe persecutory thoughts. Moreover, it comes with clinical cut-offs for increased usability in research and clinical practice. With multiple translations of the R-GPTS already available and in use, a formal test of its measurement invariance is now needed.
Methods
Using data from a multinational cross-sectional online survey in the UK, USA, Australia, Germany, and Hong Kong (N = 2510), we performed confirmatory factory analyses on the R-GPTS and tested for measurement invariance across sites.
Results
We found sufficient fit for the two-factor structure (ideas of reference, persecutory thoughts) of the R-GPTS across cultures. Measurement invariance was found for the persecutory thoughts subscale, indicating that it does measure the same construct across the tested samples in the same way. For ideas of reference, we found no scalar invariance, which was traced back to (mostly higher) item intercepts in the Hong Kong sample.
Conclusion
We found sufficient invariance for the persecutory thoughts scale, which is of substantial practical importance, as it is used for the screening of clinical paranoia. A direct comparison of the ideas of reference sum-scores between cultures, however, may lead to an over-estimation of these milder forms of paranoia in some (non-western) cultures.
Latent variable models are a powerful tool for measuring many of the phenomena in which developmental psychologists are often interested. If these phenomena are not measured equally well among all participants, this would result in biased inferences about how they unfold throughout development. In the absence of such biases, measurement invariance is achieved; if this bias is present, differential item functioning (DIF) would occur. This Element introduces the testing of measurement invariance/DIF through nonlinear factor analysis. After introducing models which are used to study these questions, the Element uses them to formulate different definitions of measurement invariance and DIF. It also focuses on different procedures for locating and quantifying these effects. The Element finally provides recommendations for researchers about how to navigate these options to make valid inferences about measurement in their own data.
The ubiquity of mobile devices allows researchers to assess people’s real-life behaviors objectively, unobtrusively, and with high temporal resolution. As a result, psychological mobile sensing research has grown rapidly. However, only very few cross-cultural mobile sensing studies have been conducted to date. In addition, existing multi-country studies often fail to acknowledge or examine possible cross-cultural differences. In this chapter, we illustrate biases that can occur when conducting cross-cultural mobile sensing studies. Such biases can relate to measurement, construct, sample, device type, user practices, and environmental factors. We also propose mitigation strategies to minimize these biases, such as the use of informants with expertise in local culture, the development of cross-culturally comparable instruments, the use of culture-specific recruiting strategies and incentives, and rigorous reporting standards regarding the generalizability of research findings. We hope to inspire rigorous comparative research to establish and refine mobile sensing methodologies for cross-cultural psychology.
Measurement invariance (MI) is essential to bolstering validity arguments behind psychometric instruments (Zumbo, 2007). Nonetheless, very few second language (L2) anxiety scales, including the most widely used L2 anxiety questionnaire—the Foreign Language Classroom Anxiety Scale (FLCAS; Horwitz et al., 1986)—have been tested for MI. The present paper seeks to address this deficiency in the literature (a) by demonstrating why this procedure is key to enhancing our understanding of the latent phenomenon in question, particularly in relation to different language learning contexts, (b) by outlining the main stages of MI testing with specific recommendations for L2 scale developers and users, (c) by providing commendable examples of the application of MI in applied linguistics research in order to illustrate the potential of this technique, and (d) by making a case for employing MI in future validation studies, thereby promoting methodologically sound research practices in the context of anxiety scales and elsewhere in applied linguistics.
This Element demonstrates how and why the alignment method can advance measurement fairness in developmental science. It explains its application to multi-category items in an accessible way, offering sample code and demonstrating an R package that facilitates interpretation of such items' multiple thresholds. It features the implications for group mean differences when differences in the thresholds between categories are ignored because items are treated as continuous, using an example of intersectional groups defined by assigned sex and race/ethnicity. It demonstrates the interpretation of item-level partial non-invariance results and their implications for group-level differences and encourages substantive theorizing regarding measurement fairness.
The Defining Issues Test (DIT) has been widely used in psychological experiments to assess one’s developmental level of moral reasoning in terms of postconventional reasoning. However, there have been concerns regarding whether the tool is biased across people with different genders and political and religious views. To address the limitations, in the present study, I tested the validity of the brief version of the test, that is, the behavioral DIT, in terms of the measurement invariance and differential item functioning (DIF). I could not find any significant non-invariance at the test level or any item demonstrating practically significant DIF at the item level. The findings indicate that neither the test nor any of its items showed a significant bias toward any particular group. As a result, the collected validity evidence supports the use of test scores across different groups, enabling researchers who intend to examine participants’ moral reasoning development across heterogeneous groups to draw conclusions based on the scores.
The General Decision-Making Styles (GDMS) scale measures five decision-making styles: rational, intuitive, dependent, avoidant and spontaneous. GDMS has been related to coping and some personality factors and sex-differences has been described. In spite of its usefulness, there is not a validated Spanish translation. The aim of this study is to translate to Spanish and provide psychometric evidence considering sex differences and the relationships between GDMS, personality and coping variables. Two samples were used for this study; the first sample composed by 300 participants who completed the GDMS and the Rational-Experiential Inventory (REI), and the second sample of 361 participants who completed the GDMS, the Ten Item Personality Trait Inventory and the brief COPE scales. Participants from second sample filled in GDMS a second time (137 participants) after eight weeks from the first data collection. Confirmatory factor analyses showed a five-factor composition of GDMS with equivalence across sex using invariance analyses. Moreover, GDMS showed acceptable internal consistency and temporal stability. Finally, rational and intuitive styles were related to healthier coping patterns and emotional stability, while dependent, avoidant and spontaneous styles were associated with unhealthy coping patterns and emotional instability.
Since the literature investigating the stigmatising attitudes of psychiatrists is scarce, this is the first study which examines the phenomena across Europe. The Opening Minds Stigma Scale for Health Care Providers (OMS-HC) is a widely used questionnaire to measure stigma in healthcare providers towards people with mental illness, although it has not been validated in many European countries.
Objectives
A cross-sectional, observational, multi-centre study was conducted in 32 European countries to investigate the attitudes towards patients among specialists and trainees in general adult and child psychiatry. In order to be able to compare stigma scores across cultures, we aimed to calculate measurement invariance.
Methods
An internet-based, anonymous survey was distributed in the participating countries, which was completed by n=4245 psychiatrists. The factor structure of the scale was investigated by using separate confirmatory factor analyses for each country. The cross-cultural validation was based on multigroup confirmatory factor analyses.
Results
When country data were analysed separately, the three dimensions of the OMS-HC were confirmed, and the bifactor model showed the best model fit. However, in some countries, a few items were found to be weak. The attitudes towards patients seemed favourable since stigma scores were less than half of the reachable maximum. Results allowed comparison to be made between stigma scores in different countries and subgroups.
Conclusions
This international cooperation has led to the cross-cultural validation of the OMS-HC on a large sample of practicing psychiatrists. The results will be useful in the evaluation of future anti-stigma interventions and will contribute to the knowledge of stigma.
Generic psychometric instruments are frequently used in psychiatric practice. When a respondent provides an affirmative reply to two contrasting items in such a questionnaire (e.g. “I am reserved” and “I am outgoing”), serious questions need to be asked about the respondent, the instrument, and the interaction between the two.
Objectives
The research aims to identify reasons which could explain the contradictory answers provided by respondents to a well-established, and seemingly psychometrically sound instrument.
Methods
World Values Survey data, collected in South Africa (N = 3 531), were analysed, focusing on the personality survey, where contrasting response to matching items were identified. Exploratory factor analyses were used to inspect the factorial structure of the instrument across groups, after which measurement invariance tests were done.
Results
The theorised factorial structure of the personality survey did not mirror the structure in the South African sample. This was demonstrated in the inspection-report, as well as in the tests of measurement invariance. However, in some groups, specifically those who were well-versed in English and possessed higher levels of education, the structures were replaceable.
Conclusions
The assumption that well-established instruments are valid in settings different to the one where they were initially developed, should be questioned, and such instruments should not be used unless thoroughly tested. This presentation exposes the extent of measurement non-invariance when using an instrument in a foreign setting and shows how this can be detected and addressed. Those working with foreign individuals or conducting cross-cultural research should be particularly aware of these threats to validity.
Given the possibility of cultural differences in the meaning and levels of gratitude among children, we evaluated the measurement invariance of the Gratitude Questionnaire–5 (GQ–5) and differences in latent means across adolescents from two distinct cultures, China and America. Data were obtained from 1,991 Chinese and 1,685 American adolescents. Confirmatory factor analysis and multigroup confirmatory factor analysis were performed to examine the factor structure and the measurement equivalence across Chinese and American adolescents. The Cronbach’s alpha and Item-total Correlations of the GQ–5 were also evaluated. Results of confirmatory factor analyses provided support for the expected one-factor structure. Also, a series of multi-group confirmatory factor analyses supported full configural invariance, full metric invariance, and partial scalar invariance between the two groups. Furthermore, the findings suggested that the GQ–5 is suitable for conducting mean level comparisons. The subsequent comparison of latent means revealed that the Chinese adolescents reported significantly lower gratitude than American adolescents.
The main aim of this study was to assess the psychometric proprieties of the Child Feeding Questionnaire (CFQ) in Italian mothers.
Design:
Mothers completed the Italian version of the CFQ, and children’s anthropometric data were collected. Construct validity of the CFQ was assessed by comparing three different models: (a) a seven correlated factors model in which all items were analysed; (b) a seven correlated factors model with composite items based on the Restriction factor and (c) an eight correlated factors model with a separate Reward factor. Measurement invariance using BMI categories and gender was evaluated. Furthermore, discriminant validity with group comparison was performed between BMI categories and gender.
Setting:
Italy.
Participants:
A total of 1253 6-year-old Italian children (53·9 % male) attending elementary school (1st grade) and their mothers (mean age = 38·22 years; sd = 4·89) participated in this study.
Results:
The eight-factor model with a separate reward factor provided the best fit for the data. The strict invariance of the CFQ across child BMI categories and gender was confirmed. The CFQ internal consistency was acceptable for most subscales. However, two subscales showed no adequate values. As expected, the CFQ scales showed significant differences between BMI categories, while no gender-related differences were found.
Conclusions:
The study indicated the Italian version of the CFQ to be factorially valid for assessing parental feeding practices of 6-year-old children across BMI categories. Future research should address low internal consistency in some of the CFQ subscales.
This study investigated the latent factor structure of the NIH Toolbox Cognition Battery (NIHTB-CB) and its measurement invariance across clinical diagnosis and key demographic variables including sex, race/ethnicity, age, and education for a typical Alzheimer’s disease (AD) research sample.
Method:
The NIHTB-CB iPad English version, consisting of 7 tests, was administered to 411 participants aged 45–94 with clinical diagnosis of cognitively unimpaired, dementia, mild cognitive impairment (MCI), or impaired not MCI. The factor structure of the whole sample was first examined with exploratory factor analysis (EFA) and further refined using confirmatory factor analysis (CFA). Two groups were classified for each variable (diagnosis or demographic factors). The confirmed factor model was next tested for each group with CFA. If the factor structure was the same between the groups, measurement invariance was then tested using a hierarchical series of nested two-group CFA models.
Results:
A two-factor model capturing fluid cognition (executive function, processing speed, and memory) versus crystalized cognition (language) fit well for the whole sample and each group except for those with age < 65. This model generally had measurement invariance across sex, race/ethnicity, and education, and partial invariance across diagnosis. For individuals with age < 65, the language factor remained intact while the fluid cognition was separated into two factors: (1) executive function/processing speed and (2) memory.
Conclusions:
The findings mostly supported the utility of the battery in AD research, yet revealed challenges in measuring memory for AD participants and longitudinal change in fluid cognition.
Social isolation is a state of nearly-absolute lack of interaction between an individual and society. The Friendship Scale (Hawthorne, 2006) is a measure of social isolation that needed to be translated in the Urdu language for its validation for the Pakistani population owing to its brevity and sound psychometric properties. For the Urdu translation, the standard back-translation procedure was adopted, and the cross-language validation of the translated version was undertaken on a purposive sample of (N = 60) older adults with a minimum age of 60 years. The test-retest reliability of one week for the Urdu-English and English-Urdu version was .95 and .97, respectively. In an independent purposive sample of older adults (N = 500; men = 263 and women = 237) from Lahore and Sargodha districts, the CFA of the Friendship Scale revealed a single factor solution with six indicators, which demonstrated configural, metric, and scalar invariance across both genders and comparable latent mean scores of men and women. The Friendship Scale demonstrated a significant positive relationship with depression and non-significant association with the assimilation, which provided evidence for the convergent and discriminant validities, respectively. Furthermore, evidence of the concurrent validity was established as the older adults whose spouses had died scored significantly higher on the Friendship scale as compared to their counterparts who were living with their spouses. These pieces of evidence suggest that the Urdu version of the Friendship scale is a reliable and valid measure of flourishing for both genders.
The goals of this study were to (1) specify the factor structure of the Uniform Dataset 3.0 neuropsychological battery (UDS3NB) in cognitively unimpaired older adults, (2) establish measurement invariance for this model, and (3) create a normative calculator for factor scores.
Methods:
Data from 2520 cognitively intact older adults were submitted to confirmatory factor analyses and invariance testing across sex, age, and education. Additionally, a subsample of this dataset was used to examine invariance over time using 1-year follow-up data (n = 1061). With the establishment of metric invariance of the UDS3NB measures, factor scores could be extracted uniformly for the entire normative sample. Finally, a calculator was created for deriving demographically adjusted factor scores.
Results:
A higher order model of cognition yielded the best fit to the data χ2(47) = 385.18, p < .001, comparative fit index = .962, Tucker-Lewis Index = .947, root mean square error of approximation = .054, and standardized root mean residual = .036. This model included a higher order general cognitive abilities factor, as well as lower order processing speed/executive, visual, attention, language, and memory factors. Age, sex, and education were significantly associated with factor score performance, evidencing a need for demographic correction when interpreting factor scores. A user-friendly Excel calculator was created to accomplish this goal and is available in the online supplementary materials.
Conclusions:
The UDS3NB is best characterized by a higher order factor structure. Factor scores demonstrate at least metric invariance across time and demographic groups. Methods for calculating these factors scores are provided.
Expert evaluations about countries form the backbone of comparative political research. It is reasonable to assume that such respondents, no matter the region they specialize in, will have a comparable understanding of the phenomena tapped by expert surveys. This is necessary to get results that can be compared across countries, which is the fundamental goal of these measurement activities. We empirically test this assumption using measurement invariance techniques which have not been applied to expert surveys before. Used most often to test the cross-cultural validity and translation effects of public opinion scales, the measurement invariance tests evaluate the comparability of scale items across any groups. We apply them to the Perceptions of Electoral Integrity (PEI) dataset. Our findings suggest that cross-regional comparability fails for all eleven dimensions identified in PEI. Results indicate which items remain comparable, at least across most regions, and point to the need of more rigorous procedures to develop expert survey questions.
This paper compares nationalism in the two ex-Czechoslovak countries—the Czech and Slovak republics. The aim is to analyze the measurement of nationalism in the 1995, 2003, and 2013 International Social Survey Program (ISSP) National Identity surveys. According to the nationalism measures from the ISSP survey – which are frequently used by authors analyzing nationalism—both countries experienced a significant rise in nationalism in the 1995 to 2013 period. Moreover, invariance testing of the nationalism latent variable confirms the possibility of comparing levels of nationalism between Czechia and Slovakia over time. However, the associations between nationalism, as measured in the study, and concepts related to nationalism—such as xenophobia, protectionism, or assertive foreign policy—suggest that what is measured as nationalism in 1995 is very different from what is measured in 2013. This is explained by a change of context which occurred in both countries between 1995 and 2013. While answering the same question had a strong nationalistic connotation in 1995, this was not the case in 2013. Based on our findings we advise against using the analyzed “nationalism” items as measurement of nationalism even beyond the two analyzed countries.