We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter sets out by discussing the way in which multidimensional techniques and visualizations have been used to analyse linguistic data. While, for instance, multidimensional scaling and unrooted phenograms (or NeighborNets) have primarily been designed for exploratory purposes, the author argues that they are in fact regularly used to put linguistic assumptions or hypotheses to the test. Cluster goodness (in terms of internal coherence and external distance from other clusters) in such approaches are typically evaluated based on a two-dimensional visualization. The author compares the affordances and limitations of visual inspection with a quantitative set of metrics that directly relates to visual displays but adds a degree of precision not attained by the human eye. The empirical part of the paper applies both approaches to a study of concessive constructions in six varieties of English, based on spoken and written material from the International Corpus of English. The author suggests that the new metrics can be usefully applied to a variety of multidimensional techniques to endow them with a measure of objectivity.
Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward’s method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data.
Design
The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice.
Results
The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward’s methods, the performance of k-means was better in 64–100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a ‘non-processed’ cluster with a high consumption of fruits, vegetables and wholemeal bread, a ‘balanced’ cluster with only slight preferences of single foods and a ‘junk food’ cluster.
Conclusions
The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.