We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter reviews the quantitative corpus linguistic literature on development of cohesion in first- and second-language writing. It first provides a theoretical and methodological context for such work by discussing the two main frameworks within which cohesion has been researched. It then critically reviews an extensive body of literature to establish what substantive conclusions can be drawn and what might constitute productive foci for future research. Interest in cohesion as a correlate of development has been less intense than that seen for grammar, vocabulary, and formulaic language, and few consistent patterns have emerged. While there is some indication that a small number of measures are associated with development, evidence on these is too sparse for any confident conclusions to be drawn. Moreover, quantitative measures of cohesion appear to be highly contextually specific, depending crucially on the nature of the text and on the writer's estimation of their audience's topic knowledge.
This chapter reviews the quantitative corpus linguistic literature on formulaic language development in writing. It first provides a theoretical and methodological context by discussing the construct of formulaic language and the various ways in which it has been operationalised in studies of writing development. It then critically reviews the literature to establish what substantive conclusions can be drawn and what might constitute productive foci for future research. The review highlights a lack of interest in first-language studies. However, second-language studies have seen a rapid expansion of interest over the last decades, which has yielded a number of consistent patterns. In particular, writing quality is positively associated with the percentage of n-grams attested in a reference corpus, the mean strength of association between collocates (again as attested in a reference corpus), and the prevalence of sequences which analysts subjectively identify as formulaic. It is also negatively associated with use of formulas copied from source materials. Key areas in which further methodological development is needed include: understanding how analysts identify sequences as formulaic; increasing the size and rigour of studies looking at discourse functions of lexical bundles; understanding the impact of reference corpus on findings; developing corpora representative of learner input.
This chapter establishes a theoretical and methodological foundation for the quantitative corpus linguistic study of writing development. First, it defines and discusses the central constructs of writing, writing proficiency, development, and quantitative corpus linguistics. Second, it sets out four assumptions on which, we argue, quantitative corpus approaches rest and discusses in detail both the strengths of these approaches and the methodological challenges they need to confront. Third, it gives a detailed discussion of specific methodological issues related to defining and measuring key variables of development and context and of establishing the status of particular measures of language use. Finally, the chapter reviews one particular quantitative corpus linguistic approach (multidimensional analysis) which raises important questions about quantitative corpus linguistic methodology as a whole.
This chapter brings together discussions and evidence from the preceding chapters to draw conclusions about first- and second-language writing development and about quantitative corpus linguistics as a methodology. It first summarises the key patterns of development in terms of grammar, vocabulary, formulaic language, and cohesion. It then discusses implications of these findings for the key constructs of time- and quality-related development and draws methodological conclusions with regard to how quantitative measures of development have been, and in the future could be, theorised and operationalised and the types of text samples on which studies have been, and could be, built. The chapter ends by setting out a number of key priorities for future research, grouped under the headings of theorisation, broadening attention to contexts, and integration with other methods.
This chapter reviews the quantitative corpus linguistic literature on syntactic development in first- and second-language writing. It first provides a theoretical and methodological context for such work by discussing the construct of syntactic proficiency. It then critically reviews an extensive body of literature to establish what substantive conclusions can be drawn and how future research could most productively develop. The strongest developmental patterns are found for generic measures of syntactic complexity, as operationalised through measures such as mean length of sentence/T-unit/clause and subordinate clause ratios. However, we argue that such measures are relatively uninformative with regard to a detailed understanding of development. Our review of more specific syntactic measures highlights a number of key features which have the potential to give useful insights into language development, while also underscoring the fragmentary nature of the measures studied to date. Methodologically, the review identifies a pervasive lack of conceptual clarity regarding what is measured and why. We find important unacknowledged differences in how key terms (e.g. clause, noun phrase) are defined and operationalised, which make it difficult to build a theoretically meaningful and cohesive developmental picture.
This chapter reviews the quantitative corpus linguistic literature on vocabulary development in first- and second-language writing. It first provides a theoretical and methodological context for such work by discussing the construct of vocabulary proficiency. It then critically reviews an extensive body of literature to establish what substantive conclusions can be drawn and what might constitute productive foci for future research. The strongest developmental patterns are found for measures of vocabulary diversity and use of academic vocabulary. There are also important, but complex, developmental patterns with regard to use of high- versus low-frequency words. However, the current range of measures attested in the literature has significant limitations: they are limited in scope, focusing almost exclusively on breadth, rather than depth, of vocabulary knowledge; relationships between measures and knowledge constructs is often unclear; the relationships between measures themselves, which often overlap with each other in complex ways, are largely unexamined; measures are often too coarse-grained, and may consequently disguise important developmental patterns by conflating distinct constructs.
This chapter sets out the aims of the book and delimits its scope by defining the field of quantitative corpus linguistics (QCL), describing its key strengths as a way of understanding written-language development, and setting out some of the methodological problems which researchers need to untangle. It then outlines the systematic literature reviews on which the book is based and provides a broad overview of the trajectories this literature has taken over time.
Quantitative corpus research on written language development has expanded rapidly in recent years, assisted by the ever-increasing power and accessibility of software capable of reliably analysing huge collections of learner writing. For this work to reach its full potential, it is important that researchers have a strong understanding of its methodological foundations and of the existing empirical evidence base on which it can build. This book provides the most comprehensive discussion to date of research in this area. Covering both first and second language learning contexts, it sets out a coherent theoretical framework and systematically reviews studies published over the last seventy years in order to establish what such research has taught us about written language development, what it hasn't taught us, and what we should do next. Timely and original, this is an essential reference work for academic researchers and students of first and second language writing.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.