The volume ‘All families and genera’: Exploring the Corpus of English Life Sciences Texts, edited by Isabel Moskowich, Inés Lareo and Gonzalo Camiña, is a very interesting and useful publication in the field of historical and diachronic studies on the language of science. The book is a collection of fifteen essays and introduces a new corpus of texts, known as CELiST (Corpus of English Life Sciences Texts, 1700–1900), as a subsection of the Coruña Corpus of English Scientific Writing (CC) compiled by the MuSTE Research Group (Multidimensional Corpus-Based Studies in English). The general aim of the volume is to support and promote qualitative research in line with a corpus-based and corpus-driven quantitative approach (cf. Biber Reference Biber, Heine and Narrog2015: 195, 202). In other words, the collection of texts in the CELiST combines a large quantitative source of data (totalling 400,305 words) with the qualitative multilayered co(n)texts in which language features and language patterns are used by a variety of eighteenth- and nineteenth-century authors. This review will discuss the main contents of the chapters presented in the volume and their interdisciplinary usefulness and reliability.
The volume can be subdivided into two main sections, dealing with the CELiST in different manners. Chapters 1–5 introduce the making of the corpus, the selection criteria and the disciplines included (Moskowich, chapter 1), the editorial policy (Camiña, chapter 2), the eighteenth- and nineteenth-century samples (Lareo & Moskowich, chapters 3–4) and the texts and their representativeness in Late Modern English (Alfaya-Lamas, chapter 5). Chapters 6–15 are corpus-based and corpus-driven CELiST studies on specific language issues, the quantitative data of which provide evidence for further qualitative discussion and interpretation in the (socio)linguistic contexts in which the selected language features appear. The areas of interest and investigation cover lexical fixedness (Bator, chapter 6), engagement (Mele-Marrero, chapter 7), indicators of persuasion (Crespo, chapter 8), persuasion and suasive verbs (Barsaglini-Castro, chapter 9), authorial presence through conditionals and citations (Puente-Castelo, chapter 10), the expression of true facts (Esteve-Ramos, chapter 11), evaluative that structures (Alonso-Almeida & Álvarez-Gil, chapter 12), authority and deontic modals (Álvarez-Gil, chapter 13), coherence relations (Bello Viruega & Narváez García, chapter 14) and register-internal variation (Monaco, chapter 15).
In ‘The making of the Corpus of English Life Sciences Texts (CELiST), a bunch of disciplines’ (pp. 1–19), Isabel Moskowich defines the fields of science included in the corpus, accurately labelled as ‘Life Sciences’. These are represented by a series of disciplines, such as biology, botany, zoology, horticulture, veterinary medicine, etc., which were only and specifically defined later than the period under scrutiny. The selection is based on the 1988 UNESCO taxonomy as a starting point (pp. 1–3), but adapted to late modern notions of knowledge. The label ‘Life Sciences’ was selected as a cohesive epistemological category for the period. The corpus counts 400,305 words evenly distributed over the eighteenth and nineteenth centuries (about 200,000 each century). The writers included are not found in any other subcorpora of the CC and represent both male and female authors, with a different ratio (80% and 20%, respectively). Most of the texts and authors come from England (about 50%) and are representative of Late Modern English scientific writing. Scotland is well represented (20%), Irish and North American Eastern Coast publications and authors follow. The genres in the CELiST include mostly treatises, as well as guides, textbooks, letters, catalogues, lectures, essays and articles, to guarantee representativeness and ‘a small-scale reflection of what was happening in the English-writing communities in the Late Modern English world’ (p. 17).
‘Editorial policy in the Corpus of English Life Sciences Texts: Criteria, conventions, encoding and editorial marks’ (pp. 21–38) by Gonzalo Camiña explains the criteria adopted for conversion of the original texts into a digital format. The CELiST includes forty samples, twenty for each century, and about 10,000 words per sample, all of them encoded in XML format. Original texts were converted in Optical Character Recognition (OCR) files and manually revised to minimise ‘gross errors’ (p. 22). To make access flexible to researchers, headers highlighting metadata about the file were added to the sample texts (p. 25); font size, typeface, chapter and section titles were normalised for spelling; everything not representing the author's voice was omitted (e.g. editorial notes, margin notes/endnotes, quotations, pp. 31–3). All the texts are tagged (TEI, Text Encoding Initiative) and editorial marks have been added, such as square brackets to signal quotations, figures and formulae, and to disambiguate homographs (p. 29); original page numbers and contents were maintained, as well as spelling variants and line breaks, but catch-words at the bottom were omitted to avoid repetition of words. A list of editorial marks (tags), followed by the author/year to which they are applied, is usefully displayed at the end of this chapter.
In the two parallel chapters 3 and 4, ‘A look beyond the texts: The samples in the eighteenth-century Corpus of English Life Sciences Texts’ (pp. 39–70) and ‘A look beyond the texts: The samples in the nineteenth-century Corpus of English Life Sciences Texts’ (pp. 71–93), Inés Lareo and Isabel Moscowich provide essential bibliographical information on the eighteenth- and nineteenth-century samples in the CELiST, as well as on the role of publishers and sellers over time (p. 39). Chapter 3 includes detailed descriptions of each sample with reference to titles, pages sampled, chapters and plates, and the paratext as a whole. Since the publication process was particularly complex in the eighteenth century, and the application of copyright was still fluid, a relevant appendix was added at the end of the chapter, indicating date, author, place of publication, printer/distributor, copyright, dedication, subscribers and genre. Chapter 4 provides a detailed description of those works which constitute the nineteenth-century CELiST, and also includes different copyright options and features drawn from prefatory material (p. 71). It also highlights some fundamental differences with the preceding century in relation to technological innovations in the production of books.
‘The Corpus of English Life Sciences Texts and representativeness: An information and documentation analysis of Late Modern English scientific texts’ (pp. 95–114) is the last of this methodological section on the collection of the CELiST texts. The author, Elena Alfaya-Lamas, discusses the qualitative and quantitative representativeness of the works included in the corpus, with a focus on their lexicon (p. 95) and its closure-saturation degree (types/tokens; see graphs and tables, pp. 104–13). CELiST contains forty 10,000-word samples (p. 108) evenly distributed over the two centuries under scrutiny; the domains involved are biology, zoology, entomology, botany and other related fields, reflecting the different categories at the time (p. 99). The selection is based on two texts for every ten-year period, texts written directly in English, extracts taken from different parts of the texts, female and male authors. On the basis of Alfaya-Lamas’ description, the corpus is certainly quantitatively and qualitatively representative and reliable.
Magdalena Bator introduces the group of essays dealing with the application of specific research questions to the CELiST or, in other words, with the retrieval and collection of language features for quantitative and qualitative investigation. In ‘Lexical fixedness within the field of Life Sciences in Late Modern English: Evidence from the Corpus of English Life Sciences Texts’ (pp. 115–31), the author examines binomials in terms of structural and semantic relations, understood as ‘a sequence of two or more words or phrases belonging to the same grammatical category having some semantic relationship and joined by some syntactic device such as “and” or “or”’ (Bhatia Reference Bhatia1993: 108, cited on p. 116), and paying attention to ‘frequency, syntactic structure and fixedness’ (p. 117). Based on the CELiST, the selection was limited to 108 coordinated word pairs which were found at least three times in more than one text, to guarantee effectiveness and representativeness. From this interesting analysis, it emerges that most of the phrases are joined by and, that those joined by or are antonymous binomials and that nominal pairs are the most frequent pattern in the data. The period 1720–50 is the most productive time, and catalogues, lectures and textbooks abound in binomials, since they are educational genres addressed to a non-expert readership. Complementation in nominal phrases (two constituents share some common property, e.g. animals or plants, form and colour, p. 124) is prevalent, followed by other semantic categories of binomials, such as antonymy, synonymy and hyponymy (p. 125). The study demonstrates that the use of binomials is consistent with the need to be precise and coherent (p. 128), and also the need to make the texts understandable, though further research is required.
‘Engagement in the botanists of the Corpus of English Life Sciences Texts: Flourishing female scientific writing’ (pp. 133–46), by Margarita Mele-Marrero, not only examines the expressions of engagement but it also highlights a relevant sociolinguistic issue: female research and writing in the expanding domain of botany (p. 133). The study is focused on stance and engagement markers and, in particular, on textual acts (explicit indications of the author to the reader, e.g. move on to, etc.), cognitive acts (mental processing of content, e.g. compare, etc.) and physical acts in directives (referring to tangible actions, e.g. use, etc.) (p. 134). The general aim is to examine variation between male and female writing and how women render their discourse authoritative. To guarantee representativeness, nine works evenly distributed over time and gender were selected for analysis. All of them were searched quantitatively, and later analysed qualitatively to distinguish imperative forms. Results highlight that all writers avoid a strong authorial voice through the use of directives and prefer a soft approach (p. 143), but whereas men tend to use more imperatives with deontic modals, women tend to prefer may to mitigate their requests. This piece of research is very interesting and deserves further investigation.
Begoña Crespo also focuses on female writers in her analysis of ‘Linguistic indicators of persuasion in female authors in the Corpus of English Life Sciences Texts’ (pp. 147–67). The study examines introductory material (e.g. prefaces and dedications) in detail and singles out linguistic features which can denote persuasion, argumentation and interaction (first/second-person pronouns, predictive/necessity modals, conditional subordination, suasive verbs and to-infinitives). Only texts written by women were included in the analysis (20% of the CELiST, 80,187 words out of the total 400,305). Introductory material amounts to 5,127 words, analysed quantitatively and qualitatively: a number of graphs and tables appropriately support data exposition and argumentation. This study provides interesting results: as expected, prefaces are more interactive and persuasive than the body of the texts and tend to construct proximity and solidarity (e.g. we, p. 157) with to-infinitives, followed by first-person pronouns, suasive verbs and predictive modals (p. 162).
A more specific piece of research on persuasion is carried out by Anabella Barsaglini-Castro, whose interest concentrates on a list of nineteen suasive verbs in ‘Persuasion in English scientific writing: Exploring suasive verbs in the Corpus of English Life Sciences Texts and Posthumanism English Texts’ (pp. 169–88). This pilot study compares Late Modern English persuasion strategies, using nineteenth-century CELiST works and Posthumanism English Texts, a twentieth-/twenty-first-century corpus of English texts (PET, under construction, p. 172), and based on Biber's multidimensional analysis (Reference Biber1988, Reference Biber1995), specifically on Dimension 4 ‘Overt Expression of Persuasion’ (Biber & Conrad Reference Biber and Conrad2009). In both corpora, the texts analysed include non-fiction samples, and graphs and tables duly support exposition and argumentation (suasive verb frequency, suasive verbs per field/subject-matter, century, sex and field/subject-matter, pp. 179–84). Interesting results report a general increase in frequency of suasive verbs, with differences over the three centuries considered; that scientific writing seems to move towards a ‘more direct kind of communication’ (p. 185) with the audience and authorial presence is characterised by a higher use of persuasion strategies in contemporary authors; and that contemporary female authors tend to use more suasive verbs than men (p. 186).
Another case study is Luis Puente-Castelo's, discussing authorial presence in ‘“If you will take the trouble to inquire into it rather closely, I think you will find that it is not worth very much”: Authorial presence through conditionals and citation sequences in late modern English life sciences texts’ (pp. 189–208). Puente-Castelo considers two different linguistic strategies which may be used to express the authors’ attitude towards their works and their contents, namely conditional structures (‘non-committal’, expressing uncertainty; ‘metalinguistic’, expressing uncertainty about forms/words used; ‘relevance’, expressing the circumstances in which a statement may be considered relevant) and opinions in citation sequences (reporting the words of authors) (pp. 192–3). Conditional particles such as if, unless, providing, supposing were extracted from the full CELiST corpus, their use disambiguated in context, and later classified in relation to their function in discourse (p. 195). Citation sequences were extracted using the CCT (Coruña Corpus Tool) and results were verified in a longer context to identify the attitude of the writer. Among the conditional structures of interest (56 out of 777), relevance if-conditionals are the most frequent pattern, followed by non-committal conditionals (present simple-present simple combination) and metalinguistic conditionals (often combined with the verb may) (pp. 195–9). As regards citation sequences, the 275 cases display a high degree of diachronic variability and the most frequent attitude is neutrality, compared to agreement, isolated sequences and disagreement. These attitudes are not evenly distributed over time. The study is thought-provoking since it reveals the complexity in the construction and expression of the authors’ opinion beyond the certainty–uncertainty axis (p. 205).
‘“This ingenious hypothesis hath a great appearance of truth”: The expression of true facts in the Corpus of English Life Sciences Texts’ (pp. 209–26), by María José Esteve-Ramos, aims at investigating the expressions used by the scientists of the Enlightenment to make their perspectives ‘sound more truthful and reliable’ (p. 209). Adverbs of certainty is one of the many resources used by writers to display and support ‘unprecedented knowledge’ (p. 209) and for this reason Esteve-Ramos investigates the full CELiST (degree of truth adverbs, p. 211) and compares the results with those obtained from the Corpus of English History Texts (CHET). Parameters such as century, sex, age of authors and genre are essential for the study. It is observed that the use of certainty adverbs is exclusive or more frequent in male writers, with a marked increase from one century to the next (p. 217). Occurrences are higher in 30–49-year-old writers, in contexts in which they introduce and discuss new approaches (pp. 218–19), and are more frequent in articles, lectures, essays and treatises, all of which are genres related to the development and exposition of new experimental scientific knowledge. The analysis confirms the use of adverbs (e.g. indeed, certainly, clearly, plainly, evidently, p. 220) as a powerful tool to establish truth, or degrees of truth, and to connect with the audience (p. 225). It is definitely a field worth further investigation.
Language strategies and, in particular, morphosyntactic issues which convey positioning and evaluation are explored by Francisco Alonso-Almeida and Francisco J. Álvarez-Gil in ‘Evaluative that structures in the Corpus of English Life Sciences Texts’ (pp. 227–47). The authors focus on that-clauses, one of the main strategies to express authorial presence. The CELiST was searched using the CCT, and results were manually pruned to identify all the relevant examples of evaluative that-expressions. Results show a slight difference between male and female writers, and a general preference for epistemic devices (p. 232). Specific uses and preferences depend on more detailed analyses, which include the object/s of evaluated that-structures, such as evaluative entity (e.g. entity evaluated, previous studies, writer's claim, pp. 233–4), evaluative stance (e.g. authorial perspectivisation of knowledge, pp. 234–6), evaluative source (e.g. authorial accountability, pp. 237–40) and expression (noun/adjective non-verbal predicate vs discourse/research/cognitive acts and verbal predicate, pp. 240–3). The study is very interesting in the variety of examples and contextualisations provided, and the topic certainly deserves further attention.
Deontic modality and its relationship with authority and the expression of necessity for argumentative reasons in scientific writing, along with mitigation to avoid potential face-threatening acts (p. 249), is the topic addressed in ‘Authority and deontic modals in Late Modern English: Evidence from the Corpus of Life Sciences Texts’ (pp. 249–64) by Francisco J. Álvarez-Gil. The focus lies in modal verbs and their occurrence in the full CELiST: modals were quantitatively retrieved by using the CCT and qualitatively pruned to identify deontic meanings. About 44 per cent of all modal verbs provide ‘deontic nuances’ (p. 252) and their deontic meaning is essentially expressed by can, may, will, shall, must and should. Among deontic verbs, will is the item most often used (22%; overuse by men), followed by should (8%), must (6%; men < women), shall (5%; women < men) and can (3%; men < women) (pp. 254–5). Prescription is usually expressed by must (necessity in natural processes and authority/directions in procedures) and advisable recommendation by should (p. 256). Necessity/logical outcome is expressed by will, must and should (pp. 256–7). Necessary steps and position of information prefer will and shall (e.g. I shall first examine; pp. 257–8). Appeal to the reader or colleague is expressed by must (e.g. it must, we must; pp. 258–9). Observed phenomena in descriptive passages and expectations based on actual knowledge are signalled by will (e.g. will see, it will be seen; pp. 259–60). The last category exemplified is deontic possibility, introduced by can and expressing impossibility to perform an action (e.g. I cannot commend, it cannot be, etc.; p. 260). In the discussion, the author addresses key issues such as the frequency and relevance of necessity modals which reinforce ‘the idea of futurity’ and logical sequencing (p. 261), the mitigation effects of should and the expression of expertise and authority by highlighting factuality and softening face-threatening situations (pp. 261–2).
Iria Bello Viruega and Elisa Narváez García investigate discourse connections between information units in ‘A study of coherence relations in the English scientific register: Conjunctions in the Corpus of English Life Sciences Texts’ (pp. 265–87), with a focus on explicit relations conveyed in additive, adversative, causal and temporal conjunctions (pp. 266, 270). Extra-linguistic factors considered are gender, age and place of education: young(er) writers (25–44 years) in the CELiST cover about 60 per cent of words; the leading place of education is England (52%, about 210,000 words), followed by Scotland, the US and Ireland. Articles, essays, letters and treatises (about 240,000 words, 60%) have been categorised as specialised genres; catalogues, guides, lectures and textbooks (about 160,000, 40%) as non-specialised genres. Explicit coherence relations tend to steadily increase over the two centuries, with a preference for adversative conjunctions, followed by causal, additive and temporal terms (pp. 270–1). Most of the explicit conjunctions identified are used by male writers, whereas the distribution of the four types of conjunctions is balanced (p. 273), and it also displays a similar frequency in relation to the author's age (p. 275). The use of conjunctions according to place of education is more frequent in the US, followed by England, Ireland and Scotland. Treatises gather most of the conjunctions analysed (pp. 276–7), but explicit coherence marks are most often found in non-specialised texts to facilitate comprehension. The study is very accurate and precise, and sheds light on the complexity of scientific argumentation as well as on cognitive processes for the reader to understand and clearly interpret the text (p. 279). A compelling field of interest which deserves further attention for the two centuries under scrutiny.
The last chapter of the collection, a contribution by Leida Maria Monaco, is concerned with ‘Spotting register-internal variation in eighteenth- and nineteenth-century life sciences: Descriptiveness and argumentation in the Corpus of English Life Sciences Texts’ (pp. 289–307). The aim is to carry out a multidimensional analysis (Monaco Reference Monaco2017; see Biber Reference Biber1988) of the CELiST from the perspective of genre and author's sex, in order to measure argumentative versus descriptive focus (Dimension 2, p. 293) and, in general, patterns of subregisters (Biber & Gray Reference Biber and Gray2013) (p. 289). Argumentation is usually marked by conjuncts, predictive modals, causative adverbs, other adverbial subordinators and conditional subordinators (positive features/above zero), whereas description is highlighted by attributive adjectives, predicative adjectives, hedges, downtoners and amplifiers (negative features/below zero). Life Sciences are markedly descriptive over the two centuries analysed, in contrast with Astronomy (CETA, Corpus of English Texts on Astronomy), which is highly argumentative, and Philosophy (CEPhiT, Corpus of English Philosophy Texts), whose positive and negative features tend to balance argumentation (eighteenth century) and description (nineteenth century) (pp. 295–7). As regards the CELiST genres (p. 298), the essay is argumentative (only eighteenth-century samples), whereas description mainly characterises letters (eighteenth century), treatises (nineteenth century) and textbooks (eighteenth and nineteenth centuries). Male writers tend to become more descriptive over time, and female writers more argumentative, but the CELiST samples overall remain fundamentally descriptive (p. 300). Like the preceding studies in the volume, the investigation has proved to be extremely interesting, revealing general versus more specific issues which need further investigation.
To conclude, the collection is a very dense and compelling read, and highly reliable. It unfolds as a useful source of information for both expert and young scholars who are interested in analysing language features and (sub)register patterns of English in the expanding field of Life Sciences. The studies based on the CELiST, which includes sample texts covering two centuries and which are not found in any other subcorpora of the CC, are multifarious. Although a subdivision into thematic sections of contents and research perspectives would have been extremely helpful to guide the readership, the work is consistent and interesting, and provides an effective and coherent overview on Late Modern English language experimentation and usage. The contributions span from linguistic analysis of descriptive issues (e.g. processes, circumstances, procedures) to more argumentative language aspects (e.g. logical patterns/expressions) and can certainly be regarded as an essential methodological starting point for future research.