1. Introduction
Electronic lexicography has enabled dictionaries to break free of some of the shortcomings inherent in the paper medium (de Schryver, Reference de Schryver2003). Although this represents an important departure from previous lexicographic practice, electronic dictionaries often mimic their print counterparts in many ways. Crucially, users still need to stop writing to consult a dictionary, a process that can be disruptive and cognitively taxing (Frankenberg-Garcia, Reference Frankenberg-Garcia2020). However, it is possible to integrate dictionaries into other applications to reduce the cognitive cost incurred when engaging in dictionary lookups, as envisioned in the ColloCaid writing assistant (Frankenberg-Garcia, Lew, Roberts, Rees & Sharma, Reference Frankenberg-Garcia, Lew, Roberts, Rees and Sharma2019). Designed to help writers enhance their academic vocabulary, ColloCaid offers collocation suggestions directly from within a text editor, so that writers can make use of them as they continue to write. This paper investigates the usefulness and effectiveness of ColloCaid in helping student writers revise their original collocational choices. While previous studies have examined users’ perceptions of the tool (Frankenberg-Garcia, Rees, Lew, Roberts, Sharma & Butcher, Reference Frankenberg-Garcia, Rees, Lew, Roberts, Sharma, Butcher, Meunier, Van de Vyver, Bradley and Thouësny2019; Rees, Reference Rees2021), this study is the first one to explore the extent of ColloCaid’s coverage of authentic pieces of writing and assess its impact on revision.
2. Background
Collocations (i.e. conventional combinations of words) have long been identified as a major hurdle for language learners. Palmer (Reference Palmer1933) saw them as a source of greater difficulty than grammar or individual words, noting that learners may “not be able to piece these collocations together from their component parts” (p. 14). Subsequent empirical research has provided abundant evidence of how collocations impact learners. Studies by Howarth (Reference Howarth1996), Nesselhauf (Reference Nesselhauf2005), and Laufer and Waldman (Reference Laufer and Waldman2011) indicate that approximately 30% of the collocations produced by participants were erroneous. More concerningly, Nesselhauf (Reference Nesselhauf2005), Laufer and Waldman (Reference Laufer and Waldman2011), and Kreyer (Reference Kreyer, Pérez-Paredes and Mark2021) found that the percentage of deviant collocations remained stable, regardless of years of English study or proficiency level. Such findings have been interpreted as evidence of a “collocation lag” (Men, Reference Men2018: 2), which occurs when “collocational knowledge does not develop alongside learners’ general level of English proficiency”.
Less conspicuous than miscollocations are the related phenomena of collocation overuse and underuse (Granger, Reference Granger and Cowie1998). Learners can overuse certain collocations, leaning on them as a crutch because they are familiar. At the same time, while clinging to these “lexical teddy bears” (Hasselgren, Reference Hasselgren1994: 237), learners can shy away from other options because they are unknown or seem like a risk not worth taking. Collocation underuse and overuse may be particularly impactful in more advanced writing. Pinto, Rees and Frankenberg-Garcia (Reference Pinto, Rees, Frankenberg-Garcia, Charles and Frankenberg-Garcia2021), for example, saw little evidence of miscollocation in a corpus of English research publications by Brazilian scholars, but when compared with a reference corpus, the Brazilian corpus exhibited fewer but stronger collocations. The authors of the study reasoned that if writers have a more restricted recall of collocations, it is natural that they will repeat the fewer collocations they do remember more often than writers who can access a wider array of collocations, thus suggesting that collocation overuse and underuse are interconnected.
Although research into collocational competence has focused on second language (L2) learners, anyone can experience retrieval difficulties or have gaps in their collocational knowledge (Dąbrowska, Reference Dąbrowska2019; Hoffmann & Lehmann, Reference Hoffmann, Lehmann and Kirk2000). This is especially true in the case of academic or otherwise specialized subjects (Benson, Reference Benson1989; Michta & Mroczyńska, Reference Michta and Mroczyńska2022). There is also some evidence to suggest that a person’s previous exposure to specialized lexis can impact their collocation proficiency more than their first language (L1) or L2 status. For example, Frankenberg-Garcia (Reference Frankenberg-Garcia2018) found that L2 English PhD students and academics performed significantly better in a gapped sentence general academic English collocation task than L1 English undergraduates.
English collocation assistance has been available for some time from general English learners’ dictionaries, purpose-made collocation dictionaries, and other tools such as corpora. Additionally, some lexical resources focus specifically on academic English, like the Academic Collocation List developed by Ackermann and Chen (Reference Ackermann and Chen2013), which is appended to the Longman Collocations Dictionary and Thesaurus (Mayor, Reference Mayor2013), and the Oxford Learner’s Dictionary of Academic English (Lea, Reference Lea2014), whose entries include academic English collocations.
However, the mere availability of a dictionary does not guarantee that the learner will remember to use it. A well-documented problem is dictionary underuse. For Lew (Reference Lew2004), consultation rates in dictionary use studies are “disappointingly low” (p. 48). Research into the effectiveness of dictionaries in providing information on collocations has produced mixed results. Somewhat pessimistic were the findings of Alonso Ramos’s (Reference Alonso Ramos, Bernal and DeCesaris2008) study, where one group of participants performed worse with a dictionary than without one. Studies by Lew and Radłowska (Reference Lew, Radłowska, Ciuk and Molek-Kozakowska2010), Laufer (Reference Laufer2011), and Chen (Reference Chen2017) have produced more encouraging data, but have also shown that while consulting a dictionary may generally be helpful, users often struggle to find, interpret and apply the information they need.
Dictionary consultation is inherently disruptive, for users normally reach for a dictionary while engaged in other activities (Béjoint, Reference Béjoint2010: 250). These may be complex in themselves, and dictionary consultation is likely to add to their complexity. The problem may be particularly acute in the case of cognitively demanding writing tasks (Hayes, Reference Hayes2012). Put differently, while dictionaries offer valuable help, there is a price to be paid for their use in terms of time and cognitive resources. Some writers may be aware of this trade-off and prioritize the flow of writing over language lookups. This is supported by East’s (Reference East2008) study, where participants claimed that dictionaries had interfered with their train of thought, slowed them down, and negatively affected their ability to think in the target language. The same issues apply to writers consulting corpora (Yoon, Reference Yoon2016).
A recent solution that can ease the cognitive strain of dictionary consultation comes from invisible lexicography, where “lexical data is used without users realizing they make use of a ‘dictionary’” (eLex, 2023). By its nature, invisible lexicography entails the integration of lexical data into some other tool so that linguistic information is provided unobtrusively.
ColloCaid (Frankenberg-Garcia, Lew et al., Reference Frankenberg-Garcia, Lew, Roberts, Rees and Sharma2019) exemplifies this recent trend. It was developed to help writers with academic collocations while minimizing disruptions to writing or revision. With 32,655 collocation suggestions curated from academic English corpora (Frankenberg-Garcia, Rees & Lew, Reference Frankenberg-Garcia, Rees and Lew2021), ColloCaid’s lexicographically vetted database is substantially larger than the academic collocation lists developed by Ackermann and Chen (Reference Ackermann and Chen2013) and Lei and Liu (Reference Lei and Liu2018). Moreover, the collocation suggestions offered have been integrated into a text editor to “bring dictionaries to writers” (Frankenberg-Garcia, Reference Frankenberg-Garcia2020: 29). When activated, ColloCaid (Figure 1) underlines all words in the editor screen for which collocation suggestions are offered. Users can ignore the prompts and carry on writing or revising or, if they would like to know more about the collocations around a word, they can click on it to see collocates grouped by the syntactic pattern of the collocation (e.g. verb-noun, noun-verb, verb-preposition, etc.). If prompted further, the tool displays examples illustrating how collocations are used in context. Users may insert a collocation suggestion into their texts with just a click or adapt its form as required.

Figure 1. ColloCaid displaying collocation suggestions for the noun impact.
The ColloCaid prototype has been described as having “great potential” (Chen, Lai, Lee & Yang, Reference Chen, Lai, Lee and Yang2023: 643) and as “a convincing example of how corpus data can be successfully employed to give rise to a new generation of technology-enhanced resources and learning tools” (Szudarski, Reference Szudarski, Jablonkai and Csomay2023: 50). Preliminary evidence of users’ satisfaction with the tool already exists. In early-stage tool development (Frankenberg-Garcia, Rees et al., Reference Frankenberg-Garcia, Rees, Lew, Roberts, Sharma, Butcher, Meunier, Van de Vyver, Bradley and Thouësny2019), ColloCaid was rated between good and excellent on the System Usability Scale (Brooke, Reference Brooke2013). Rees (Reference Rees2021) used the NASA Task Load Index (Hart & Staveland, Reference Hart and Staveland1988) to compare the workload involved in looking up collocations for gapped sentences in ColloCaid against using a regular text editor plus any other lexical tool selected by the participants. ColloCaid performed better across all the workload dimensions and took users less time to complete the task.
Despite these promising results, the above studies rely solely on self-reported data. No study to date has looked into ColloCaid’s coverage (i.e. the words for which ColloCaid provides collocation suggestions) when applied to authentic pieces of academic writing, or into the effect of the actual revisions made with its assistance, or into users’ evaluation of the collocation suggestions given. With these considerations in mind, the present study aimed to address the following research questions:
-
• RQ1: What is the coverage of collocation suggestions provided by ColloCaid when applied to authentic English academic writing?
-
• RQ2: What collocations do writers change?
-
• RQ3: What motivates writers’ decisions to revise (or not) their use of collocations when presented with collocation suggestions from ColloCaid?
-
• RQ4: Do self-revisions with ColloCaid improve texts?
-
• RQ5: What are writers’ perceptions of ColloCaid after using it with their own authentic texts?
3. Methodology
This section describes the participants in the study, the data collection, and how the data were analysed.
3.1 Participants
The participants were 27 L2 English university students of English, with a mean age of 22.6 years. Most of them (25) spoke Polish as their L1, while two spoke Ukrainian. At the time this study was conducted, all were attending their final semester, writing a BA (18) or an MA (9) dissertation on linguistics, literary studies or cultural studies. Their English proficiency level was approximately C1, as indicated by scores in an English exam taken at the time of the experiment.
3.2 Procedure
Three types of data were used in the study. These were collected through a text revision task, self-reports on the actual revisions the participants made with ColloCaid (and on revisions they chose not to make), and semi-structured interviews with a sample of the participants.
3.2.1 Revision task
ColloCaid can be used when writing from scratch, or it can be activated to revise previously written drafts. Either way, writers do not have to leave their text editing screen to access collocation suggestions. In this study, to better control for variables such as text length, task time, and topic authenticity and relevance, the data were elicited using a revision task based on a real academic writing assignment rather than a researcher-led essay topic or a gapped-sentence exercise. The task took place in a computer lab. Participants were requested to bring the latest version of the dissertation drafts they were in the process of writing for their coursework (see Section 3.1). After a brief introduction, students watched a short video from the ColloCaid website demonstrating how the tool worked.Footnote 1
Participants then registered for a free ColloCaid account and were instructed to select a 550–650-word extract from their dissertations that had not yet been seen by their supervisors or received any kind of human feedback. Any quotations present in the excerpts were excluded from the word count. Participants saved their extracts in a separate file, logged into ColloCaid, and pasted their selected text into the ColloCaid window. They were then asked to revise their writing with the help of ColloCaid (i.e. by adding, replacing, deleting or ignoring collocation suggestions) and bold any word that prompted a revision. Changes other than those involving collocations were not allowed. Participants had 20 minutes to complete the task. Once they finished, they saved the revised extracts underneath their original versions to facilitate comparing the two.
3.2.2 Self-reports
The revision task was immediately followed by self-reports on the use of ColloCaid. Participants completed a questionnaire (Figure 2) for each word whose collocates they decided to revise. Additionally, participants were asked to fill in the same questionnaire for at least five words whose collocates they left unchanged. This minimum requirement of five, introduced after a pilot study, aimed to ensure that sufficient data about the participants’ reasons for not revising could be collected without overburdening them. The questionnaire, created in Google Forms, used conditional logic, so that a participant’s response to a question determined the next question asked.

Figure 2. Questionnaire used for revision self-reporting.
3.2.3 Interviews
One-to-one semi-structured interviews were conducted with a sample of 14 participants, deliberately selected to represent users who adopted different levels of tool uptake – that is, from users who revised many collocations with the assistance of ColloCaid to users who practically did not follow up any collocation suggestions. The interviews were held within a week of the revision task and lasted 10 minutes on average. Participants were asked to comment on three aspects: ease of use, perceived usefulness, and potential for future use or recommendation. Depending on the participants’ responses, follow-up questions were asked to obtain a fuller picture of their perceptions of working with ColloCaid. The interviews were audio-recorded unless a participant withheld consent, in which case thorough notes were taken.
3.3 Data analysis
The original pre-revision dissertation excerpts that the participants selected for the study were compiled into a corpus on Sketch Engine (Kilgarriff et al., Reference Kilgarriff, Baisa, Bušta, Jakubíček, Kovář, Michelfeit, Rychlý and Suchomel2014) to assess ColloCaid’s coverage against authentic academic writing (RQ1).
The self-report questionnaire data were used to address RQ2 (which collocations the participants revised) and RQ3 (their motivations behind the revisions). To ensure the completeness of our data, 61 instances of unreported revisions were manually added to the spreadsheet. In these cases, the reasons for the revisions could not be determined (see Figure 2). Less critical problems such as misspelled words were also corrected.
To investigate the impact of individual revisions on the dissertation excerpts (RQ4), the revision taxonomy developed by Frankenberg-Garcia (Reference Frankenberg-Garcia1990) was used. It contemplates the following effects of revisions:
-
• positive (improves readability)
-
• negative (hinders readability)
-
• indeterminate (insufficient context to assess effect)
-
• unnecessary (an effective form is replaced with an equally effective one)
-
• ineffective (an infelicitous form is replaced with an equally infelicitous one)
-
• consequential (the revision is a consequence of adjacent changes in the text).
Two L1 English coders were asked to classify the revisions using this taxonomy. Both were experienced proofreaders and teachers of academic English. The original two raters reached “almost perfect” agreement as per Landis and Koch (Reference Landis and Koch1977: 165), with Cohen’s κ = 0.82. A third rater was called in to arbitrate the small percentage (11.6%) of discrepant judgements without seeing the previous ratings. The third rater achieved 100% agreement either with Reviewer 1 or Reviewer 2, and it is on these agreed ratings that the results presented in Section 4.4 are based.
To address RQ5 (participants’ perceptions), the audio recordings collected during the interviews were transcribed verbatim. These transcripts along with the notes taken during the interviews were saved into a text file. Following Strauss and Corbin (Reference Strauss and Corbin1990), the first author identified, examined and categorised relevant units of analysis (open coding), then related categories to one another (axial coding), and finally refined them (selective coding). The coding involved several readings of the transcripts, with notes being taken to track changes and ensure consistency.
4. Results
This section examines ColloCaid’s coverage (RQ1), investigates the type of revisions made by the participants (RQ2), analyses the motivations behind those revisions (RQ3), and assesses their impact on the drafts (RQ4). Finally, it also reports on the participants’ experience of working with ColloCaid (RQ5).
4.1 Coverage
The corpus of original dissertation excerpts collected in the study consisted of 16,289 running words. On average, a ColloCaid suggestion was available for every 51.5 words or every 1.8 sentences. Lemmatising the corpus yielded 2,706 unique lemmas, of which 368 (13.6%) were featured in ColloCaid. Most of the remaining 2,338 lemmas with no collocation suggestions were articles, prepositions, conjunctions, proper nouns, and non-English, non-academic, and subject-specific words that had been excluded from ColloCaid by design, either because they were not collocationally productive (it is unlikely that writers will initiate a collocation query from a closed-class word like a preposition or article), or because they were not present in general academic English vocabulary lists.Footnote 2
What is relevant here is noun, verb, and adjective lemma coverage, since these are the basic part-of-speech categories from which collocation queries arise (e.g. what verbs can be used with a given noun, what adverbs can be used with a specific adjective, and so on). Table 1 summarizes ColloCaid’s coverage of the nouns, verbs, and adjectives present in the student drafts, excluding proper nouns and non-English words. As is natural in word distributions, around half of the lemmas in the corpus occurred only once (compare Columns 2 and 4 in Table 1). It can be seen that the coverage of lemmas with a frequency of two and above (Column 5) was much better than the coverage rate that included lemmas occurring just once in the corpus (Column 3).
Table 1. ColloCaid’s coverage: Collocation suggestions for authentic academic English texts

Closer inspection of the noun, verb, and adjective lemmas for which there were no collocation suggestions revealed nouns such as vowel, utterance, and collocation – all very clear examples of linguistic terms excluded from the ColloCaid database because they were subject-specific. A few general English words, such as child, artist, and people, were not contemplated because they were not academic. However, the student dissertation corpus also drew attention to nouns such as user, order, and manner, which would be valid additions to future versions of the tool.
The verbs in the corpus that were not covered by ColloCaid consisted of delexicalized verbs, such as be, have, do, make, and take, general English vocabulary, such as kiss, love, and talk, and subject-specific verbs, such as pronounce, collocate, and devoice. Again, all three groups had been intentionally excluded from ColloCaid. However, verbs like convey, perceive, and claim would be worthy additions to ColloCaid.
Adjectives had the lowest coverage in ColloCaid. Many of those left out were non-gradable adjectives such as main, such, English, and female, which are normally used on their own, rather than as a component of an adjective-adverb collocation. Additionally, many general English adjectives such as good, great, and young, and a few subject-specific adjectives like lexical, literary, and linguistic, were not contemplated in ColloCaid. All these categories were outside ColloCaid’s scope. However, modifiable adjectives used in general academic texts such as able, large, possible, and accurate were not covered but would be good candidates for collocation suggestions.
4.2 Collocation revision
Participants made a total of 199 revisions, which corresponds to an average of 7.37 revisions per participant. Figure 3 depicts the distribution of these revisions among participants. As shown, the number of revisions per participant varied widely, from 1 to 31. The reasons for these discrepancies will become clearer from the interviews in Section 4.5.

Figure 3. Number of revisions per participant.
As previously explained in Section 3.2.1, participants could revise collocations by adding a collocate, replacing an existing one with a new one, or deleting one. Most revisions were additions (83%). Replacements occurred less frequently and accounted for the remaining 17%. No collocates were deleted.
Figure 4 displays the revisions sorted according to collocation type. As shown, there was a clear preference for verb-adverb revisions (33%). Adjective-preposition and noun-verb combinations were revised less frequently (2%, in both cases).

Figure 4. Breakdown of revisions according to collocation type.
4.3 Reasons for revisions
The participants stated that the majority of their changes (66.8%) were meant to improve their writing rather than correct errors. Only 2.5% of the revisions were intended to address errors. The remaining 30.7% were those that the participants failed to record (see Section 3.3), so the motivations behind them could not be captured.
Participants also reported on 168 instances where they decided not to revise collocations. The reason most often given was because they felt that no additional collocates were needed (48.2%) or the original collocate was already appropriate (37.5%). To a smaller extent, they also attributed their behaviour to the inability to find a suitable collocate (11.3%). In the case of 3% of all revisions, participants claimed that the collocates suggested by ColloCaid did not work in their texts.
4.4 Effect of revisions
Table 2 presents the ratings from each coder on the effect of the revisions made with ColloCaid along with the final ratings determined after resolving any discrepancies between them. Figure 5, in turn, highlights individual differences behind those figures.
Table 2. Overall effect of revisions


Figure 5. Effect of revisions per participant.
It can be seen that 54.8% of the revisions were positive, and for all but two (of 27) participants, the number of positive revisions (in green) was greater than the negative ones (in red). The vast majority of positive changes involved adding a collocate, as illustrated by the following revisions: contribution → unique contribution (#7), concept → key concept (#14), explore → thoroughly explore (#6), incorporate → effectively incorporate (#19), and available → readily available (#27), which seem to enhance the perception of fluency. In some instances, a collocate in the draft was replaced by a more precise suggestion. Examples include revising sad reality to harsh reality (#9) as well as important features to distinctive features (#26). Although both sad and important are valid collocates, they seem more typical of general than academic English. Replacing them with harsh and distinctive makes the text more recognizable as academic prose. In rare cases, the participants corrected mistakes in their original drafts. For example, Participant 15 changed *their need of communication to their need for communication. Similarly, Participant 27 changed *apply references to make references.
Negative revisions constituted 15.6% of all revisions (i.e. fewer than a third of the positive changes), and when looking at individual differences, only Participant 20 made more negative than positive changes, while Participant 21 made the same number of each. Some of the negative revisions can be explained by a participant’s misinterpreting the information provided by ColloCaid. For example, Participant 7 replaced to demand the resources with *to demand for the resources. Although ColloCaid distinguishes between demand (verb) and demand (noun), the participant seems to have misread this information and chosen a preposition that goes with the noun. Similarly, Participant 21 revised the reasons varied to *the mentioned reasons varied. The resulting error could be attributable to interference from Polish, which often places the participle before the noun. However, ColloCaid presents the correct order in reasons mentioned (not *mentioned reasons). A few negative revisions had little to do with failing to capitalize on the information provided by ColloCaid. For example, in All the factors mentioned above may be obstacles to mastering undoubtedly correct consonantal pronunciation (#5), the addition of undoubtedly before correct adds an evaluation that clashes with the meaning of the sentence. In another instance, Participant 14 failed to remove often when inserting commonly in *are often commonly referred to as translation.
A further 8% of the revisions were classified as unnecessary. These were cases where changes did not make the text either better or worse. Examples include revising based on to based upon (#2), findings to research findings (#24), and discipline to scientific discipline (#23).
Classifying a revision sometimes required additional context or subject-specific knowledge unavailable to the raters. Such cases were classified as indeterminate and made up 21.6% of all revisions. For example, Participant 21 added the word sole, changing the purpose to the sole purpose. Linguistically, sole fits well in this context. However, the extra meaning resulting from the insertion could not be judged from a linguistic perspective alone. Similarly, the change from The description […] present in to The brief description […] present in (#15) required additional context and a certain amount of subjectivity for it to be rated.
It is worth noting that none of the revisions were classed as ineffective. Lastly, given that the participants were explicitly told not to revise anything else but collocations, little room was left for changes classified as consequential, and indeed none were found.
4.5 Perceptions of revising with ColloCaid
As described in Section 3.2.3, a sample of participants who made many revisions and participants who made very few revisions with ColloCaid were interviewed. Both minimal and extensive users gave overwhelmingly positive feedback about the tool. Without exception, all interviewees said the tool’s user-friendliness was an obvious strength: “It was my first time using a tool like this, but doing so was straightforward” (#23). When asked to elaborate, Participant 27 said it was intuitive because it was similar to other programs: “Using it did not feel like learning something from scratch.”
Although the participants’ opinion regarding ColloCaid’s usefulness was based on the revision task used to elicit data for the experiment, some participants commented on how ColloCaid might help them in their future writing. Of the 14 participants interviewed, 10 (71%) declared they might use it again. The remaining four did not anticipate needing the tool in the future. They had nearly finished their dissertations and did not intend to engage in academic writing after graduation. One participant (#14) was satisfied with their existing workflow and did not want to change it. Additionally, 12 participants (86%) would recommend ColloCaid to their peers.
Generally, the participants acknowledged ColloCaid’s contribution to improving the quality of their writing: “It was very helpful because my work sounds more formal and fluent, even somewhat more native-like” (#20). ColloCaid’s ability to act as a memory aid was frequently appreciated: “It showed me some collocations that I did not remember at the time. But seeing them in ColloCaid, I could incorporate them into my work” (#11); “It’s not that I don’t know the right collocation; it’s just that it doesn’t come to me when I need it. That’s when I may turn to ColloCaid” (#13). Participants 2 and 20 said the examples strengthened their trust in the tool and helped them discover more about how collocations are used.
Other uses mentioned by participants included replacing correct collocations with ones that sounded better: “Even when what I had was already fine, the suggestions helped improve my text” (#13). Some felt reassured in their original choices when they were able to find them in ColloCaid. Participant 27 used ColloCaid “to look for confirmation rather than new information”, as they had already worked extensively on polishing their original draft using a range of language resources.
Participants also drew comparisons with other tools, especially dictionaries and search engines: “ColloCaid gathers all possible collocations in one place. Searching manually in search engines or dictionaries is neither quick nor easy. Here with one click everything I need is at my fingertips” (#20); “It stops the user from going on an internet hunt for the right collocation” (#22); “Using a dictionary takes a while. So just by using it, they [writers] would be giving away time for writing (#14)”; “I don’t have to go to other websites to type what I need and spend time looking for collocations, copying them, pasting them, etc. So, in this regard, it offers a very big advantage” (#27). One participant made an indirect comparison of ColloCaid with AI tools: “There are many automated systems and artificial pseudo-intelligence [tools] intended to replace humans, and I don’t like such programs. If someone reads articles or books (…), they will know these collocations. If someone doesn’t use them, it means they aren’t very advanced yet, which isn’t necessarily a bad thing; it just reflects their level. It’s more natural if a text comes from a person and not from a machine” (#14). Participant 26 went further, remarking that ColloCaid gave the user a sense of agency: “It [ColloCaid] doesn’t automatically correct your (…) text. You still have the opportunity to review those suggestions and make a decision.”
As some of the above quotes suggest, the participants were aware of the trade-off between the cost of consulting a resource and its benefit. This is further highlighted by Participant 26, who appreciated how ColloCaid could help writers carry on writing without interruption, thus offsetting the burden of looking for collocations: “While writing, you don’t want to stop to look for a collocation, because you will usually get distracted. (…). Here, you don’t lose this flow of writing. You just write” (#26).
Participants also reported not using ColloCaid indiscriminately: “I didn’t feel the need to force in a collocation, but sometimes I know the text isn’t cohesive or interesting, and to improve it a little, I used it [ColloCaid]” (#20).
Although most of the interview feedback was positive, there were also suggestions for improvement. Several participants mentioned the lack of integration with the text processors they were already familiar with, especially with Microsoft Word and Google Docs. They felt that this change would make using the tool more convenient and allow them to keep everything in one place. Participant 2 expressed uncertainty as to how they should interpret the absence of collocations from ColloCaid: “I wasn’t sure if the word I wrote was wrong because it wasn’t on the list. I wrote ‘of much importance’, but that wasn’t on the collocation list.” Participant 18 suggested that ColloCaid should highlight any collocations detected that were present in its database to indicate they were likely to be correct, so that users did not have to interact further with the tool to check.
5. Discussion
The present study investigated the effectiveness of ColloCaid in assisting academic writers with collocations. By triangulating data from original drafts, revised texts, self-reports, external coder assessment and semi-structured interviews, it was possible to take a closer look at ColloCaid’s perceived and actual effectiveness.
The study showed that ColloCaid’s coverage of university student writing was very good, especially for collocationally productive words (nouns, verbs, and adjectives) typical of general academic English, excluding subject-specific and general English words that were deliberately not contemplated in ColloCaid.
However, as seen in Section 4.1, the analysis of coverage brought to light a few general academic English lemmas that could be usefully added to the database. Words such as order (noun), able (adjective), or claim (verb) generate typical academic collocations. As pointed out in Frankenberg-Garcia et al. (Reference Frankenberg-Garcia, Rees and Lew2021), there is a surprising amount of variation in general academic English word lists because they “are heavily dependent on the corpora and extraction methods used to compile them” (p. 226). Testing the coverage of such lists against authentic pieces of writing as in Durrant (Reference Durrant2016) and in the present study can help us come to a better understanding of how they can be improved. It is nevertheless encouraging to see that ColloCaid can offer academic writers assistance for a large number of academic words that they indeed use. Moreover, while the purpose of the study was not to compare ColloCaid’s coverage with that of other academic English collocation lists, we assume that those by Ackermann and Chen (Reference Ackermann and Chen2013) and Lei and Liu (Reference Lei and Liu2018) would offer far less coverage, simply because they are considerably smaller. This matters because repeatedly not finding answers in a dictionary or similar may lead to users giving up on it.
Participants showed a substantial level of engagement with ColloCaid, as evidenced by the collocation changes prompted by the tool. The fact that these were existing drafts rather than texts written from scratch is important, as the level of revision for initial texts versus more mature versions developed over time and with the aid of reference tools can vary quite substantially. In this study, the interview data showed that differences in the levels of uptake of ColloCaid could be partly explained by the extent the drafts had already undergone some revision. In future measures of uptake, it would be worth controlling for this variable. However, individual differences in writing processes may also play an important part in the uptake of the collocation suggestions offered: some writers will take more time to perfect word combinations, while others will focus more on writing flow.
Revisions fell into two categories: additions and replacements. Additions typically involved adding nuance to an existing text, while replacements demonstrated participants’ ability to critically reflect on their initial choices. Interestingly, the most common collocation types revised by participants involved syntactically optional adverbial and adjectival collocates. This may be partly because the data elicited were based on pre-existing texts, where core meanings conveyed by noun-verb and verb-noun collocations had already been construed during the initial drafting stage. A hypothesis that remains to be tested is whether there might be a greater uptake of noun-verb and verb-noun collocation suggestions when using the tool for writing from scratch. The syntactically optional collocates facilitated by ColloCaid during revision nevertheless enabled the participants to improve their texts. Considering that collocations featuring adverbs are a notable source of difficulty for writers (Granger, Reference Granger and Cowie1998; Hasselgård, Reference Hasselgård2015), our findings demonstrate the potential of a tool like ColloCaid to enhance fluency and academic style.
When undertaking revisions, the participants’ primary motivation was to improve the quality of their writing rather than simply correct mistakes. Indeed, unlike grammar checkers like those that are integrated into editors such as MS Word or tools like Grammarly, ColloCaid seeks to improve idiomaticity and fluency rather than identify miscollocations. The fact that some drafts had already been spell- and grammar-checked, and that the participants were advanced-level students whose texts did not exhibit many errors, further explains why ColloCaid was used for text refinement more than for text correction.
It is noteworthy that almost a third of the revisions went unreported when the participants filled in the self-reports. Given the amount of underreporting observed, we cannot overly emphasize the need for checking mechanisms when collecting data through structured written protocols like the one used in this study. In the present case, the seemingly simple task of bolding revisions may have been overlooked simply because actual revision was also in progress. The struggle to maintain focus while juggling a range of tasks and processes is precisely why efforts to minimize writers’ cognitive load are worthwhile, which is indeed one of the driving principles of invisible lexicography (eLex, 2023).
It is also significant that the participants chose not to make any collocation changes for some of the words highlighted by ColloCaid. This suggests that the participants were selective in their engagement with the tool, and frequently had the discernment not to overuse it. Note that although there might have been cases where participants did not make revisions because ColloCaid did not provide the information needed, the number of such cases reported was negligible. Still, it would be worth investigating whether less proficient users of English engage less critically with the tool, which is precisely what Zomer (Reference Zomer2023) noted in experiments with an AI-powered writing assistant.
In terms of effectiveness, the study shows that ColloCaid generally helped writers improve their texts. Positive revisions by far outnumbered the negative ones, indicating a net improvement in fluency and readability. In a few cases, the revisions undertaken involved correcting mistakes, suggesting that users were able to notice problems in their texts, despite ColloCaid not being designed to be an error-correction tool. Further evidence of ColloCaid’s positive impact can be seen in many positive replacements of more generic collocates with stronger ones, helping the participants achieve a higher level of idiomaticity and produce a more nuanced text. The coder judgements in this study align with the findings of Naismith and Juffs (Reference Naismith and Juffs2025) regarding the effect of collocation on writing ratings. Our data also provide some evidence that ColloCaid helped the participants let go of “lexical teddy bears” (Hasselgren, Reference Hasselgren1994) and replace them with more contextually appropriate alternatives. As noted during the interviews, ColloCaid served as a memory aid. It helped the participants use collocates they recognized but failed to recall, turning passive vocabulary into active vocabulary.
The relatively high percentage of indeterminate revisions observed in the study reflects the difficulty in evaluating a writer’s use of collocations in experimental setups without access to the writer’s intentions. Determining whether a revised collocation improves the text depends not only on whether a given combination of words sounds fluent but also on the communicative presumption underpinning it.
Although a few negative revisions were identified, this was only to be expected. There is an abundance of evidence that users of lexical resources do not always use them well or to their full potential (e.g. Laufer, Reference Laufer2011), and there is no reason why ColloCaid should be any different, especially if writers are using it for the first time.
Even though only a few revisions were deemed unnecessary, it must be acknowledged that their very presence suggests that a tool like ColloCaid may occasionally challenge a writer’s confidence in their initial solutions. Overall, the fact that the positive changes by far exceeded the negative ones (109 vs. 31) and that the collocation prompts that were deliberately ignored outnumbered the ones that resulted in unnecessary changes (168 vs. 16) suggests that the tool does more good than harm. Moreover, as with any dictionary or writing assistance tool, it would not be surprising if effectiveness improved with practice.
The participants’ individual performances reported in Figure 5 suggest that they were similar to the results pertaining to the cohort as a whole, despite the two outliers whose overall revision was not positive. Moreover, the semi-structured interviews with different profiles of participants served to gain a better understanding of the reasons for individual variability.
The interviews resulted in overwhelmingly positive feedback. The fact that the text editor integration of ColloCaid was similar to software they already used was one of the features the participants particularly appreciated. Thus, rather than expecting users to familiarize themselves with a radically different tool, it seems beneficial to design it in a way that ensures that users can build on their existing skills. Participants viewed ColloCaid as an effective memory aid, fit for the purpose for which it was developed. They also recognized its ability to mitigate distractions and lower cognitive load, thus saving time. This recognition nicely dovetails with the earlier findings by Rees (Reference Rees2021) regarding the lowered cognitive load and time-saving aspects of working with a writing assistant compared with tools that need to be consulted separately. The present study extended Rees’s findings in that it demonstrated that participants not only notice the cognitive advantages conferred by ColloCaid but also attach importance to them and actively refer to them. This indicates that they are aware of the trade-off inherent in using separate dictionaries and dictionary-like tools and may opt for a tool that offers integrated solutions.
Participants also identified several areas for improvement. As in the beta testing phase of ColloCaid, participants suggested the tool should be integrated with the text editors they used. Another suggestion was for ColloCaid to automatically signal whether a collocation used in a text was included in the database, so that users did not have to interact with the tool just to confirm the word combination used was indeed a collocation. On the whole, participants valued the sense of agency provided by a writing assistant offering lexical suggestions. They also saw working with ColloCaid as an opportunity to learn and expressed a sense of trust towards the tool’s lexicographically curated recommendations, especially since they could access example sentences displaying the target collocations.
One key takeaway for future developers of similar tools is that despite the hype surrounding AI, some writers might prefer tools edited by experts precisely because such tools reflect human expertise and impart a sense of agency. Although our study did not set out to compare ColloCaid with AI-based solutions, our interview data indicate that some writers are concerned about a potential pitfall: users may overrely on AI tools rather than learn from them – a phenomenon observed by Darvishi, Khosravi, Sadiq, Gašević and Siemens (Reference Darvishi, Khosravi, Sadiq, Gašević and Siemens2024).
Another important implication is that trust should not be overlooked as a factor that affects participants’ perceptions of and consequent engagement with new tools. While users’ trust has already been investigated in the context of automated feedback (Ranalli, Reference Ranalli2021), its role in lexicography remains underexplored. Traditionally, dictionaries have long been seen as reliable sources of information on language. However, given the growing number of new AI tools, some of which compete with traditional dictionaries for users’ attention, trust is bound to play a key role in future user preferences.
6. Conclusion
This study provides a reminder of the challenging nature of collocations. Rather than assuming that soon-to-graduate students’ language journey is complete, it is important to raise their awareness of collocations, sensitize them to their knowledge gaps, and equip them with the knowledge of reliable tools that they can turn to for help.
On another level, the study has provided evidence of ColloCaid’s potential and actual effectiveness in assisting academic writers with collocations. Specifically, it showed that ColloCaid offers a suitable degree of lexical coverage for student academic writing and that students can use the collocation suggestions offered for text improvement. Importantly, the students appreciated ColloCaid’s impact not only on their texts but also on the revision process.
By extension, the study has shown that efforts to integrate corpus-based language resources within the users’ working environment to minimize distractions are worthwhile. This approach, critical for invisible lexicography, has only recently begun to be translated into new language tools. As predicted by Szudarski (Reference Szudarski, Jablonkai and Csomay2023), “this kind of cutting-edge research is likely to grow and make increased use of new technological developments” (p. 50).
In the future, it would be worth exploring how ColloCaid performs across a broader spectrum of users and contexts, beyond L2 English students revising dissertation drafts. This could include testing the tool with novice L1 English academic writers, L2 writers of other levels of academic experience (PhD students, lecturers, and professors), and writers from different disciplines (beyond linguistics, literature, and culture studies). Importantly, this study needs to be followed up with writers using ColloCaid to write from scratch rather than just revising pre-existing drafts.
Data availability statement
Data are available from the corresponding author upon reasonable request.
Authorship contribution statement
Tomasz Michta: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing; Ana Frankenberg-Garcia: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing.
Funding disclosure statement
The ColloCaid project was funded by the UK Arts and Humanities Research Council (AH/P003508/1).
Competing interests statement
The authors declare no competing interests.
Ethical statement
The data collected in this study were anonymized. Permission to conduct the investigation was obtained from the Dean of the Faculty of Philology at the University of Bialystok, Poland. All participants were volunteers who gave consent to take part in the study. They were all informed that they could withdraw from the study at any point.
GenAI use disclosure statement
The authors declare no use of generative AI.
About the authors
Tomasz Michta is an assistant professor at the University of Bialystok, Poland. His research interests include lexicography and the use of AI in language education.
Ana Frankenberg-Garcia is a visiting professor at the University of Surrey, UK, and an independent consultant. Her research focuses on the intersection of corpus linguistics with lexicography, academic writing and translation.