Until recently, use has been made almost exclusively of text-based concordancers in the analysis of spoken corpora. This article discusses research being carried out on Padua University's Multimedia English Corpus (Padova MEC) using the multimodal concordancer MCA (Multimodal Corpus Authoring System, Baldry, 2005). This highly innovative concordancer enables the retrieval of parts of video and audio from a tagged corpus and access to examples of language in context, thereby providing non-verbal information about the environment, the participants and their moods, details that can be gleaned from a combination of word, sound, image and movement. This is of use to language learners of all levels because if “communication is to be successful, a relevant context has to be constructed by the discourse participants” (Braun, 2005: 52). In other words, transcripts alone are not sufficient if learners are to have anything like participant knowledge and comprehend spoken language. In the article it will be demonstrated how language functions expressed in the multimedia corpus of spoken English are retrieved using MCA. Online learning materials based on the multimodal concordances take into consideration not only language, but also the way in which it co-patterns with other semiotic resources, thereby raising the issue of the importance of learner awareness of the multimodal nature of communication.