Previous research on audiovisual input attests to a significant effect of on-screen text and proficiency on learning gains. However, there is scarce research on whether these factors affect viewers’ feeling of learning, a variable that can affect overall second language (L2) learning outcomes (Ellis, 2008). Moreover, there is a lack of research exploring whether viewing experience prompts viewers to switch from one viewing mode (subtitles, captions, no on-screen text) to another and what factors affect those choices. This study explores learners’ perspectives on learning from audiovisual input and their preferred viewing mode before and after participating in a prolonged viewing intervention. A total of 136 participants of varying L2 English proficiency levels (from A1 to C2) completed pre-viewing and post-viewing questionnaires. The results show that vocabulary and expressions were perceived to be learnt the most. The elementary proficiency group were more likely to be positive about learning from the intervention than higher proficiency students. Concerning the preferred viewing mode outside of the classroom, the participants favoured no on-screen text or first language (L1) subtitles over L2 captions. At the end of the intervention, the elementary-level participants found that viewing without any L1 support was too challenging for leisure viewing, while the intermediate- and advanced-level students gained confidence in watching without any textual support.