Cultural evolution depends on both innovation (the creation of new cultural variants by accident or design) and high-fidelity transmission (which preserves our accumulated knowledge and allows the storage of normative conventions). What is required is an overarching theory encompassing both dimensions, specifying the psychological motivations and mechanisms involved. The bifocal stance theory (BST) of cultural evolution proposes that the co-existence of innovative change and stable tradition results from our ability to adopt different motivational stances flexibly during social learning and transmission. We argue that the ways in which instrumental and ritual stances are adopted in cultural transmission influence the nature and degree of copying fidelity and thus also patterns of cultural spread and stability at a population level over time. BST creates a unifying framework for interpreting the findings of otherwise seemingly disparate areas of enquiry, including social learning, cumulative culture, overimitation, and ritual performance. We discuss the implications of BST for competing by-product accounts which assume that faithful copying is merely a side-effect of instrumental learning and action parsing. We also set out a novel “cultural action framework” bringing to light aspects of social learning that have been relatively neglected by behavioural ecologists and evolutionary psychologists and establishing a roadmap for future research on this topic. The BST framework sheds new light on the cognitive underpinnings of cumulative cultural change, selection, and spread within an encompassing evolutionary framework.