Atmospheric models used for weather and climate prediction are traditionally formulated in a deterministic manner. In other words, given a particular state of the resolved scale variables, the most likely forcing from the subgrid scale processes is estimated and used to predict the evolution of the large-scale flow. However, the lack of scale separation in the atmosphere means that this approach is a large source of error in forecasts. Over recent years, an alternative paradigm has developed: the use of stochastic techniques to characterize uncertainty in small-scale processes. These techniques are now widely used across weather, subseasonal, seasonal, and climate timescales. In parallel, recent years have also seen significant progress in replacing parametrization schemes using machine learning (ML). This has the potential to both speed up and improve our numerical models. However, the focus to date has largely been on deterministic approaches. In this position paper, we bring together these two key developments and discuss the potential for data-driven approaches for stochastic parametrization. We highlight early studies in this area and draw attention to the novel challenges that remain.