We consider experimentally an initially quiescent and linearly stratified fluid with buoyancy frequency $N_{Q}$ in a cylinder subject to surface-stress forcing from a disc of radius $R$ spinning at a constant angular velocity ${\rm\Omega}$. We observe the growth of the disc-adjacent turbulent mixed layer bounded by a sharp primary interface with a constant characteristic thickness $l_{I}$. To a good approximation the depth of the forced mixed layer scales as $h_{F}/R\sim (N_{Q}/{\rm\Omega})^{-2/3}({\rm\Omega}t)^{2/9}$. Generalising the previous arguments and observations of Shravat et al. (J. Fluid Mech., vol. 691, 2012, pp. 498–517), we show that such a deepening rate is consistent with three central assumptions that allow us to develop a phenomenological energy balance model for the entrainment dynamics. First, the total kinetic energy of the deepening mixed layer $\mathscr{E}_{KF}\propto h_{F}u_{F}^{2}$, where $u_{F}$ is a characteristic velocity scale of the turbulent motions within the forced layer, is essentially independent of time and the buoyancy frequency $N_{Q}$. Second, the scaled entrainment parameter $E={\dot{h}}_{F}/u_{F}$ depends only on the local interfacial Richardson number $Ri_{I}=(N_{Q}^{2}h_{F}l_{I})/(2u_{F}^{2})$. Third, the potential energy increase (due to entrainment, mixing and homogenisation throughout the deepening mixed layer) is driven by the local energy input at the interface, and hence is proportional to the third power of the characteristic velocity $u_{F}$. We establish that internal consistency between these assumptions implies that the rate of increase of the potential energy (and hence the local mass flux across the primary interface) decreases with $Ri_{I}$. This observation suggests, as originally argued by Phillips (Deep-Sea Res., vol. 19, 1972, pp. 79–81), that the mixing in the vicinity of the primary interface leads to the spontaneous appearance of secondary partially mixed layers, and we observe experimentally such secondary layers below the primary interface.