Accurate mortality forecasting is crucial for actuarial pricing, reserving, and capital planning, yet the traditional Lee-Carter model struggles with non-linear age and cohort patterns, coherent multi-population forecasting, and quantifying prediction uncertainties. Recent advances in deep learning provide a range of tools that can address these limitations, but actuarial surveys have not kept pace. This paper provides the first concise view of deep learning in mortality forecasting. We cover six deep network architectures, namely Recurrent Neural Networks, Convolutional Neural Networks, Transformers, Autoencoders, Locally Connected Networks, and Multi-Task Feed-Forward Networks. We discuss how these architectures tackle cohort effects, population coherence, interpretability, and uncertainty in mortality forecasting. Evidence from the literature shows that carefully calibrated deep learning models can consistently outperform the Lee-Carter baselines; however, no single architecture resolves every challenge, and open issues remain with data scarcity, interpretability, uncertainty quantification, and keeping pace with the advances of deep learning. This review is also intended to provide actuaries with a practical roadmap for adopting deep learning models in mortality forecasting.