Deep Learning in Quantitative Trading

Zihao Zhang; Stefan Zohren

doi:10.1017/9781009707091

Series: Elements in Quantitative Finance

Deep Learning in Quantitative Trading

Published online by Cambridge University Press: 03 October 2025

Zihao Zhang and

Stefan Zohren

Show author details

Zihao Zhang: Affiliation:
University of Oxford
Stefan Zohren: Affiliation:
University of Oxford

Summary

This Element provides a comprehensive guide to deep learning in quantitative trading, merging foundational theory with hands-on applications. It is organized into two parts. The first part introduces the fundamentals of financial time-series and supervised learning, exploring various network architectures, from feedforward to state-of-the-art. To ensure robustness and mitigate overfitting on complex real-world data, a complete workflow is presented, from initial data analysis to cross-validation techniques tailored to financial data. Building on this, the second part applies deep learning methods to a range of financial tasks. The authors demonstrate how deep learning models can enhance both time-series and cross-sectional momentum trading strategies, generate predictive signals, and be formulated as an end-to-end framework for portfolio optimization. Applications include a mixture of data from daily data to high-frequency microstructure data for a variety of asset classes. Throughout, they include illustrative code examples and provide a dedicated GitHub repository with detailed implementations.

Element contents

Summary
References

Get access

Keywords

deep learning machine learning reinforcement learning time-series neural networks quantitative trading portfolio optimization market microstructure momentum trading volatility scalings A12 B34 C56 D78 E90

Information

Type: Element
Information: Series: Elements in Quantitative Finance

DOI: https://doi.org/10.1017/9781009707091 [Opens in a new window]

Online ISBN: 9781009707091

Publisher: Cambridge University Press

Print publication: 30 October 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Element purchase

Temporarily unavailable

References

Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining (pp. 2623–2631).CrossRef Google Scholar

Almgren, R., & Chriss, N. (2001). Optimal execution of portfolio transactions. Journal of Risk, 3, 5–40.10.21314/JOR.2001.041CrossRef Google Scholar

Atkins, A., Niranjan, M., & Gerding, E. (2018). Financial news predicts stock market volatility better than close price. The Journal of Finance and Data Science, 4(2), 120–137.CrossRef Google Scholar

Atsalakis, G. S., & Valavanis, K. P. (2009). Surveying stock market forecasting techniques–Part II: Soft computing methods. Expert Systems with Applications, 36(3), 5932–5941.10.1016/j.eswa.2008.07.006CrossRef Google Scholar

Bachelier, L. (1900). Théorie de la spéculation. In Annales scientifiques de l’école normale supérieure (Vol. 17, pp. 21–86). ElsevierGoogle Scholar

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.Google Scholar

Bao, W., Yue, J., & Rao, Y. (2017). A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PloS one, 12(7), e0180944.CrossRef Google Scholar PubMed

Beck, M., Pöppel, K., Spanring, M., et al. (2024). xlstm: Extended long short-term memory. arXiv preprint arXiv:2405.04517.Google Scholar

Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2), 157–166.CrossRef Google Scholar PubMed

Bertsimas, D., & Lo, A. W. (1998). Optimal control of execution costs. Journal of financial markets, 1(1), 1–50.CrossRef Google Scholar

Blondel, M., Teboul, O., Berthet, Q., & Djolonga, J. (2020). Fast differentiable sorting and ranking. In Daumé, Hal & Singh, Aarti (eds), International conference on machine learning (pp. 950–959). PMLRGoogle Scholar

Borovykh, A., Bohte, S., & Oosterlee, C. W. (2017). Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691.Google Scholar

Boureau, Y., Ponce, J., & LeCun, Y. (2010). A theoretical analysis of feature pooling in vision algorithms. In Proceedings of international conference on machine learning (icml’10) (Vol. 28, p. 3).Google Scholar

Briola, A., Bartolucci, S., & Aste, T. (2024). Deep limit order book forecasting. arXiv preprint arXiv:2403.09267.Google Scholar

Briola, A., Turiel, J., & Aste, T. (2020). Deep learning modeling of limit order book: A comparative perspective. arXiv preprint arXiv:2007.07319.Google Scholar

Cesa, M. (2017). A brief history of quantitative finance. Probability, Uncertainty and Quantitative Risk, 2(1), 1–16.CrossRef Google Scholar

Chen, J.- F., Chen, W.- L., Huang, C.- P., Huang, S.- H., & Chen, A.- P. (2016). Financial time-series data analysis using deep convolutional neural networks. In Cloud computing and big data (ccbd), 2016 7th international conference on (pp. 87–92).Google Scholar

Cho, K., Van Merriënboer, B., Gulcehre, C., et al. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.Google Scholar

Cont, R., Cucuringu, M., Kochems, J., & Prenzel, F. (2023). Limit order book simulation with generative adversarial networks. SSRN 4512356.CrossRef Google Scholar

Cuturi, M., Teboul, O., & Vert, J.- P. (2019). Differentiable ranking and sorting using optimal transport. Advances in Neural Information Processing Systems, 32.Google Scholar

Devlin, J., Chang, M.- W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.Google Scholar

Di Persio, L., & Honchar, O. (2016). Artificial neural networks architectures for stock price prediction: Comparisons and applications. International Journal of Circuits, Systems and Signal Processing, 10, 403–413.Google Scholar

Dixon, M. (2018). Sequence classification of the limit order book using recurrent neural networks. Journal of Computational Science, 24, 277–286.10.1016/j.jocs.2017.08.018CrossRef Google Scholar

Doering, J., Fairbank, M., & Markose, S. (2017). Convolutional neural networks applied to high-frequency market microstructure forecasting. In Computer science and electronic engineering (ceec), 2017 (pp. 31–36).CrossRef Google Scholar

Du, K., Xing, F., Mao, R., & Cambria, E. (2024). Financial sentiment analysis: Techniques and applications. ACM Computing Surveys, 56(9), 1–42.10.1145/3649451CrossRef Google Scholar

Ekmekcioğlu, Ö., & Pınar, M. Ç. (2023). Graph neural networks for deep portfolio optimization. Neural Computing and Applications, 35(28), 20663–20674.CrossRef Google Scholar

Fischer, T., & Krauss, C. (2017). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654–669.10.1016/j.ejor.2017.11.054CrossRef Google Scholar

Frazier, P. I. (2018). Bayesian optimization. In Recent advances in optimization and modeling of contemporary problems (pp. 255–278). Informs.10.1287/educ.2018.0188CrossRef Google Scholar

Gatheral, J. (2010). No-dynamic-arbitrage and market impact. Quantitative finance, 10(7), 749–759.10.1080/14697680903373692CrossRef Google Scholar

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. (www.deeplearningbook.org)Google Scholar

Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.Google Scholar

Grover, A., Wang, E., Zweig, A., & Ermon, S. (2018). Stochastic optimization of sorting networks via continuous relaxations. In International conference on learning representations.Google Scholar

Gu, A., & Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752.Google Scholar

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the ieee conference on computer vision and pattern recognition (pp. 770–778).Google Scholar

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735–1780.CrossRef Google Scholar PubMed

Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural networks, 2(5), 359–366.10.1016/0893-6080(89)90020-8CrossRef Google Scholar

Hwang, Y., Kong, Y., Lee, Y., & Zohren, S. (2025). Decision-informed neural networks with large language model integration for portfolio optimization.Google Scholar

Jin, M., Wang, S., Ma, L., et al. (2023). Time-LLM: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728.Google Scholar

Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, Transactions of the ASME, 82(1), 35–45.10.1115/1.3662552CrossRef Google Scholar

Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114.Google Scholar

Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.Google Scholar

Kong, Y., Nie, Y., Dong, X., et al. (2024). Large language models for financial and investment management: Applications and benchmarks. Journal of Portfolio Management, 51(2) 162–210.Google Scholar

Kong, Y., Wang, Z., Nie, Y., et al. (2024). Unlocking the power of lstm for long term time series forecasting. arXiv preprint arXiv:2408.10006.Google Scholar

Korangi, K., Mues, C., & Bravo, C. (2024). Large-scale time-varying portfolio optimisation using graph attention networks. arXiv preprint arXiv:2407.15532.Google Scholar

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.Google Scholar

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90.CrossRef Google Scholar

Lim, B., Arık, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4), 1748–1764.10.1016/j.ijforecast.2021.03.012CrossRef Google Scholar

Lim, B., & Zohren, S. (2021). Time-series forecasting with deep learning: A survey. Philosophical Transactions of the Royal Society A, 379(2194), 20200209.10.1098/rsta.2020.0209CrossRef Google Scholar PubMed

Lim, B., Zohren, S., & Roberts, S. (2019). Enhancing time-series momentum strategies using deep neural networks. The Journal of Financial Data Science, 1(4), 19–38.10.3905/jfds.2019.1.015CrossRef Google Scholar

Liu, T.- Y. (2009). Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval, 3(3), 225–331.10.1561/1500000016CrossRef Google Scholar

Liu, Y., Hu, T., Zhang, H., et al. (2023). itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625.Google Scholar

Luong, M.- T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.Google Scholar

Maas, A. L., Hannun, A. Y., Ng, A. Y., et al. (2013). Rectifier nonlinearities improve neural network acoustic models. In Proc. icml (Vol. 30, p. 3).Google Scholar

Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91.Google Scholar

Mhaskar, H. N., & Micchelli, C. A. (1993). How to choose an activation function. Advances in neural information processing systems, 6.Google Scholar

Moreno-Pino, F., & Zohren, S. (2024). Deepvol: Volatility forecasting from high-frequency data with dilated causal convolutions. Quantitative Finance, 24(8), 1105–1127.10.1080/14697688.2024.2387222CrossRef Google Scholar PubMed

Moskowitz, T. J., Ooi, Y. H., & Pedersen, L. H. (2012). Time series momentum. Journal of Financial Economics, 104(2), 228–250.10.1016/j.jfineco.2011.11.003CrossRef Google Scholar

Nagy, P., Calliess, J.- P., & Zohren, S. (2023). Asynchronous deep double dueling q-learning for trading-signal execution in limit order book markets. Frontiers in Artificial Intelligence, 6 1151003.10.3389/frai.2023.1151003CrossRef Google Scholar PubMed

Nagy, P., Frey, S., Sapora, S., et al. (2023). Generative AI for end-to-end limit order book modelling: A token-level autoregressive generative model of message flow using a deep state space network. In Proceedings of the fourth ACM international conference on AI in finance (pp. 91–99).10.1145/3604237.3626898CrossRef Google Scholar

Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Icml (pp. 807–814).Google Scholar

Nelson, D. M., Pereira, A. C., & de Oliveira, R. A. (2017). Stock market’s price movement prediction with LSTM neural networks. In Neural networks (ijcnn), 2017 international joint conference on (pp. 1419–1426).CrossRef Google Scholar

Nie, Y., Nguyen, N. H., Sinthong, P., & Kalagnanam, J. (2022). A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730.Google Scholar

Obizhaeva, A. A., & Wang, J. (2013). Optimal trading strategy and supply/demand dynamics. Journal of Financial markets, 16(1), 1–32.10.1016/j.finmar.2012.09.001CrossRef Google Scholar

Ogryczak, W., & Tamir, A. (2003). Minimizing the sum of the k largest functions in linear time. Information Processing Letters, 85(3), 117–122.10.1016/S0020-0190(02)00370-8CrossRef Google Scholar

Poh, D., Lim, B., Zohren, S., & Roberts, S. (2021a). Building cross-sectional systematic strategies by learning to rank. The Journal of Financial Data Science, 3(2), 70–86.10.3905/jfds.2021.1.060CrossRef Google Scholar

Poh, D., Lim, B., Zohren, S., & Roberts, S. (2021b). Enhancing cross-sectional currency strategies by context-aware learning to rank with self-attention. arXiv preprint arXiv:2105.10019.10.3905/jfds.2022.1.099CrossRef Google Scholar

Poh, D., Lim, B., Zohren, S., & Roberts, S. (2021c). Enhancing cross-sectional currency strategies by ranking refinement with transformer-based architectures. arXiv preprint arXiv:2105.10019.Google Scholar

Poh, D., Roberts, S., & Zohren, S. (2022). Transfer ranking in finance: applications to cross-sectional momentum with data scarcity. arXiv preprint arXiv:2208.09968.Google Scholar

Prata, M., Masi, G., Berti, L., et al. (2024). Lob-based deep learning models for stock price trend prediction: A benchmark study. Artificial Intelligence Review, 57(5), 1–45.CrossRef Google Scholar

Pu, X. S., Roberts, S., Dong, X., & Zohren, S. (2023). Network momentum across asset classes. Stephen and Dong, Xiaowen and Zohren, Stefan, Network Momentum across Asset Classes (August 7, 2023).10.2139/ssrn.4540651CrossRef Google Scholar

Rahimikia, E., Zohren, S., & Poon, S.- H. (2021). Realised volatility forecasting: Machine learning via financial word embedding. arXiv preprint arXiv:2108.00480.Google Scholar

Reisenhofer, R., Bayer, X., & Hautsch, N. (2022). Harnet: A convolutional neural network for realized volatility forecasting. arXiv preprint arXiv:2205.07719.Google Scholar

Schnaubelt, M. (2022). Deep reinforcement learning for the optimal placement of cryptocurrency limit orders. European Journal of Operational Research, 296(3), 993–1006.10.1016/j.ejor.2021.04.050CrossRef Google Scholar

Selvin, S., Vinayakumar, R., Gopalakrishnan, E., Menon, V. K., & Soman, K. (2017). Stock price prediction using LSTM, RNNs and CNN-sliding window model. In Advances in computing, communications and informatics (icacci), 2017 international conference on (pp. 1643–1647).Google Scholar

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google Scholar

Sirignano, J., & Cont, R. (2018). Universal features of price formation in financial markets: Perspectives from deep learning. arXiv preprint arXiv:1803.06917.Google Scholar

Soleymani, F., & Paquet, E. (2021). Deep graph convolutional reinforcement learning for financial portfolio management–deeppocket. Expert Systems with Applications, 182, 115127.10.1016/j.eswa.2021.115127CrossRef Google Scholar

Sun, Q., Wei, X., & Yang, X. (2024). Graphsage with deep reinforcement learning for financial portfolio optimization. Expert Systems with Applications, 238, 122027.10.1016/j.eswa.2023.122027CrossRef Google Scholar

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104–3112).Google Scholar

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.Google Scholar

Theron, L., & Van Vuuren, G. (2018). The maximum diversification investment strategy: A portfolio performance comparison. Cogent Economics & Finance, 6(1), 1427533.10.1080/23322039.2018.1427533CrossRef Google Scholar

Tsantekidis, A., Passalis, N., Tefas, A., et al. (2017a). Forecasting stock prices from the limit order book using convolutional neural networks. In Business informatics (cbi), 2017 ieee 19th conference on (Vol. 1, pp. 7–12). IEEE.10.1109/CBI.2017.23CrossRef Google Scholar

Tsantekidis, A., Passalis, N., Tefas, A., et al. (2017b). Using deep learning to detect price change indications in financial markets. In Signal processing conference (eusipco), 2017 25th european (pp. 2511–2515).10.23919/EUSIPCO.2017.8081663CrossRef Google Scholar

Van Den Oord, A., Dieleman, S., Zen, H., et al. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 12.Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. Advances in neural information processing systems, 30.Google Scholar

Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24, 175–186.CrossRef Google Scholar

Wan, X., Yang, J., Marinov, S., et al. (2021). Sentiment correlation in financial news networks and associated market movements. Scientific Reports, 11(1), 3062.10.1038/s41598-021-82338-6CrossRef Google Scholar PubMed

Wang, J., Zhang, S., Xiao, Y., & Song, R. (2021). A review on graph neural network methods in financial applications. arXiv preprint arXiv:2111.15367.Google Scholar

Wang, Y., Wu, H., Dong, J., et al. (2024). Deep time series models: A comprehensive survey and benchmark. arXiv preprint arXiv:2407.13278.Google Scholar

Wood, K., Giegerich, S., Roberts, S., & Zohren, S. (2021). Trading with the momentum transformer: An intelligent and interpretable architecture. arXiv preprint arXiv:2112.08534.Google Scholar

Wood, K., Kessler, S., Roberts, S. J., & Zohren, S. (2023). Few-shot learning patterns in financial time-series for trend-following strategies. arXiv preprint arXiv:2310.10500.Google Scholar

Wood, K., Roberts, S., & Zohren, S. (2021). Slow momentum with fast reversion: A trading strategy using deep learning and changepoint detection. arXiv preprint arXiv:2105.13727.Google Scholar

Wu, H., Xu, J., Wang, J., & Long, M. (2021). Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34, 22419–22430.Google Scholar

Wu, Z., Pan, S., Chen, F., et al. (2020). A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1), 4–24.10.1109/TNNLS.2020.2978386CrossRef Google Scholar

Zhang, C., Pu, X., Cucuringu, M., & Dong, X. (2023). Graph neural networks for forecasting multivariate realized volatility with spillover effects. arXiv preprint arXiv:2308.01419.Google Scholar

Zhang, C., Pu, X., Cucuringu, M., & Dong, X. (2024). Graph-based methods for forecasting realized covariances. Journal of Financial Econometrics, nbae026.Google Scholar

Zhang, C., Zhang, Z., Cucuringu, M., & Zohren, S. (2021). A universal end-to-end approach to portfolio optimization via deep learning. arXiv preprint arXiv:2111.09170.Google Scholar

Zhang, X., Chowdhury, R. R., Gupta, R. K., & Shang, J. (2024). Large language models for time series: A survey. arXiv preprint arXiv:2402.01801.Google Scholar

Zhang, Y., & Yan, J. (2023). Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In The Eleventh International Conference on Learning Representations.Google Scholar

Zhang, Z., Lim, B., & Zohren, S. (2021). Deep learning for market by order data. Applied Mathematical Finance, 28(1), 79–95.10.1080/1350486X.2021.1967767CrossRef Google Scholar

Zhang, Z., & Zohren, S. (2021). Multi-horizon forecasting for limit order books: Novel deep learning approaches and hardware acceleration using intelligent processing units. arXiv preprint arXiv:2105.10430.Google Scholar

Zhang, Z., Zohren, S., & Roberts, S. (2019). Deep convolutional neural networks for limit order books. IEEE Transactions on Signal Processing, 67(11), 3001–3012.10.1109/TSP.2019.2907260CrossRef Google Scholar

Zhang, Z., Zohren, S., & Roberts, S. (2019a). Deeplob: Deep convolutional neural networks for limit order books. IEEE Transactions on Signal Processing, 67(11), 3001–3012.10.1109/TSP.2019.2907260CrossRef Google Scholar

Zhang, Z., Zohren, S., & Roberts, S. (2019b). Extending deep learning models for limit order books to quantile regression. Proceedings of Time Series Workshop of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019.Google Scholar

Zhang, Z., Zohren, S., & Roberts, S. (2020). Deep learning for portfolio optimization. The Journal of Financial Data Science, 2(4), 8–20.10.3905/jfds.2020.1.042CrossRef Google Scholar

Zhou, H., Zhang, S., Peng, J., et al. (2021). Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of aaai (pp. 11106–11115).Google Scholar

Zhou, Y.- T., & Chellappa, R. (1988). Computation of optical flow using a neural network. In Icnn (pp. 71–78).Google Scholar

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this Element is currently unknown and may be updated in the future.

Element contents

Deep Learning in Quantitative Trading

Summary

Keywords

Information

Access options

Element purchase

Temporarily unavailable

References

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save element to Kindle

Save element to Dropbox

Save element to Google Drive