Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-28T02:00:41.206Z Has data issue: false hasContentIssue false

A multi-level procedure for enhancing accuracy of machine learning algorithms

Published online by Cambridge University Press:  14 July 2020

KJETIL O. LYE
Affiliation:
SINTEF Digital, Oslo, Norway, email: kjetil.olsen.lye@sintef.no
SIDDHARTHA MISHRA
Affiliation:
Seminar for Applied Mathematics (SAM), D-Math, ETH Zürich, Rämistrasse 101, Zürich-8092, Switzerland, emails: smishra@sam.math.ethz.ch; roberto.molinaro@sam.math.ethz.ch
ROBERTO MOLINARO
Affiliation:
Seminar for Applied Mathematics (SAM), D-Math, ETH Zürich, Rämistrasse 101, Zürich-8092, Switzerland, emails: smishra@sam.math.ethz.ch; roberto.molinaro@sam.math.ethz.ch

Abstract

We propose a multi-level method to increase the accuracy of machine learning algorithms for approximating observables in scientific computing, particularly those that arise in systems modelled by differential equations. The algorithm relies on judiciously combining a large number of computationally cheap training data on coarse resolutions with a few expensive training samples on fine grid resolutions. Theoretical arguments for lowering the generalisation error, based on reducing the variance of the underlying maps, are provided and numerical evidence, indicating significant gains over underlying single-level machine learning algorithms, are presented. Moreover, we also apply the multi-level algorithm in the context of forward uncertainty quantification and observe a considerable speedup over competing algorithms.

Type
Papers
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arora, S., Ge, R., Neyshabur, B. & Zhang, Y. (2018) Stronger generalization bounds for deep nets via a compression approach. In: Proceedings of the 35th International Conference on Machine Learning, Vol. 80. PMLR, July 2018, pp. 254263.Google Scholar
Caflisch, R. E. (1988) Monte Carlo and Quasi–Monte Carlo methods. Acta. Numer. 1, 149.Google Scholar
Cucker, F. & Smale, S. (2001) On the mathematical foundations of learning. Bull. Amer. Math. Soc. 39(1), 149.CrossRefGoogle Scholar
Cybenko, G. (1989) Approximations by superpositions of sigmoidal functions. Approximation Theory Appl. 9(3), 1728.Google Scholar
De Ryck, T., Mishra, S. & Deep, R. (2020) On the approximation of rough functions with deep neural networks. Preprint, available from arXiv:1912.06732.Google Scholar
Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T. F. G., Qin, C., Zidek, A., Nelson, A., Bridgland, A., Penedones, H., Petersen, S., Simonyan, K., Crossan, S., Jones, D. T., Silver, D., Kavukcuoglu, K., Hassabis, D. & Senior, A. W. (2019) De novo structure prediction with deep learning based scoring. Google DeepMind Working Paper.Google Scholar
Giles, M. B. (2008) Multilevel Monte Carlo path simulation. Oper. Res. 56, 607617.CrossRefGoogle Scholar
Giles, M. B. (2015) Multilevel Monte Carlo methods. Acta Numer. 24, 259328.CrossRefGoogle Scholar
Goodfellow, I., Bengio, Y. & Courville, A. (2016) Deep Learning, MIT press, Cambridge, Massachusetts, USA.Google Scholar
Han, J., Jentzen, A. & Weinan, E. (2018) Solving high-dimensional partial differential equations using deep learning. PNAS 115(34), 85058510.CrossRefGoogle ScholarPubMed
Heinrich, S. (2001) Multilevel Monte Carlo methods. In: Large-Scale Scientific Computing, Third International Conference LSSC 2001, Sozopol, Bulgaria, 2001, Lecture Notes in Computer Science, Vol. 2170, Springer Verlag, pp. 5867.CrossRefGoogle Scholar
Hirsch, C., Wunsch, D., Szumbarksi, J., Laniewski-Wollk, L. & pons-Prats, J. (editors). (2018) Uncertainty Management for Robust Industrial Design in Aeronautics, Notes on Numerical Fluid Mechanics and Multidisciplinary Design, Vol. 140, Springer, Berlin, Germany.Google Scholar
Hornik, K., Stinchcombe, M. & White, H. (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359366.CrossRefGoogle Scholar
Kingma, D. P. & Ba, J. L. (2015) Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 113.Google Scholar
Lagaris, I. E., Likas, A. & Fotiadis, D. I. (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Networks 9(5), 9871000.CrossRefGoogle ScholarPubMed
LeCun, Y., Bengio, Y. & Hinton, G. (2015) Deep learning. Nature 521, 436444.CrossRefGoogle ScholarPubMed
Lu, L., Jin, P. & Karniadakis, G. E. (2019) DeepONet: learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. Preprint, available from arXiv:1910.03193.Google Scholar
Lye, K. O., Mishra, S. & Ray, D. Deep learning observables in computational fluid dynamics. J. Comput. Phys. 410, 109339 (2020).CrossRefGoogle Scholar
Lye, K. O. Statistical Solutions of Hyperbolic Systems of Conservation Laws. PhD thesis, ETH Zurich.Google Scholar
Mishra, S. (2018) A machine learning framework for data driven acceleration of computations of differential equations. Math. Eng. 1(1), 118146.CrossRefGoogle Scholar
Mishra, S. & Rusch, K. (2020) Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences. Preprint, available from arXiv:2005.12564.Google Scholar
Mishra, S. & Schwab, C. (2012) Sparse tensor multi-level Monte Carlo finite volume methods for hyperbolic conservation laws with random initial data. Math. Comput. 81(180), 19792018.CrossRefGoogle Scholar
Mishra, S., Schwab, Ch . & Šukys, J. (2012) Multi-level Monte Carlo finite volume methods for nonlinear systems of conservation laws in multi-dimensions. J. Comput. Phys. 231(8), 33653388.CrossRefGoogle Scholar
Miyanawala, T. P. & Jaiman, R. K. (2017) An efficient deep learning technique for the Navier–Stokes equations: application to unsteady wake flow dynamics. Preprint, available from arXiv:1710.09099v2.Google Scholar
Neyshabur, B., Li, Z., Bhojanapalli, S., LeCun, Y. & Srebro, N. (2018) Towards understanding the role of over-parametrization in generalization of neural networks. arXiv preprint arXiv:1805.12076.Google Scholar
Peherstorfer, B., Willcox, K. & Gunzburger, M. (2018) Survey of multifidelity methods in uncertainty propagation, inference, and optimization. SIAM Rev. 60(3), 550591.CrossRefGoogle Scholar
Quateroni, A., Manzoni, A. & Negri, F. (2015) Reduced Basis Methods for Partial Differential Equations: an Introduction, Springer Verlag, Berlin, Germany.Google Scholar
Raissi, M. & Karniadakis, G. E. (2018) Hidden physics models: machine learning of nonlinear partial differential equations. J. Comput. Phys. 357, 125141.CrossRefGoogle Scholar
Raissi, M., Yazdani, A. & Karniadakis, G. E. (2020) Hidden fluid mechanics: learning velocity and pressure fields from flow visualizations. Science 367(6481), 10261030.CrossRefGoogle ScholarPubMed
Rasmussen, C. E. (2003) Gaussian Processes in Machine Learning, Summer School on Machine Learning, Springer, Berlin, Heidelberg.Google Scholar
Ray, D. & Hesthaven, J. S. (2018) An artificial neural network as a troubled cell indicator. J. Comput. Phys. 367, 166191.CrossRefGoogle Scholar
Ray, D., Chandrasekhar, P., Fjordholm, U. S. & Mishra, S. (2016) Entropy stable scheme on two-dimensional unstructured grids for Euler equations. Commun. Comput. Phys. 19(5), 11111140.CrossRefGoogle Scholar
Ruder, S. (2017) An overview of gradient descent optimization algorithms. Preprint, available from arXiv.1609.04747v2.Google Scholar
Sacks, et al. (1989) Design and analysis of computer experiments. Stat. Sci. 4, 409423.Google Scholar
Shalev-Shwartz, S. & Ben-David, S. (2014) Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, Cambridge, UK.CrossRefGoogle Scholar
Statistical Functions (scipy.stats). Python Library. https://docs.scipy.org/doc/scipy/reference/stats.html Google Scholar
Tompson, J., Schlachter, K., Sprechmann, P. & Perlin, K. (2017) Accelarating Eulerian fluid simulation with convolutional networks. Preprint, available from arXiv:1607.03597v6.Google Scholar
E W. Han, J. & Jentzen, A. (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun. Math. Stat. 5(4), 349380.Google Scholar
E W. Ma, C. & Wu, L. (2018) A priori estimates for the generalization error for two-layer neural networks. ArXIV preprint, available from arXiv:1810.06397.Google Scholar
E, W. & Yu, B. (2018) The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat. 6(1), 112.CrossRefGoogle Scholar
Yarotsky, D. (2017) Error bounds for approximations with deep ReLU networks. Neural Networks 94, 103114.CrossRefGoogle ScholarPubMed
Zaspel, P., Huang, B., Harbrecht, H. & Anotole von Lillenfeld, O. Boosting quantum machine learning with multi-level combination technique: Pople diagrams revisited. Preprint, available as arxiv1808.02799v2.Google Scholar