We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter, we establish the mathematical foundation for hard computing optimization algorithms. We look at the classical optimization approaches and extend our discussion to include iterative methods, which hold a special role in machine learning. In particular, we review the gradient decent method, Newton’s method, the conjugate gradient method and the quasi-Newton’s method. Along with the discussion of these optimization methods, implementation using Matlab script as well as considerations for use in neural network training algorithms are provided. Finally, the Levenberg-Marquardt method is introduced, discussed, and implemented in Matlab script to compare its functioning with the other four iterative algorithms introduced in this chapter.
Several numerical methods used in the study of tensor network renormalization are introduced, including the power, Lanczos, conjugate gradient, Arnoldi methods, and quantum Monte Carlo simulation.
Many machine learning methods require non-linear optimization, performed by the backward propagation of model errors, with the process complicated by the presence of multiple minima and saddle points. Numerous gradient descent algorithms are available for optimization, including stochastic gradient descent, conjugate gradient, quasi-Newton and non-linear least squares such as Levenberg-Marquardt. In contrast to deterministic optimization, stochastic optimization methods repeatedly introduce randomness during the search process to avoid getting trapped in a local minimum. Evolutionary algorithms, borrowing concepts from evolution to solve optimization problems, include genetic algorithm and differential evolution.
Mathematical background and formulation of numerical minimization process are described in terms of gradient-based methods, whose ingredients include gradient, Hessian, directional derivatives, optimality conditions for minimization, Hessian eigensystem, conjugate number of Hessian, and conjugate vectors. Various minimization algorithms, such as the steepest descent method, Newton’s method, conjugate gradient method, and quasi-Newton’s method, are introduced along with practical examples.
Some optimal choices for a parameter of the Dai–Liao conjugate gradient method are proposed by conducting matrix analyses of the method. More precisely, first the $\ell _{1}$ and $\ell _{\infty }$ norm condition numbers of the search direction matrix are minimized, yielding two adaptive choices for the Dai–Liao parameter. Then we show that a recent formula for computing this parameter which guarantees the descent property can be considered as a minimizer of the spectral condition number as well as the well-known measure function for a symmetrized version of the search direction matrix. Brief convergence analyses are also carried out. Finally, some numerical experiments on a set of test problems related to constrained and unconstrained testing environment, are conducted using a well-known performance profile.
We propose a new derivative-free conjugate gradient method for large-scale nonlinear systems of equations. The method combines the Rivaie–Mustafa–Ismail–Leong conjugate gradient method for unconstrained optimisation problems and a new nonmonotone line-search method. The global convergence of the proposed method is established under some mild assumptions. Numerical results using 104 test problems from the CUTEst test problem library show that the proposed method is promising.
This paper proposes improvements to the modified Fletcher–Reeves conjugate gradient method (FR-CGM) for computing $Z$-eigenpairs of symmetric tensors. The FR-CGM does not need to compute the exact gradient and Jacobian. The global convergence of this method is established. We also test other conjugate gradient methods such as the modified Polak–Ribière–Polyak conjugate gradient method (PRP-CGM) and shifted power method (SS-HOPM). Numerical experiments of FR-CGM, PRP-CGM and SS-HOPM show the efficiency of the proposed method for finding $Z$-eigenpairs of symmetric tensors.
We consider a Robin inverse problem associated with the Laplace equation, which is a severely ill-posed and nonlinear. We formulate the problem as a boundary integral equation, and introduce a functional of the Robin coefficient as a regularisation term. A conjugate gradient method is proposed for solving the consequent regularised nonlinear least squares problem. Numerical examples are presented to illustrate the effectiveness of the proposed method.
An inverse problem of reconstructing the initial condition for a time fractional diffusion equation is investigated. On the basis of the optimal control framework, the uniqueness and first order necessary optimality condition of the minimizer for the objective functional are established, and a time-space spectral method is proposed to numerically solve the resulting minimization problem. The contribution of the paper is threefold: 1) a priori error estimate for the spectral approximation is derived; 2) a conjugate gradient optimization algorithm is designed to efficiently solve the inverse problem; 3) some numerical experiments are carried out to show that the proposed method is capable to find out the optimal initial condition, and that the convergence rate of the method is exponential if the optimal initial condition is smooth.
We address in this article the computation of the convex solutions of the Dirichletproblem for the real elliptic Monge − Ampère equation for general convex domains in twodimensions. The method we discuss combines a least-squares formulation with a relaxationmethod. This approach leads to a sequence of Poisson − Dirichlet problems and anothersequence of low dimensional algebraic eigenvalue problems of a new type. Mixed finiteelement approximations with a smoothing procedure are used for the computer implementationof our least-squares/relaxation methodology. Domains with curved boundaries are easilyaccommodated. Numerical experiments show the convergence of the computed solutions totheir continuous counterparts when such solutions exist. On the other hand, when classicalsolutions do not exist, our methodology produces solutions in a least-squares sense.
In earlier work we have studied a method for discretization in time of a parabolic problem, which consists of representing the exact solution as an integral in the complex plane and then applying a quadrature formula to this integral. In application to a spatially semidiscrete finite-element version of the parabolic problem, at each quadrature point one then needs to solve a linear algebraic system having a positive-definite matrix with a complex shift. We study iterative methods for such systems, considering the basic and preconditioned versions of first the Richardson algorithm and then a conjugate gradient method.
Orbital-free density functional theory (OFDFT) is a quantum mechanical method in which the energy of a material depends only on the electron density and ionic positions. We examine some popular algorithms for optimizing the electron density distribution in OFDFT, explaining their suitability, benchmarking their performance, and suggesting some improvements. We start by describing the constrained optimization problem that encompasses electron density optimization. Next, we discuss the line search (including Wolfe conditions) and the nonlinear conjugate gradient and truncated Newton algorithms, as implemented in our open source OFDFT code. We finally focus on preconditioners derived from OFDFT energy functionals. Newly-derived preconditioners are successful for simulation cells of all sizes without regions of low electron-density and for small simulation cells with such regions.
In studying biomechanical deformation in articular cartilage, the presence of cells (chondrocytes) necessitates the consideration of inhomogeneous elasticity problems in which cells are idealized as soft inclusions within a stiff extracellular matrix. An analytical solution of a soft inclusion problem is derived and used to evaluate iterative numerical solutions of the associated linear algebraic system based on discretization via the finite element method, and use of an iterative conjugate gradient method with algebraic multigrid preconditioning (AMG-PCG). Accuracy and efficiency of the AMG-PCG algorithm is compared to two other conjugate gradient algorithms with diagonal preconditioning (DS-PCG) or a modified incomplete LU decomposition (Euclid-PCG) based on comparison to the analytical solution. While all three algorithms are shown to be accurate, the AMG-PCG algorithm is demonstrated to provide significant savings in CPU time as the number of nodal unknowns is increased. In contrast to the other two algorithms, the AMG-PCG algorithm also exhibits little sensitivity of CPU time and number of iterations to variations in material properties that are known to significantly affect model variables. Results demonstrate the benefits of algebraic multigrid preconditioners for the iterative solution of assembled linear systems based on finite element modeling of soft elastic inclusion problems and may be particularly advantageous for large scale problems with many nodal unknowns.
We present a new numerical optimal design for a redundant parallel manipulator, the eclipse, which has a geometrically symmetric workspace shape. We simultaneously consider the structural mass and design efficiency as objective functions to maximize the mass reduction and minimize the loss of design efficiency. The task-oriented workspace (TOW) and its partial workspace (PW) are considered in efficiently obtaining an optimal design by excluding useless orientations of the end-effector and by including just one cross-sectional area of the TOW. The proposed numerical procedure is composed of coarse and fine search steps. In the coarse search step, we find the feasible parameter regions (FPR) in which the set of parameters only satisfy the marginal constraints. In the fine search step, we consider the multiobjective function in the FPR to find the optimal set of parameters. In this step, fine search will be kept until it reaches the optimal set of parameters that minimize the proposed objective functions by continuously updating the PW in every iteration. By applying the proposed approach to an eclipse-rapid prototyping machine, the structural mass of the machine can be reduced by 8.79% while the design efficiency is increased by 6.2%. This can be physically interpreted as a mass reduction of 49 kg (the initial structural mass was 554.7 kg) and a loss of 496 mm3/mm in the workspace volume per unit length. The proposed optimal design procedure could be applied to other serial or parallel mechanism platforms that have geometrically symmetric workspace shapes.
Modeling genetic regulatory networks is an important problem in genomic research. Boolean Networks (BNs) and their extensions Probabilistic Boolean Networks (PBNs) have been proposed for modeling genetic regulatory interactions. In a PBN, its steady-state distribution gives very important information about the long-run behavior of the whole network. However, one is also interested in system synthesis which requires the construction of networks. The inverse problem is ill-posed and challenging, as there may be many networks or no network having the given properties, and the size of the problem is huge. The construction of PBNs from a given transition-probability matrix and a given set of BNs is an inverse problem of huge size. We propose a maximum entropy approach for the above problem. Newton's method in conjunction with the Conjugate Gradient (CG) method is then applied to solving the inverse problem. We investigate the convergence rate of the proposed method. Numerical examples are also given to demonstrate the effectiveness of our proposed method.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.