We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter reviews vectors and matrices, and basic properties like shape, orthogonality, determinant, eigenvalues, and trace. It also reviews operations like multiplication and transpose. These operations are used throughout the book and are pervasive in the literature. In short, arranging data into vectors and matrices allows one to apply powerful data analysis techniques over a wide spectrum of applications. Throughout, this chapter (and book) illustrates how the ideas are implemented in practice in Julia.
In this chapter we cover clustering and regression, looking at two traditional machine learning methods: k-means and linear regression. We briefly discuss how to implement these methods in a non-distributed manner first, to then carefully analyze the bottlenecks of these methods when manipulating big data. This enables us to design global-based solutions based on the DataFrame API of Spark. The key focus is on the principles for designing solutions effectively. Nevertheless, some of the challenges in this chapter are to investigate tools from Spark to speed up the processing even further. k-means is an example of an iterative algorithm, and how to exploit caching in Spark, and we analyze its implementation with both RDD and DataFrame APIs. For linear regression, we first implement the closed form, which involves numerous matrix multiplications and outer products, to simplify the processing in big data. Then, we look at gradient descent. These examples give us the opportunity to expand on the principles of designing a global solution, and also allow us to show how knowing the underlying platform, Spark in this case, well is essential to really maximize the performance.
Chapter 2 gives a more formal account of the ideas introduced in Chapter 1. We discuss the requirements for writing CUDA kernel code and explain the syntax in detail. We encourage the reader to start thinking in parallel by introducing some key coding ideas including methods for summing a large number of values in parallel for so-called reduction operations. This chapter also introduces GPU shared memory, illustrated with a tiled matrix multiplication example. We demonstrate how the __restrict keyword applied to kernel pointer arguments can speed up your code. In some sense this is our most conventional chapter for a book on CUDA, and the reduction operation is revisited in a number of later chapters to help introduce new CUDA features. However, many of our other examples go well beyond what you can find elsewhere.
The objective of this chapter is to prepare for the use of MATLAB to find solutions for a system of linear equations. The important concepts include the inverse matrix, solution of a system of linear equations, least squares method, and MATLAB’s “left-division,” which is essentially an implementation of the least squares method. MATLAB functions are very useful and efficient in the job. For real problems, the equations are often highly overdetermined, which occurs when there are many more equations (or measurements) than the number of unknowns, and thus it requires the use of the least squares method to solve the solution in a statistical sense. The contents in this chapter lay out a foundation for several later chapters because many theories and methods are related in one way or the other to the solution of a system of linear equations, e.g. Fourier analysis and harmonic analysis.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.