Skip to main content Accessibility help
×
Publisher:
Cambridge University Press
Expected online publication date:
June 2025
Print publication year:
2025
Online ISBN:
9781009180108

Book description

This self-contained guide introduces two pillars of data science, probability theory, and statistics, side by side, in order to illuminate the connections between statistical techniques and the probabilistic concepts they are based on. The topics covered in the book include random variables, nonparametric and parametric models, correlation, estimation of population parameters, hypothesis testing, principal component analysis, and both linear and nonlinear methods for regression and classification. Examples throughout the book draw from real-world datasets to demonstrate concepts in practice and confront readers with fundamental challenges in data science, such as overfitting, the curse of dimensionality, and causal inference. Code in Python reproducing these examples is available on the book's website, along with videos, slides, and solutions to exercises. This accessible book is ideal for undergraduate and graduate students, data science practitioners, and others interested in the theoretical concepts underlying data science methods.

Reviews

‘Fernandez-Granda's Probability and Statistics for Data Science is a comprehensive yet approachable treatment of the fundamentals required of all aspiring Data Scientists-whether they be in academia, industry or elsewhere. The language is clear and precise, and it is one of the best-organized treatments of this material I have ever seen. With lucid examples and helpful exercises, it deserves to be the leading text for these topics among undergraduate and graduate students in this technical, fast-moving discipline. Instructors take note!’

Arthur Spirling - Princeton University

‘If you're mathematically inclined and want to master the foundations of data science in one go, this book is for you. It covers a broad range of essential modern topics - including nonparametric methods, causal inference, latent variable models, Bayesian approaches, and a thorough introduction to machine learning - all illustrated with an abundance of figures and real-world data examples. Highly recommended.’

David Rosenberg - Office of the CTO, Bloomberg

Refine List

Actions for selected content:

Select all | Deselect all
  • View selected items
  • Export citations
  • Download PDF (zip)
  • Save to Kindle
  • Save to Dropbox
  • Save to Google Drive

Save Search

You can save your searches here and later view and run them again in "My saved searches".

Please provide a title, maximum of 40 characters.
×

Contents

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Book summary page views

Total views: 0 *
Loading metrics...

* Views captured on Cambridge Core between #date#. This data will be updated every 24 hours.

Usage data cannot currently be displayed.