Metrics
Full text views
Full text views help
Loading metrics...
* Views captured on Cambridge Core between #date#. This data will be updated every 24 hours.
Usage data cannot currently be displayed.
Deep learning models are powerful, but often large, slow, and expensive to run. This book is a practical guide to accelerating and compressing neural networks using proven techniques such as quantization, pruning, distillation, and fast architectures. It explains how and why these methods work, fostering a comprehensive understanding. Written for engineers, researchers, and advanced students, the book combines clear theoretical insights with hands-on PyTorch implementations and numerical results. Readers will learn how to reduce inference time and memory usage, lower deployment costs, and select the right acceleration strategy for their task. Whether you're working with large language models, vision systems, or edge devices, this book gives you the tools and intuition needed to build faster, leaner AI systems, without sacrificing performance. It is perfect for anyone who wants to go beyond intuition and take a principled approach to optimizing AI systems
Loading metrics...
* Views captured on Cambridge Core between #date#. This data will be updated every 24 hours.
Usage data cannot currently be displayed.
This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.
Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.