Book contents
- Frontmatter
- Dedication
- Contents
- List of insights
- Preface
- Notation
- Part I Preliminaries
- Part II Fundamentals of Biological Sequence Analysis
- Part III Genome-Scale Index Structures
- Part IV Genome-Scale Algorithms
- 10 Alignment-based genome analysis
- 11 Alignment-free genome analysis and comparison
- 12 Compression of genome collections
- 13 Fragment assembly
- Part V Applications
- References
- Index
12 - Compression of genome collections
from Part IV - Genome-Scale Algorithms
Published online by Cambridge University Press: 28 September 2023
- Frontmatter
- Dedication
- Contents
- List of insights
- Preface
- Notation
- Part I Preliminaries
- Part II Fundamentals of Biological Sequence Analysis
- Part III Genome-Scale Index Structures
- Part IV Genome-Scale Algorithms
- 10 Alignment-based genome analysis
- 11 Alignment-free genome analysis and comparison
- 12 Compression of genome collections
- 13 Fragment assembly
- Part V Applications
- References
- Index
Summary
A pragmatic problem arising in the analysis of biological sequences is that collections of genomes, and especially collections of read sets consisting of material from many species, occupy too much space. This chapter explores techniques to efficiently compress such collections. Several algorithms related to Lempel–Ziv factorization are covered, as well as the prefix-free parsing technique to run-length encode the Burrows–Wheeler transform of a collection of genomes.
- Type
- Chapter
- Information
- Genome-Scale Algorithm DesignBioinformatics in the Era of High-Throughput Sequencing, pp. 284 - 307Publisher: Cambridge University PressPrint publication year: 2023