Foreword
Preface
1.Big DataTechnologyPrimer
A Tour of the Landscape
Core Components
Computational Frameworks
Analytical SQL Engines
Storage Engines
Ingestion
Orchestration
Summary
Part Ⅰ.Infrastructure
2.Clusters
Reasons for Multiple Clusters
Multiple Clusters for Resiliency
Multiple Clusters for Software Development
Multiple Clusters for Workload Isolation
Multiple Clusters for Legal Separation
Multiple Clusters and Independent Storage and Compute
Multitenancy
Requirements for Multitenancy
Sizing Clusters
Sizing by Storage
Sizing by Ingest Rate
Sizing by Woddoad
Cluster Growth
The Drivers of Cluster Growth
Implementing Cluster Growth
Data Replication
Replication for Software Development
Replication and Workload Isolation
Summary
3.Computeand Storage
Computer Architecture for Hadoop
Commodity Servers
Server CPUs and RAM
Nonuniform Memory Access
CPU Specifications
RAM
Commoditized Storage Meets the Enterprise
Modularity of Compute and Storage
Everything Is Java
Replication or Erasure Coding?
Alternatives
Hadoop and the Linux Storage Stack
User Space
Important System CalIs
The Linux Page Cache
Short-Circuit and Zero-Copy Reads
……