preface
acknowledgments
about this book
about the authors
about the cover illustration
1 Introducing the data platform
1.1 The trends behind the change from data warehouses to data platforms
1.2 Data warehouses struggle with data variety, volume, and velocity
Variety
Volume
Velocity
All the V’s at once
1.3 Data lakes to the rescue?
1.4 Along came the cloud
1.5 Cloud, data lakes, and data warehouses: The emergence of cloud data platforms
1.6 Building blocks of a cloud data platform
Ingestion layer
Storage layer
Processing layer
Serving layer
1.7 How the cloud data platform deals with the three V’s
Variety
,Volume
Velocity
Two more V’s
1.8 Common use cases
2 Why a data platform and not just a data warehouse
2.1 Cloud data platforms and cloud data warehouses: The practical aspects
A closer look at the data sources
An example cloud data warehouse–only architecture
An example cloud data platform architecture
2.2 Ingesting data
Ingesting data directly into Azure Synapse
Ingesting data into an Azure data platform
Managing changes in upstream data sources
2.3 Processing data
Processing data in the warehouse
Processing data in the data platform
2.4 Accessing data
2.5 Cloud cost considerations
2.6 Exercise answers
3 Getting bigger and leveraging the Big 3: Amazon, Microsoft Azure, and Google
3.1 Cloud data platform layered architecture
Data ingestion layer
Fast and slow storage
, Processing layer
Technical metadata layer
The serving layer and data consumers
Orchestration and ETL overlay layers
3.2 The importance of layers in a data platform architecture
3.3 Mapping cloud data platform layers to specific tools
AWS
Google Cloud
Azure
3.4 Open source and commercial alternatives
Batch data ingestion
Streaming data ingestion and real-time analytics
Orchestration layer
3.5 Exercise answers
4 Getting data into the platform
4.1 Databases, files, APIs, and streams
Relational databases
Files
SaaS data via API
Streams
4.2 Ingesting data from relational databases
Ingesting data from RDBMSs using a SQL interface
Full-table ingestion