1. Preface
1. Who this book is for
2. What this book covers
3. To get the most out of this book
4. Get in touch
1. Introduction to ML Engineering
1. Technical requirements
2. Defining a taxonomy of data disciplines
1. Data scientist
2. ML engineer
3. ML operations engineer
4. Data engineer
3. Working as an effective team
4. ML engineering in the real world
5. What does an ML solution look like?
1. Why Python?
6. High-level ML system design
1. Example 1: Batch anomaly detection service
2. Example 2: Forecasting API
3. Example 3: Classification pipeline
7. Summary
2. The Machine Learning Development Process
1. Technical requirements
2. Setting up our tools
1. Setting up an AWS account
3. Concept to solution in four steps
1. Comparing this to CRISP-DM
2. Discover
1. Using user stories
3. Play
4. Develop
1. Selecting a software development methodology
2. Package management (conda and pip)
3. Poetry
4. Code version control
5. Git strategies
6. Model version control
5. Deploy
1. Knowing your deployment options
2. Understanding DevOps and MLOps
3. Building our first CI/CD example with GitHub Actions
4. Continuous model performance testing
5. Continuous model training
, 4. Summary
3. From Model to Model Factory
1. Technical requirements
2. Defining the model factory
3. Learning about learning
1. Defining the target
2. Cutting your losses
3. Preparing the data
4. Engineering features for machine learning
1. Engineering categorical features
2. Engineering numerical features
5. Designing your training system
1. Training system design options
2. Train-run
3. Train-persist
6. Retraining required
1. Detecting data drift
2. Detecting concept drift
3. Setting the limits
4. Diagnosing the drift
5. Remediating the drift
6. Other tools for monitoring
7. Automating training
8. Hierarchies of automation
9. Optimizing hyperparameters
1. Hyperopt
2. Optuna
10. AutoML
1. auto-sklearn
2. AutoKeras
7. Persisting your models
8. Building the model factory with pipelines
1. Scikit-learn pipelines
2. Spark ML pipelines
9. Summary
4. Packaging Up
1. Technical requirements
2. Writing good Python
1. Recapping the basics
2. Tips and tricks
3. Adhering to standards
4. Writing good PySpark
3. Choosing a style
1. Object-oriented programming
2. Functional programming
4. Packaging your code
, 1. Why package?
2. Selecting use cases for packaging
3. Designing your package
5. Building your package
1. Managing your environment with Makefiles
2. Getting all poetic with Poetry
6. Testing, logging, securing, and error handling
1. Testing
2. Securing your solutions
3. Analyzing your own code for security issues
4. Analyzing dependencies for security issues
5. Logging
6. Error handling
7. Not reinventing the wheel
8. Summary
5. Deployment Patterns and Tools
1. Technical requirements
2. Architecting systems
1. Building with principles
3. Exploring some standard ML patterns
1. Swimming in data lakes
2. Microservices
3. Event-based designs
4. Batching
4. Containerizing
5. Hosting your own microservice on AWS
1. Pushing to ECR
2. Hosting on ECS
6. Building general pipelines with Airflow
1. Airflow
1. Airflow on AWS
2. Revisiting CI/CD for Airflow
7. Building advanced ML pipelines
1. Finding your ZenML
2. Going with the Kubeflow
8. Selecting your deployment strategy
9. Summary
6. Scaling Up
1. Technical requirements
2. Scaling with Spark
1. Spark tips and tricks
2. Spark on the cloud
1. AWS EMR example
3. Spinning up serverless infrastructure
4. Containerizing at scale with Kubernetes
5. Scaling with Ray