Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Presentation

Effective Machine Learning Teams

Rating
-
Sold
-
Pages
351
Uploaded on
02-08-2024
Written in
2021/2022

"Gain the valuable skills and techniques you need to accelerate the delivery of machine learning solutions. With this practical guide, data scientists, ML engineers, and their leaders will learn how to bridge the gap between data science and Lean product delivery in a practical and simple way. David Tan, Ada Leung, and Dave Colls show you how to apply time-tested software engineering skills and Lean product delivery practices to reduce toil and waste, shorten feedback loops, and improve your team's flow when building ML systems and products. Based on the authors' experience across multiple real-world data and ML projects, the proven techniques in this book will help your team avoid common traps in the ML world, so you can iterate and scale more quickly and reliably. You'll learn how to overcome friction and experience flow when delivering ML solutions."

Show more Read less
Institution
Course

Content preview

,Chapter 1. Challenges and Better Paths in
Delivering ML Solutions
The most dangerous kind of waste is the waste we do not recognize.

Shigeo Shingo, leading expert on the Toyota Production System

Not everything that is faced can be changed, but nothing can be changed until it is faced.

James Baldwin, writer and playwright

Many individuals and organizations start their machine learning (ML) journey with high hopes,
but the lived experiences of many ML practitioners tell us that the journey of delivering ML
solutions is riddled with traps, detours, and sometimes even insurmountable barriers. When we
peel back the hype and the glamorous claims of data science being the sexiest job of the 21st
century, we often see ML practitioners bogged down by burdensome manual work; firefighting in
production; team silos; and unwieldy, brittle, and complex solutions.

This hinders, or even prevents, teams from delivering value to customers and also frustrates an
organization’s investments and ambitions in AI. As hype cycles go, many travel past the peak of
inflated expectations and crash-land into the trough of disillusionment. We might see some high-
performing ML teams move on to the plateau of productivity and wonder if we’ll ever get there.

Regardless of your background—be it academia, data science, ML engineering, product
management, software engineering, or something else—if you are building products or systems
that involve ML, you will inevitably face the challenges that we describe in this chapter. This
chapter is our attempt to distill our experience—and the experience of others—in building and
delivering ML-enabled products. We hope that these principles and practices will help you avoid
unnecessary pitfalls and find a more reliable path for your journey.

We kick off this chapter by acknowledging the dual reality of promise and disappointment in ML
in the real world. We then examine both high-level and day-to-day challenges that often cause ML
projects to fail. We then outline a better path based on the principles and practices of Lean delivery,
product thinking, and agile engineering. Finally, we briefly discuss why these practices are
relevant to, and especially to, teams delivering Generative AI products and large language model
(LLM) applications. Consider this chapter a miniature representation of the remainder of this book.


ML: Promises and Disappointments
In this section, we look at evidence of continued growth of investments and interest in ML before
taking a deep dive into the engineering, product, and delivery bottlenecks that impede the returns
on these investments.

,Continued Optimism in ML
Putting aside the hype and our individual coordinates on the hype cycle for a moment, ML
continues to be a fast-advancing field that provides many techniques for solving real-world
problems. Stanford’s “AI Index Report 2022” found that in 2021, global private investment in AI
totaled around $94 billion, which is more than double the total private investment even in 2019,
before the COVID-19 pandemic. McKinsey’s “State of AI in 2021” survey indicated that AI
adoption was continuing its steady rise: 56% of all respondents reported AI adoption in at least
one function, up from 50% in 2020.

The Stanford report also found companies are continuing to invest in applying a diverse set of ML
techniques—e.g., natural language understanding, computer vision, reinforcement learning—
across a wide array of sectors, such as healthcare, retail, manufacturing, and financial services.
From a jobs and skills perspective, Stanford’s analysis of millions of job postings since 2010
showed that the demand for ML capabilities has been growing steadily year-on-year in the past
decade, even through and after the COVID-19 pandemic.

While these trends are reassuring from an opportunities perspective, they are also highly
concerning if we journey ahead without confronting and learning from the challenges that have
ensnared us—both the producers and consumers of ML systems—in the past. Let’s take a look at
these pitfalls in detail.

Why ML Projects Fail
Despite the plethora of chart-topping Kaggle notebooks, it’s common for ML projects to fail in the
real world. Failure can come in various forms, including:

 Inability to ship an ML-enabled product to production
 Shipping products that customers don’t use
 Deploying defective products that customers don’t trust
 Inability to evolve and improve models in production quickly enough

Just to be clear—we’re not trying to avoid failures. As we all know, failure is as valuable as it is
inevitable. There’s lots that we can learn from failure. The problem arises as the cost of failure
increases—missed deadlines, unmet business outcomes, and sometimes even collateral
damage: harm to humans and loss of jobs and livelihoods of many employees who aren’t even
directly related to the ML initiative.

What we want is to fail in a low-cost and safe way, and often, so that we improve our odds of
success for everyone who has a stake in the undertaking. We also want to learn from failures—by
documenting and socializing our experiments and lessons learned, for example—so that we don’t
fail in the same way again and again. In this section, we’ll look at some common challenges—
spanning product, delivery, and engineering—that reduce our chances of succeeding, and in the
next section, we’ll explore ways to reduce the costs and likelihood of failure and deliver valuable
outcomes more effectively.

, Let’s start at a high level and then zoom in to look at day-to-day barriers to the flow of value.

High-level view: Barriers to success

Taking a high-level view—i.e., at the level of an ML project or a program of work—we’ve seen
ML projects fail to achieve their desired outcomes due to the following challenges:

Failing to solve the right problem or deliver value for users

In this failure mode, even if we have all the right engineering practices and “build the thing
right,” we fail to move the needle on the intended business outcomes because the team
didn’t “build the right thing.” This often happens when the team lacks product management
capabilities or lacks alignment with product and business. Without mature product thinking
capabilities in a team, it’s common for ML teams to overlook human-centered design
techniques—e.g., user testing, user journey mapping—to identify the pains, needs, and
desires of users.1

Challenges in productionizing models

Many ML projects do not see the light of day in production. A 2021 Gartner poll of roughly
200 business and IT professionals found that only 53% of AI projects make it from pilot
into production, and among those that succeed, it takes an average of nine months to do
so.2 The challenges of productionizing ML models isn’t limited to just compute issues such
as model deployments, but can be related to data (e.g., not having inference data available
at suitable quality, latency, and distribution in production).

Challenges after productionizing models

Once in production, it’s common to see ML practitioners bogged down by toil and tedium
that inhibits iterative experimentation and model improvements. In its “2021 Enterprise
Trends in Machine Learning” report, Algorithmia reported that 64% of companies
take more than one month to deploy a new model, an increase from 58% as reported in
Algorithmia’s 2020 report. The report also notes 38% of organizations spend more than
50% of their data scientists’ time on deployment—and that only gets worse with scale.

Long or missing feedback loops

During model development, feedback loops are often long and tedious, and this diverts
valuable time from important ML product development work. The primary way of knowing
if everything works might be to manually run a training notebook or script, wait for it to
complete—sometimes waiting hours—and manually wading through logs or printed
statements to eyeball some model metrics to determine if the model is still as good as before.
This doesn’t scale well and more often than not, we are hindered by unexpected errors and
quality degradations during development and even in production.

Written for

Course

Document information

Uploaded on
August 2, 2024
Number of pages
351
Written in
2021/2022
Type
PRESENTATION
Person
Unknown

Subjects

$4.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
RobertCuong

Get to know the seller

Seller avatar
RobertCuong Telecommunication
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
3 year
Number of followers
0
Documents
225
Last sold
-
GPON and WiFi

+ SDH solution based on Fujitsu/Alcatel/Huawei devices in deployment and troubleshoot + Switching and Routing network fundamental and advance + GPON solution with deep knowledge of PLOAM/OMCI, activation procedure. Analysis of Private/Public OMCI + WiFi solution with WiFi Management/Control/Data. WiFi bandsteering, WiFi mesh, and WiFi 6, 6E, 7, ...

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions