Lecture 1 ................................................................................................................................................................. 2
Article 1: The surprising power of online experiments (harvard business review) ............................................. 2
Article 2: The discipline of business experimentation (Harvard Business review) .............................................. 3
Article 3: The temperature premium: warm temperatures increase product valuation (Zwebner et al) .......... 5
Lecture 2 ................................................................................................................................................................. 7
Article 4: Psychological targeting as an effective approach to digital mass persuasion (Matz et al) .................. 7
Article 5: Spotlights, foodlights, and the magic number zero: simple effects tests in moderated regression
(Spiller et al) - ..................................................................................................................................................... 9
Lecture 3 ............................................................................................................................................................... 11
Article 6: Delegrating decisions: recruiting others to make choices we might regret (steffel et al) ................. 11
Lecture 4 ............................................................................................................................................................... 13
Article 7: A comparison of approaches to advertising measurement: evidence from bing field experiments at
facebook (Gordon et al) .................................................................................................................................... 13
Article 8: Measuring consumer sensitivity to audio advertising: a field experiment pandora internet radio
(Huang et al) ...................................................................................................................................................... 15
Article 9: Experimental methods: between-subject and within-subject design (Charness et al) ..................... 16
Article 10: Field studies of psyhocologically targeted ads face threats to internal validity (Eckles et al) ......... 18
Article 11: Reply to eckles et al: facebook’s optimization algorithms are highly unlikely to explain the effect of
psychological targeting (matz et al) .................................................................................................................. 18
Lecture 5 ............................................................................................................................................................... 19
Article 12: Critical condition: people don’t dislike a corporate experiment more than they dislike its worst
condition (Mislavsky et al) ................................................................................................................................ 19
Lecture 6 ............................................................................................................................................................... 22
Article 13: Estimating the reproducibility of psychological sience (open science collaboration) ..................... 22
Article 14: False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting
anything as significant (Simmons et al) ............................................................................................................. 23
1
,LECTURE 1
ARTICLE 1: THE SURPRISING POWER OF ONLIN E EXPERIMENTS (HARVARD BUSINESS REVIEW)
In brief
The need: when building websites/application, too many companies make decisions using subjective
opinions rather than hard data.
The solution: conduct online controlled experiments to evaluate ideas. Reason: large investments can
fail to deliver/some tiny changes can be surprisingly detrimental while others have big payoffs.
Implementation: understand how to properly design and execute A/B tests and other controlled
experiments, ensure their integrity, interpret their results, and void pitfalls.
Introduction: controlled experiments can transform decision making into a scientific, evidence-driven process –
rather than an intuitive reaction. Too many organizations don’t know how to run rigorous scientific tests, or
conduct way too few of them.
Appreciate the value of a/b tests: A/B test consist of two experiences: ‘A’ the control (current system), ‘B’ the
treatment (modification that attempts to improve something). Users are randomly assigned to the experiences,
and key metrics are computed and compared. Large customer samples opportunity to evaluate many ideas
quickly, with great precision, and at a negligible cost per incremental experiment. That allows organizations to
iterate rapidly, fail fast, and pivot.
What managers need to understand: 1) tiny changes can have a big impact. People assume the greater an
investment, the larger the impact. But online: success is more about getting small changes right. Large
investments, however, may have little or no payoff. 2) Experiments can guide investment decisions. Online tests
help figure out how much investment in a potential improvement is optimal.
Build a large-scale capability: ‘companies need to kiss a lot frogs (that is, perform a massive number of
experiments) to find a prince’. It is key to experiment with everything to make sure that changes neither are
degrading nor have unexpected effects. This requires an infrastructure: instrumentation, data pipelines, and data
scientists. Several third-party tools/services make it easy to try experiments, but if you want to scale things up,
you must tightly integrate the capability into your processes. That will drive down the cost of each experiment
and increase its reliability.
A company’s experimentation personnel can be organized in three ways:
Centralized model: a team of data scientists serve the entire company. small companies
+ : you can focus on long-term projects. - : group may have different priorities (lead to conflicts), data
scientists may feel like outsiders.
Decentralized model: distributing data scientists throughout the different business units. companies
with multiple businesses
+ : experts in each business domain. - : lack of a clear career path/not receive feedback to develop.
Center-of-excellence model: have some data scientists in a centralized function and others within the
different business units. if online experimentation is a corporate priority
+ : lowers time and resources/can spread best testing practices. - : lack of clarity
Address the definition of success: define a suitable evaluation metric for experiments that aligns with statistical
goals overall evaluation criterion (OEC). This is difficult (determining which short-term metrics are best
predictors of long-term outcomes) and requires close cooperation between senior executives and data analysts.
Also break down the components of an OEC and track them.
Be aware of low-quality data: getting numbers you can trust is hard. A/A tests: test something against itself to
ensure that about 95% of the time the system correctly identifies no significant difference. Twyman’s law: any
figure that looks interesting/different is usually wrong surprising results should be replicated.
If you want results to be trustworthy, you must ensure that high-quality data is used. Check for: outliers,
collections error, heterogeneous treatment effects (when some segment experience much larger/smaller effects
than other do), carryover effects (in which people’s experience in an experiment alters their future behavior.
2
, Solution = shuffle users between experiments), validating that the percentages of users in the control/treatment
groups in the actual experiment match the experimental design.
Avoid assumptions about causality: Some executives believe that all they need to do is establish correlation, and
causality can be inferred = wrong. Observational studies cannot establish causality and including too many
variables in tests also makes it hard to learn about causality. An experiment should be simple enough that cause-
and-effect relationships can be easily understood.
Should you try to understand causal mechanism? Yes. But if it comes to the ‘behavior of users’ you don’t always
have to know the ‘why’ or the ‘how’ to benefit from knowledge of the ‘what’.
Conclusion: the online world is often viewed as turbulent and full of peril, but controlled experiments can help
us navigate it. If you want to gain a competitive advantage, your firm should build an experimental capability and
master the science of conducting online tests.
ARTICLE 2: THE DISCIPLINE OF BUSINESS EXPERIMENTATION (HARVARD BUSINESS REVIEW)
In brief:
The problem: in the absence of sufficient data to inform decisions about proposed innovations,
managers often rely on their experience, intuition, or conventional wisdom – none of which is
necessarily relevant.
The solution: A rigorous scientific test, in which companies separate an independent variable (the
presumed cause) from a dependent variable (the observed effect) while holding all other potential
causes constant, and then manipulate the IV to study changes in the DV cause and effect.
The guidance: to make the most of their experiments, companies must ask: does the experiment have
a clear purpose? Have stakeholders made a commitment to abide by the results? Is the experiment
doable? How can we ensure reliable results? Have we gotten the most value out of the experiment?
Checklist for running a business experiment:
Purpose Buy-in Feasibility Reliability Value
- Does the experiment - What specific - Does the experiment - What measures will be used to - Has the organization considered a
focus on a specific changes would be have a testable account for systemic bias, whether targeted rollout- that is, one that
management action made on the basis of prediction? it’s conscious or unconscious? takes into account a proposed
under consideration? the results? - What is the required - Do the characteristics of the initiative’s effect on different
- What do people hope - How will the sample sales? Note: control group match those of the customers, markets, and segments
to learn from the organization ensure the sample size will test group? – to concentrate investments in
experiment? that the results aren’t depend on the - Can the experiment be conducted areas where the potential payback
ignored? expected effect (e.g. a in either ‘blind’ or ‘double-blind’ is highest?
- How does the 5% increase in sales) fashion? - Has the organization implemented
experiment fit into the - Can the organization - Have any remaining biases been only the components of an initiative
organization’s overall feasibility conduct the estimated through statistical with the highest ROI?
learning agenda and experiment at the test analysis or other techniques? - Does the organization have a
strategic priorities? locations for the - Would other conducting the same better understanding of what
required duration? test obtain similar results? variables are causing what effects?
Purpose: Companies should conduct experiments if they are the only practical way to answer specific questions
about proposed management actions. To decide this, managers must figure out what they want to learn. A good
hypothesis with a specific IV and DV is important to support/reject. In many situations: go beyond the direct
effects of an initiative and investigate its ancillary effects. Example Kohl’s: open stores an hour later to decrease operating
costs. Only way to test it was to conduct rigorous experiment. Result: delayed opening would not result in any meaningful sales decline.
Stakeholders: Before conducting any test, stakeholders must agree how they'll proceed once the results are in
(promise to weight all the findings/willing to walk away from project if it’s not supported by data). Example Kohl’s:
many executives were enthusiastic about adding a new product category. But test showed a drop in sales and the program was scrapped.
This shows that experiments are needed to perform objective assessments of initiatives.
A process should be instituted to ensure that test results aren’t ignored. Example Publix Super Markets: all large retail
projects must undergo formal experiments to receive a green light (a filtering process). When constructing/implementing a
process, it is important that the experiments should be part of a learning agenda that supports a firm’s
organizational priorities. Example Petco: each test request must address how the experiment would contribute to the overall strategy
(become more innovative). Number of tests reduced.
3