Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Overzichtelijke samenvatting inclusief afbeeldingen

Beoordeling
-
Verkocht
-
Pagina's
21
Geüpload op
15-12-2021
Geschreven in
2021/2022

De samenvatting is tijdens de studieperiode wekelijks bijgehouden en hierdoor is alle mogelijke theorie hierin opgenomen.

Voorbeeld van de inhoud

Strategy Analytics

Chapter 1. Introduction: Data-Analytic Thinking

Information is now widely available on external events such as market trends, industry news, and
competitors’ movements. This broad availability of data has led to increasing interest in methods for
extracting useful information and knowledge from data—the realm of data science

Probably the widest applications of data-mining techniques are in marketing for tasks such as
targeted marketing, online advertising, and recommendations for cross-selling.

Data mining is used for general customer relationship management to analyze customer behavior in
order to manage attrition and maximize expected customer value.

At a high level, data science is a set of fundamental principles that guide the extraction of knowledge
from data. Data mining is the extraction of knowledge from data, via technologies that incorporate
these principles.

Example Hurricane

It would be more valuable to discover patterns due to the
hurricane that were not obvious. To do this, analysts might
examine the huge volume of Wal-Mart data from prior, similar
situations (such as Hurricane Charley) to identify unusual local
demand for products. From such patterns, the company might be
able to anticipate unusual demand for products and rush stock to
the stores ahead of the hurricane’s landfall.

They show that statistically, the more data- driven a firm is, the more productive it is—even
controlling for a wide range of possible confounding factors. And the differences are not small. One
standard deviation higher on the DDD scale is associated with a 4%–6% increase in productivity. DDD
also is correlated with higher return on assets, return on equity, asset utilization, and market value,
and the relationship seems to be causal.

The sort of decisions we will be interested in in this book mainly fall into two types: (1) decisions for
which “discoveries” need to be made within data, and (2) decisions that repeat, especially at massive
scale, and so decision-making can benefit from even small increases in decision-making accuracy
based on data analysis

Predictive model abstracts away most of the complexity of the world, focusing in on a particular set
of indicators that correlate in some way with a quantity of interest (who will churn, or who will
purchase, who is pregnant, etc.).

Big data essentially means datasets that are too large for traditional data processing systems, and
therefore require new processing technol‐ ogies.

Occasionally, big data technologies are actually used for implementing data mining techniques.
However, much more often the well-known big data technologies are used for data processing in
support of the data mining techniques and other data science activities,

,Big Data 1.0: Firms are busying themselves with building the capabilities to process large data, largely
in support of their current operations—for example, to improve efficiency.

In Web 1.0, businesses busied themselves with getting the basic internet technologies in place, so
that they could establish a web presence, build electronic commerce capability, and improve the
efficiency of their op‐ erations.

Web 2.0, where new systems and companies began taking advantage of the interactive nature of the
Web.

Big Data 2.0: Once firms have become capable of processing massive data in a flexible fashion, they
should begin asking: “What can I now do that I couldn’t do before, or do better than I could do
before?” This is likely to be the golden era of data science.

The prior sections suggest one of the fundamental principles of data science: data, and the capability
to extract useful knowledge from data, should be regarded as key strategic assets.

thinking of these as assets should lead us to the realization that they are complementary.

Sociodemographic data provide a substantial ability to model the sort of consumers that are more
likely to purchase one product or another. (The case in Capital one).

Fundamental concept: Extracting useful knowledge from data to solve business problems can be
treated systematically by following a process with reasonably well-defined stages. The Cross Industry
Standard Process for Data Mining, abbreviated CRISP-DM (CRISP- DM Project, 2000), is one
codification of this process. Keeping such a process in mind provides a framework to structure our
thinking about data analytics problems

Fundamental concept: From a large mass of data, information technology can be used to find
informative descriptive attributes of entities of interest

Fundamental concept: If you look too hard at a set of data, you will find something—but it might not
generalize beyond the data you’re looking at. This is referred to as overfit‐ ting a dataset.

Fundamental concept: Formulating data mining solutions and evaluating the results involves thinking
carefully about the context in which they will be used.

Chapter 2. Business Problems and Data Science Solutions

Data mining is a process with fairly well- understood stages.

A critical skill in data science is the ability to decompose a data- analytics problem into pieces such
that each piece matches a known task for which tools are available.

Tasks:

1. Classification and class probability estimation attempt to predict, for each individual in a
population, which of a (small) set of classes this individual belongs to. Usually, the classes are
mutually exclusive. An example classification question would be: “Among all the customers
of MegaTelCo, which are likely to respond to a given offer?” In this example the two classes
could be called will respond and will not respond. (Whether something will happen).

, 2. Regression (“value estimation”) attempts to estimate or predict, for each individual, the
numerical value of some variable for that individual. An example regression question would
be: “How much will a given customer use the service?” The property (variable) to be
predicted here is service usage. (How much something will happen).
3. Similarity matching attempts to identify similar individuals based on data known about them.
Similarity matching can be used directly to find similar entities.
4. Clustering attempts to group individuals in a population together by their similarity, but not
driven by any specific purpose. An example clustering question would be: “Do our customers
form natural groups or segments?”
5. Co-occurrence grouping (also known as frequent itemset mining, association rule discovery,
and market-basket analysis) attempts to find associations between enti‐ ties based on
transactions involving them.
6. Profiling (also known as behavior description) attempts to characterize the typical behavior
of an individual, group, or population. An example profiling question would be: “What is the
typical cell phone usage of this customer segment?”
7. Link prediction attempts to predict connections between data items, usually by suggesting
that a link should exist, and possibly also estimating the strength of the link. Link prediction is
common in social networking systems: “Since you and Ka ‐ ren share 10 friends, maybe you’d
like to be Karen’s friend?”
8. Data reduction attempts to take a large set of data and replace it with a smaller set of data
that contains much of the important information in the larger set. The smaller dataset may
be easier to deal with or to process.
9. Causal modeling attempts to help us understand what events or actions actually influence
others. For example, consider that we use predictive modeling to target advertisements to
consumers, and we observe that indeed the targeted consumers purchase at a higher rate
subsequent to having been targeted. Was this because the advertisements influenced the
consumers to purchase?

Supervised vs. unsupervised methods

When there is no target, the data mining problem is referred to as unsupervised.

The learner would be given no information about the purpose of the learning but would be left to
form its own conclusions about what the examples have in common.

A supervised technique is given a specific purpose for the grouping—predicting the target. Clustering,
an unsupervised task, produces groupings based on similarities, but there is no guarantee that these
similarities are meaningful or will be useful for any particular purpose.

The value for the target variable for an individual is often called the indi ‐ vidual’s label, emphasizing
that often (not always) one must incur expense to actively label the data.

Classification, regression, and causal modeling generally are solved with supervised methods.
Similarity matching, link prediction, and data reduction could be either. Clustering, cooccurrence
grouping, and profiling generally are unsupervised.

Important distinction pertaining to mining data: the difference between (1) mining the data to find
patterns and build models, and (2) using the results of data mining.

Documentinformatie

Heel boek samengevat?
Nee
Wat is er van het boek samengevat?
1-14
Geüpload op
15 december 2021
Aantal pagina's
21
Geschreven in
2021/2022
Type
SAMENVATTING
€9,89
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
CFMdejong

Maak kennis met de verkoper

Seller avatar
CFMdejong Avans Hogeschool
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
-
Lid sinds
9 jaar
Aantal volgers
0
Documenten
5
Laatst verkocht
-

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen