Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary Data Engineering (MSc Marketing Analytics & Data Science)

Rating
-
Sold
1
Pages
21
Uploaded on
28-01-2023
Written in
2022/2023

Summary of Data Engineering (EBM213A05). This summary is advised to use by the professor. It covers all mandatory material for the exam.

Institution
Course

Content preview

Summary Data Engineering
Week 1

Different steps to structure the business challenge
1. define managerial/research dilemma
2. define managerial question
3. define research question
4. refine research questions (opportunity tree is used here)
5. define research proposal

What is a management/research dilemma?
- Usually a symptom of an underlying problem
- Usually not hard to identify
- Research dilemma may be either a problem or an opportunity. At this stage you may even have identified
symptoms rather than problems or opportunities.

What is a management question?
- Management dilemma, restated in question form
- Defined in terms of the underlying problem
- Preferably clearly linked to an important KPI (key performance indicator)
- Still managerial in nature; does not specify the research that needs to be done; questions are still abroad
(Further discussion is needed)

What is the difference between a top-down and a bottom-up approach?
- Top-down approach: with a data warehouse increasing data is cleaned and organized into a single
consistent schema before being put into the warehouse.
- Analysis is done directly on the curated warehouse data.
- Pro: Consistency consensus and shared best practices
- Con: No domain knowledge and responsiveness

- Bottom-up approach: with a data lake, incoming data goes into the lake in its raw form.
- We select and organize data for each need.
- Pro: Autonomy, agility, innovation, domain expertise
- Con: Lack of: management, consensus, analytical models, governance

What are research questions? And how do they differ from management questions?
- To find the research question, you have to think about possible management actions to solve the dilemma.
- RQ: asks what research should be conducted; information oriented
- MQ: asks what the decision maker needs to do; action oriented

,What are the three domains of which data science consists of? (guest lecture)

1. Domain Expertise (business): This domain involves having a deep
understanding of the subject matter that the data relates to, in
order to effectively analyze and interpret it. This can also be
considered as Business Intelligence, which is the ability to use data
and analysis to drive business decisions.

2. Technical data engineering: This domain involves using
programming and database management skills to extract, clean,
and organize large sets of data. This includes Data Management,
Data Integration, Data Governance, and Data Quality.

3. Math & Statistical knowledge: This domain involves using statistical
techniques and algorithms to analyze and understand patterns in
data. This includes statistical modeling, probability theory,
optimization, and hypothesis testing.

Explain the opportunity tree and its five components
- From research question to analysis questions
- Sub-questions & factors: who, what, where, when, why
o Use the five W’s for the sub-questions & factors
- Opportunity tree:
1. Business challenge (MQ)
2. Sub-business challenge (sub-MQ’s)
3. Sub-questions (RQ; who, what, where, when, why, which)
4. Factors
5. Hypotheses

What are the four steps discussed when defining a problem? Regarding HBR article.
Step 1: Establish the need for a solution
- What is the basic need?
- What is the desired outcome?
- Who stands to benefit and why?
Step 2: Justify the need
- Is the effort aligned with our strategy?
- What are the desired benefits for the company, and how will we measure them?
- How will we ensure that a solution is implemented?
Step 3: Conceptualize the problem
- What approaches have we tried?
- What have others tried?
- What are the internal and external constraints on implementing a solution?
Step 4: Write the problem statement
- Is the problem actually many problems?
- What requirements must a solution meet?
- Which problem solvers should we engage?
- What information and language should the problem statement conclude?

, Week 2

Name the four components of the ‘Data Science Value Creation Model” and explain how value for a firm is
created by using data science

1. Value objectives (V2F, V2C)
2. Data assets
3. Analytics
4. Value creation
- Capabilities

This model starts with value objectives
that have to be set before developing a
data science strategy. The core data
science strategy elements are data assets
and analytics, which then should lead to
value creation. The data science strategy
should be enabled by data science
capabilities.

Why can’t you just use excel/a spreadsheet for all that data?
- Excel has limited rows; storage space is limited
- Efficiency issues: due to duplicates
- Storage space is limited

What are the two perspectives on value creation?
1. Value to the customer (V2C)
2. Value to the firm (V2F)

What is a database?
A database is a collection of information that is organized so that it can easily be accessed, managed, and updated.

Explain how to balance both value creation perspectives on one dimension each (four cases)




What are fields and records?
A database consists of multiple tables: ‘spreadsheets’ with columns (fields) and rows (records)

What are the three characteristics of big data?
Big data itself has also changed the data landscape. Big data has specific characteristics known as the 3Vs of big data,
posing specific challenges for researchers and managers.
1. Increasing data Volume
2. Increasing data Velocity
3. Increasing data Variety

Written for

Institution
Study
Course

Document information

Uploaded on
January 28, 2023
Number of pages
21
Written in
2022/2023
Type
SUMMARY

Subjects

12.05 $
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
SDaan99 Rijksuniversiteit Groningen
Follow You need to be logged in order to follow users or courses
Sold
52
Member since
10 year
Number of followers
46
Documents
25
Last sold
2 year ago

4.0

4 reviews

5
1
4
2
3
1
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions