MIS 441 Exam 1 – Study Guide, Practice Questions & Exam Review
*INTRO* - ✔✔----------------------------------
What you can learn from
MIS 441? - ✔✔-Unique project experience on real-life large-scale data sets (Rapidminer).
-Cutting-edge business intelligence techniques for decision making.
-Insightful business values discovered from real data and businesses.
Learning Objectives - ✔✔-Knowledge of Business Intelligence and learn tools and techniques
to gain meaningful information from corporate and external data.
-Techniques include data cleaning and preparation, data mining and understanding of results.
-Use of Rapid Miner to analyze both structured and unstructured data.
-Understanding of, and an appreciation for the complexities of, data mining unstructured data
such as text data including documents, web pages, emails, etc.
-Social networks and mobile and location based analytics and the importance of this analysis
-Recommender systems and the advantages for companies using them
New City, New Life - Findings and Interpretation - ✔✔*Less traffic, higher price*
-Not surprising since residential areas have lots of STOP signs and shorter/narrower road
segments.
-Plus heavy traffic makes living less enjoyable
--> thus lowers home values
*DATA MINING* - ✔✔----------------------------------
,What is Data Mining? - ✔✔1. Data mining (knowledge discovery from data)
-Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful)
patterns or knowledge from huge amount of data
2. Alternative names
-Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis,
data archeology, data dredging, information harvesting, *business intelligence*, etc.
3. Watch out: Is everything "data mining"?
-(Deductive) query processing.
-Expert systems or small ML/statistical programs
What is Data Mining?
(
Real Example from NBA) - ✔✔-Play-by-play information recorded by teams
--> Who is on the court
--> Who shoots
--> Results
-Coaches want to know what works best
--> Plays that work well against a given team
--> Good/bad player matchups
-Advanced Scout (from IBM Research) is a data mining tool to answer these questions
Data Mining: Classification Schemes - ✔✔-General functionality
--> Descriptive data mining
--> Predictive data mining
-Different views, different classifications
--> Kinds of data to be mined
--> Kinds of knowledge to be discovered
,--> Kinds of techniques utilized
--> Kinds of applications adapted
Why Data Mining? - ✔✔*Data analysis and decision support*
1. Market analysis and management
-Target marketing, customer relationship management (CRM), market basket analysis, cross
selling, market segmentation
2. Risk analysis and management
-Forecasting, customer retention, improved underwriting, quality control, competitive analysis
3. Fraud detection and detection of unusual patterns (outliers)
Market Analysis and Management - ✔✔1. Where does the data come from?
-Credit card transactions, loyalty cards, discount coupons, customer complaint calls, plus
(public) lifestyle studies
2. Target marketing
-Find clusters of "model" customers who share the same characteristics: interest, income level,
spending habits, etc.
-Determine customer purchasing patterns over time
3. Cross-market analysis
-Associations/co-relations between product sales, & prediction based on such association
4. Customer profiling
-What types of customers buy what products (clustering or classification)
5. Customer requirement analysis
-Identifying the best products for different customers
-Predict what factors will attract new customers
6. Provision of summary information
-Multidimensional summary reports
, -Statistical summary information (data central tendency and variation)
Market Basket Analysis (MBA) - ✔✔-algorithm that examines a long list of transactions in order
to determine which items are *most frequently purchased together*
-takes its name from the idea of a person in a supermarket throwing all of their items into a
shopping cart (a "market basket")
Market Basket Analysis Questions - ✔✔1. Who makes purchases?
2. What do customers buy together?
3. In what order do customers purchase items?
Benefits of Market Basket Analysis - ✔✔-A good indication of consumer behavior
-Increase in sales
-Improves customer satisfaction
-Tracks what types of products interest consumer and finds relative alternative ones to
introduce to the consumer.
Association Rules for MBA - ✔✔1. Support
2. Confidence
-rules are *unidirectional*
-Left-hand side rule IMPLIES Right-hand side rule
-ex. Pasta IMPLIES Wine, but Wine IMPLIES Pasta may not hold
Support - ✔✔an indication of how frequently the item-set appears in the database
*INTRO* - ✔✔----------------------------------
What you can learn from
MIS 441? - ✔✔-Unique project experience on real-life large-scale data sets (Rapidminer).
-Cutting-edge business intelligence techniques for decision making.
-Insightful business values discovered from real data and businesses.
Learning Objectives - ✔✔-Knowledge of Business Intelligence and learn tools and techniques
to gain meaningful information from corporate and external data.
-Techniques include data cleaning and preparation, data mining and understanding of results.
-Use of Rapid Miner to analyze both structured and unstructured data.
-Understanding of, and an appreciation for the complexities of, data mining unstructured data
such as text data including documents, web pages, emails, etc.
-Social networks and mobile and location based analytics and the importance of this analysis
-Recommender systems and the advantages for companies using them
New City, New Life - Findings and Interpretation - ✔✔*Less traffic, higher price*
-Not surprising since residential areas have lots of STOP signs and shorter/narrower road
segments.
-Plus heavy traffic makes living less enjoyable
--> thus lowers home values
*DATA MINING* - ✔✔----------------------------------
,What is Data Mining? - ✔✔1. Data mining (knowledge discovery from data)
-Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful)
patterns or knowledge from huge amount of data
2. Alternative names
-Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis,
data archeology, data dredging, information harvesting, *business intelligence*, etc.
3. Watch out: Is everything "data mining"?
-(Deductive) query processing.
-Expert systems or small ML/statistical programs
What is Data Mining?
(
Real Example from NBA) - ✔✔-Play-by-play information recorded by teams
--> Who is on the court
--> Who shoots
--> Results
-Coaches want to know what works best
--> Plays that work well against a given team
--> Good/bad player matchups
-Advanced Scout (from IBM Research) is a data mining tool to answer these questions
Data Mining: Classification Schemes - ✔✔-General functionality
--> Descriptive data mining
--> Predictive data mining
-Different views, different classifications
--> Kinds of data to be mined
--> Kinds of knowledge to be discovered
,--> Kinds of techniques utilized
--> Kinds of applications adapted
Why Data Mining? - ✔✔*Data analysis and decision support*
1. Market analysis and management
-Target marketing, customer relationship management (CRM), market basket analysis, cross
selling, market segmentation
2. Risk analysis and management
-Forecasting, customer retention, improved underwriting, quality control, competitive analysis
3. Fraud detection and detection of unusual patterns (outliers)
Market Analysis and Management - ✔✔1. Where does the data come from?
-Credit card transactions, loyalty cards, discount coupons, customer complaint calls, plus
(public) lifestyle studies
2. Target marketing
-Find clusters of "model" customers who share the same characteristics: interest, income level,
spending habits, etc.
-Determine customer purchasing patterns over time
3. Cross-market analysis
-Associations/co-relations between product sales, & prediction based on such association
4. Customer profiling
-What types of customers buy what products (clustering or classification)
5. Customer requirement analysis
-Identifying the best products for different customers
-Predict what factors will attract new customers
6. Provision of summary information
-Multidimensional summary reports
, -Statistical summary information (data central tendency and variation)
Market Basket Analysis (MBA) - ✔✔-algorithm that examines a long list of transactions in order
to determine which items are *most frequently purchased together*
-takes its name from the idea of a person in a supermarket throwing all of their items into a
shopping cart (a "market basket")
Market Basket Analysis Questions - ✔✔1. Who makes purchases?
2. What do customers buy together?
3. In what order do customers purchase items?
Benefits of Market Basket Analysis - ✔✔-A good indication of consumer behavior
-Increase in sales
-Improves customer satisfaction
-Tracks what types of products interest consumer and finds relative alternative ones to
introduce to the consumer.
Association Rules for MBA - ✔✔1. Support
2. Confidence
-rules are *unidirectional*
-Left-hand side rule IMPLIES Right-hand side rule
-ex. Pasta IMPLIES Wine, but Wine IMPLIES Pasta may not hold
Support - ✔✔an indication of how frequently the item-set appears in the database