20MA2E04 STATISTICS FOR BIG DATA ANALYTICS 3003
COURSE OUTCOMES
On successful completion of the course, the students will able to
CO1: Understand the foundations of big data and Essentials of Statistics.
CO2: Use the advanced Statistical methods in solving the problems of big data analytics.
CO3: Analyse the data using Testing of Hypothesis.
CO4: Analyse the variance using ANOVA.
DATA SCIENCE AND BIG DATA ANALYTICS 10
Classification of Digital Data, Structured and Unstructured Data – Introduction to Big Data:
Characteristics – Evolution – Applications of Data - Data Warehouse and Hadoop Environment. Big
Data Analytics: Classification of Analytics – Challenges - Data Science Terminologies used in Big Data
Environments - Soft State Eventual Consistency - Top Analytics Tools.
STATISTICAL ANALYTICS 17
Introduction to Machine Learning, Data Mining, Text Mining. Naïve Bayesian classifier, decision tree
classifier – categorization using K – means clustering and association rules – predictive modelling using
Linear, Multiple and logistics regression.
HYPOTHESIS TESTING 18
Sampling Techniques- Estimation- Parametric Test- Z Test-Single Mean and difference of means-t test-
F Test-Chi Square test- Non-Parametric Test- Kruskal Wallis Test-Mann Whitney U Test. Design of
Experiments, ANOVA - One Way and Two-Way classification.
Total Periods: 45
TEXT BOOKS
1. Seema Acharya, Subhashini Chellappan, “Big Data and Analytics”, Wiley Publications, Fourth
Edition, 2020
2. Sheldon M. Ross, “Probability and Statistics for Engineers and Scientists”, Elsevier India Private
Ltd, 6th Edition, 2020.
REFERENCE BOOKS
1. Neil A. Weiss, “Introductory Statistics”, Pearson Education, 10 th Edition, 2017.
2. Ronald E. Walpole, Raymond H. Myers and Sharon L. Myers, “Probability & Statistics for
Engineers and Scientists”, Prentice Hall of India Private Ltd 9 th Edition,2017.
3. Morris H DeGroot, Mark J Schervish, “Probability and Statics”, Addison Wesly, 4 th Edition, 2015.
4. Michael Berthold, David J. Hand Intelligent Data AnalysisSpringer2007
WEB REFERENCES
1. https://www.coursera.org/lecture/big-data-language-1/1-1-introduction-to-big-data-and-
language-6tLnz
2. https://www.coursera.org/learn/big-data-introduction
3. https://nptel.ac.in/courses/106/104/106104189/
4. https://nptel.ac.in/courses/106/106/106106142/
COURSE OUTCOMES
On successful completion of the course, the students will able to
CO1: Understand the foundations of big data and Essentials of Statistics.
CO2: Use the advanced Statistical methods in solving the problems of big data analytics.
CO3: Analyse the data using Testing of Hypothesis.
CO4: Analyse the variance using ANOVA.
DATA SCIENCE AND BIG DATA ANALYTICS 10
Classification of Digital Data, Structured and Unstructured Data – Introduction to Big Data:
Characteristics – Evolution – Applications of Data - Data Warehouse and Hadoop Environment. Big
Data Analytics: Classification of Analytics – Challenges - Data Science Terminologies used in Big Data
Environments - Soft State Eventual Consistency - Top Analytics Tools.
STATISTICAL ANALYTICS 17
Introduction to Machine Learning, Data Mining, Text Mining. Naïve Bayesian classifier, decision tree
classifier – categorization using K – means clustering and association rules – predictive modelling using
Linear, Multiple and logistics regression.
HYPOTHESIS TESTING 18
Sampling Techniques- Estimation- Parametric Test- Z Test-Single Mean and difference of means-t test-
F Test-Chi Square test- Non-Parametric Test- Kruskal Wallis Test-Mann Whitney U Test. Design of
Experiments, ANOVA - One Way and Two-Way classification.
Total Periods: 45
TEXT BOOKS
1. Seema Acharya, Subhashini Chellappan, “Big Data and Analytics”, Wiley Publications, Fourth
Edition, 2020
2. Sheldon M. Ross, “Probability and Statistics for Engineers and Scientists”, Elsevier India Private
Ltd, 6th Edition, 2020.
REFERENCE BOOKS
1. Neil A. Weiss, “Introductory Statistics”, Pearson Education, 10 th Edition, 2017.
2. Ronald E. Walpole, Raymond H. Myers and Sharon L. Myers, “Probability & Statistics for
Engineers and Scientists”, Prentice Hall of India Private Ltd 9 th Edition,2017.
3. Morris H DeGroot, Mark J Schervish, “Probability and Statics”, Addison Wesly, 4 th Edition, 2015.
4. Michael Berthold, David J. Hand Intelligent Data AnalysisSpringer2007
WEB REFERENCES
1. https://www.coursera.org/lecture/big-data-language-1/1-1-introduction-to-big-data-and-
language-6tLnz
2. https://www.coursera.org/learn/big-data-introduction
3. https://nptel.ac.in/courses/106/104/106104189/
4. https://nptel.ac.in/courses/106/106/106106142/