Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Other

It is about Big data basic fundamentals with deep understandable knowladge

Rating
-
Sold
-
Pages
64
Uploaded on
17-04-2024
Written in
2023/2024

In my document big understanding about big data fundamentals which tools are required for practically perform about it it is helpful for new learner

Institution
Course

Content preview

Data science Big Data
Kilobytes to Terabytes
Exabytes to Zettabytes
Big data is the data that has high
volume,variety,velocity,veracity, and high value
These are also known as five V’s of big data
Data used in which industries
Banking for transactional analysis
Healthcare to help doctors work through patient diagnoses
In energy sector General technology consumer technology and
Manufacturing
Big data alone will fetch 11.5 million jobs by 2026

Course Outline
1.Hadoop Architecture Distributed storage and YRAN
2.Introduction to Big Data and Hadoop
3.Data Ingestion into Big DataETL
4.Distributed processing Reduce Framework and Pig
5.Appache Hive
6.No SQL Databases:HBase
7.Basics of functiofunctional programming and scala
8.Apache spark: Next Generation Big data Framework
9.Process RDDs
10. Soark SQL:Processing dataFrames
11.Stream Processing Frameworks and Spark Streaming
12.Spark Graph X
Project Highlights
HDFS and MapReduce using flume and setting up Kafka
using Hive and also HBase for your data storage
In these projects you will use Hadoop features to predict
patterns and share actionable insights for a car insurance
coampny also usehigh features for data engineering and
analysis of New York stock exchange data You’ll learn

,sentiment analysis on employee review data gathered from
google, Netflix and Facebook and perform product and
customer segmentation to increase the sales of Amazon .
Understand the concepts of Big Data
Explain Hadoop and how it addresses Big Data Challenges
Describe the concepts of Hadoop Ecosystem
*Introduction to Big Data
The Traditional Decision making process is based on we think
In other words our perception of the task at hand also
experience and intuition.
Decision are made based on past experiences and personal
instincts
1.Rule of thumb
Decisions are made based on preconceived guidelines rather
than facts.
*Challenges of Traditional Decision Making
Takes a time to arrive at a decision, therefore losing the
competitive advantage.
Requires human intervention at various stages
Lacks systematic linkage among strategy, planning, execution
and reporting
Provides limited scope of data analytics, that is, it provides
only a bird’s eye view
Add obstructs company’s ability to make fully informed
decisions
*The solution:Big Data Analytics
The decision-making is based on what you know which in turn
is based on data analytics
It provides a comprehensive view of the overall picture which
is a result of analyzing data from various sources.
It provides steamlined decision-making from top to bottom.
Big data analytics helps in analyzing unstructured data

,It helps in faster decision making thus improving the
competitive advantage
*Case Study: Google’s self-Driving Car
A google car is equipped with numerous sensors which collect
data on the real time surroundings. According to the an
estimate each car produces around one gigabyte of sensor data
per second.

Would add up to two petabytes of data per year per car
Assuming the average car owner drives 600 hours in a year.
Of courae not all this data is important and will have to be
stored.
Currently the serves needed to process this data are in the
car’s trunk and require their own large cooling system.
The data produced by the car can be classified in three ways.
1.Technical Data
Learning about avoiding obstacles such as cones or cyclists.
It’s the data that comes from the cars sensors and is the
analysed by cars machine learning algorithms .
2.Community Data
Crowd sourced data about traffic and driving conditions from
ways like platforms.
3.Personal Data
Writer’s personal preferences regarding driving locations
indoor temperature in car entertainment etc. Also serving to
improve the user experience.
Google’s autonomous cars collect all three kinds of data but
most of the processing power is used in consolidating technical
and community data. Most of the data coming in real time and
cars need to make split second decisions based on it.
*Big data Analytics pipeline
1.Data ingestion layer
Data coming from variable sources to start its journey.

, Data here is prioritized and categorized which makes data
flow smoothly in further layers.
The data collector layer
In this layer more focuses on the transportation of data from
the ingestion layer to the rest of the data pipeline. It is this
layer where components are decoupled so that analytic
capabilities may begin.
2.Data processing Layer
In this primary layer the focus is to specialize the data pipeline
processing system.
Here we route the data to a different destination classify the
data flow and it’s the first point where the analytics may take
place.
3.Data storage Layer
Finding a storage solution is very much important when the
size of your data becomes very large.
This layer focuses on where to store such a large dataset
efficiently.
4.Query Layer
Where active analytical processing takes place.
Here the primary focus is to gather the data value so that is
made more helpful for the next layer.
5.Data visualization Layer
at the visualization or presentation tire is the most prestigious
tire.
Where the data pipeline users may feel the value of the data.

What is big data?
Big Data refers to extremely large data sets that may be
analysed computationally to reveal patterns, trends, and
associations, especially relating to human behaviour and
interactions.

*Big Data at a Glance

Written for

Institution
Course

Document information

Uploaded on
April 17, 2024
Number of pages
64
Written in
2023/2024
Type
OTHER
Person
Unknown

Subjects

$8.49
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
mansijani122

Get to know the seller

Seller avatar
mansijani122 CPI
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
2 year
Number of followers
0
Documents
1
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions