Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

CMPE172 Final Exam questions and answers with complete solutions verified graded a++ latest update

Rating
-
Sold
-
Pages
45
Grade
A+
Uploaded on
07-07-2025
Written in
2024/2025

CMPE172 Final Exam questions and answers with complete solutions verified graded a++ latest update

Institution
Course

Content preview

7/7/25, 5:44
PM




CMPE172 Final Exam questions and answers with

complete solutions verified graded a++ latest update

Terms in this set (137)




Major problems faced by majorly falls under three V's:

Volume: Facebook generates 500 TB of data

Big Data every day. Twitter generates 8TB of data daily

Velocity: Need of framework which is

capable of high-speed data Variety:

Computations of data from various sources

have varied formats

A Big Data_requires three components:

-A scalable and available storage mechanism,

architecture such as a distributed filesystem or database

-A distributed compute engine, for processing and

querying the data at scale

-Tools to manage the resources and services used to

implement these systems

Big_systems come in two general forms:

data -NoSQL databases that integrate these components

https://quizlet.com/504905432/cmpe172-final- 1/45
flash-cards/

,7/7/25, 5:44
PM

into a database system,

-Environments like Hadoop

In Big Data, Data is ingested to tier - HDFS (Hadoop

persistence Distributed File

System), AWS S3, SQL and NoSQL databases.

-Flume for log aggregation, Sqoop for interoperating

with databases used.

_____is a technology to store massive datasets on a

cluster of cheap machines in a

Hadoop distributed manner. a registered trademark of

the Apache software foundation. basically split

files into the large blocks and distribute them

across the clusters,

transfer code into nodes to process data in

parallel. Datasets processed faster and more

efficiently.

_____consists of three core components:

◦ Hadoop Distributed File

Hadoop System (HDFS) - It is the

storage layer of Hadoop.

◦ Map-Reduce - It is the data processing layer of


https://quizlet.com/504905432/cmpe172-final- 2/45
flash-cards/

,7/7/25, 5:44
PM
Hadoop.

◦ YARN - It is the resource management layer of

Hadoop.

_____utilizes a simple programming model to perform

the required operation

Hadoop among clusters.

All modules in Hadoop are designed with a

fundamental assumption that hardware failures

are common occurrences and should be

dealt with by the framework.

It runs the application using the MapReduce algorithm

_____algorithm:

◦ Data is processed in parallel on different CPU nodes.

◦ Capable of running on clusters of computers

◦ Could perform a complete statistical analysis for a

huge amount of data.
Map Reduce
-is the data processing layer of Hadoop:

-Data processes in two phases:

Map Phase- This phase applies business

logic to the data. The input data gets

converted into key-value pairs. Reduce Phase-


https://quizlet.com/504905432/cmpe172-final- 3/45
flash-cards/

, 7/7/25, 5:44
PM

The Reduce phase takes as input the

output of Map Phase. It applies aggregation based on

the key of the key-value pairs.

Hadoop Distributed File System ( ) - It is the storage

layer of Hadoop.

-Master is a high-end machine where metadata is

stored

-Slaves are inexpensive computers.

-The Big Data files get divided into the number

HDFS of blocks. Hadoop stores these blocks in a

distributed fashion on the cluster of slave

nodes.

-HDFS has two daemons running

NameNode: Responsible for maintaining,

monitoring and managing DataNodes.

Records the metadata of the files like the

location of blocks, file size, permission,

hierarchy etc

DataNode: Runs on the slave machine. Stores

the actual business data. Serves the read-write

request from the user

____ - Yet Another Resource Negotiator:
https://quizlet.com/504905432/cmpe172-final- 4/45
flash-cards/

Written for

Course

Document information

Uploaded on
July 7, 2025
Number of pages
45
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$11.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF


Also available in package deal

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
NurseAdvocate chamberlain College of Nursing
Follow You need to be logged in order to follow users or courses
Sold
497
Member since
2 year
Number of followers
77
Documents
12046
Last sold
1 day ago
NURSE ADVOCATE

I have solutions for following subjects: Nursing, Business, Accounting, statistics, chemistry, Biology and all other subjects. Nursing Being my main profession line, I have essential guides that are Almost A+ graded, I am a very friendly person: If you would not agreed with my solutions I am ready for refund

4.6

239 reviews

5
193
4
14
3
15
2
6
1
11

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions