Big Data Storage Solution | 2026 Update
with complete solutions
Q1. Which of the following best defines “big data”?
A) Data that is larger than 1 terabyte
B) Data characterized by high volume, velocity, variety,
veracity, and value
C) Data stored in a single Excel file
D) Data that cannot be processed by any computer
Answer: B
Rationale: The 5 V’s are the standard definition: Volume (scale),
Velocity (speed of ingestion), Variety (structured, semi-structured,
unstructured), Veracity (uncertainty/quality), and Value (business
benefit).
,Page 2 of 124
Q2. A company collects streaming sensor data from 10,000 IoT
devices at 100 Hz. This is a challenge of:
A) Volume only
B) Velocity
C) Variety
D) Veracity
Answer: B
Rationale: High data generation rate (100 records per second
per device) imposes velocity requirements for ingestion and
storage.
Q3. (Scenario) An e-commerce platform stores customer
clickstream logs (JSON), transaction records (relational), and
product images (JPEG). This illustrates which big data
characteristic?
A) Volume
,Page 3 of 124
B) Velocity
C) Variety
D) Veracity
Answer: C
Rationale: Multiple data formats (structured, semi-structured,
unstructured) require a storage solution that handles variety.
Q4. The “CAP theorem” states that a distributed data system can
provide at most two of:
A) Cost, Availability, Performance
B) Consistency, Availability, Partition tolerance
C) Compression, Access speed, Privacy
D) Capacity, Atomicity, Persistence
Answer: B
Rationale: CAP theorem guides design trade-offs in distributed
, Page 4 of 124
systems; e.g., many NoSQL databases favor AP (availability +
partition tolerance) over strong consistency.
Q5. In a distributed storage system, “partition tolerance” means:
A) The system can be split into multiple parts arbitrarily
B) The system continues to operate despite network partitions
(message loss or delay between nodes)
C) Data is divided into fixed-size blocks
D) The system cannot tolerate any failures
Answer: B
Rationale: Partition tolerance is a requirement for large-scale
distributed systems; networks are unreliable.
Q6. Which consistency model guarantees that once a write
completes, all subsequent reads (from any node) see that write?
A) Eventual consistency