BIG DATA
IN THIS STUDY UNIT
SHIFT TO BIG DATA
BENEFITS AND PROBLEMS OF
BIG DATA
DATA ANALYSIS
1 Introduction
We are indeed in one of the fastest-changing technology phases in human history. The
world’s most valuable resource is no longer oil, but data. Big Data is a term for a
collection of data which is so large that it becomes difficult to store and process using
traditional databases and data-processing applications (ACC_CIMAKP_E3,2021:482).
It describes data sets so large and varied they are beyond the capability of traditional
data processing (ACC_CIMAKP_E1, 2021:138). Companies are experiencing a rapid
growth in the volume of data. This data is sourced from different areas of the business,
for example, transactional data and access to trillions of bytes of information about
customers, vendors, employees, operational and productive process.
Big Data often also includes more than simply financial information and can involve other
organisational data (both internal and external), which is often unstructured. Data that
inputs into Big Data systems can include social network traffic, web server logs, traffic
flow information, satellite imagery, streamed audio content, banking transactions, web
, page histories and content, government documentation, GPS tracking, telemetry from
vehicles, and financial market data.
2 The shift to Big Data
Traditionally, businesses gathered structured information on relevant issues from a
variety of sources and placed them into a database, or data warehouse. As the world is
increasingly moving towards digitisation (and especially through the growth of the
internet), almost all information relating to the organisation and its environment can be
stored electronically. The amount of unstructured data generated by electronic
interactions has increased significantly, through e-mails, online shopping, text
messages, social media sites as well as various electronic devices (such as
smartphones), which gather and transmit data. In fact, it is estimated that around 90%
of the information in the world today has been created in the last few years.
(ACC_CIMAKP_E3,2021:482).
The amount of data which businesses must store and interrogate has increased at an
exponential rate, requiring new tools and techniques to make the most of them.
Leveraging this resource for visualisation, structure and support, optimal decision-
making has become a commercialised privilege for many companies. Visualisation tools
like Power Bi and SAP Lumira have become a well-used tool amongst companies with
substantial data lakes (systems of repository of data stored in its natural or raw format)
that want to sort and visualise it in an easy and understandable way. This includes the
amount of personal data available to and used by organisations. In this regard, the
privacy, sensitivity and security of data are significant considerations in modern
business. In the context of the organisation, privacy refers to all information that is
considered confidential and in need of protection from public disclosure. The Protection
of Personal Information Act (POPIA) provides for a general prohibition on the processing
of special personal information. Special personal information includes information
relating to the health, political persuasion, race or ethnic origin, or criminal behaviour of
the data subject. POPIA will be dealt with in study unit 15.
2.1 THE FEATURES OF BIG DATA
According to Gartner (2021), Big Data can be described using the “3Vs”: