Abstract
Big data is the term used for processing a large number of data sets to reveal
patterns in the hidden data. Big data has played a great role in the progress of
businesses as well as scientific research but it contradicts the same data
privacy and security with the increased use of big data. The goal of this report
is to study Big data and discuss the privacy and security risks attached to it. The
paper also provides an overview of privacy preservation mechanisms in big
data with challenges in current methods and how can they be made more
secure. We will also focus on the limitations of Big data and how it is feasible.
The report will also look into the increasing demand for Big data and how
important it would be in the future.
Big Data
Over the last decade, the volume and the variety of data produced by and
about the individual have increased many folds. The main reason for such a
huge increase in data is the internet, social media, and the web. This increased
data has led to big data analytics for the available data to generate meaningful
insight. Big data refers to the large data sets that are so much complex that the
traditional data processing algorithms are insufficient. It consists of a large
amount of both structured and unstructured data that inundates the business
day-to-day (Jain et al., 2016). The big is involves diverse information and data
sets including images, text, audio, and video.
To define the characteristics of Big data, a mechanism of 5 V’s is described. It
includes Velocity, Volume, Variety, Veracity, and Value and together these put
a great impact on the outcome of the data. Let's have a glance at 5 V’s of big
data and see how are they important in big data and what part they play:
1. Velocity
This refers to the high speed at which the data is collected, stored, created,
processed, and analyzed by the database.
, 2. Variety
It is another important aspect of big data. Since we collect a huge amount of
variable types of data, so out data includes structured, unstructured, and raw
types of data which can be critical to place them in databases. So for good
results, we must know the data category
3. Veracity
The Veracity of big data is crucial as it tends to deal with clean data. When
dealing with big data, you might get dirty data when gathering the data sets
and thus the results and accuracy of the analysis depends on data veracity
4. Volume
It refers to the collection of a large amount of data collected since huge data
volume is a critical part of big data. It is also crucial as most companies fail to
process huge data sets.
5. Value
There is great potential value in the usage of big data unless you get the return
on investment that is value generated which could be beneficial for the
company. If it has no value then it would be costly and time-consuming to
store the data in company infrastructure (Ishwarappa and Anuradha, 2015)
Even though there are
great potential values in
usage of Big Data unless
there is a return on
investment (value
generated) for the
company; it would be very