Chapter 2 Slides (A+ 100% Correct
Solutions)
knowledge - ANS-Understanding, awareness, or familiarity acquired through education,
or experienced; anything that has been learned, perceived, discovered, inferred, or
understood; the ability to use information. In knowledge management system,
knowledge is information in action.
data quality - ANS-The holistic quality of data, including their accuracy, precision,
completeness, and relevance.
analytics ready - ANS-A state of preparedness for analytics projects, especially as it
relates to data acquisition and preparedness.
Data source reliability - ANS-refers to the originality and appropriateness of the storage
medium where the data is obtained-answering the questions of "Do we have the right
confidence and belief in this data source?"
Data content accuracy - ANS-means that data re correct and are a good match for the
analytics problem-answering the question of "Do we have the right data for the job?"
The data should represent what was intended or defined by the original source of the
data.
Data accessibility - ANS-means that the data are easily and readily
obtainable-answering the question of "Can we easily get to the data when we need to?"
Data security and data privacy - ANS-means that the data is secured to only allow those
people who have the authority and the need to access it and to prevent anyone else
from reaching it.
Data richness - ANS-means that all the required data elements are included in the data
set. In essence, richness (or comprehensiveness) means that the available variables
portray a rich enough dimensionality of the underlying subject matter for an accurate
and worthy analytics study. It also means that the information content is complete (or
near complete) to build a predictive and/or prescriptive analytics model.
, Data consistency - ANS-means that the data are accurately collected and
combined/merged.
Data currency/data timliness - ANS-means that the data should be up-to-date (or as
recent/new as it needs to be) for a given analytics model. It also means that the data is
recorded at or near the time of the event or observation so that the time-delay-related
misrepresentation (incorrectly remembering and encoding) of the data is prevented.
Data granularity - ANS-requires that the variables and data values be defined at the
lowest (or as low as required) level of detail for the intended use of the data.
Data validity - ANS-is the term used to describe a match/mismatch between the actual
and expected data values of a given variable. As part of data definition, the acceptable
values or value ranges for each data element must be defined.
Data relevancy - ANS-means that the variables in the data set are all relevant to the
study being conducted.
Data - ANS-Raw facts that are meaningless by themselves (e.g., names, numbers).
Structured data - ANS-Data that is formatted (often into tables with rows and columns)
for computers to easily understand and process.
unstructured data - ANS-Data that do not have a predetermined format and are stored
in the form of textual documents.
Semi-structured data - ANS-XML, HTML, Log files, etc.
Data taxonomy - ANS-A structured representation of the sub-groups/subtypes of data.
categorical data - ANS-Represents the labels of multiple classes used to divide a
variable into specific groups. Examples include race, sex, age group, and education
level.
Nominal data - ANS-contain measurements of simple codes assigned to objects as
labels, which are not measurements. For example, the variable marital status can be
generally categorized as (1) single, (2) married, and (3) divorced.
Ordinal data - ANS-contain codes assigned to objects or events as labels that also
Solutions)
knowledge - ANS-Understanding, awareness, or familiarity acquired through education,
or experienced; anything that has been learned, perceived, discovered, inferred, or
understood; the ability to use information. In knowledge management system,
knowledge is information in action.
data quality - ANS-The holistic quality of data, including their accuracy, precision,
completeness, and relevance.
analytics ready - ANS-A state of preparedness for analytics projects, especially as it
relates to data acquisition and preparedness.
Data source reliability - ANS-refers to the originality and appropriateness of the storage
medium where the data is obtained-answering the questions of "Do we have the right
confidence and belief in this data source?"
Data content accuracy - ANS-means that data re correct and are a good match for the
analytics problem-answering the question of "Do we have the right data for the job?"
The data should represent what was intended or defined by the original source of the
data.
Data accessibility - ANS-means that the data are easily and readily
obtainable-answering the question of "Can we easily get to the data when we need to?"
Data security and data privacy - ANS-means that the data is secured to only allow those
people who have the authority and the need to access it and to prevent anyone else
from reaching it.
Data richness - ANS-means that all the required data elements are included in the data
set. In essence, richness (or comprehensiveness) means that the available variables
portray a rich enough dimensionality of the underlying subject matter for an accurate
and worthy analytics study. It also means that the information content is complete (or
near complete) to build a predictive and/or prescriptive analytics model.
, Data consistency - ANS-means that the data are accurately collected and
combined/merged.
Data currency/data timliness - ANS-means that the data should be up-to-date (or as
recent/new as it needs to be) for a given analytics model. It also means that the data is
recorded at or near the time of the event or observation so that the time-delay-related
misrepresentation (incorrectly remembering and encoding) of the data is prevented.
Data granularity - ANS-requires that the variables and data values be defined at the
lowest (or as low as required) level of detail for the intended use of the data.
Data validity - ANS-is the term used to describe a match/mismatch between the actual
and expected data values of a given variable. As part of data definition, the acceptable
values or value ranges for each data element must be defined.
Data relevancy - ANS-means that the variables in the data set are all relevant to the
study being conducted.
Data - ANS-Raw facts that are meaningless by themselves (e.g., names, numbers).
Structured data - ANS-Data that is formatted (often into tables with rows and columns)
for computers to easily understand and process.
unstructured data - ANS-Data that do not have a predetermined format and are stored
in the form of textual documents.
Semi-structured data - ANS-XML, HTML, Log files, etc.
Data taxonomy - ANS-A structured representation of the sub-groups/subtypes of data.
categorical data - ANS-Represents the labels of multiple classes used to divide a
variable into specific groups. Examples include race, sex, age group, and education
level.
Nominal data - ANS-contain measurements of simple codes assigned to objects as
labels, which are not measurements. For example, the variable marital status can be
generally categorized as (1) single, (2) married, and (3) divorced.
Ordinal data - ANS-contain codes assigned to objects or events as labels that also