SUMMARY
Dataset
all the data collected for a particular analysis.
Element
the entity on which data is collected.
Variable
a characteristic of interest of an element.
Observation
the variables associated with an individual element.
Categorical
use numeric or ordinal values of measurement of categories.
Quantitative
use numeric (quantitative) measures.
Cross-sectional
data collected at a similar point in time.
Time Series
data collected over several time periods.
Panel
,combination of cross-sectional and time series data.
Descriptive Statistics
describe data or variables.
Population
is the set of all data/variables of a statistical analysis.
Sample
is a subset of the population.
Statistical Inference
uses data from a sample to make estimates and test hypothesis
about the characteristics of a population.
Frequency Distribution
a tabular summary of data showing the number (i.e. frequency)
of observations in each of several non-overlapping categories.
Relative Frequency
frequency of a class/ n of a class.
Percent Frequency
relative frequency * 100.
Bar Chart
a visual display of frequency; relative frequency & percent
frequency distributions.
Pie Chart
, a visual display of frequency; relative frequency & percent
frequency distributions.
Number of Classes
Typically, between 5 and 20. Small datasets have less; larger
datasets have more.
Width of the Class
Generally, it should be the same for each class. Approximate
class width = (largest data value - smallest data value)/number
of classes.
Class Limits
each data observation must only belong to one class.
Relative Frequency Distributions
frequency of the class/n.
Analytics
the scientific process of transforming data for decision making.
Descriptive Analytics
which describe what has happened in the past.
Predictive Analytics
uses statistical models from past data to predict the future
[forecasting] or access the impact of one variable on another
[inference].
Prescriptive Analytics