Frequency distributions
Bar graphs
Histograms
Associations between variables
Grouped bar graph
Mosaic plot
Boxplot
Scatterplot
Dot plot
Histogram
Area = frequency distribution of a numerical variable
• Height = frequency
• Must choose number/width of bins
Scatterplot
identify relationships, correlations (positive, negative, or none), and patterns
between two numerical, continuous variables, particularly with large datasets
Outliers
extreme values that don't appear to belong with the rest of the data (extreme
skews)
Numerical variables
take numerical values, and it is sensible toadd, subtract, or take averages with
those values
Categorial variables
the responses themselves are categories(the possible values are called the
variable's levels)
ordinal
, ordered, ranked categories (ie: bronze, silver, and gold in olympics)
nominal
label, classify, or name distinct groups that have no inherent, quantitative order
or ranking (ie: different breeds of dogs or different brands of soda)
discrete
only take on distinct, countable values, typically whole numbers, with clear
spaces between them (ie: 1, 100, 20, 57)
continuoius
variable that can take on an infinite number of values within a given range,
including decimals and fractions (ie: Wind speed, Body temperature, Beak
length)
Univariate
One set of data - Describe the data (ie: snout vent length (SVL) of salamander)
Bivariate
Two sets of data - Describe the data + Correlatethe two types of data (ie: SVL vs.
elevation)
Multivariate
> 2 sets of data - Describe the data + Manyoptions! (ie: SVL vs. elevation and
species and diet and ...)
histogram vs barplots
histograms show the distribution of continuous numerical data using touching
bars (representing ranges or "bins"), while bar plots compare discrete,
categorical data using separated bars
central tendencies
mean, median, mode
mean