VISUALIZATION
EXAM
What do pie charts use and what are the key aspects? - ANSWERS-
angle and area and arc length. the last two are key
What are line charts best for? - ANSWERS-comparing over time
What are bar charts used for? - ANSWERS-comparing between groups
Heckbert, Nice numbers - ANSWERS-optimizes the number of tick
marks, but for small numbers the range can be much larger than the data
range, you can drop them but then it will become uneven
Wilkson's algorithm - ANSWERS-format + fontsize + orientation +
overlap / 4
Sturge's formula - ANSWERS-ceil(logN + 1) for k number of bins in
histogram
Scott's choice - ANSWERS-h=3.5stddev/N^1/3 for k number of bins in
histogram
END OF
PAGE
1
, CSE 578 DATA LATEST
VISUALIZATION
EXAM
Freedman-Diaconis - ANSWERS-h=2IQR(x)N^(-1/3) for k number of
bins in histogram
Common choice for k bins in hist - ANSWERS-sqrt(N)
Quantiles - ANSWERS-points taken at a regular interval from the
cumulative distribution function of a random variable
Steps to building a boxplot - ANSWERS-1. Sort data
2. Find max number of data points that fall in it: ceil((quantile number *
number of samples)/ number of quantiles)
3. Find the first number of samples calculated above, the quantile is
equal to the max of that group
Q-Q plots - ANSWERS-compare probability distribution by plotting
quantiles against each other
More powerful than histograms, sample size don't need to be equal
END OF
PAGE
2
, CSE 578 DATA LATEST
VISUALIZATION
EXAM
Two main ways to present multi-variate datasets - ANSWERS-1.
Directly (tables)
2. Symbolically (graphs)
Scagnostics - ANSWERS-graph-theoretic for detecting a variety of
structural anomalies in a geometric graph representation of a scatterplot
Determines notable relationships b/w two variables
9 scagnostic measures - ANSWERS-1. Outlying
2. Sparse
3. Striated
4. Skinny
5. Monotic
6. Skewed
7. Clumpy
8. Convex
9. Stringy
END OF
PAGE
3