Class notes

Statistics

Rating

Sold

Pages

Uploaded on

10-09-2024

Written in

2021/2022

I N T R O D U C T I O N A N D D E S C R I P T I V E S TATI S T I C S

Institution

Course

Content preview

lOMoARcPSD|4942262

Bstats Notes

Business Statistics (University of Technology Sydney)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university
Downloaded by Shebnoor Ahmed ()

, lOMoARcPSD|4942262

LECTURE 1: INTRODUCTION AND DESCRIPTIVE
S TAT I S T I C S I

TYPES OF DATA

QUALITATIVE/CATEGORICAL

 Mutually exclusive labels (one label cannot mean two things)
 Not often numbers, if so, numbers have no mathematical meaning
- Nominal: ordering/ranking makes no sense, numerical labels are arbitrary
- Ordinal: ordering/ranking has meaning/can be interpreted, numerical labels respect
the ordering

QUANTITATIVE/NUMERICAL

 Numbers used to record certain events, numbers have mathematical meaning
- Interval: quantity in difference is meaningful, but in ratio is not; zero has no natural
meaning
- Ratio: difference and ratio of two quantities is also meaningful; zero is meaningful

WORKING WITH CATEGORICAL DATA

 Intuitive to tabulate and visualise, technique is frequency distribution
 Frequency counts: total no of occurrences for each category
 Relative frequency: fraction/proportion of the total no of data items belonging to that
category
 Percent frequency: relative frequency x 100 (%)
 Excel function COUNTIF, technique to use is frequency counts
 To visualise: histogram (categories on x-axis, frequency/relative frequency/percent
frequency on y-axis) or pie chart

INTERMEZZO: THE LANGUAGE

 Random variable (r.v.): a variable whose value appears randomly
- usually denoted by capital letters
- Realisations/observations of an r.v. are denoted by lowercase letters
- e.g. N and n denote the size/number of observations - N is referred to population size,
n denotes sample size (no of data points collected in a sample)
 Population: collection of people, objects or items of interest; complete pool of certain
random variable
 Sample: subset of a population; random collection of a certain size from the population
 Probability distribution: general shape of probability for values that a random variable
may assume

DESCRIPTIVE STATISTIC: CENTRAL TENDENCY

Downloaded by Shebnoor Ahmed ()

, lOMoARcPSD|4942262

 Measure of central tendency yields info about the centre of a set of numbers (distribution
of a r.v.’s) – does not focus on the span of the dataset or how far values are from middle
numbers
 gives an idea of what a typical, middle, or average that a r.v. can take
 sometimes called measures of location

THREE MEASURES OF CENTRAL TENDENCY
Mode - most frequently occurring value in a set of data
- In case of a tie for the most frequently occurring value, two modes are listed
and the data is said to be bimodal
- Datasets with two or more modes are referred to as multimodal
- Concept of mode is often used in determining sizes
- Appropriate descriptive summary measure for categorical data

Media - middle value in an ordered array of numbers
n n+1
- A way to locate the median is by finding the th term in the ordered array
2
- Large and small values do not inordinately influence the median – hence the
best measure of location to use in the analysis of variables in which extreme but
acceptable values can occur at just one end of the data
- Not all info from the dataset is used
- Data must be quantitative or be able to be ranked
Mean - Average of a set of numbers
- Sample mean is represented by X
- Population mean is represented by 
- Data should be quantitative as it needs to be summed
- Affected by all values – advantage because it reflects all the data, but
disadvantage because extreme values pull the mean towards extremes

 Can consider population mean or sample mean – if you denote r.v. by X , you have:
- Population mean is denoted by  or E( X) , computed by

- Sample mean is denoted by X , computed by


Outlier: observation of the r.v. of interest whose value is far outside the range of other
realisations – often biases impressions about the distribution of r.v. in the dataset, we
may want to correct for such biases/simply remove such a data point

Downloaded by Shebnoor Ahmed ()

, lOMoARcPSD|4942262

DESCRIPTIVE STATISTIC: VARIABILITY

 Measures of variability yield info about the likelihood of a realisation of the r.v. is away
from the centre of its distribution, describes the spread/dispersion of a dataset
 Gives an idea of fluctuation and volatility across realisations of the r.v.
 The more variability in a dataset, the less typical they are of the whole set
 Using measures of variability in conjunction with measures of central tendency makes
possible a more complete numerical description of the data (measure of variability is
necessary to complement the mean value when describing data)

FIVE MEASURES OF VARIABILITY
Range - Maximum – minimum
- Crude measure of variability
- Advantage: ease of calculation; disadvantage: affected by extreme
values (thus application as a measure of variability is limited)
Inter-quartile - Distance between the first and third quartiles, IQR = Q 3−Q 1
range - Essentially the range of the middle 50% of the data
- useful when there is interest in values towards the middle rather
than values in the extremes
Variance - one is obtained from the other, they are presented together
- Variance and standard deviation measure out how spread out a r.v.
Standard is, the large the more spread out
deviation - involves considering how far each data value is from the mean and
describing this dispersion on average
- subtracting the mean from each value of data yields the deviation
from the mean: x−¿ - negative deviations represent values below
the mean, positive deviations represent values above the mean

VARIANCE
- Average squared distance between data points and their mean
- Sum of squared deviations from the mean of a set of values is called
the sum of squares of x : SS x

STANDARD DEVIATION
- Square root of the variance – has the same unit of the original data
- Estimate of the average distance that individual values are away
from the mean

Coefficient of - Standard deviation ÷ mean
variation

Downloaded by Shebnoor Ahmed ()

Report Copyright Violation

Written for

Institution: University of Technology Sydney (UTS )
Course: 26134 Business Statistics (26134)

All documents for this subject (2)

Document information

Uploaded on: September 10, 2024
Number of pages: 36
Written in: 2021/2022
Type: Class notes
Professor(s): Unknown
Contains: Business statistics

Subjects

business
statistics
theory
sampling distribution
sampling error
normal distribution
probability distribution function

$10.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

ahmednoor53

Get to know the seller

ahmednoor53

View profile

Sold

Member since

1 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller ahmednoor53. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $10.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 48421 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Statistics

Content preview

Written for

Document information

Subjects

Get to know the seller

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?