BUSINESS STATISTICS
Definition:
It is the science of collecting, organizing, presenting, analyzing and interpreting data to assist in
making more effective decisions.
Types of Statistics
(a) Descriptive statistics: it’s a tabular, graphical and numerical method for organizing and
summarizing information clearly and effectively relating to either a population or sample.
(b) Inferential statistics: are the methods of drawing and measuring the reliability of
conclusions about a statistical population based on information from a sample data set.
A population is a collection of all possible individuals, objects or measurements
of interest.
A sample part or sub set of the population of interest.
Variables:
A variable is a measurable characteristic that assumes different values among the subjects.
Types of variables
(a) Independent variables: It is a variable that a researcher manipulates in order to determine
its effect or influence on another variable. They predict the amount of variation that occurs
in other variables.
(b) Dependent variables: It is the variable that is measured, predicted or monitored and is
expected to be affected by manipulation of an independent variable. They attempt to indicate
the total influence arising from the effects of the independent variable. It varies as a function
of the independent variable e.g. influence of hours studied on performance in a statistical
test, influence of distance from the supply center on cost of building materials.
The above variables can either be qualitative or quantitative variables:-
i. Qualitative variables: Are variables that are non-numeric i.e. attributes e.g. Gender,
Religion, Colour, State of birth etc.
ii. Quantitative variables: are numeric variables. They can either be discrete or
continuous.
Discrete variables: Are variables, which can only assume certain values
i.e. whole numbers. Are always counted.
Continuous variables: Are variables, which can assume any value within
a specific range. Are always measured e.g. height, temperature, weight,
radius etc.
Levels of measurement
There are four levels of measurement; nominal, ordinal, interval and ratio.
(a) Nominal level. The observations are classified under a common characteristic e.g. sex, race,
marital status, employment status, language, religion etc. helps in sampling.
,(b) Ordinal level: items or subjects are not only grouped into categories, but they are ranked
into some order e.g. greater than, less than, superior, happier than, poorer, above etc. helps
in developing a likert scale.
(c) Interval level: numerals are assigned to each measure and ranked. The intervals between
numerals are equal. The numerals used represent meaningful quantities but the zero point is
not meaningful e.g. test scores, temperature.
(d) Ratio level: has all the characteristics of the other levels and in addition the zero point is
meaningful. Mathematical operations can be applied to yield meaningful values e.g. height,
weight, distance, age, area etc.
Characteristics of statistical data
They are aggregate of facts e.g. total sales of a firm for one year.
They are affected to a marked extent by a multiplicity of causes e.g. volume of wheat
production depends on rainfall, soil fertility, seeds etc
They are numerically expressed e.g. population of Kenya increased by 4 million during the
year 2004.
They are estimated according to a reasonable standard of accuracy e.g. 90% accuracy
They are collected in a systematic manner.
They are collected for a predetermined purpose
They should be placed in relation to each other.
Uses and users of statistics
1. Government :
Monitoring economic and social trends
Forecasting
Policy making
2. Individuals
Leisure activities
Community work
Personal finances
Gambling
3. Academia
Testing hypothesis
Developing new theories
Consultancy services
4. Businesses
Planning and control
Quality control especially for the manufacturers
Forecasting i.e. planning production schedules, advertising expenditures etc.
Auditing
By Gladys Kimutai Page 2 of 74
, Determining production costs e.g. by using regression and correlation, one can determine
the relationship between two variables like costs and methods of production, advertising
and sales etc.
It gives relevant information for decision-making.
1.0 Limitations of statistics
Deals with aggregate facts and not individual items.
Deals mainly with quantitative characteristics and not qualitative characteristics like
honesty, efficiency etc.
The results are only true on an average and under certain conditions.
Statistics can be misused i.e. wrong interpretation. It requires experience and skill to draw
sensible conclusions from the data.
Statistics may not provide the best solution under all circumstances.
2.0 Data collection
Data can be collected from primary and / or secondary sources.
Secondary data consists of information that already exists somewhere having been calculated for
another purpose e.g. in government publications, periodicals, journals, books etc.
3.0 Advantages
- Low in cost
- Readily available
4.0 Disadvantages
- The data needed might not exist
- The existing data might be outdated, inaccurate, incomplete and unreliable.
Primary data consists of original information gathered for the specific purpose through
observation, interviews and questionnaires.
5.0 Advantages
- It is relevant
- Its accurate
6.0 Disadvantages
- It is costly
- It is time consuming
By Gladys Kimutai Page 3 of 74
, Presentation of data
Presentation of data refers to the classification and tabulation of data. Classification of data refers
to the act of arranging the data in groups or classes according to some resemblance of the data in
each group or class. Tabulation of data is the arrangement of statistical data in columns and rows.
Frequency distribution
A frequency distribution is a grouping of data into mutually exclusive categories showing the
number of observations in each category.
Steps
Decide on the number of classes
Determine the class interval or width
Set the individual class limits
Tally the values into the classes
Count the number of items in each class
A class interval is the difference between the lower limit of the class and the lower limit of the
next class.
A class midpoint / class mark is the middle point between the lower and the upper class limit.
Graphical representation of a frequency distribution
1. Histogram: It is a graph in which classes are marked on the horizontal axis and the class
frequencies on the horizontal axis and the class frequencies on the vertical axis. The class
frequencies are represented by the heights of the bars and the bars are drawn adjacent to each
other.
2. Frequency polygons: The class midpoints are connected with a line segment.
3. Cumulative frequency polygons
Less than cumulative frequency polygons
More than cumulative frequency polygons
4. Line charts: Show the change in a variable over time
5. Bar chart: Make use of rectangles to present the given data. Can be vertical, horizontal
or component.
6. Pie charts: different segments of a circle represent percentage contribution of various
components to the total.
7. Graphs
8. Pictograms: pictures are used to represent data.
Example
(a) The data below indicates the marks attained by students in a statistical test. Construct a
frequency distribution table with 10 classes
12 24 40 50 56 72
By Gladys Kimutai Page 4 of 74
Definition:
It is the science of collecting, organizing, presenting, analyzing and interpreting data to assist in
making more effective decisions.
Types of Statistics
(a) Descriptive statistics: it’s a tabular, graphical and numerical method for organizing and
summarizing information clearly and effectively relating to either a population or sample.
(b) Inferential statistics: are the methods of drawing and measuring the reliability of
conclusions about a statistical population based on information from a sample data set.
A population is a collection of all possible individuals, objects or measurements
of interest.
A sample part or sub set of the population of interest.
Variables:
A variable is a measurable characteristic that assumes different values among the subjects.
Types of variables
(a) Independent variables: It is a variable that a researcher manipulates in order to determine
its effect or influence on another variable. They predict the amount of variation that occurs
in other variables.
(b) Dependent variables: It is the variable that is measured, predicted or monitored and is
expected to be affected by manipulation of an independent variable. They attempt to indicate
the total influence arising from the effects of the independent variable. It varies as a function
of the independent variable e.g. influence of hours studied on performance in a statistical
test, influence of distance from the supply center on cost of building materials.
The above variables can either be qualitative or quantitative variables:-
i. Qualitative variables: Are variables that are non-numeric i.e. attributes e.g. Gender,
Religion, Colour, State of birth etc.
ii. Quantitative variables: are numeric variables. They can either be discrete or
continuous.
Discrete variables: Are variables, which can only assume certain values
i.e. whole numbers. Are always counted.
Continuous variables: Are variables, which can assume any value within
a specific range. Are always measured e.g. height, temperature, weight,
radius etc.
Levels of measurement
There are four levels of measurement; nominal, ordinal, interval and ratio.
(a) Nominal level. The observations are classified under a common characteristic e.g. sex, race,
marital status, employment status, language, religion etc. helps in sampling.
,(b) Ordinal level: items or subjects are not only grouped into categories, but they are ranked
into some order e.g. greater than, less than, superior, happier than, poorer, above etc. helps
in developing a likert scale.
(c) Interval level: numerals are assigned to each measure and ranked. The intervals between
numerals are equal. The numerals used represent meaningful quantities but the zero point is
not meaningful e.g. test scores, temperature.
(d) Ratio level: has all the characteristics of the other levels and in addition the zero point is
meaningful. Mathematical operations can be applied to yield meaningful values e.g. height,
weight, distance, age, area etc.
Characteristics of statistical data
They are aggregate of facts e.g. total sales of a firm for one year.
They are affected to a marked extent by a multiplicity of causes e.g. volume of wheat
production depends on rainfall, soil fertility, seeds etc
They are numerically expressed e.g. population of Kenya increased by 4 million during the
year 2004.
They are estimated according to a reasonable standard of accuracy e.g. 90% accuracy
They are collected in a systematic manner.
They are collected for a predetermined purpose
They should be placed in relation to each other.
Uses and users of statistics
1. Government :
Monitoring economic and social trends
Forecasting
Policy making
2. Individuals
Leisure activities
Community work
Personal finances
Gambling
3. Academia
Testing hypothesis
Developing new theories
Consultancy services
4. Businesses
Planning and control
Quality control especially for the manufacturers
Forecasting i.e. planning production schedules, advertising expenditures etc.
Auditing
By Gladys Kimutai Page 2 of 74
, Determining production costs e.g. by using regression and correlation, one can determine
the relationship between two variables like costs and methods of production, advertising
and sales etc.
It gives relevant information for decision-making.
1.0 Limitations of statistics
Deals with aggregate facts and not individual items.
Deals mainly with quantitative characteristics and not qualitative characteristics like
honesty, efficiency etc.
The results are only true on an average and under certain conditions.
Statistics can be misused i.e. wrong interpretation. It requires experience and skill to draw
sensible conclusions from the data.
Statistics may not provide the best solution under all circumstances.
2.0 Data collection
Data can be collected from primary and / or secondary sources.
Secondary data consists of information that already exists somewhere having been calculated for
another purpose e.g. in government publications, periodicals, journals, books etc.
3.0 Advantages
- Low in cost
- Readily available
4.0 Disadvantages
- The data needed might not exist
- The existing data might be outdated, inaccurate, incomplete and unreliable.
Primary data consists of original information gathered for the specific purpose through
observation, interviews and questionnaires.
5.0 Advantages
- It is relevant
- Its accurate
6.0 Disadvantages
- It is costly
- It is time consuming
By Gladys Kimutai Page 3 of 74
, Presentation of data
Presentation of data refers to the classification and tabulation of data. Classification of data refers
to the act of arranging the data in groups or classes according to some resemblance of the data in
each group or class. Tabulation of data is the arrangement of statistical data in columns and rows.
Frequency distribution
A frequency distribution is a grouping of data into mutually exclusive categories showing the
number of observations in each category.
Steps
Decide on the number of classes
Determine the class interval or width
Set the individual class limits
Tally the values into the classes
Count the number of items in each class
A class interval is the difference between the lower limit of the class and the lower limit of the
next class.
A class midpoint / class mark is the middle point between the lower and the upper class limit.
Graphical representation of a frequency distribution
1. Histogram: It is a graph in which classes are marked on the horizontal axis and the class
frequencies on the horizontal axis and the class frequencies on the vertical axis. The class
frequencies are represented by the heights of the bars and the bars are drawn adjacent to each
other.
2. Frequency polygons: The class midpoints are connected with a line segment.
3. Cumulative frequency polygons
Less than cumulative frequency polygons
More than cumulative frequency polygons
4. Line charts: Show the change in a variable over time
5. Bar chart: Make use of rectangles to present the given data. Can be vertical, horizontal
or component.
6. Pie charts: different segments of a circle represent percentage contribution of various
components to the total.
7. Graphs
8. Pictograms: pictures are used to represent data.
Example
(a) The data below indicates the marks attained by students in a statistical test. Construct a
frequency distribution table with 10 classes
12 24 40 50 56 72
By Gladys Kimutai Page 4 of 74