College of Economic and Management Sciences
⋄
STA4820 — ASSIGNMENT 01
Semester 1 Assignment 01 — 2026
⋄
Module Code: STA4820
Module Name: Applied Statistics
Assignment No.: 01
Due Date: 2026
Semester: Semester 1, 2026
Submitted in partial fulfilment of the requirements for STA4820
at the University of South Africa.
,UNISA | STA4820 Assignment 01 — 2026
Question 1: Statistical Concepts (Catalysts Experiment)
Question: A certain reaction was run several times using catalysts. The catalysts were sup-
posed to control the yield of an undesirable side product. 44 runs were selected, and 20 were
from catalyst A to control the yield.
a) What is the population of interest? (2)
b) What is the sample? (2)
c) What is the parameter? (2)
d) What is the statistic? (2)
e) Does the value 44 refer to the parameter or to the statistic? (2)
f) Is the value 20 a parameter or a statistic? (2)
1.a) Population of Interest
The population of interest is the entire collection of units about which information is
sought (Berger and Casella, 2002). In this experiment, the population of interest is all runs
of the reaction using catalysts (across all possible experimental conditions) where the
catalysts are intended to control the yield of the undesirable side product. This includes ev-
ery possible run that could ever be performed under the same experimental setup, not merely
those that were observed.
Key Distinction
The population is conceptually infinite here because the reaction could be run an un-
limited number of times. The 44 selected runs are simply a window into this larger,
theoretically boundless collection.
1.b) The Sample
A sample is a subset of the population that is actually observed and measured (Devore,
2016). The sample here consists of the 44 runs that were selected from the population of
all possible catalyst runs. These 44 observations represent the data actually collected and used
to draw conclusions about the broader population.
Page 1 of 17
,UNISA | STA4820 Assignment 01 — 2026
1.c) The Parameter
A parameter is a numerical summary that describes a characteristic of the entire population
(Walpole, Myers and Myers, 2012). In this context, the parameter is the true proportion
(or mean yield) of all runs across the entire population of reactions that would be
controlled by the catalysts. Because the population is never fully observed, the true parameter
value remains unknown and must be estimated from the sample.
Implementation Insight
Parameters are fixed, though unknown, quantities. The goal of statistical inference is
precisely to estimate these population parameters from observed sample statistics.
1.d) The Statistic
A statistic is a numerical summary computed from the sample data (Devore, 2016). The
statistic here is any summary measure calculated from the 44 selected runs, for example the
sample proportion of runs controlled by a specific catalyst, or the sample mean
yield computed from the 44 observations. Unlike the parameter, a statistic is observable and
calculable.
1.e) Does 44 Refer to the Parameter or the Statistic?
The value 44 refers to the statistic. It is the sample size, meaning it is the count of runs
that were actually selected and observed, not the total count of all runs in the population.
Since 44 is derived from and describes the sample, it is a statistic. Specifically, it is the num-
ber of observations in the sample from which all other statistics will be computed.
1.f ) Is 20 a Parameter or a Statistic?
The value 20 is a statistic. It represents the number of runs within the sample of 44 that
came from catalyst A. Because 20 is a count derived from the sample, and not from the entire
population of all possible catalyst A runs, it qualifies as a sample statistic. It could be used to
estimate the population parameter, that is, the true proportion of all reactions controlled by
catalyst A.
Page 2 of 17
,UNISA | STA4820 Assignment 01 — 2026
Key Distinction
Parameter vs Statistic summary for Question 1:
Concept Definition This Question
Population All possible catalyst runs All runs ever conducted
Sample The 44 observed runs 44 selected runs
Parameter Summary of population True yield across all runs
(unknown)
Statistic Summary of sample 44 (sample size); 20 (catalyst
A count)
Page 3 of 17
, UNISA | STA4820 Assignment 01 — 2026
Question 2: Graphical Representation of Catalyst Data
Question: A certain reaction was run several times using each of two catalysts, A and B. The
catalysts were supposed to control the yield of an undesirable side product. Results, in units
of percentage yield, for 24 runs of catalyst A and 20 runs of catalyst B are as follows:
Rating Results
1. Catalyst A 20
2. Catalyst B 35
3. Catalyst C 70
4. Catalyst D 40
5. Catalyst E 35
Draw a graph that describes the data. What does the graph tell you? (6)
2.1 Bar Chart of Catalyst Percentage Yields
The data presents five catalyst ratings with their corresponding percentage yields. A bar
chart is the most appropriate graphical technique for comparing discrete categorical groups
(catalysts) against a continuous numerical outcome (percentage yield) (Devore, 2016). Each
bar represents one catalyst, and the height of the bar corresponds to the recorded percentage
yield.
2.2 Interpretation of the Graph
The bar chart reveals the following:
• Catalyst C produces the highest percentage yield at 70%, indicating it is the least effec-
tive at suppressing the undesirable side product (since a higher yield of the side product is
undesirable).
• Catalyst A produces the lowest percentage yield at 20%, suggesting it is the most effec-
tive catalyst for controlling the undesirable side product.
• Catalysts B and E are tied at 35%, indicating comparable performance between the
two.
• Catalyst D falls in the middle at 40%.
Page 4 of 17