UNIVERSITY OF CENTRAL FLORIDA
Course Code: NUR3165
Instructor:
Date: 2026
FINAL EXAM – NURSING RESEARCH
A team of nurse researchers at UCF Health is investigating whether a
structured hourly rounding protocol reduces fall rates in a medical-
surgical unit compared to standard nursing care. They randomly assign
120 patients to either the intervention group (hourly rounding) or the
control group (standard care) over a 12-week period. Fall incidents are
recorded for both groups. Which research design is being used, and
what is the primary strength of this design in establishing causality?
A) Descriptive correlational design; strength lies in identifying associations
between variables without manipulation B) Randomized controlled trial
(RCT); random assignment controls for confounding variables by distributing
known and unknown confounders equally between groups, allowing causal
inference C) Quasi-experimental pre-test/post-test design; strength lies in
measuring change within the same group over time D) Cross-sectional
survey design; strength lies in capturing data from large populations at a
single point in time
Correct Answer: B
Rationale: A randomized controlled trial (RCT) is the gold standard
experimental design for establishing cause-and-effect relationships. The
defining feature is random assignment of participants to intervention and
control groups, which distributes both known and unknown confounding
variables approximately equally between groups. This means any observed
difference in outcome (fall rates) can be more confidently attributed to the
intervention (hourly rounding) rather than pre-existing group differences. In
this study, the independent variable is the hourly rounding protocol, the
dependent variable is fall rate, and random allocation controls for
confounders such as age, fall history, and medications. UCF nursing
students must understand that RCTs produce the highest level of evidence in
1
,the evidence-based practice hierarchy for intervention research, surpassed
only by systematic reviews and meta-analyses of multiple RCTs.
A nurse researcher is conducting a phenomenological study exploring
the lived experience of nurses who provided frontline care during the
COVID-19 pandemic. She recruits 12 ICU nurses from three hospitals
and conducts in-depth semi-structured interviews, each lasting
approximately 90 minutes. Data collection continues until no new
themes emerge from subsequent interviews. Which qualitative concept
describes the point at which the researcher stops recruiting
participants, and which philosophical tradition most directly underpins
phenomenological inquiry?
A) Statistical power; the positivist philosophical tradition emphasizing
objective measurement B) Data saturation (theoretical saturation); the
interpretivist or constructivist philosophical tradition rooted in Husserlian and
Heideggerian phenomenology C) Representational sampling adequacy; the
pragmatist philosophical tradition emphasizing utility of findings D) Content
validity threshold; the post-positivist philosophical tradition emphasizing
probabilistic truth
Correct Answer: B
Rationale: Data saturation (also called theoretical saturation) is the
qualitative methodological criterion used to determine sample adequacy,
defined as the point at which no new themes, categories, or patterns emerge
from additional data collection. It replaces the concept of statistical power
used in quantitative research. Phenomenology is grounded in the
interpretivist-constructivist philosophical tradition and is specifically
concerned with understanding the essence of human lived experience as it is
consciously experienced by individuals. Edmund Husserl founded
transcendental phenomenology, emphasizing "bracketing" (epoché) of the
researcher's preconceptions to describe the pure phenomenon. Martin
Heidegger's hermeneutic phenomenology emphasizes the role of the
researcher's being-in-the-world and interpretation of meaning. Nurses
conducting phenomenological research must articulate which philosophical
tradition they are working within because it determines their approach to data
collection, analysis, and researcher positioning. A sample of 12 is
appropriate for in-depth phenomenological work, which values depth over
breadth.
A nursing researcher wants to measure patient satisfaction with pain
management following abdominal surgery. She develops a new 20-item
Likert-scale instrument. Before using it in her study, she tests it on 30
post-surgical patients at two separate time points, one week apart, and
obtains an intraclass correlation coefficient (ICC) of 0.91. She also asks
2
,three expert nurses in pain management to evaluate whether all items
adequately represent the construct of pain management satisfaction,
obtaining an overall content validity index (CVI) of 0.72. How should the
researcher interpret these two findings, and what action is warranted
regarding the content validity index?
A) ICC of 0.91 indicates poor reliability; CVI of 0.72 is excellent and exceeds
the recommended threshold of 0.70 B) ICC of 0.91 indicates excellent test-
retest reliability; CVI of 0.72 falls below the recommended minimum
threshold of 0.80, indicating that some items do not adequately represent the
construct and should be revised or replaced before using the instrument C)
ICC of 0.91 indicates acceptable inter-rater reliability; CVI of 0.72 is within
acceptable range for instruments used with diverse populations D) Both
values are within acceptable ranges and the instrument is ready for use in
the study without modification
Correct Answer: B
Rationale: The intraclass correlation coefficient (ICC) measures test-retest
reliability, specifically the consistency of scores obtained from the same
participants at two time points. An ICC of 0.91 is excellent, meeting the
widely accepted threshold of greater than 0.75 for good reliability and greater
than 0.90 for excellent reliability (Koo and Mae, 2016). However, the content
validity index (CVI) measures the degree to which instrument items represent
the intended content domain, rated by a panel of subject matter experts. The
recommended minimum acceptable CVI for a multi-item scale is 0.80 (Lynn,
1986; Polit and Beck), meaning at least 80% of expert raters agree that each
item is relevant. A CVI of 0.72 falls below this threshold, indicating that
approximately 28% of expert raters identified items that poorly represent pain
management satisfaction. The researcher must convene an expert panel
review to identify and revise the problematic items before proceeding with
data collection, regardless of the instrument's strong reliability.
A systematic review published in a nursing journal concludes that
structured nurse-led discharge education significantly reduces 30-day
hospital readmission rates among heart failure patients (pooled risk
ratio 0.68, 95% CI 0.54 to 0.85, p = 0.001, I² = 62%). A nurse manager at
UCF Health is considering implementing this protocol. How should the
nurse manager interpret the I² statistic of 62%, and what does it mean
for applying these findings to practice?
A) I² of 62% indicates that 62% of the included studies found statistically
significant results, strengthening the recommendation for implementation B)
I² of 62% indicates substantial heterogeneity among the included studies,
meaning that approximately 62% of the variance in effect sizes is attributable
to true differences between studies rather than chance, requiring careful
examination of the sources of heterogeneity before broad implementation C)
3
, I² of 62% represents the percentage of studies with low risk of bias, indicating
that 38% of studies had significant methodological flaws that should be
disregarded D) I² of 62% indicates moderate agreement between
independent reviewers on data extraction, which is acceptable for meta-
analysis
Correct Answer: B
Rationale: The I² statistic (I-squared) is a measure of statistical heterogeneity
in meta-analysis, quantifying what proportion of variability in effect estimates
across included studies is due to true between-study differences rather than
sampling error (chance). Interpretation thresholds per Higgins et al.
(Cochrane Handbook): 0 to 40% = low heterogeneity; 30 to 60% = moderate
heterogeneity; 50 to 90% = substantial heterogeneity; 75 to 100% =
considerable heterogeneity. An I² of 62% indicates substantial heterogeneity,
meaning the studies included in this meta-analysis varied meaningfully in
their populations, interventions, comparators, outcomes, or methods. This
does not invalidate the meta-analysis but requires the nurse manager to
examine subgroup analyses and moderator variables (patient demographics,
education format, follow-up duration) to understand which specific
implementation contexts produced the strongest effects. Simply adopting the
pooled estimate without contextualizing it to the local population introduces
implementation risk. UCF nursing students must understand that statistical
significance and pooled effect size alone are insufficient for practice
translation when substantial heterogeneity exists.
A researcher is studying the relationship between nurses' years of
experience and their confidence in performing evidence-based practice
(EBP) behaviors. She collects data from 200 nurses and calculates a
Pearson correlation coefficient of r = 0.34, p = 0.001. A colleague argues
this finding is clinically meaningless because the r value is small. The
researcher counters that statistical significance and clinical (practical)
significance are distinct. Which concept captures the magnitude of
practical importance independent of sample size and p-value, and how
should r = 0.34 be interpreted?
A) Internal validity; r = 0.34 indicates a strong relationship that explains most
of the variance in EBP confidence B) Effect size; r = 0.34 represents a small-
to-medium effect size by Cohen's conventions (small = 0.10, medium = 0.30,
large = 0.50), explaining approximately 11.6% of variance in EBP confidence
(r² = 0.116), which has modest but meaningful practical significance C)
Statistical power; r = 0.34 indicates that the study was adequately powered
to detect a large effect at alpha 0.05 D) Construct validity; r = 0.34 confirms
the instrument validly measures the intended constructs
Correct Answer: B
Rationale: Effect size quantifies the magnitude of a relationship or difference
4
Course Code: NUR3165
Instructor:
Date: 2026
FINAL EXAM – NURSING RESEARCH
A team of nurse researchers at UCF Health is investigating whether a
structured hourly rounding protocol reduces fall rates in a medical-
surgical unit compared to standard nursing care. They randomly assign
120 patients to either the intervention group (hourly rounding) or the
control group (standard care) over a 12-week period. Fall incidents are
recorded for both groups. Which research design is being used, and
what is the primary strength of this design in establishing causality?
A) Descriptive correlational design; strength lies in identifying associations
between variables without manipulation B) Randomized controlled trial
(RCT); random assignment controls for confounding variables by distributing
known and unknown confounders equally between groups, allowing causal
inference C) Quasi-experimental pre-test/post-test design; strength lies in
measuring change within the same group over time D) Cross-sectional
survey design; strength lies in capturing data from large populations at a
single point in time
Correct Answer: B
Rationale: A randomized controlled trial (RCT) is the gold standard
experimental design for establishing cause-and-effect relationships. The
defining feature is random assignment of participants to intervention and
control groups, which distributes both known and unknown confounding
variables approximately equally between groups. This means any observed
difference in outcome (fall rates) can be more confidently attributed to the
intervention (hourly rounding) rather than pre-existing group differences. In
this study, the independent variable is the hourly rounding protocol, the
dependent variable is fall rate, and random allocation controls for
confounders such as age, fall history, and medications. UCF nursing
students must understand that RCTs produce the highest level of evidence in
1
,the evidence-based practice hierarchy for intervention research, surpassed
only by systematic reviews and meta-analyses of multiple RCTs.
A nurse researcher is conducting a phenomenological study exploring
the lived experience of nurses who provided frontline care during the
COVID-19 pandemic. She recruits 12 ICU nurses from three hospitals
and conducts in-depth semi-structured interviews, each lasting
approximately 90 minutes. Data collection continues until no new
themes emerge from subsequent interviews. Which qualitative concept
describes the point at which the researcher stops recruiting
participants, and which philosophical tradition most directly underpins
phenomenological inquiry?
A) Statistical power; the positivist philosophical tradition emphasizing
objective measurement B) Data saturation (theoretical saturation); the
interpretivist or constructivist philosophical tradition rooted in Husserlian and
Heideggerian phenomenology C) Representational sampling adequacy; the
pragmatist philosophical tradition emphasizing utility of findings D) Content
validity threshold; the post-positivist philosophical tradition emphasizing
probabilistic truth
Correct Answer: B
Rationale: Data saturation (also called theoretical saturation) is the
qualitative methodological criterion used to determine sample adequacy,
defined as the point at which no new themes, categories, or patterns emerge
from additional data collection. It replaces the concept of statistical power
used in quantitative research. Phenomenology is grounded in the
interpretivist-constructivist philosophical tradition and is specifically
concerned with understanding the essence of human lived experience as it is
consciously experienced by individuals. Edmund Husserl founded
transcendental phenomenology, emphasizing "bracketing" (epoché) of the
researcher's preconceptions to describe the pure phenomenon. Martin
Heidegger's hermeneutic phenomenology emphasizes the role of the
researcher's being-in-the-world and interpretation of meaning. Nurses
conducting phenomenological research must articulate which philosophical
tradition they are working within because it determines their approach to data
collection, analysis, and researcher positioning. A sample of 12 is
appropriate for in-depth phenomenological work, which values depth over
breadth.
A nursing researcher wants to measure patient satisfaction with pain
management following abdominal surgery. She develops a new 20-item
Likert-scale instrument. Before using it in her study, she tests it on 30
post-surgical patients at two separate time points, one week apart, and
obtains an intraclass correlation coefficient (ICC) of 0.91. She also asks
2
,three expert nurses in pain management to evaluate whether all items
adequately represent the construct of pain management satisfaction,
obtaining an overall content validity index (CVI) of 0.72. How should the
researcher interpret these two findings, and what action is warranted
regarding the content validity index?
A) ICC of 0.91 indicates poor reliability; CVI of 0.72 is excellent and exceeds
the recommended threshold of 0.70 B) ICC of 0.91 indicates excellent test-
retest reliability; CVI of 0.72 falls below the recommended minimum
threshold of 0.80, indicating that some items do not adequately represent the
construct and should be revised or replaced before using the instrument C)
ICC of 0.91 indicates acceptable inter-rater reliability; CVI of 0.72 is within
acceptable range for instruments used with diverse populations D) Both
values are within acceptable ranges and the instrument is ready for use in
the study without modification
Correct Answer: B
Rationale: The intraclass correlation coefficient (ICC) measures test-retest
reliability, specifically the consistency of scores obtained from the same
participants at two time points. An ICC of 0.91 is excellent, meeting the
widely accepted threshold of greater than 0.75 for good reliability and greater
than 0.90 for excellent reliability (Koo and Mae, 2016). However, the content
validity index (CVI) measures the degree to which instrument items represent
the intended content domain, rated by a panel of subject matter experts. The
recommended minimum acceptable CVI for a multi-item scale is 0.80 (Lynn,
1986; Polit and Beck), meaning at least 80% of expert raters agree that each
item is relevant. A CVI of 0.72 falls below this threshold, indicating that
approximately 28% of expert raters identified items that poorly represent pain
management satisfaction. The researcher must convene an expert panel
review to identify and revise the problematic items before proceeding with
data collection, regardless of the instrument's strong reliability.
A systematic review published in a nursing journal concludes that
structured nurse-led discharge education significantly reduces 30-day
hospital readmission rates among heart failure patients (pooled risk
ratio 0.68, 95% CI 0.54 to 0.85, p = 0.001, I² = 62%). A nurse manager at
UCF Health is considering implementing this protocol. How should the
nurse manager interpret the I² statistic of 62%, and what does it mean
for applying these findings to practice?
A) I² of 62% indicates that 62% of the included studies found statistically
significant results, strengthening the recommendation for implementation B)
I² of 62% indicates substantial heterogeneity among the included studies,
meaning that approximately 62% of the variance in effect sizes is attributable
to true differences between studies rather than chance, requiring careful
examination of the sources of heterogeneity before broad implementation C)
3
, I² of 62% represents the percentage of studies with low risk of bias, indicating
that 38% of studies had significant methodological flaws that should be
disregarded D) I² of 62% indicates moderate agreement between
independent reviewers on data extraction, which is acceptable for meta-
analysis
Correct Answer: B
Rationale: The I² statistic (I-squared) is a measure of statistical heterogeneity
in meta-analysis, quantifying what proportion of variability in effect estimates
across included studies is due to true between-study differences rather than
sampling error (chance). Interpretation thresholds per Higgins et al.
(Cochrane Handbook): 0 to 40% = low heterogeneity; 30 to 60% = moderate
heterogeneity; 50 to 90% = substantial heterogeneity; 75 to 100% =
considerable heterogeneity. An I² of 62% indicates substantial heterogeneity,
meaning the studies included in this meta-analysis varied meaningfully in
their populations, interventions, comparators, outcomes, or methods. This
does not invalidate the meta-analysis but requires the nurse manager to
examine subgroup analyses and moderator variables (patient demographics,
education format, follow-up duration) to understand which specific
implementation contexts produced the strongest effects. Simply adopting the
pooled estimate without contextualizing it to the local population introduces
implementation risk. UCF nursing students must understand that statistical
significance and pooled effect size alone are insufficient for practice
translation when substantial heterogeneity exists.
A researcher is studying the relationship between nurses' years of
experience and their confidence in performing evidence-based practice
(EBP) behaviors. She collects data from 200 nurses and calculates a
Pearson correlation coefficient of r = 0.34, p = 0.001. A colleague argues
this finding is clinically meaningless because the r value is small. The
researcher counters that statistical significance and clinical (practical)
significance are distinct. Which concept captures the magnitude of
practical importance independent of sample size and p-value, and how
should r = 0.34 be interpreted?
A) Internal validity; r = 0.34 indicates a strong relationship that explains most
of the variance in EBP confidence B) Effect size; r = 0.34 represents a small-
to-medium effect size by Cohen's conventions (small = 0.10, medium = 0.30,
large = 0.50), explaining approximately 11.6% of variance in EBP confidence
(r² = 0.116), which has modest but meaningful practical significance C)
Statistical power; r = 0.34 indicates that the study was adequately powered
to detect a large effect at alpha 0.05 D) Construct validity; r = 0.34 confirms
the instrument validly measures the intended constructs
Correct Answer: B
Rationale: Effect size quantifies the magnitude of a relationship or difference
4