ISYE 7406 Homework 4
1. Introduction
Local smoothing methods constitute a central class of nonparametric regression techniques
designed to estimate complex functional relationships without imposing a predetermined
parametric structure. These approaches are particularly valuable when the underlying regression
function exhibits nonlinear patterns that are difficult to capture with global models. This study
examines the statistical behavior and computational characteristics of three widely applied local
smoothing techniques: Loess (local polynomial regression), Nadaraya–Watson (NW) kernel
smoothing, and spline smoothing.
The objective is to evaluate and compare their empirical bias, empirical variance, and empirical
mean squared error (MSE) when estimating the well-known Mexican hat function:
𝑓(x) = (1 − 𝑥 2 ) exp(−0.5𝑥 2 ) , 𝑥 ∈ [−2π, 2π]
under the additive Gaussian noise model:
𝑌𝑖 = 𝑓(𝑥𝑖 ) + 𝜀𝑖 , 𝜀𝑖 ~ 𝑁(0, 0.22 )
The Mexican hat function presents several estimation challenges. Its pronounced nonlinearity,
substantial curvature near the origin, and rapidly decaying tails require a method capable of
adapting to varying local structure. These characteristics make it a suitable benchmark for
evaluating the performance of local smoothing procedures.
To evaluate estimator performance, 1,000 Monte Carlo simulations were conducted, each based
n=101 observations. Two deterministic design settings were examined: equidistant and non-
equidistant designs. For each setting, empirical bias, variance, and mean squared error (MSE)
were calculated at every design point and visualized to facilitate comparison across methods. The
remainder of the report presents the exploratory data analysis, describes the methodology,
summarizes the comparative results, and concludes with a discussion of practical implications.
2. Exploratory Data Analysis
The exploratory analysis examines three components: the structural properties of the Mexican
hat function, the role of additive noise, and the influence of design point distribution on local
smoothing performance. Because the study emphasizes empirical bias, variance, and MSE across
1,000 Monte Carlo replications, the analysis focuses on systematic estimation behavior rather
than single-sample visualization.
2.1 Structural Features of the Mexican Hat Function
The Mexican hat function presents several characteristics that make nonparametric estimation
nontrivial:
1. Nonlinearity and Curvature Variation
Curvature changes substantially across the domain, with a sharp central peak around x=0 and
flatter tails. This heterogeneity requires smoothing methods to adapt locally while
maintaining overall stability.
-1-
,2. Symmetry
The function is symmetric about zero. Under a symmetric design and sufficient sample size,
fitted curves should approximately preserve this structure. Noticeable asymmetry therefore
signals finite-sample variability or smoothing bias.
3. Rapid Decay in Tails
As |x| increases, the exponential component dominates and the function approaches zero.
In these regions, the signal-to-noise ratio declines, making estimation more sensitive to
random fluctuations and potentially increasing variance.
4. Sign Changes and Oscillation
The function transitions from positive values near the center to negative values in
intermediate regions before returning toward zero. This oscillatory pattern makes the fit
sensitive to oversmoothing, which may dampen peaks and troughs.
Together, these features create a setting where curvature varies sharply across regions, making
bias–variance trade-offs especially visible.
2.2 Noise Structure and Signal-to-Noise Consideration
The data-generating mechanism follows the additive noise model. With a noise standard
deviation of 0.2, the variability introduced by the error term is not negligible relative to the scale
of the regression function, whose maximum value is approximately 1. This moderate signal-to-
noise ratio produces visibly noisy realizations in each simulated dataset while still preserving the
underlying structure of the function.
Across Monte Carlo replications, several patterns emerge. Individual datasets may deviate
substantially from the true curve, particularly in regions where the signal is weak. Effective
smoothing procedures must therefore mitigate random fluctuations without distorting key
structural features of the function. Excessive flexibility tends to capture noise, resulting in
increased variance, whereas overly aggressive smoothing suppresses genuine features and leads
to increased bias. This controlled noise level provides a clear framework for evaluating empirical
bias–variance trade-offs.
2.3 Equidistant Design: Structural Insights
The equidistant design points are equally spaced over the interval [−2π, 2π]. This configuration
provides uniform coverage of the domain, ensures equal observation density across all regions,
and eliminates artificial clustering effects. As a result, performance differences primarily reflect
the intrinsic properties of the smoothing methods rather than design irregularities.
2.3.1 Mean Fitted Curves
The empirical mean curves, obtained from 1000 Monte Carlo replications, indicate that all three
smoothing methods recover the overall shape of the Mexican hat function. The largest
discrepancies occur near x=0, where curvature is most pronounced, and near the boundaries at
−2π and 2π, reflecting typical edge effects in nonparametric estimation. With uniform spacing,
these differences stem from smoothing behavior rather than data density.
-2-
, 2.3.2 Empirical Bias
The empirical results confirm that both bias and MSE are largest near x=0, increasing substantially
around the central peak while remaining comparatively small in the flatter tail regions. Each
method displays a distinct bias pattern, reflecting differences in how local neighborhoods are
defined or how smoothness penalties are imposed. The central region therefore represents the
primary estimation difficulty: when curvature changes rapidly, smaller neighborhoods may fail to
capture the underlying structure, yet enlarging the smoothing window reduces variance at the
cost of oversmoothing the peak and increasing bias. This behavior highlights the fundamental
trade-off emphasized in the assignment. Notably, even after 1000 Monte Carlo replications,
systematic bias persists in high-curvature areas, suggesting that the effect is structural rather
than a consequence of random variation.
2.3.3 Empirical Variance
The empirical variance remains relatively stable in the interior of the domain but increases near
the boundaries, reflecting typical edge effects in local smoothing. Differences across methods are
apparent and correspond to their relative smoothing intensity, with more flexible approaches
generally exhibiting higher variability. These findings align with the standard decomposition:
MSE = Bias2 + Variance
Regions with smaller bias often display comparatively larger variance, and conversely, areas with
reduced variance tend to incur greater bias. This pattern underscores the fundamental trade-off
inherent in nonparametric smoothing.
2.3.4 Empirical MSE
The empirical MSE combines both squared bias and variance, and is highest in the vicinity of x=0,
where curvature is substantial. In the flatter tail regions, MSE values are generally lower. The
magnitude and distribution of MSE are also sensitive to the choice of tuning parameters, further
underscoring the bias–variance trade-off. Because the design is equidistant, this setting isolates
the intrinsic difficulty of estimating the regression function itself, without confounding effects
arising from uneven sampling density.
2.4 Non-Equidistant Design: Additional Challenges
Under the non-equidistant design, points are unevenly distributed over the interval [−2π,2π]. In
contrast to the equidistant configuration, this setting produces clustering in some sub regions
and relatively sparse spacing in others, resulting in unequal local observation densities.
Consequently, the estimation problem becomes more complex: performance is shaped not only
by the structural features of the regression function but also by the distribution of the design
points themselves.
2.4.1 Mean Fitted Curves
Compared with the equidistant case, the mean fitted curves display greater fluctuation in
sparsely sampled regions, while densely sampled areas yield smoother and more stable average
fits. Differences among smoothing methods become more pronounced under this design,
reflecting their varying sensitivity to local data density. Estimator performance is therefore
governed by the interaction between function curvature and the spatial distribution of
-3-
1. Introduction
Local smoothing methods constitute a central class of nonparametric regression techniques
designed to estimate complex functional relationships without imposing a predetermined
parametric structure. These approaches are particularly valuable when the underlying regression
function exhibits nonlinear patterns that are difficult to capture with global models. This study
examines the statistical behavior and computational characteristics of three widely applied local
smoothing techniques: Loess (local polynomial regression), Nadaraya–Watson (NW) kernel
smoothing, and spline smoothing.
The objective is to evaluate and compare their empirical bias, empirical variance, and empirical
mean squared error (MSE) when estimating the well-known Mexican hat function:
𝑓(x) = (1 − 𝑥 2 ) exp(−0.5𝑥 2 ) , 𝑥 ∈ [−2π, 2π]
under the additive Gaussian noise model:
𝑌𝑖 = 𝑓(𝑥𝑖 ) + 𝜀𝑖 , 𝜀𝑖 ~ 𝑁(0, 0.22 )
The Mexican hat function presents several estimation challenges. Its pronounced nonlinearity,
substantial curvature near the origin, and rapidly decaying tails require a method capable of
adapting to varying local structure. These characteristics make it a suitable benchmark for
evaluating the performance of local smoothing procedures.
To evaluate estimator performance, 1,000 Monte Carlo simulations were conducted, each based
n=101 observations. Two deterministic design settings were examined: equidistant and non-
equidistant designs. For each setting, empirical bias, variance, and mean squared error (MSE)
were calculated at every design point and visualized to facilitate comparison across methods. The
remainder of the report presents the exploratory data analysis, describes the methodology,
summarizes the comparative results, and concludes with a discussion of practical implications.
2. Exploratory Data Analysis
The exploratory analysis examines three components: the structural properties of the Mexican
hat function, the role of additive noise, and the influence of design point distribution on local
smoothing performance. Because the study emphasizes empirical bias, variance, and MSE across
1,000 Monte Carlo replications, the analysis focuses on systematic estimation behavior rather
than single-sample visualization.
2.1 Structural Features of the Mexican Hat Function
The Mexican hat function presents several characteristics that make nonparametric estimation
nontrivial:
1. Nonlinearity and Curvature Variation
Curvature changes substantially across the domain, with a sharp central peak around x=0 and
flatter tails. This heterogeneity requires smoothing methods to adapt locally while
maintaining overall stability.
-1-
,2. Symmetry
The function is symmetric about zero. Under a symmetric design and sufficient sample size,
fitted curves should approximately preserve this structure. Noticeable asymmetry therefore
signals finite-sample variability or smoothing bias.
3. Rapid Decay in Tails
As |x| increases, the exponential component dominates and the function approaches zero.
In these regions, the signal-to-noise ratio declines, making estimation more sensitive to
random fluctuations and potentially increasing variance.
4. Sign Changes and Oscillation
The function transitions from positive values near the center to negative values in
intermediate regions before returning toward zero. This oscillatory pattern makes the fit
sensitive to oversmoothing, which may dampen peaks and troughs.
Together, these features create a setting where curvature varies sharply across regions, making
bias–variance trade-offs especially visible.
2.2 Noise Structure and Signal-to-Noise Consideration
The data-generating mechanism follows the additive noise model. With a noise standard
deviation of 0.2, the variability introduced by the error term is not negligible relative to the scale
of the regression function, whose maximum value is approximately 1. This moderate signal-to-
noise ratio produces visibly noisy realizations in each simulated dataset while still preserving the
underlying structure of the function.
Across Monte Carlo replications, several patterns emerge. Individual datasets may deviate
substantially from the true curve, particularly in regions where the signal is weak. Effective
smoothing procedures must therefore mitigate random fluctuations without distorting key
structural features of the function. Excessive flexibility tends to capture noise, resulting in
increased variance, whereas overly aggressive smoothing suppresses genuine features and leads
to increased bias. This controlled noise level provides a clear framework for evaluating empirical
bias–variance trade-offs.
2.3 Equidistant Design: Structural Insights
The equidistant design points are equally spaced over the interval [−2π, 2π]. This configuration
provides uniform coverage of the domain, ensures equal observation density across all regions,
and eliminates artificial clustering effects. As a result, performance differences primarily reflect
the intrinsic properties of the smoothing methods rather than design irregularities.
2.3.1 Mean Fitted Curves
The empirical mean curves, obtained from 1000 Monte Carlo replications, indicate that all three
smoothing methods recover the overall shape of the Mexican hat function. The largest
discrepancies occur near x=0, where curvature is most pronounced, and near the boundaries at
−2π and 2π, reflecting typical edge effects in nonparametric estimation. With uniform spacing,
these differences stem from smoothing behavior rather than data density.
-2-
, 2.3.2 Empirical Bias
The empirical results confirm that both bias and MSE are largest near x=0, increasing substantially
around the central peak while remaining comparatively small in the flatter tail regions. Each
method displays a distinct bias pattern, reflecting differences in how local neighborhoods are
defined or how smoothness penalties are imposed. The central region therefore represents the
primary estimation difficulty: when curvature changes rapidly, smaller neighborhoods may fail to
capture the underlying structure, yet enlarging the smoothing window reduces variance at the
cost of oversmoothing the peak and increasing bias. This behavior highlights the fundamental
trade-off emphasized in the assignment. Notably, even after 1000 Monte Carlo replications,
systematic bias persists in high-curvature areas, suggesting that the effect is structural rather
than a consequence of random variation.
2.3.3 Empirical Variance
The empirical variance remains relatively stable in the interior of the domain but increases near
the boundaries, reflecting typical edge effects in local smoothing. Differences across methods are
apparent and correspond to their relative smoothing intensity, with more flexible approaches
generally exhibiting higher variability. These findings align with the standard decomposition:
MSE = Bias2 + Variance
Regions with smaller bias often display comparatively larger variance, and conversely, areas with
reduced variance tend to incur greater bias. This pattern underscores the fundamental trade-off
inherent in nonparametric smoothing.
2.3.4 Empirical MSE
The empirical MSE combines both squared bias and variance, and is highest in the vicinity of x=0,
where curvature is substantial. In the flatter tail regions, MSE values are generally lower. The
magnitude and distribution of MSE are also sensitive to the choice of tuning parameters, further
underscoring the bias–variance trade-off. Because the design is equidistant, this setting isolates
the intrinsic difficulty of estimating the regression function itself, without confounding effects
arising from uneven sampling density.
2.4 Non-Equidistant Design: Additional Challenges
Under the non-equidistant design, points are unevenly distributed over the interval [−2π,2π]. In
contrast to the equidistant configuration, this setting produces clustering in some sub regions
and relatively sparse spacing in others, resulting in unequal local observation densities.
Consequently, the estimation problem becomes more complex: performance is shaped not only
by the structural features of the regression function but also by the distribution of the design
points themselves.
2.4.1 Mean Fitted Curves
Compared with the equidistant case, the mean fitted curves display greater fluctuation in
sparsely sampled regions, while densely sampled areas yield smoother and more stable average
fits. Differences among smoothing methods become more pronounced under this design,
reflecting their varying sensitivity to local data density. Estimator performance is therefore
governed by the interaction between function curvature and the spatial distribution of
-3-