Reactive and Cyclic Policy Formulations
Let p(Δ) = 1 / (1 + e^(–a(Δ–b))), with a > 0 and b > 0
Period 1: The state is (0,1), (Ω2,3), (Ω2,2). We assume Ω2 > Ω3. The expected downtime is
E[D1] = Ω2 + Ω3. The action is to visit cluster 2 under either the cyclic or reactive policy.
Period 2: The state is (x,(2,2), (0,1), (Ω3 + y1,3)), with x ~ Bin(W, p(1)) and y1 ~ Bin(W –
Ω3, p(2)).
The visitation decision is based on comparing x (from Cluster 1) and Ω3 + y1 (from Cluster
3).
Period 3: We have two cases
Case A (Branch A: Ω3 + y1 < x): When Ω3 + y1 < x, the decision is to visit Cluster 1. Then, at
period 3 the states are updated as: ((0,1), (x,2), (Ω2 + y1 + y2, 4)), with x ~ Bin(W, p(1))
and y2 ~ Bin(W – (Ω3 + y1), p(3)). Thus, the conditional expected downtime is:
E[D3^A | x > Ω3 + y1] = Ω2 + y1 + y2 + Σ_{x,y1,y2:Ω3+y1<x} (W p(1) + Ω3 + y1 + (W – ( Ω3
+ y1)) p(3)) ⋅ P(x,y1,y2)
/ P(x > Ω3 + y1)
The probability of choosing Branch A is π3 = πA = Pr(x > Ω3 + y1), πB = 1 – πA.
Period 4: Under Branch A, there are two options again. Option A1 (if Ω3 + y1 + y2 ≤ x), i.e.,
visit cluster 2. Conditional expected downtime:
E[D4^A1 | x > Ω3 + y1, x ≥ Ω3 + y1 + y2] = Σ_{x,y1,y2: x > Ω3+y1, x ≥ Ω3+y1+y2}
(W p(1) + (Ω3 + y1 + y2) + (W – (Ω3 + y1 + y2)) p(4)) P(x,y1,y2) / P(x > Ω3 + y1 ∧ x ≥ Ω3 +
y1 + y2)
π3 = Pr(x > Ω3 + y1); π4 = Pr(x > Ω3 + y1 + y2 | x > Ω3 + y1)
π4 = Σ_{x,y1,y2: x > Ω3 + y1 + y2} P(x,y1,y2) / P(x > Ω3 + y1)
P(x,y1,y2) = Bin[x;W,p(1)] ⋅ Bin[y1;W–Ω3,p(2)] ⋅ Bin[y2;W–(Ω3+y1),p(3)]
P(x,y1,y2,y3) = Bin[x;W,p(1)] ⋅ Bin[y1;W–Ω3,p(2)] ⋅ Bin[y2;W–(Ω3+y1),p(3)] ⋅ Bin[y3;W–
(Ω3+y1+y2),p(4)]
Period 5 Option A1: (if Ω3 + y1 + y2 + y3 < x) visit cluster 1: state is ((0,1), (x,2), ( Ω3 + y1 +
y2 + y3 + y4,5)), with y4 ~ Bin(W–Ω3–y1–y2–y3, p(5)).
i
, Conditional expected downtime;
E[D5^A1 | x > Ω3 + y1 + y2 + y3 < x] = Σ_{x,y1,y2,y3,y4}
(W p(1) + (Ω3 + y1 + y2 + y3) + (W – ( Ω3 + y1 + y2 + y3))p(5)) P(x,y1,y2,y3,y4)
/ P(x > Ω3 + y1 + y2 + y3)
Cyclic: Under a cyclic maintenance visitation approach, each cluster is visited every three
periods. In period 1, the downtime is Ω3 + Ω2. In period 2, the downtime is W ⋅ p(1) + Ω3 +
(W – Ω3) ⋅ p(2).
Starting in period 3, the per period total downtime for all three clusters is the same: D^c =
D3^c = 2W ⋅ p(1) + W ⋅ (1 – p(1)) ⋅ p(2).
Final Difference and Objective
Final Difference at t = 5 is: E[D5^r] – E[D5^c] = π3 ( Ψ3 + π4 Ψ4 + π4 π5 Ψ5),
where
π3 = Pr(x > Ω3 + y1); π4 = Pr(x > Ω3 + y1 + y2 | x > Ω3 + y1); π5 = Pr(x > Ω3 + y1 + y2 + y3
| x > Ω3 + y1 + y2)
Ψ3 = E[D4^A | x > Ω3 + y1] – D3^c; Ψ4 = E[D4^A1 | x ≥ Ω3 + y1 + y2] – D3^c; Ψ5 =
E[D4^A1 | x ≥ Ω3 + y1 + y2 + y3] – D3^c
Ideally, I want to show that when b is large, E[D5^r] – E[D5^c] < 0; and when b is small,
E[D5^r] – E[D5^c] > 0. For a minimum, I need to find some analytical properties of E[D5^r]
– E[D5^c]. For example, derivative with respect to x, b, or a. Or do it for t = 4.
Analytical Approach to Comparing Policies
1. Simplify the Nested Sums
The conditional expectations involve sums over multiple binomial random variables. For
tractability, we approximate the binomial terms using the normal approximation:
Step 1: Normal Approximation of Binomial Variables
If X ~ Bin(n, p), then for large n:
X ≈ N(μ, σ²) where μ = np and σ² = np(1 – p).
We apply this to key random variables in our model:
- x ~ Bin(W, p(1)) ⇒ x ≈ N(μ₁, σ₁²) with μ₁ = W ⋅ p(1), σ₁² = W ⋅ p(1) ⋅ (1 – p(1))
ii