## Example Problems on SMARTs

ST790 Homework 4 on SMARTs

Analytical Problems

# 1

## a

There are 4 embedded treatment regimes in this SMART.

• ($e_1$) Give Trt 0, if response continue Trt 0 or if nonresponse continue Trt 0
• ($e_2$) Give Trt 0, if response continue Trt 0 or if nonresponse switch to Trt 1
• ($e_3$) Give Trt 1, if response continue Trt 1 or if nonresponse continue Trt 1
• ($e_4$) Give Trt 1, if response continue Trt 1 or if nonresponse switch to Trt 0

## b

For this SMART, our randomization probabilities are all $\frac{ 1 }{ 2 }$. This gives

$\omega_{1}(H_{1i}, A_{1i}) = \frac{ 1 }{ 2 }.$

At decision 2, if the patient responded, then they will stay on the same treatment with probability 1. Otherwise they will again be randomized to remain on the treatment or switch with probability $\frac{ 1 }{ 2 }$. This gives

$\omega_{2}(H_{2i}, A_{2i}) = R_{2,i}+ (1-R_{2,i}) \frac{ 1 }{ 2 } = \frac{ R_{2,i} +1}{ 2 }.$

For simplicity take $W_i$ such that

$W_i = \frac{ 1 }{ \omega_{1}(H_{1i}, A_{1i}) \omega_{2}(H_{2i}, A_{2i})} = \frac{ 4 }{ R_{2,i} + 1}.$

Also for simplicity, take $\mathcal C_{d,i}$ such that

$\mathcal{C}_{d,i} = I \Big[ A_{1i} = d_1(H_{1i}), A_{2i} = d_2(H_{2i}) \Big] = I\Big[ A_{1i} = d_1(H_{1i}) \Big] I \Big[ A_{2i} = d_2(H_{2i}) \Big].$

Finally we can plug this all into equation 8.3.

\begin{align} \widehat{\mathcal V}(d) & = \left[ \sum_{i=1}^{n} \frac{ I\Big[ A_{1i} = d_1(H_{1i}) \Big] I \Big[ A_{2i} = d_2(H_{2i}) \Big] }{ \omega_{1}(H_{1i}, A_{1i}) \omega_{2}(H_{2i}, A_{2i}) } \right]^{-1} \sum_{i=1}^{n} \frac{ I\Big[ A_{1i} = d_1(H_{1i}) \Big] I \Big[ A_{2i} = d_2(H_{2i}) \Big] Y_i }{ \omega_{1}(H_{1i}, A_{1i}) \omega_{2}(H_{2i}, A_{2i})} \\ & = \Big[ \sum_{i=1}^{n} \mathcal C_{d,i} W_i \Big]^{-1} \sum_{i=1}^{n} \mathcal C_{d,i} W_i Y_i \end{align}

## c

From slide 627, we get the following sampling variance.

\begin{align} \widehat \sigma^2 & = \frac{ 1 }{ n } \sum_{i=1}^{n} \Big[ \frac{ I\Big[ A_{1i} = d_1(H_{1i}) \Big] I \Big[ A_{2i} = d_2(H_{2i}) \Big] (Y_i - \widehat{\mathcal V}(d)}{ \omega_{1}(H_{1i}, A_{1i}) \omega_{2}(H_{2i}, A_{2i})} \Big]^2 \\ & = \frac{ 1 }{ n } \sum_{i=1}^{n} 4 \mathcal C_{d,i} W_i^2 (Y_i - \widehat{\mathcal V}(d))^2 \end{align}

## d 1

Define

$\beta = \begin{bmatrix} \alpha_0 \\ \alpha_1 \end{bmatrix}, \ X_i = \begin{bmatrix} 1 & \mathcal C_{d,i} \end{bmatrix}.$

Then we can write our model as

$Y_i = X \beta + \epsilon_i , \epsilon_i \sim (0, \sigma^2 W_i^{-1}).$

Notice that this is an Aitken model. Then we know

$\widehat \beta = (X^T W X)^{-1} X^T W Y.$

Where

\begin{align} X^T W X &= \begin{bmatrix} 1 & \dots & 1 \\ \mathcal C_{d, 1} & \dots & \mathcal C_{d, n} \end{bmatrix} \begin{bmatrix} W_1 \\ & \ddots \\ & & W_n \end{bmatrix} \begin{bmatrix} 1 & \mathcal C_{d,1} \\ \vdots & \vdots \\ 1 & \mathcal C_{d,n} \end{bmatrix} \\ & = \begin{bmatrix} \sum_{i=1}^{n} W_i & \sum_{i=1}^{n} W_i \mathcal C_{d,i} \\ \sum_{i=1}^{n} W_i \mathcal C_{d,i} & \sum_{i=1}^{n} W_i \mathcal C_{d,i}^2 \end{bmatrix} \\ \\ (X^T W X)^{-1} & = \frac{ 1 }{ \Big( \sum_{i=1}^{n} W_i \Big)\Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i}^2 \Big) - \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,1} \Big)^2 } \begin{bmatrix} \sum_{i=1}^{n} W_i \mathcal C_{d,i}^2 & - \sum_{i=1}^{n} W_i \mathcal C_{d,i} \\ - \sum_{i=1}^{n} W_i \mathcal C_{d,i} & \sum_{i=1}^{n} W_i \end{bmatrix} \\ & = \frac{ 1 }{ \Big( \sum_{i=1}^{n} W_i \Big)\Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i}\Big) - \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i} \Big)^2 } \begin{bmatrix} \sum_{i=1}^{n} W_i \mathcal C_{d,i} & - \sum_{i=i}^{n} W_i \mathcal C_{d,i} \\ - \sum_{i=1}^{n} W_i \mathcal C_{d,i} & \sum_{i=i}^{n} W_i \end{bmatrix} \end{align}

Thus,

$\widehat \beta = (X^T W X)^{-1} X^T W Y = \frac{ 1 }{ \Big( \sum_{i=1}^{n} W_i \Big)\Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i}\Big) - \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i} \Big)^2 } \begin{bmatrix} \sum_{i=1}^{n} W_i \mathcal C_{d,i} \cdot \sum_{i=1}^{n} W_i Y_i - \sum_{i=1}^{n} W_i \mathcal C_{d,i} \sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i \\ - \sum_{i=1}^{n} W_i \mathcal C_{d,i} \cdot \sum_{i=1}^{n} W_i Y_i + \sum_{i=1}^{n} W_i\cdot \sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i \end{bmatrix}.$

Finally,

\begin{align} \widehat \alpha_0 + \widehat \alpha_1 & = \frac{ 1 }{ \Big( \sum_{i=1}^{n} W_i \Big)\Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i}\Big) - \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i} \Big)^2 } \\ & \times \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i} \cdot \sum_{i=1}^{n} W_i Y_i - \sum_{i=1}^{n} W_i \mathcal C_{d,i} \sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i \Big) \\ & \times \Big( - \sum_{i=1}^{n} W_i \mathcal C_{d,i} \cdot \sum_{i=1}^{n} W_i Y_i + \sum_{i=1}^{n} W_i\cdot \sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i \Big) \\ & = \frac{ 1 }{ \sum_{i=1}^{n} W_i \mathcal C_{d,i} (\sum_{i=1}^{n} W_i - \sum_{i=1}^{n} W_i \mathcal C_{d,i}) } \\ & \times \sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i \cdot (\sum_{i=1}^{n} W_i - \sum_{i=1}^{n} W_i \mathcal C_{d,i}) \\ & = \frac{ \sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i }{ \sum_{i=1}^{n} W_i \mathcal C_{d,i} } \\ & = \Big[ \sum_{i=1}^{n} W_i \mathcal C_{d,i} \Big]^{-1}\sum_{i=1}^{n} W_i \mathcal C_{d,i} Y_i \\ & = \widehat{\mathcal V}(d) \end{align}

## d 2

We can take the variance of our estimator above.

\begin{align} \text{Var}( \widehat \beta ) & = \text{Var}( (X^T W X)^{-1} X^T W Y ) \\ & = \Big( (X^T W X)^{-1} X^T W \Big) \text{Var}( Y ) \Big( (X^T W X)^{-1} X^T W \Big)^T \\ & = (X^T W X)^{-1} X^T W (\widehat \sigma^2 W^{-1}) W X (X^T W X)^{-1} \\ & = \widehat \sigma^2 (X^T W X)^{-1} X^T W X (X^T W X)^{-1} \\ & = \widehat \sigma^2 (X^T W X)^{-1}. \end{align}

Where $\widehat \sigma^2 = \frac{ 1 }{ n-1 } \left( Y^T (W - X(X^T W X) X^Y) Y \right)$.

Now we can find the variance of $\widehat \alpha_0 + \widehat \alpha_1$.

\begin{align} a^T & = \begin{bmatrix} 1 & 1 \end{bmatrix} \\ \\ \text{Var}( \widehat \alpha_0 + \widehat \alpha_1 ) & = \text{Var}( a^T \widehat \beta ) \\ & = a^T \text{Var}( B ) a \\ & = [\widehat \sigma^2 (X^T W X)^{-1}]_{11} + [\widehat \sigma^2 (X^T W X)^{-1}]_{22} + 2 [\widehat \sigma^2 (X^T W X)^{-1}]_{12} \\ & = \frac{ \widehat \sigma^2 }{ \Big( \sum_{i=1}^{n} W_i \Big)\Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i}\Big) - \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i} \Big)^2 } \Big( \sum_{i=1}^{n} [W_i \mathcal C_{d_i}] + \sum_{i=1}^{n} [W_i] +2 \sum_{i=1}^{n} [-W_i \mathcal C_{d,i}] \Big) \\ & = \widehat \sigma^2 \frac{ \sum_{i=1}^{n} [W_i] -\sum_{i=1}^{n} [W_i \mathcal C_{d,i}] }{ \Big( \sum_{i=1}^{n} W_i \Big)\Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i}\Big) - \Big( \sum_{i=1}^{n} W_i \mathcal C_{d,i} \Big)^2 } \\ & = \frac{ \widehat \sigma^2 }{ \sum_{i=1}^{n} W_i \mathcal C_{d,i} } \end{align}

## e

We can find the asymptotic distribution of $\beta$ using M-Estimation. Then we can use Delta Theorem to find the asymptotic variance of $\alpha_0 +\alpha_1$. Let’s set up our estimating equations.

$X^T W^{-1}(Y- X \widehat \beta) = 0 \Rightarrow \psi = X^T W^{-1}(Y- X \widehat \beta)$

Now we can pieces of our sandwich matrix.

\begin{align} A(\beta) & = E\left[ - \frac{ \partial \psi }{\partial \beta}\right] \\ & = E\left[- X^T W^{-1} X\right] \\ B(\beta) & = E[ \psi \psi^T] \\ & = E[X^T W^{-1}(Y- X \widehat \beta)(Y- X \widehat \beta)^T W^{-1} X] \\ & = E \begin{bmatrix} W^2(Y - \alpha_0 - \alpha_1 \mathcal C_d)^2 & W^2 C_d(Y - \alpha_0 - \alpha_1 \mathcal C_d)^2 \\ W^2 C_d(Y - \alpha_0 - \alpha_1 \mathcal C_d)^2 & W^2 C_d(Y - \alpha_0 - \alpha_1 \mathcal C_d)^2 \\ \end{bmatrix} \end{align}

Now we can find a consistent estimator.

\begin{align} \Sigma & = A(\beta_0)^{-1} B(\beta_0) A(\beta_0)^{-T} \\ \\ \widehat \Sigma & = n \begin{bmatrix} \sum_{i=1}^{n} W_i & \sum_{i=1}^{n} W_i \mathcal C_{d,i} \\ \sum_{i=1}^{n} W_i \mathcal C_{d,i} & \sum_{i=1}^{n} W_i \mathcal C_{d,i} \end{bmatrix}^{-1} \frac{ 1}{ n } \begin{bmatrix} \sum_{i=1}^{n} W_i^2(Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 & \sum_{i=1}^{n} W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2\\ \sum_{i=1}^{n} W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 & \sum_{i=1}^{n} W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 \end{bmatrix} n \begin{bmatrix} \sum_{i=1}^{n} W_i & \sum_{i=1}^{n} W_i \mathcal C_{d,i} \\ \sum_{i=1}^{n} W_i \mathcal C_{d,i} & \sum_{i=1}^{n} W_i \mathcal C_{d,i} \end{bmatrix}^{-1} \end{align}

Notice that many of the elements are repeated. So let’s simplify by saying

\begin{align} a_1 & = \sum_{i=1}^{n} W_i \\ a_2 & = \sum_{i=1}^{n} W_i \mathcal C_{d,i} \\ b_1& = \sum_{i=1}^{n} W_i^2 \mathcal (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 \\ b_2& = \sum_{i=1}^{n} W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2. \end{align}

(Thank you Samsul for this tip!)

\begin{align} \widehat \Sigma & = n \begin{bmatrix} a_1 & a_2 \\ a_2 & a_2 \end{bmatrix}^{-1} \begin{bmatrix} b_1 & b_2 \\ b_2 & b_2 \end{bmatrix} \begin{bmatrix} a_1 & a_2 \\ a_2 & a_2 \end{bmatrix}^{-1} \\ &= \frac{ n }{ (a_1 - a_2)^2 } \begin{bmatrix} b_1 - b_2 & b_2 - b_1 \\ b_2 - b_1 & \frac{ 1 }{ a_2^2 } a_2^2 b_1 + a_1^2 b_2 - 2a_1 a_2 b_2) \end{bmatrix} \end{align}

However, this estimates the variance of $\beta$. Now we can estimate the variance of $\alpha_0 + \alpha_1$.

\begin{align} \text{Var}( \widehat \alpha_0 + \widehat \alpha_1 ) & = \begin{bmatrix} 1 & 1 \end{bmatrix} \widehat \Sigma \begin{bmatrix} 1 \\ 1 \end{bmatrix} \\ & = \frac{ n }{ (a_1 - a_2)^2 } \begin{bmatrix} 0 & \frac{ b_2 }{ a_2^2 } \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} \\ & = \frac{ n b_2 }{ a_2^2 } \\ & = \frac{ \frac{ 1 }{ n } \sum_{i=1}^{n} W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 }{ \left( \frac{ 1 }{ n }\sum_{i=1}^{n} W_i \mathcal C_{d,i} \right)^2 }. \end{align}

Now we can look at these pieces asymptotically.

\begin{align} \frac{ 1 }{ n }\sum_{i=1}^{n} W_i \mathcal C_{d,i} & \stackrel{ \text{p}}{\rightarrow} E[W C_{d}] \\ & = W \frac{ 1 }{ W } \\ & = 1 \\ \\ \frac{ 1 }{ n } \sum_{i=1}^{n} W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 & \stackrel{ \text{p}}{\rightarrow} E\left[ W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 \right] \\ & = E\left[ W_i^2 \mathcal C_{d,i} (Y_i - \widehat \alpha - \widehat \alpha \mathcal C_{d,i})^2 \right] \\ & = W^2 E\left[ \mathcal C_d (Y - X \beta)^2 \right] \\ & = W^2 E\left[ \mathcal C_d (Y - E(Y| \mathcal C_d))^2 \right] \\ & = W^2 E\left[ E \left\{ \mathcal C_d (Y - E(Y| \mathcal C_d))^2 \right\} | \mathcal C_d \right] \\ & = W^2 E\left[ E \left\{ \mathcal C_d (Y - E(Y| \mathcal C_d))^2 \right\} | \mathcal C_d = 1 \right] P(\mathcal C_d = 1) \\ & = W^2 E\left[ E \left\{ 1 \cdot (Y - E(Y| \mathcal C_d = 1))^2 \right\} | \mathcal C_d = 1 \right] P(\mathcal C_d = 1) \\ & = W^2 E\left[ (Y^*(d) - \mathcal V(d))^2 \right] \frac{ 1 }{ W } \\ & = E \left[\frac{ (Y^*(d) - \mathcal V(d))^2 }{ \omega_1(H_1, d_1(H_1))\omega_2(H_2, d_2(H_2)) } \right] \\ & = \sigma^2_d \end{align}

Now, by the continuous mapping theorem,

$\text{Var}( \widehat \alpha_0 + \alpha_1 ) \stackrel{ \text{p}}{\rightarrow} \frac{ \sigma^2_d }{ 1^2 } = \sigma^2_d.$