Example Problems on LRT and UMP

24 Mar 2020

ST702 Homework 8 on LRT and UMP

8.17
- a
- b
- c
8.18
- a
- b
8.19
8.20

8.17

Suppose that $X_1, \dots , X_n \stackrel{ \text{iid}}{\sim} beta(\mu, 1)$ and $Y_1, \dots , Y_m \stackrel{ \text{iid}}{\sim} beta(\theta, 1)$. Also assume that the $X$s are independent of the $Y$s.

a

Find the LRT of $H_0: \theta = \mu \text{ vs. } h_1: \theta \neq \mu$.

We can start by finding the likelihood.

\[\begin{align} L(\mu , \theta | x ,y ) & = \prod_{i=1}^{n} \frac{ \Gamma(\mu + 1) }{ \Gamma(\mu) \Gamma(1) } x_i^{\mu-1}(1-x_i)^{1-1} \prod_{j=1}^{m} \frac{ \Gamma(\theta + 1) }{ \Gamma(\theta) \Gamma(1) } y_i^{\theta-1}(1-y_i)^{1-1} \\ & = \prod_{i=1}^{n} \mu x_i^{\mu-1} \prod_{j=1}^{m} \theta y_i^{\theta-1} \\ & = \mu^n \Big( \prod_{i=1}^{n} x_i \Big)^{\mu-1} \theta^m \Big( \prod_{j=1}^m y_i \Big)^{\theta - 1} \\ \\ \ell(\mu , \theta | x ,y ) & = n \log(\mu) + (\mu-1) \sum \log(x_i) + m \log(\theta) + (\theta-1) \sum \log(y_j) \end{align}\]

Then we can optimize with respect to both variables.

\[\begin{align} \frac{ \partial \ell }{\partial \mu } & = \frac{ n }{ \mu } + \sum \log(x_i) \\ \widehat \mu & = \frac{ n }{ - \sum \log(x_i) } \\ \frac{ \partial \ell }{\partial \theta } & = \frac{ m }{ \theta } + \sum \log(y_j) \\ \widehat \theta & = \frac{ m }{ - \sum \log(y_j) } \end{align}\]

Under $H_0$,

\[\begin{align} L(\mu | x ,y ) & = \mu^{n+m} \Big( \prod_{i=1}^{n} x_i \prod_{j=1}^m y_i \Big)^{\mu-1} \\ \ell(\mu | x ,y ) & = (n + m) \log(\mu) + (\mu - 1) \Big( \sum \log(x_i) + \sum \log(y_i) \Big) \\ \frac{ \partial \ell }{\partial \mu } & = \frac{ n+m }{ \mu } + \sum \log(x_i) + \sum \log(y_i) \\ \widehat \mu & = \frac{ n+m }{ -\sum \log(x_i) - \sum \log(y_i) } \end{align}\]

Then,

\[\begin{align} \lambda(x,y) & = \frac{ \Big( \frac{ n+m }{ -\sum \log(x_i) - \sum \log(y_i) } \Big)^{n+m} \Big( \prod_{i=1}^{n} x_i \prod_{j=1}^m y_i \Big)^{\Big( \frac{ n+m }{ -\sum \log(x_i) - \sum \log(y_i) } \Big)-1} }{ \Big( \frac{ n }{ - \sum \log(x_i) } \Big)^n \Big( \prod_{i=1}^{n} x_i \Big)^{\Big( \frac{ n }{ - \sum \log(x_i) } \Big)-1} \Big( \frac{ m }{ - \sum \log(y_j) } \Big)^m \Big( \prod_{j=1}^m y_i \Big)^{\Big( \frac{ m }{ - \sum \log(y_j) } \Big) - 1} } \\ & = \frac{ \Big( \frac{ n+m }{ -\sum \log(x_i) - \sum \log(y_i) } \Big)^{n+m} }{ \Big( \frac{ n }{ - \sum \log(x_i) } \Big)^n \Big( \frac{ m }{ - \sum \log(y_j) } \Big)^m } \\ \end{align}\]

b

Show that the test in part (a) can be based on the statistic

\[T = \frac{ \sum \log(X_i) }{ \sum \log (X_i) + \sum \log (Y_i) }\] \[\begin{align} \lambda(x,y) & = \frac{ \Big( \frac{ n+m }{ -\sum \log(x_i) - \sum \log(y_i) } \Big)^{n+m} }{ \Big( \frac{ n }{ - \sum \log(x_i) } \Big)^n \Big( \frac{ m }{ - \sum \log(y_j) } \Big)^m } \\ \\ & = \frac{ (n+m)^{n+m} }{ n^n m^m } \frac{ (- \sum \log(x_i))^n (-\sum \log(y_j))^m }{ \Big(\sum -\log(x_i) - \sum \log(y)_j\Big)^{m+n} } \\ & = \frac{ (n+m)^{n+m} }{ n^n m^m } T^n (1-T)^m \end{align}\]

Reject when $T \leq c_1$ or when $T \geq c_2$ (since this is a two-sided test).

c

Find the distribution of $T$ when $H_0$ is true, and then show how to get a test of size $\alpha = 0.10$.

If we take $A_i = -\log(X_i)$ then $A \sim Exponential(\frac{ 1 }{ \mu })$. Similarly, $B_i = -\log(Y_i) \sim Exponential(\frac{ 1 }{ \theta })$. Then,

\[\begin{align} T & = \frac{ \sum A_i }{ \sum A_i + \sum B_j } \\ & = \frac{ Gamma(n, \frac{ 1 }{ \mu }) }{ Gamma(n, \frac{ 1 }{ \mu }) + Gamma(m, \frac{ 1 }{ \theta }) } \\ & \stackrel{=}{\text{under }H_0} \frac{ Gamma(n, \frac{ 1 }{ \mu }) }{ Gamma(n, \frac{ 1 }{ \mu }) + Gamma(m, \frac{ 1 }{ \mu }) } \\ & \sim Beta(n, m) \end{align}\]

We could find an $\alpha = 0.10$ test by finding $c_1$ and $c_2$ such that

\[P(T \leq c_1) + P(T \geq c_2) = \alpha\]

(we have $\alpha$ level) and

\[(1-c_1)^m c_1^n = (1-c_2)^m c_2^n.\]

8.18

Let $X_1, \dots , X_N \stackrel{ \text{iid}}{\sim} n(\theta, \sigma^2)$, $\sigma^2$ known. An LRT of $H_0: \theta = \theta_0 \text{ vs. } H_1: \theta \neq \theta_0$ is a test that rejects $H_0$ is $| \overline{ X } - \theta_0 | / (\sigma / \sqrt{ n }) >c$.

a

Find an expression, in terms of standard normal probabilities, for the power function of this test.

Notice that $\frac{ \overline{ X } - \theta}{ \sigma / \sqrt{ n } } = Z \sim N(0,1)$, so we will try to get things in terms of $Z$.

\[\begin{align} \beta(\theta) & = P_\theta \Big( \frac{ | \overline{ X } - \theta_0 | }{ (\sigma / \sqrt{ n }) } >c \Big) \\ & = 1 - P_\theta \Big( \frac{ | \overline{ X } - \theta_0 | }{ (\sigma / \sqrt{ n }) } \leq c \Big) \\ & = 1 - P_\theta \Big( -c \leq \frac{ \overline{ X } - \theta_0 }{ (\sigma / \sqrt{ n }) } \leq c \Big) \\ & = 1 - P_\theta \Big( \frac{-c \sigma }{ \sqrt{ n } } - \theta_0 \leq \overline{ X } \leq \frac{c \sigma }{ \sqrt{ n } } - \theta_0 \Big) \\ & = 1 - P_\theta \Big( \frac{ \frac{-c \sigma }{ \sqrt{ n } } - \theta_0 - \theta }{ \frac{ \sigma }{ \sqrt{ n } } } \leq \frac{ \overline{ X } - \theta}{ \sigma / \sqrt{ n } } \leq \frac{ \frac{c \sigma }{ \sqrt{ n } } - \theta_0 - \theta }{ \frac{ \sigma }{ \sqrt{ n } } } \Big) \\ & = 1 - P_\theta \Big( \frac{ \frac{-c \sigma }{ \sqrt{ n } } - \theta_0 - \theta }{ \frac{ \sigma }{ \sqrt{ n } } } \leq Z \leq \frac{ \frac{c \sigma }{ \sqrt{ n } } - \theta_0 - \theta }{ \frac{ \sigma }{ \sqrt{ n } } } \Big) \\ & = 1 - \Phi(\frac{ \frac{c \sigma }{ \sqrt{ n } } - \theta_0 - \theta }{ \frac{ \sigma }{ \sqrt{ n } } }) + \Phi(\frac{ \frac{-c \sigma }{ \sqrt{ n } } - \theta_0 - \theta }{ \frac{ \sigma }{ \sqrt{ n } } }) \end{align}\]

b

The experimenter desires a Type I Error probability of 0.05 and a maximum Type II Error probability of 0.25 at $\theta = \theta_0 + \sigma$. Find the values of $n$ and $c$ that will achieve this.

We want size to be 0.05. This means,

\[\beta(\theta_0) = 1 - \Phi(c) + \Phi(-c) = 0.05.\]

Using a $Z$ table we know that we want $c = 1.96$. Since $\beta$ gives $1 - \text{Type II error}$,

\[\begin{align} 0.25 & = 1 - 0.75 \\ 0.75 & \leq \beta(\theta_0 + \sigma) \\ & = 1 - \Phi(c - \sqrt{ n }) + \Phi(-c - \sqrt{ n }) \\ 0.25 & = \Phi(c - \sqrt{ n }) - \Phi(-c - \sqrt{ n })\\ & = \Phi(c - \sqrt{ n }) - 0\\ \end{align}\]

We take the second term to be 0 as it is in the tail and we are making it even smaller by subtracting $\sqrt{ n }$. We want

\[c - \sqrt{ n } = 1.96 - \sqrt{ n } = x = \text{ s.t. } \Phi(x) = 0.25.\]

Using a $Z$ table gives us $x = -0.675$.

\[\begin{align} 1.96 - \sqrt{ n } & = -0.675 \\ n & = 6.95 \\ & \approx 7 \end{align}\]

8.19

The random variable $X$ has pdf $f(x) = e^{-x}, x > 0$. One observation is obtained on the random variable $Y = X^\theta$, and a test of $H_0: \theta = 1 \text{ vs. } H_1: \theta = 2$ needs to be constructed. Find the UMP level $\alpha = 0.10$ test and compute the Type II Error probability.

Under $H_0$, $Y = X$. So $f_Y = f_X$.

Under $H_1$,

\[\begin{align} F_Y & = P(Y \leq y) \\ & = P(X^2 \leq y)\\ & = P(-\sqrt{ y } \leq X \leq \sqrt{ y }) \\ & = P(X \leq \sqrt{ y }) & y > 0 \\ & = F_X(\sqrt{ y }) \\ \\ f_y & = \frac{ 1 }{ \sqrt{ y } } f_X(\sqrt{ y }) \\ & = \frac{ 1 }{ 2 \sqrt{ y } } e^{-\sqrt{ y }} \end{align}\]

Then our UMP test will reject if

\[k < \frac{ f_y(y | \theta = 2) }{ f_y(y | \theta = 1) } = \frac{ \frac{ 1 }{ 2 \sqrt{ y } } e^{-\sqrt{ y }} }{ e^{-y} } = \frac{ e^{-\sqrt{ y } +y } }{ 2\sqrt{ y } }\]

Taking the derivative of this ratio and setting it to 0 gives $y=1$. Plugging in $y=0.5$ to the derivative gives -0.81 and plugging $y=3$ into the derivative gives 1.11745. Thus, the ratio is decreasing before 1 and increasing after 1. This gives a rejection region of $y \leq c_1$ and $y \geq c_2$. We will reject with $\alpha = 0.10$ if

\[\begin{align} 0.10 & = P(Y \leq c_1 | \theta = 1) + P(Y \geq c_2 | \theta = 1) \\ & = -e^{-c_1} + 1 + e^{-c_2} \\ \\ \frac{ f(c_1 | \theta = 2) }{ f(c_1 | \theta = 1) } & = \frac{ f(c_2 | \theta = 2) }{ f(c_2 | \theta = 1) }. \end{align}\]

Solving these gives $c_1 = 0.077$ and $c_2 = 3.638$. Thus, the Type II Error probability is,

\[P(c_1 < Y < c_1 | \theta = 2) = \int_{0.077}^{3.638} \frac{ 1 }{ 2 } y^{1/2 - 1} e^{-y^{1/2}} dy = 0.609\]

8.20

Let $X$ be a random variable whose pmf under $H_0$ and $H_1$ is given by

\[\begin{array}{c c c c c c c c} x & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\ \hline f(x | H_0) & .01 & .01 & .01 & .01 & .01 & .01 & .94 \\ f(x | H_1) & .06 & .05 & .04 & 0.3 & .02 & .01 & .79 \end{array}\]

Use the Neyman-Pearson Lemma to find the most powerful test for $H_0$ versus $H_1$ with size $\alpha = 0.04$. Compute the probability of Type II Error for this test.

First we can take the ratio of to find the UMP (by the Neyman-Pearson Lemma).

\[\begin{array}{c c c c c c c c} x & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\ \hline f(x | H_0) & .01 & .01 & .01 & .01 & .01 & .01 & .94 \\ f(x | H_1) & .06 & .05 & .04 & 0.3 & .02 & .01 & .79 \\ \hline \frac{ f(x | H_1) }{ f(x | H_0) } & 6 & 5 & 4 & 3 & 2 & 1 & .84 \end{array}\]

We want to reject when the ratio is large which corresponds to smaller values of $x$.

We want to find $c$ such that $P(X \leq c | H_0) = 0.04$. We can find $c$ by just adding the values of the PDF under $H_0$. Then,

\[F(4 | H_0) = f(1|H_0) + f(2 | H_0) + f(3 | H_0) + f(4 | H_0) = 0.01 + 0.01 + 0.01 + 0.01 = 0.04.\]

So take $c = 4$. The probability of Type II error is then $P(X = 5, 6, 7 | H_1) = 0.02 + 0.01 + 0.79 = 0.82$.