Almost sure hypothesis testing
From Wikipedia, the free encyclopedia
In statistics, almost sure hypothesis testing or a.s. hypothesis testing utilizes almost sure convergence in order to determine the validity of a statistical hypothesis with probability one. This is to say that whenever the null hypothesis is true, then an a.s. hypothesis test will fail to reject the null hypothesis w.p. 1 for all sufficiently large samples. Similarly, whenever the alternative hypothesis is true, then an a.s. hypothesis test will reject the null hypothesis with probability one, for all sufficiently large samples. Along similar lines, an a.s. confidence interval eventually contains the parameter of interest with probability 1. Dembo and Peres (1994) proved the existence of almost sure hypothesis tests.
For simplicity, assume we have a sequence of independent and identically distributed normal random variables, , with mean , and unit variance. Suppose that nature or simulation has chosen the true mean to be , then the probability distribution function of the mean, , is given by
where an Iverson bracket has been used. A naïve approach to estimating this distribution function would be to replace true mean on the right hand side with an estimate such as the sample mean, , but
which means the approximation to the true distribution function will be off by 0.5 at the true mean. However, is nothing more than a one-sided 50% confidence interval; more generally, let be the critical value used in a one-sided confidence interval, then
If we set , then the error of the approximation is reduced from 0.5 to 0.05, which is a factor of 10. Of course, if we let , then
However, this only shows that the expectation is close to the limiting value. Naaman (2016) showed that setting the significance level at with results in a finite number of type I and type II errors w.p.1 under fairly mild regularity conditions. This means that for each , there exists an , such that for all ,
where the equality holds w.p. 1. So the indicator function of a one-sided a.s. confidence interval is a good approximation to the true distribution function.