Cramér–von Mises criterion

Statistical test From Wikipedia, the free encyclopedia

In statistics the Cramér–von Mises criterion is a criterion used for judging the goodness of fit of a cumulative distribution function (CDF) compared to a given empirical distribution function , or for comparing two empirical distributions. It is also used as a part of other algorithms, such as minimum distance estimation. It is defined as , where

The shaded area indicates the region that contributes to

In one-sample applications is the theoretical distribution and is the empirically observed distribution. Alternatively the two distributions can both be empirically estimated ones; this is called the two-sample case.

The criterion is named after Harald Cramér and Richard Edler von Mises who first proposed it in 1928–1930. [1][2] The generalization to two samples is due to Anderson. [3]

The Cramér–von Mises test is an alternative to the Kolmogorov–Smirnov test (1933).[4]

Cramér–von Mises test (one sample)

Let be the observed values, in increasing order. Then the test statistic is[3]:1153[5]

If this value is larger than the tabulated value, then the hypothesis that the data came from the distribution can be rejected.

Watson test

A modified version of the Cramér–von Mises test is the Watson test[6] which uses the statistic U2, where[5]

where

Cramér–von Mises test (two samples)

Let and be the observed values in the first and second sample respectively, in increasing order. Within the combined sample of size , let be the ranks of the xs in the combined sample, and let be the ranks of the ys in the combined sample. Anderson[3]:1149 shows that

where U is defined as

If the value of T is larger than the tabulated values,[3]:1154–1159 the hypothesis that the two samples come from the same distribution can be rejected. (Some books[specify] give critical values for U, which is more convenient, as it avoids the need to compute T via the expression above. The conclusion will be the same.)

The above assumes there are no duplicates in the , , and sequences. So is unique, and its rank is in the sorted list . If there are duplicates, and through are a run of identical values in the sorted list, then one common approach is the midrank[7] method: assign each duplicate a "rank" of . In the above equations, in the expressions and , duplicates can modify all four variables , , , and .

Cramér distance

For two distributions on the real line with cumulative distribution functions and and finite first moment, the Cramér distance is

a metric on the space of such distributions.[8] Note that some sources define the Cramér distance as , but this fails the triangle inequality and so cannot be properly defined as a distance. The Cramér distance is the one-dimensional case of the energy distance via the relationship ,[9] and when represents a single observation with cumulative distribution , is equivalent to the continuous ranked probability score, a strictly proper scoring rule.[10]

The shaded area of this PIT reliability diagram indicates the values that are squared, integrated, and square rooted to form the Cramér distance

Under the probability integral transform (PIT), the plot of the empirical distribution of the transformed values and the uniform distribution on creates a PIT reliability diagram. The Cramér distance between these two distributions equals , the square root of the criterion, and serves as a numerical score of the calibration error of . This may also be referred to as the Root Mean Square Calibration Error (RMSCE).

For a deterministic (point) forecast at , the PIT degenerates to a Bernoulli random variable on with success probability , so in the population limit the Cramér distance between the PIT CDF and the uniform distribution evaluates in closed form to

This quantity is minimized at (the unbiased case) with value , establishing a calibration-error floor that no point forecast can fall below regardless of how accurate its central value is. In contrast, a well-calibrated probabilistic forecast can approach 0. Similarly, this quantity is maximized at the bias extremes with value .

References

Further reading

Related Articles

Wikiwand AI