Grouped Dirichlet distribution

From Wikipedia, the free encyclopedia

In statistics, the grouped Dirichlet distribution (GDD) is a multivariate generalization of the Dirichlet distribution It was first described by Ng et al. 2008.[1] The Grouped Dirichlet distribution arises in the analysis of categorical data where some observations could fall into any of a set of other 'crisp' category. For example, one may have a data set consisting of cases and controls under two different conditions. With complete data, the cross-classification of disease status forms a 2(case/control)-x-(condition/no-condition) table with cell probabilities

TreatmentNo Treatment
Controlsθ1θ2
Casesθ3θ4

If, however, the data includes, say, non-respondents which are known to be controls or cases, then the cross-classification of disease status forms a 2-x-3 table. The probability of the last column is the sum of the probabilities of the first two columns in each row, e.g.

TreatmentNo TreatmentMissing
Controlsθ1θ2θ12
Casesθ3θ4θ34

The GDD allows the full estimation of the cell probabilities under such aggregation conditions.[1]

References

Related Articles

Wikiwand AI