Inside–outside algorithm
Parameter estimation method for probabilistic context-free grammars
From Wikipedia, the free encyclopedia
For parsing algorithms in computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced by James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).
Inside and outside probabilities
Computing inside probabilities
Base Case:
General case:
Suppose there is a rule in the grammar, then the probability of generating starting with a subtree rooted at is:
The inside probability is just the sum over all such possible rules:
Computing outside probabilities
Base Case:
Here the start symbol is .
General case:
Suppose there is a rule in the grammar that generates . Then the left contribution of that rule to the outside probability is:
Now suppose there is a rule in the grammar. Then the right contribution of that rule to the outside probability is:
The outside probability is the sum of the left and right contributions over all such rules: