Inside–outside algorithm

For parsing algorithms in computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced by James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

Inside and outside probabilities

The inside probability $\beta _{j}(p,q)$ is the total probability of generating words $w_{p}\cdots w_{q}$ , given the root nonterminal $N^{j}$ and a grammar $G$ :^[1]

\beta _{j}(p,q)=P(w_{pq}|N_{pq}^{j},G)

The outside probability $\alpha _{j}(p,q)$ is the total probability of beginning with the start symbol $N^{1}$ and generating the nonterminal $N_{pq}^{j}$ and all the words outside $w_{p}\cdots w_{q}$ , given a grammar $G$ :^[1]

\alpha _{j}(p,q)=P(w_{1(p-1)},N_{pq}^{j},w_{(q+1)m}|G)

Computing inside probabilities

Base Case:

$\beta _{j}(p,p)=P(w_{p}|N^{j},G)$

General case:

Suppose there is a rule $N_{j}\rightarrow N_{r}N_{s}$ in the grammar, then the probability of generating $w_{p}\cdots w_{q}$ starting with a subtree rooted at $N_{j}$ is:

$\sum _{k=p}^{k=q-1}P(N_{j}\rightarrow N_{r}N_{s})\beta _{r}(p,k)\beta _{s}(k+1,q)$

The inside probability $\beta _{j}(p,q)$ is just the sum over all such possible rules:

$\beta _{j}(p,q)=\sum _{N_{r},N_{s}}\sum _{k=p}^{k=q-1}P(N_{j}\rightarrow N_{r}N_{s})\beta _{r}(p,k)\beta _{s}(k+1,q)$

Computing outside probabilities

Base Case:

$\alpha _{j}(1,n)={\begin{cases}1&{\mbox{if }}j=1\\0&{\mbox{otherwise}}\end{cases}}$

Here the start symbol is $N_{1}$ .

General case:

Suppose there is a rule $N_{r}\rightarrow N_{j}N_{s}$ in the grammar that generates $N_{j}$ . Then the left contribution of that rule to the outside probability $\alpha _{j}(p,q)$ is:

$\sum _{k=q+1}^{k=n}P(N_{r}\rightarrow N_{j}N_{s})\alpha _{r}(p,k)\beta _{s}(q+1,k)$

Now suppose there is a rule $N_{r}\rightarrow N_{s}N_{j}$ in the grammar. Then the right contribution of that rule to the outside probability $\alpha _{j}(p,q)$ is:

$\sum _{k=1}^{k=p-1}P(N_{r}\rightarrow N_{s}N_{j})\alpha _{r}(k,q)\beta _{s}(k,p-1)$

The outside probability $\alpha _{j}(p,q)$ is the sum of the left and right contributions over all such rules:

$\alpha _{j}(p,q)=\sum _{N_{r},N_{s}}\sum _{k=q+1}^{k=n}P(N_{r}\rightarrow N_{j}N_{s})\alpha _{r}(p,k)\beta _{s}(q+1,k)+\sum _{N_{r},N_{s}}\sum _{k=1}^{k=p-1}P(N_{r}\rightarrow N_{s}N_{j})\alpha _{r}(k,q)\beta _{s}(k,p-1)$

Inside–outside algorithm

Inside and outside probabilities

Computing inside probabilities

Computing outside probabilities

References

External links

Related Articles