An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

Citation

Stolcke, A. (1994). An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. arXiv preprint cmp-lg/9411029.

Abstract

We describe an extension of Earley’s parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (Viterbi) parse of the string; d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. Probabilities (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley’s top-down control structure. It can process any context-free rule format without conversion to some normal form, and combine computations for (a) through(d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracked inputs, and for finding partial parses and their likelihoods on ungrammatical inputs. 


Read more from SRI

This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.