CIS Colloquium, Nov 04, 2008, 03:00PM - 04:00PM, Tech Center 111
Dependency Parsing by Belief Propagation
Jason Eisner, Johns Hopkins University
A dependency parse uses directed edges to connect syntactically related words within a sentence. These simple sentence diagrams have proved useful for information extraction, machine translation, semantic clustering, and other NLP tasks.
Finding the most probable dependency parse is a combinatorial optimization problem. Usually it is handled by dynamic programming or graph algorithms. Unfortunately, exact approaches become slow or NP-hard if the probability distribution over dependency parses is defined to consider interactions among the edges, or interactions between edges and other graph annotations.
We attack this problem by formulating dependency parsing as a graphical model with the novel ingredient of global constraints. We show how to apply loopy belief propagation (BP), a simple and effective tool for approximate learning and inference. As a parsing algorithm, BP is both asymptotically and empirically efficient. Even with model features that would make exact parsing slow or NP-hard, BP needs only O(n^3) time with a small constant factor. Furthermore, such features significantly improve parse accuracy over exact first-order methods. Incorporating additional features would increase the runtime additively rather than multiplicatively.
This is joint work with David A. Smith.
Jason Eisner earned his Ph.D. in Computer Science at the University of Pennsylvania, under Mitch Marcus. He joined the University of Rochester as an assistant professor, then moved to Johns Hopkins University (JHU) soon afterward. At JHU, he is a Associate Professor of Computer Science, with a joint appointment in Cognitive Science. He is also a core member of the Center for Language and Speech Processing, and an affiliate of the national Center of Excellence in Human Language Technology. In 2005, he received JHU's Robert B. Pond, Sr. Excellence in Teaching Award. Dr. Eisner has authored papers and software tools in several areas of computational linguistics, including parsing, statistical learning including grammar induction, machine translation, weighted finite-state methods, and computational phonology. He serves as president of the ACL's special interest group on computational morphology and phonology (SIGMORPHON).