Parsing sentences with the OTHER natural language tool: LinkGrammar

Type:
Talk
Audience level:
Novice
Category:
Science
March 11th 2:10 p.m. – 2:55 p.m.

Description

Many of you are probably familiar with NLTK, the wonderful Natural Language Toolkit for Python. You may not be familiar with Linkgrammar, which is a sentence parsing system created at Carnegie Melon university. Linkgrammar is quite robust and works "out of the box" in a way that NLTK does not for sentence parsing.

Abstract

NLTK is a fantastic library with broad capabilities. But often I find that I want something that will just do what I want without my having to figure out all of the details. An example of this is sentence parsing. A quick google search for parsing sentences with NLTK returns a number of articles describing how to write your own grammar and define a parser based on that grammar and parse sentences. This is great for toy problems and education, but if you actually need to parse sentences "from the wild," writing your own grammar is a huge undertaking.

Enter Linkgrammar. Linkgrammar was developed at Carnegie Melon university and is now maintained by the developers of Abiword as the basis for their grammar checking capabilities. It works nicely out of the box and is tolerant of irregularities found in authentic text.

Here's a rough draft of the structure of the talk:

Intro

  • Introduction to the talk, with overview of structure
  • Discuss NLTK and acknowledge all that it offers
  • Show examples of code found on the Internet to create a simple parser with NLTK
  • Show code snippet parsing sentence with pylinkgrammar

A quick demonstration of the Linkgrammar Python bindings

  • Existing bindings were simple SWIG wrapping of C library. I have expanded these bindings to offer a more Pythonic interface.
  • Show examples: Simple sentences first
  • Links between words have different types and there is a set of rules defining which words and what type of words can have certain links between each other
  • Show example: slightly more complex versions of initial sentence to indicate differences in link types
  • Show examples of using different parsing options
  • Show examples of generating constituent trees

Linkgrammar in greater depth (supported by Python code examples as appropriate)

  • Discuss history/background of linkgrammar system: developed at CMU, now maintained by AbiWord and used in the RelEx project and others
  • What is a Link Grammar and why is it different than other approaches?
  • Most parsing systems create constituent phrase trees; show examples
  • Describe the rules Linkgrammar uses to make links between words
  • Show examples of defining a grammar
  • Post processing of these linkages allows for standard constituent phrase tree to be created.
  • Show more examples of constituent phrase trees

The Future

  • What needs to be done?
  • How can you help?