top band

Linguistics 101 for Pythonistas: Why noun phrase chunking with the NLTK is awesome & useful.

Luke Petschauer

Audience level:
Novice
Category:
Python Libraries

Description

A look at how the Natural Language Toolkit (NLTK) can be used to identify meaningful information in a text. Learn what noun phrases are, why phrase chunking is useful for text analysis, and why grammar is more fun than you may think.

Abstract

This poster draws upon my graduate studies in applied linguistics and my experience working with Python and the [Natural Language Toolkit (NLTK)](http://nltk.org), a Python library that provides a number of tools for working with language. The poster will demonstrate how the NLTK can be used to extract meaningful chunks of language from a text, specifically _noun phrases_, using regular expressions to carry out part of speech (POS) tagging. Relevant concepts in linguistics are introduced and illustrated using the NLTK. A fundamental idea for the presentation is _grammar as science_; audience members are inductively introduced to the idea of descriptive grammar and how to use language samples to extrapolate grammar rules & patterns. These skills are incredibly useful for anyone who works with text. The poster contents will be made available online via an iPython notebook. A draft version of [the notebook](http://nbviewer.ipython.org/github/lukewrites/NP_chunking_with_nltk/blob/master/NP_chunking_with_the_NLTK.ipynb) and [its source](https://github.com/lukewrites/NP_chunking_with_nltk) are online now.
bottom band background