This talk describes a project that uses the Natural Language Toolkit to build a language model from a gossip blog. The tone is light-hearted, but manages to introduce some core concepts in Python's most popular NLP library as well as some basics on computational linguistics and programming in Python.
This talk describes how to use Python to programmatically access content from a site, and generate a language model out of that content. We use the Natural Language Toolkit library against content from the gossip blog MediaTakeOut. During this time, we will provide some introduction on how to use Python for data-driven approaches to natural language processing. This talk is intended for beginners with an interest in learning about natural language processing. It is based on a post from my personal blog. For more details please see http://robertelwell.info/blog/mto-on-blast-a-language-model-for-a-gossip-blog/