pycon logo

PyCon 2011 Atlanta

March 9th–17th

Log in or Sign Up

TRAPPIST : A toolkit for comparative analysis and visualization of genomic regions

log in to bookmark this presentaton

Experienced / Poster
TRAPPIST is a BioPython-based package that performs detailed comparisons of related genomic regions and produces publication-quality vector PDF figures for clear visualization of analysis results.

Abstract

Recent developments in genome sequencing technology have led to an outpouring of raw sequence data, which represents both a great opportunity and a daunting challenge as biologists struggle to make sense of this enormous amount of information. There are standard tools available to process genomic data (for tools in Python, look up the BioPython Project at http://biopython.org/) but there is an unfulfilled need for applications and pipelines to address specific biological questions.

My work on the genomic evolution and functional modularity of large bacterial plasmids (extrachromosomal DNA molecules which usually confer genetic properties that are non-essential but enhance survival of the host cell, such as antibiotic resistance and virulence) has led me to develop a set of tools (TRAPPIST) that are tailored for the study of plasmids but can in fact be used for any type of genomic region.

TRAPPIST is a Python package that performs detailed comparisons of related genomic regions and produces publication-quality vector PDF figures for clear visualization of analysis results. It is built as a series of functional modules that can be run separately or in pipelines. Several pre-set pipeline scripts are provided for automated start-to-finish processing of tasks that combine analysis and visualization, such as producing figures of multiple alignments with detailed similarity heat maps, identifying conserved genetic backbones or reconstructing the differential phylogeny of mosaic genome regions. Sequence handling, comparisons and alignments are done using BioPython and external calls to standard tools such as ClustalW, MUSCLE and/or BLAST using their respective BioPython wrappers. Figure generation is done using the ReportLab Toolkit.

This poster is aimed at bionformaticists who are interested in genomic analysis and visualization as well as Python developers who are interested in developing packages for scientific research. I'm looking for discussion, potential users and development collaboration opportunities.

Poster content / discussion points - context: genomic research - overview of TRAPPIST features: - versatility of alignment functions - improved visualization of alignment data - identification of genetic backbones - handling differential phylogeny - assessment of current performance - future developments