Python API

Warning

Documentation and testing of the Python API of MiModD are currently work in progress.

Feel free to use any part of the package for your own coding purposes, but know that you will do so at your own risk.

If you want to contribute to MiModD or report bugs, contact us via the MiModD user group at https://groups.google.com/forum/#!forum/mimodd.

Description of the MiModD Python modules in alphabetical order

bioobj_base.py

Currently, only holds very few class definitions used by a few other modules.

config.py

This is MiModD’s configuration file defining global settings used by several other tools. The contents of this file are what is accessed when running mimodd config from the command line. Tools that depend on any of the settings simply import them.

convert.py

The module underlying the convert tool. Two functions - fq2sam and sam2bam - are defined here that perform the supported format conversions. The dispatch function is called by the package command line interface and calls the appropriate conversion function based on the provided arguments. fastqReader is a pretty robust generator-based fastq parser used by fq2sam for input parsing.

covtools.py

Provides an interface for handling the MiModD-specific .cov file format.

decox.py

Provides a few low-level helper objects for working with decorators.

deletion_calling.py

The (still experimental) implementation of the delcall subcommand. The main function of the module is delcall which uses sample_insert_sizes to get an estimate of the distribution of insert sizes for the reads of a sample, find_lowcov_regions to identify regions of low-coverage and del_stats to assess whether there is statistical evidence of a deletion for any low-coverage region.

fileinfo.py

The implementation of the info command line tool. print_sampleinfo is the function called when the info tool is run. It calls - and reports on the return value of - the get_samplenames function, whch does the real job of file format autodetection, but delegates file parsing to the modules pysamtools (for SAM/BAM format), pyvcf (for vcf format) and covtools (for .cov files).

fisher.py

A Python implementation of Fisher’s exact test (right-sided version only). Used by deletion_calling.

mimodd_base.py

Some low-level object definitions for use by the other modules.

pysamtools.py

A collection of wrappers that provide a functional programming interface for csamtools. Currently, MiModD uses a pysam core for iterator-based access to reads in SAM/BAM format, but shell-based command line calls to samtools are routed through this module. Most functions defined here have the same name as the corresponding samtools subcommand, but they typically provide extra functionality (e.g., support for additional file formats or error handling).

pyvcf.py

Provides an object-oriented interface for manipulating data in the vcf format. Used throughout the package to read and write vcf files.

samheader.py

Provides a Header class as an object representation of a sam header and a single function sam_header that generates and returns a Header instance. When run from the command line, parses the command line arguments, calls sam_header and either prints the returned header object or writes it to the specified output file.

snap.py

The package’s wrapper around the SNAP aligner. This module is used by the snap, snap_batch and snap_index command line tools.

tmpfiles.py

The functions defined here are used throughout the package to obtain unique names for temporary files and hardlinks during temporary file management.

tool_conf_update.py

Defines a single function add_to_galaxy, which is called by the enable_galaxy subcommand and adds the MiModD section to the Tools bar of Galaxy.

variant_annotation.py

The implementation of the annotate command line tool. The main function is annotate (called when the command line tool is run). snpeff is a wrapper for executing command line calls to SnpEff through Python function calls. get_installed_snpeff_genomes is called by the snpeff_genomes command line tool.

variant_calling.py

The single function varcall is called when the command line tool of the same name is run. It wraps command line samtools mpileup || bcftools view pipes (one per chromosome in the reference genome) into pre- and post-processing code for temporary file and read-group management, optional generation of a genome-wide coverage file and error handling.

vcf_filter.py

This module provides the implementation of the vcf_filter tool. From the command line arguments passed to the tool, it constructs an appropriate call to the filter method of a VCFReader object as defined in the pyvcf module, then writes the VCFEntry objects returned by the filter method to the specified output destination.