Configuring MiModD for your system

The simplest way to configure a standard installation of MiModD is from the command line using the config tool.

Alternatively, you can configure the package through an environmental variable. Typically, this is the preferred way to configure a Galaxy Tool Shed install of MiModD.

Note

If you have installed MiModD according to the standard installation scheme, you MUST configure the package through the command line once before it will let you perform any analysis.

Otherwise, the command line and the environment variable based approach can be used interchangeably and at any time to modify settings.

Configuration from the command line

The config tool of MiModD has this general usage pattern:

python3 -m MiModD.config [--parameter VALUE] ...

Note

Depending on your installation of MiModD, changes to the configuration file may require superuser rights, in which case you will have to prepend sudo to the invocation of the tool.

Configuration wizard

For a fresh standard installation of MiModD invoking the config tool, e.g. through:

python3 -m MiModD.config

will start a simple configuration wizard.

For each parameter the wizard will present a short summary of its meaning (for more information see the section Configuration parameters below). It will then ask you to type a value for the parameter or to use the default value suggested in brackets, which you can accept by hitting <ENTER>.

Finally, the complete new package settings will be reported as confirmation (see the next section for an example report).

Inspect settings

If you run the config tool without options and MiModD has already been configured, it will simply report the current settings, for example:

% python3 -m MiModD.config

Settings for package MiModD in: /home/wgs/.local/lib/python3.5/site-packages/MiModD

-----------------------
PARAMETER : VALUE
.......................
TMPFILES_PATH :
MULTITHREADING_LEVEL : 3
MAX_MEMORY : 6
SNPEFF_PATH :
USE_GALAXY_INDEX_FILES : True

Change settings

Running the config tool is also the standard way to change settings at any time. Its invocation pattern for this purpose is [1]:

python3 -m MiModD.config [--tmpfiles [PATH]] [--snpeff [PATH]] [-t THREADS] [-m MEMORY] [--use-galaxy-index-files|--no-use-galaxy-index-files]

So to set a new TMPFILES_PATH and to set MAX_MEM to 8 GB, you could use:

python3 -m MiModD.config --tmpfiles /var/tmp/mimodd_tmp -m 8

Unset path parameters

The PATH argument to the --tmpfiles and --snpeff options above is optional. If any of these options is used without a PATH, this unsets the corresponding parameter. [1]

[1](1, 2) Proceed to Configuration parameters to learn more about the parameters that can be configured through the different options and about the effects of unset path parameters.

Configuration through an environment variable

As an alternative to command line configuration, you can provide new settings for the Configuration parameters through the MiModD-specific environment variable $MIMODD_CONFIG_UPDATE.

The value of this variable has to be set to a :-separated list of parameter=value entries, for example:

export MIMODD_CONFIG_UPDATE=MAX_MEMORY=8:TMPFILES_PATH=/tmp

The presence of the environment variable will be detected by MiModD at its next run from the same terminal, and will be used to update the package settings accordingly.

Unset paths through the environment variable

For parameters that take filesystem paths as their values, i.e., TMPFILES_PATH and SNPEFF_PATH the path can be unset by omitting it and specifying only parameter= in $MIMODD_CONFIG_UPDATE, for example after:

export MIMODD_CONFIG_UPDATE=SNPEFF_PATH=:TMPFILES_PATH=

the next invocation of MiModD will unset both paths (the effects of doing so are described in the Configuration parameters section below).

Passing the environment variable through Galaxy

You can also pass the variable to Galaxy by prepending it to the invocation of the Galaxy start script and this is the recommended approach to configure a Galaxy Tool Shed installation of MiModD, e.g.:

MIMODD_CONFIG_UPDATE=MULTITHREADING_LEVEL=3:MAX_MEMORY=4 sh run.sh

With this, the corresponding MiModD settings will get updated when the first MiModD tool gets executed from within the Galaxy instance.

Note

Changes made through $MIMODD_CONFIG_UPDATE are persistent so the variable has to be found only once.

To store the changes, MiModD needs to be executed with write privileges for the MiModD package directory. This should not be an issue with Tool Shed installations of MiModD, but can make this configuration scheme problematic in combination with Standard Installations in system directories.

Configuration parameters

listed as pairs of

PARAMETER : config options
where PARAMETER can be used in the $MIMODD_CONFIG_UPDATE variable and the config options are used to set the same parameter using the python3 -m MiModD.config command.
TMPFILES_PATH : --tmpfiles, --tmpfiles-path

the directory in which MiModD will store temporary files

In a typical analysis pipeline, MiModD may produce several GB of data in this directory and remove them automatically again when the data is not any longer needed. Under exceptional circumstances, however, MiModD might fail to delete data files, so this directory is the first place you should look at if you notice reduced disk space. Also, any users of MiModD will require write permission in this directory. When this parameter is unset tools will generate temporary files in the folder they are run from, i.e., in the current working directory found at tool run time.

Configuring MiModD for use with Galaxy

An unset TMPFILES_PATH will cause all temporary data to go into the temporary job working directory managed by Galaxy. Most often, this is a welcome solution when using MiModD exclusively through Galaxy and, consequently, will get autoconfigured for you during a MiModD Toolshed install.

MULTITHREADING_LEVEL : -t, --threads

the maximum number of cores that a single MiModD command is supposed to use at any time [default: 4]

Many MiModD commands take advantage of multiprocessing to speed up analyses. These commands try to respect this setting if possible although some may use slightly more than their allocated CPU share.

Note for Galaxy Admins

When running MiModD-based jobs from within Galaxy, a configured GALAXY_SLOTS value will override this parameter setting. So if you are controlling multi-threading through a job_conf.xml, you can continue doing so without worrying about this parameter.

MAX_MEMORY : -m, --memory

the maximum memory in GB that any single MiModD command should use [default: 2]

WGS data files are often very large and some MiModD tools can operate on them more efficiently when allowed to hold a larger portion of such files in memory at once. Most of the time MiModD will consume less than 1 GB though. The setting is adhered to relatively strictly by most tools with the exception of the SNAP aligner-based tools mimodd snap, mimodd snap-batch, the corresponding Galaxy MiModD Read Alignment tool, and the mimodd index tool while generating a snap reference genome index. Due to the nature of their underlying alignment algorithm these tools will require a fixed amount of memory that depends on the size of the reference genome and which may be significantly more than the configured setting.

Tip

MULTITHREADING_LEVEL and MAX_MEMORY will have a big effect on the performance of MiModD, but also on the responsiveness of your system during execution of MiModD commands. As a rule of thumb, if you do not have special requirements, we recommend to set both parameters to between 50 and 75 % of the available resources on your system, i.e., if you have 8 threads and 16 GB of RAM on your system, you might set MULTITHREADING_LEVEL to 4-6 and MAX_MEM to 8-12.

See also

MiModD Hardware Requirements for more notes on performance.

SNPEFF_PATH : --snpeff, --snpeff-path

the path to the optional SnpEff variant annotation tool

This parameter should be set to the directory that you installed SnpEff into (e.g., ~/snpEff if you followed the recommended installation steps). If you do not have SnpEff installed, unset the parameter (this is also the default setting), to have all SnpEff-dependent functionality of MiModD deactivated.

USE_GALAXY_INDEX_FILES : --use-galaxy-index-files, --no-use-galaxy-index-files

control whether MiModD should try to use index files generated by Galaxy

Galaxy does its own indexing for certain file types used and generated by MiModD. With this option turned on (the default) MiModD tools that require an index will try to use the Galaxy-generated index instead of building a new one. Typically, this will be the preferred behaviour because it avoids redundant calculations and speeds up the corresponding analyses. You may, however, have to turn this option off if you encounter version incompatibilities between the indices built by Galaxy and those required by MiModD. Use the environment variable like this:

USE_GALAXY_INDEX_FILES=YES
USE_GALAXY_INDEX_FILES=NO

to turn the behaviour on and off, respectively.