Origins of the project

This project—as probably many others—has been born from frustration. People working with computer simulation sooner or later encounter the need of automation, when faced with increasing simulation demands and projects with very similar workflows. A first solution is typically a simple shell script consisting of the sequence of commands. Some operations, especially file edits are not easy to represent in the form of command line utilities. Another challenge is decisions and conditionals, for which most shells (namely BASH, CSH and their variants) have limited and/or cumbersome facilities.

These advanced operations can be readily solved by a more complete programming language, for instance Python. Inserting Python snippets into the shell scripts (e.g. with “here documents” using the “<<” operator in BASH) mitigates the problem, but the resulting scripts tend to be large and unmaintainable. The largest advantage of BASH over Python is the straightforward way to run programs.

The autors of this project experienced that the scripts developed for their workflows tended to contain more and more Python blocks, and much of the other code was information transfer between Python and BASH. When the Python:BASH ratio reached around 1:1, we thought to turn the tables and choose Python as the main language and replace the remaining BASH constructs in a Pythonic, object oriented way. This also came with the advantage that many common operations (such as calling grompp or solvate) can be abstracted away. Of course, abstraction comes with the danger of turning parts of the workflow into black boxes and taking away / hiding important decisions from the end user. The end user of this package is therefore encouraged to always check the files produced, especially coordinate sets, topologies and molecular dynamics parameter files (.mdp).

We also intended to make this package as portable as possible. We are aware of the existence of gmxapi, the official Python bindings of GROMACS supplied with the newest versions. Because it directly uses the shared libraries of GROMACS, it is very fast. However, it is not easily portable between versions. In gmxbatch we depend on and use the command line interface by invoking gmx subprograms where needed. This results in a one-size-fits-(almost)-all package, where the exact program version is selected by the user. We also implemented a higher level object oriented interface, letting the user focus on the physics/chemistry instead of the exact parametrization of GROMACS commands.