Faculty Enrichment Program in Applied Malaria Modeling

Experiment design

When setting up a simulation experiment, be considerate about the number of parameters to change and scenarios to run. There is rarely a first single large simulation that will provide immediate answer at once and is correctly setup. Plan for multiple iterations that build up on another to gain confidence and better understanding before building a larger simulation experiment.

Prepare a simulation plan (list of scenarios to run): which ones are the main simulation experiments you need to run in order to answer the research question? Which ones are out of scope?
- You may find it helpful to create an excel sheet with the 1) parameter names and relevant values to explore and 2) combination of parameter values that define a simulation experiment to run. An example is provided [HERE]. More advanced use also includes notes about # number of scenarios, computational resources and time required.
Pilot before scaling up: Start simple and use a template scenario that has been validated, then add the intervention or feature of interest for one or a few settings before running all.
Technical feasibility before accurate predictions: It is OK to use placeholder parameters for test simulations to develop your code and scripts; however, do keep track of these and do not forget to update them to the correct parameters as soon as the initial testing is done.
- In the pilot and test simulations, make sure to carefully investigate the input json and output files. See reviewing input and output files below.
Give your simulation experiments meaningful names that can be versioned and tracked across iterations. For instance, any test runs may include ‘test’ in its name and have a v0, v1 or a date or similar included in the name or folder in which your simulations are stored.

Some general best practices for scientific computing are described by PLOS Biology in addition to what we specifically recommend for members of this team.

Reviewing input & output files

When designing new experiments, you should make sure to review input and output files to make sure your simulations are doing what you think they are. It can be tricky to get everything setup correctly the first time, even for experienced EMOD users, so this review process will help you verify prior to scaling up. Questions to check for investigating new simulation runs include:

Were the campaigns actually deployed, at the correct coverage and time?
How often were the campaigns deployed?
How does the simulated population change over time?
When running with burnin, was the burnin actually “picked up” successfully?
Small simulations allow for individual event reports, at what ages and how often did individuals get an intervention or change property?
Look at the same metric (i.e. prevalence) noy only at aggregated level over time, but monthly.
Are the agebins correctly set up, extracted in the analyzer and aggregated?
How do your plots compare to other known relationships?

Using QUEST

Quest is Northwestern’s high-performance computing cluster (HPC) on which we run our EMOD simulations. Quest is a linux-based HPC with the workload manager slurm on it to schedule jobs among its users. Everyone will need to apply for access to the team’s Quest allocation, b1139, here. Once granted access, you will have 80GB of space on your home directory and access to the team allocation which has much more space. We recommend cloning GitHub repositories to the home directory but saving all outputs to an appropriate folder on the team allocation.

Experiment design

Reviewing input & output files

Using QUEST

Resource Sharing