Performance and Profiling

Indexing

Indexing properties that are queried repeatedly in your simulation can lead to dramatic speedups. It is not uncommon to see two or more orders of magnitude of improvement in some cases. It is also very simple to do.

You can index a single property, or you can index multiple properties jointly. Just include the following method call(s) during the initialization of context, replacing the example property names with your own:

// For single property indexes
// Somewhere during the initialization of `context`:
context.index_property::<Person, Age>();

// For multi-indexes
// Where properties are defined:
define_multi_property!((Name, Age, Weight), Person);
// Somewhere during the initialization of `context`:
context.index_property::<Person, (Name, Age, Weight)>();

The cost of creating indexes is increased memory use, which can be significant for large populations. So it is best to only create indexes / multi-indexes that actually improve model performance, especially if cloud computing costs / VM sizes are an issue.

See the chapter on Indexing for full details.

Optimizing Performance with Build Profiles

Build profiles allow you to configure compiler settings for different kinds of builds. By default, Cargo uses the dev profile, which is usually what you want for normal development of your model but which does not perform optimization. When you are ready to run a real experiment with your project, you will want to use the release build profile, which does more aggressive code optimization and disables runtime checks for numeric overflow and debug assertions. In some cases, this can improve performance dramatically.

The Cargo documentation for build profiles describes many different settings you can tweak. You are not limited to Cargo's built in profiles either. In fact, you might wish to create your own profile for creating flame graphs, for example, as we do in the section on flame graphs below. These settings go under [profile.release] or a custom profile like [profile.bench] in your Cargo.toml file. For maximum execution speed, the key trio is:

[profile.release]
opt-level = 3     # Controls the level of optimization. 3 = highest runtime speed. "s"/"z" = size-optimized.
lto = true        # Link Time Optimization. Improves runtime performance by optimizing across crate boundaries.
codegen-units = 1 # Number of codegen units. Lower = better optimization. 1 enables whole-program optimization.

The Cargo documentation for build profiles describes a few more settings that can affect runtime performance, but these are the most important.

Ixa Profiling Module

For Ixa's built-in profiling (named counts, spans, and JSON output), see the Profiling Module topic.

Visualizing Execution with Flame Graphs

Samply and Flame Graph are easy to use profiling tools that generate a "flame graph" that visualizes stack traces, which allow you to see how much execution time is spent in different parts of your program. We demonstrate how to use Samply, which has better macOS support.

Install the samply tool with Cargo:

cargo install samply

For best results, build your project in both release mode and with debug info. The easiest way to do this is to make a build profile, which we name "profiling" below, by adding the following section to your Cargo.toml file:

[profile.profiling]
inherits = "release"
debug = true

Now when we build the project we can specify this build profile to Cargo by name:

cargo build --profile profiling

This creates your binary in target/profiling/my_project, where my_project is standing in for the name of the project. Now run the project with samply:

samply record ./target/profiling/my_project

We can pass command line arguments as usual if we need to:

samply record ./target/profiling/my_project arg1 arg2

When execution completes, samply will open the results in a browser. The graph looks something like this:

Flame Graph

The graph shows the "stack trace," that is, nested function calls, with a "deeper" function call stacked on top of the function that called it, but does not otherwise preserve chronological order of execution. Rather, the width of the function is proportional the time spent within the function over the course of the entire program execution. Since everything is ultimately called from your main function, you can see main at the bottom of the pile stretching the full width of the graph. This way of representing program execution allows you to identify "hot spots" where your program is spending most of its time.

Using Logging to Profile Execution

For simple profiling during development, it is easy to use logging to measure how long certain operations take. This is especially useful when you want to understand the cost of specific parts of your application — like loading a large file.

cultivate good logging habits

It's good to cultivate the habit of adding trace! and debug! logging messages to your code. You can always selectively enable or disable messages for different parts of your program with per-module log level filters. (See the logging module documentation for details.)

Suppose we want to know how long it takes to load data for a large population before we start executing our simulation. We can do this with the following pattern:

use std::fs::File;
use std::io::BufReader;
use std::time::Instant;
use ixa::trace;

fn load_population_data(path: &str, context: &mut Context) {
    // Record the start time before we begin loading the data.
    let start = Instant::now();

    let file = File::open(path)?;
    let mut reader = BufReader::new(file);
    // .. code to load in the data goes here ...

    // This line computes the time that has elapsed since `start`.
    let duration = start.elapsed();
    trace!("Loaded population data from {} in {:?}", path, duration);
}

This pattern is especially useful to pair with a progress bar as in the next section.

Progress Bar

Provides functions to set up and update a progress bar.

A progress bar has a label, a maximum progress value, and its current progress, which starts at zero. The maximum and current progress values are constrained to be of type usize. However, convenience methods are provided for the common case of a progress bar for the timeline that take f64 time values and rounds them to nearest integers for you.

Only one progress bar can be active at a time. If you try to set a second progress bar, the new progress bar will replace this first. This is useful if you want to track the progress of a simulation in multiple phases. Keep in mind, however, that if you define a timeline progress bar, the Context will try to update it in its event loop with the current time, which might not be what you want if you have replaced the progress bar with a new one.

Timeline Progress Bar

/// Initialize the progress bar with the maximum time until the simulation ends.
pub fn init_timeline_progress_bar(max_time: f64);
/// Updates the progress bar with the current time. Finalizes the progress bar when
/// `current_time >= max_time`.
pub fn update_timeline_progress(mut current_time: f64);

Custom Progress Bar

If the timeline is not a good indication of progress for your simulation, you can set up a custom progress bar.

/// Initializes a custom progress bar with the given label and max value.
pub fn init_custom_progress_bar(label: &str, max_value: usize);

/// Updates the current value of the custom progress bar.
pub fn update_custom_progress(current_value: usize);

/// Increments the custom progress bar by 1. Use this if you don't want to keep track of the
/// current value.
pub fn increment_custom_progress();

Custom Example: People Infected

Suppose you want a progress bar that tracks how much of the population has been infected (or infected and then recovered). You first initialize a custom progress bar before executing the simulation.

use crate::progress_bar::{init_custom_progress_bar};

init_custom_progress_bar("People Infected", POPULATION_SIZE);

To update the progress bar, we need to listen to the infection status property change event.

use crate::progress_bar::{increment_custom_progress};

// You might already have this event defined for other purposes.
pub type InfectionStatusEvent = PropertyChangeEvent<Person, InfectionStatus>;

// This will handle the status change event, updating the progress bar
// if there is a new infection.
fn handle_infection_status_change(context: &mut Context, event: InfectionStatusEvent) {
  // We only increment the progress bar when a new infection occurs.
  if (InfectionStatusValue::Susceptible, InfectionStatusValue::Infected)
      == (event.previous, event.current)
  {
    increment_custom_progress();
  }
}

// Be sure to subscribe to the event when you initialize the context.
pub fn init(context: &mut Context) -> Result<(), IxaError> {
    // ... other initialization code ...
    context.subscribe_to_event::<InfectionStatusEvent>(handle_infection_status_change);
    // ...
    Ok(())
}

Additional Resources

For an in-depth look at performance in Rust programming, including many advanced tools and techniques, check out The Rust Performance Book.

Keyboard shortcuts

The Ixa Book