My last blog post was Simple Monte Carlo Simulations in Rust.

It was single-threaded, and I quickly moved onto parallelizing it. Since the calculations were independent, until the end, it could be described as an 'embarassingly parallel' problem.

Threads

First stop, threads. Rust's standard library has a std::thread module that makes it easy to use threads. Along with std::sync::mpsc (multi-producer-single-consumer) for message passing, it's easy to parallelize the problem and then return results.

use rand::Rng;
use std::sync::mpsc;
use std::thread;

struct Message {
    points_inside_circle: u64,
    total_points: u64,
}

fn main() {
    let num_tasks = 10;

    let (tx, rx) = mpsc::channel::<Message>();

    for _ in 0..num_tasks {
        let tx = tx.clone();

        thread::spawn(move || {
            let mut rng = rand::thread_rng();
            let total_points = 10_000_000;
            let mut points_inside_circle = 0;

            for _ in 0..total_points {
                let x: f64 = rng.gen();
                let y: f64 = rng.gen();

                if x * x + y * y <= 1.0 {
                    points_inside_circle += 1;
                }
            }

            let message = Message {
                points_inside_circle,
                total_points,
            };
            tx.send(message).expect("failed to send result")
        });
    }

    let mut total_points = 0;
    let mut points_inside_circle = 0;

    for _ in 0..num_tasks {
        let result = rx.recv().expect("failed to receive message");
        points_inside_circle += result.points_inside_circle;
        total_points += result.total_points;
    }

    let pi_estimate = 4.0 * (points_inside_circle as f64) / (total_points as f64);
    println!(
        "Estimated Pi = {}, from {} total points",
        pi_estimate, total_points
    );
}

Rayon

On closer glance, the above code is doing map-reduce. It is also using lots of mutable state.

Rust has support for doing functional programming, which means it has map and reduce functions! Let's use them instead.

I also chose to use Rayon, a data-parallelism library for Rust - since it had been on my list of things to play around with for a while.

Rayon introduces into_par_iter to replace std-lib into_iter, plus a map_init function, which is perfect for this problem. It allows us to initialize each underlying thread managed by the library with a value, and then map over the values in parallel.

In our case, since we're using rand::thread_rng(), we can use map_init to initialize each thread with a random number generator. This randon number generator uses thread-local state, so we don't want to share it across multiple threads.

use rand::Rng;
use rayon::prelude::*;

fn main() {
    let total_points = 1_000_000_000;
    let points_inside_circle = (0..total_points)
        .into_par_iter()
        .map_init(
            || rand::thread_rng(),
            |rng, _| {
                let x: f64 = rng.gen();
                let y: f64 = rng.gen();
                if x * x + y * y <= 1.0 {
                    1
                } else {
                    0
                }
            },
        )
        .reduce(|| 0, |a, b| a + b);

    let pi_estimate = 4.0 * (points_inside_circle as f64) / (total_points as f64);
    println!(
        "Estimated Pi = {}, from {} total points",
        pi_estimate, total_points
    );
}

This code is now much cleaner, using functional programming and Rayon.

I look forward to doing more simulations in Rust in the future!

Colin Webb

Parallel Monte Carlo Simulation in Rust

Threads

Rayon