Performance

Overview

It’s hard to do a full benchmark. We need to choose a scenario in which the scheduler will be busy creating, scheduling, running and destroying the processes.

We continue to talk about the topic of the sum example in the tutorial. But now the computing rule is changed to a divide-and-conquer one: sum(low, high) = sum(low, mid) + sum(mid + 1, high) in which mid is (low + high) / 2.

In a basic version, it may look like:

int sum(int low, int high) {
  if (low == high) {
    return low;
  }
  int mid = low + ((high - low) >> 1);
  return sum(low, mid) + sum(mid + 1, high);
}

In our benchmark, we compute the sub-parts sum in two processes each. And it becomes:

proc void sum(int low, int high, int *result) {
  if (low == high) {
    *result = low;
    return;
  }
  int mid = low + ((high - low) >> 1), left, right;
  sync(sum(low, mid, &left); sum(mid + 1, high, &right));
  *result = left + right;
}

Go to sum.go for the equivalent golang function.

Environment

  • OS: Linux ubuntu-bionic 4.15.0-88-generic x86_64
  • CPU Cores: 4
  • Memory: 2048M
  • Go: 1.14
  • GCC: 8.3.0

Bechmark

Now compute sum(0, 5000000) for 10 rounds and print the time spent per round in seconds.

$ make CC=gcc-8 benchmark_sum
The result is 12500002500000, ran 10 rounds, 0.415647 seconds per round.

$ make benchmark_sum_go
The result is 12500002500000, ran 10 rounds, 4.510704 seconds per round.

We will see that libcsp is about 11 times faster than go in this scenario.