Using Crank To Benchmark Libraries

Catcher Wong
2y
6.2k
0
0

Article

Background

When we write a library for others, we may do benchmark testing to track performance.

In DotNet, we often use BenchmarkDotNet to accomplish this task. With a small amount of code, we can run benchmark tests locally and obtain results.

The effect may be more apparent when modifying the code because we want to know if our modifications will make the code run faster and consume fewer resources.

Today, I will introduce an awesome tool Crank to benchmark our libraries.

What is a Crank?

Crank is the benchmarking infrastructure used by the .NET team to run benchmarks, including (but not limited to) scenarios from the TechEmpower Web Framework Benchmarks.

Its first appearance was introduced by @sebastienros in .NET Conf 2021.

https://learn.microsoft.com/en-us/events/dotnetconf-2021/benchmarking-aspnet-applications-with-net-crank

Crank is a client-server (C/S) architecture, mainly composed of a controller and one or more agents. The controller is the client, responsible for sending instructions; The agent is the server, responsible for running benchmark tests that the client sends.

The following is its architecture diagram.

Using Crank To Benchmark Libraries

As you can see, the interaction between the controller and the agent is driven through HTTP requests. And the agent can execute multiple different types of job types.

Crank To Benchmark Libraries

This article mainly focuses on the .NET project job shown in the diagram. Let's look at a simple example of a crank repository first.

Getting Started

Firstly, we need to install two tools related to the crank, one is the controller, and the other is the agent.

dotnet tool update Microsoft.Crank.Controller --version "0.2.0-*" --global

dotnet tool update Microsoft.Crank.Agent --version "0.2.0-*" --global

Then run the micro example, which compares Md5 and SHA256.

public class Md5VsSha256 {
    [Params(100, 500)]
    public int N {
        get;
        set;
    }
    private readonly byte[] data;
    private readonly SHA256 sha256 = SHA256.Create();
    private readonly MD5 md5 = MD5.Create();
    public Md5VsSha256() {
            data = new byte[N];
            new Random(42).NextBytes(data);
        }
        [Benchmark]
    public byte[] Sha256() => sha256.ComputeHash(data);
    [Benchmark]
    public byte[] Md5() => md5.ComputeHash(data);
}

It should be noted that the Main method needs to be run using the BenchmarkSwitch, as Crank is executed from the command line and will attach some arguments, which is a parameter args in the code.

public static void Main(string[] args)
{
    BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);
}

There is the configuration file that the controller needs to use, which tells the agent how to run the benchmark tests.

jobs:
  benchmarks:
    source:
      localFolder: .
      project: micro.csproj
    variables:
      filterArg: "*"
      jobArg: short
    arguments: --job {{jobArg}} --filter {{filterArg}} --memory
    options:
      # Using BenchmarkDotNet
      benchmarkDotNet: true

scenarios:
  Md5VsSha256:
    application:
      job: benchmarks

profiles:
  local:
    jobs: 
      application:
        # Endpoints of agent
        endpoints: 
          - http://localhost:5010

Next, start the agent by the following command.

crank-agent

After starting, you will see Agent ready, waiting for jobs...

[11:42:30 INF] Created temp directory 'C:\Users\catcherwong\AppData\Local\Temp\2\benchmarks-agent\benchmarks-server-8952\2mmqc00i.3b1'
[11:42:30 INF] Agent ready, waiting for jobs...

The default port is 5010, and other ports can be specified through -u|--URL

The next step is to send instructions to the agent through the controller.

crank --config C:\code\crank\samples\micro\micro.benchmarks.yml --scenario  Md5VsSha256 --profile local

The above command specifies our configuration file, scenario, and profile. Because there can be multiple scenarios and profiles in the configuration file, it is necessary to specify a specific one in a single execution.

If multiple scenarios need to be executed, multiple commands must be executed vir the controller.

After executing the command, we can see the log output in the agent:

Using Crank To Benchmark Libraries

After the agent receives a job request, it will automatically install the corresponding SDK version and runtime. After installation, the specified project will be published.

Using Crank To Benchmark Libraries

Then the agent will run the benchmark test.

Using Crank To Benchmark Libraries

After the run is completed, the results will be output, and finally, the content of this benchmark test will be cleared.

At last, the results can be seen on the controller side.

Using Crank To Benchmark Libraries

By the way, we will save the results obtained by the controller in a JSON file for future comparison or chart.

We can add json option to the controller. --json filename.json.

crank --config C:\code\crank\samples\micro\micro.benchmarks.yml --scenario  Md5VsSha256 --profile local --json base.json

Run multiple times and save the results in different JSON files, especially before and after code changes.

crank --config C:\code\crank\samples\micro\micro.benchmarks.yml --scenario  Md5VsSha256 --profile local --json head.json

Finally, by comparing these two results, we can see more clearly whether the code changes have brought about improvements.

crank compare base.json head.json

Using Crank To Benchmark Libraries

What was mentioned above is still executed locally. How to configure it if it needs to be executed on different machines?

We need to add the machine's proxy address to the profiles node in the configuration file.

Here is a simple example:

profiles:
  local:
    jobs: 
      application:
        endpoints: 
          - http://localhost:5010
  remote-win:
    jobs: 
      application:
        endpoints: 
          - http://192.168.1.100:9090
  remote-lin:
    jobs: 
      application:
        endpoints: 
          - http://192.168.1.102:9090

At this point, if you specify -profile remote-win, the benchmark test will be executed on the server 192.168.1.100, and if it is -profile remote-lin, it will be executed on server 192.168.1.102.

This makes it easy to perform benchmark testing on different machines.

Another helpful feature of Crank is its ability to benchmark the code before and after Pull Request, which is very helpful for open-source projects that require benchmarking.

In the next section, we will introduce how to use Crank and GitHub Action for benchmark testing on GitHub.

Pull Request

Crank provides a tool named Pull Request Bot that can help us to do benchmarks quickly.

We need to install it first.

dotnet tool update Microsoft.Crank.PullRequestBot --version "0.2.0-*" --global

After installation, we will receive crank-pr that we will use later.

Using Crank To Benchmark Libraries

crank-pr Provides the configuration options shown in the figure above.

Here is a sample for it:

crank-pr \
  --benchmarks lib-dosomething \
  --components lib \
  --config ./benchmark/pr-benchmark.yml\
  --profiles local \
  --pull-request 1 \
  --repository "https://github.com/catcherwong/library_with_crank" \
  --access-token "${{ secrets.GITHUB_TOKEN }}" \
  --publish-results true

What does the above command mean?

It will affect the pull request with ID 1 of catcherwong/library_with_crank by running two benchmark tests: the code of the main branch and the code after the pull request is merged. The testing content is determined jointly by benchmarks, components, and profiles, and the comparison results of the two benchmark tests will be commented on the pull request.

NOTE: catcherwong/library_with_crank is a repository that I created in advance, which contains some simple code, configuration related to crank, and code for GitHub workflow.

Let's take a look on pr-benchmark.yml

components:
    lib: 
        script: |
            echo lib
        arguments:
            # crank arguments
            "--application.selfContained false"

# default arguments that are always used on crank commands
defaults: ""

profiles:
    local:
      description: Local
      arguments: --profile local
    remote-win:
      description: windows
      arguments: --profile remote-win 
    remote-lin:
      description: linux
      arguments: --profile remote-lin 

benchmarks:
    lib-dosomething:
      description: DoSomething
      arguments: --config ./benchmark/library.benchmark.yml --scenario dosomething

    lib-getsomething:
      description: GetSomething
      arguments: --config ./benchmark/library.benchmark.yml --scenario getsomething

    lib-another:
      description: Another
      arguments: --config ./benchmark/library.benchmark.yml --scenario another

This can be said to split the parameters of the crank into different configuration options.

The benchmarks option specifies the configuration file and scenario of the crank, while the profiles option specifies the crank profile.

Let's take a look at the actual operation effect.

Using Crank To Benchmark Libraries

The specific execution process can refer to the following link

https://github.com/catcherwong/library_with_crank/actions/runs/4598397510/jobs/8122376959

If you have your cloud servers, you can also use your cloud servers to run benchmarks without the resources GitHub Action provides.

The advantage of our cloud servers is that they are relatively stable, and you can specify servers with different configurations according to the scenario. However, using GitHub Action's resources is also harmless.

The following screenshot is executed on submitting a task to an external cloud server.

Using Crank To Benchmark Libraries

Summary

Crane is a perfect tool, combined with BenchmarkDotNet for benchmark testing of class libraries and load testing tools such as work/wrk2/Bombardier/h2load for API/GRPC framework and application.

This only introduces a small piece of content about Crank, and there is still much to explore.

I hope this will help you!

Reference