Ability to easily download and run benchmarks locally

Hi everyone :wave:

My name is Osama and I’m currently working on rubybench.org to bring it up-to-date and make some improvements.

One feature I’d like to implement is to make it easy to download and run benchmarks locally. This would come in handy for example when someone wants to debug/fix a performance regression, they can quickly grab the script and use it.

I can see 2 ways to make this work:

First way is to create a gem which can be invoked via command line to install the required dependencies to run the benchmarks. Once the dependencies are installed, you can provide the gem with a URL (which you could copy from rubybench.org) or path to a local copy of the script you want to benchmark and it’ll benchmark the script for you. The script will be benchmarked against a local clone of the repository you’re fixing (e.g. rails, ruby etc.). The command would look something like this:

rubybench-runner rails $SCRIPT_URL --path ~/rails

(suggestions for a better name are welcome)

Once the benchmarks are complete the gem would print the results to the terminal in a nice format and maybe it’ll have an option to save the results to a file.

The other way is to create a runner script in GitHub - ruby-bench/ruby-bench-suite: Benchmark suite for RubyBench.org that can take care of installing dependencies and benchmarking scripts similar to how the gem would work. Then the user would need to clone the repo (which contains all the scripts we have for rubybench.org) and launch the runner script inside the repo and pass to the launcher the name of the benchmark script they wish to run.

The downside is that the ruby-bench/ruby-bench-suite repo contains many files for ruby, rails, PG and bundler benchmarks. If you’re working on rails you probably don’t care about the ruby files and vice versa. Also it contains files that are only used by rubybench.org server to publish results on the website.

I’m in favor of the gem approach because I think it’s cleaner, easier to use and more user-friendly than calling a script inside a cloned repo.


Some benchmarks depend on having a postgres (or mysql etc.) server running, the problem here is that installing dependencies such as postgres differs between between operating systems and it even differs between Linux distros. Is it safe to assume the user already has such fundamental dependencies installed?

I think it makes sense to have a page on rubybench.org that explains how to run the benchmarks locally and lists what dependencies are required.

Does this sound good to you? Any concerns?

cc @sam / @system

1 Like

The gem approach sounds amazing. As far as, say, postgres, would it make sense to do it with a Docker container, the same way Ruby-Bench runs on the web site? That would not only take care of installation and avoid creating databases on local Postgres, it would also guarantee the same version was used.

2 Likes

I think it is reasonably safe to go with the gem approach and trust that dependencies are good, I guess on boot it can do a test of pre-reqs and error out if something is wildly off. That said I would not really lock us to a particular version of postgres and instead just say must be newer than say 9.6 or something like that. Then if there is a mistmatch simply explain that in the runner:

The bench on Ruby bench ran using PG version 10.5, you are running 9.6

Something like that.

1 Like

Thanks Noah. While using docker would certainly solve this particular problem, there are other downsides that can be barrier to entry such as having to install docker itself and download/build docker images which can take a while. I think the vast majority of people who will use this feature will have the necessary dependencies (e.g. postgres) already installed on their systems.

Fair enough. But then you’ll need to duplicate any configuration that’s currently done via Docker and keep it up to date. If you’re fine with that, then sounds good :slight_smile:

1 Like

Noah you may want to have a play with this, just head to any bench on the site and follow the breadcrumbs :slight_smile:

@osama some feedback on:

rubybench_runner run rails/bm_activerecord_scope_all_over_select.rb


Cannot find rails at /home/sam/rails.Perhaps try:

Missing a newline.


I want to hack on this script so it would be nice to have:

run <repo_name>/<script_name> OR localscript so I can run locally, that way I can measure memory and so on.


What is this thing doing to my DB? What database is it going to create? It should be very specific with verbose output about all the DB fiddeling it is doing (especially any createdb statements it runs)

EG:

rubybench_runner run rails/bm_activerecord_scope_all_over_select.rb --rails=`pwd` --db postgres
Using the 'postgres' gem...
Checking dependencies...
Warning: rubybench.org is currently running version 9.6 of PostgreSQL, you're running version .
Installing gems...
Checking database... (using rubybench_local) <-- something like that.
Downloading script to /tmp/rubybench/script.name <-- something like that. 
Running benchmarks...
Rails version 6.1.0.alpha
Results (1 runs):

- run 1:
    iterations_per_second: 91.71
    iterations_per_second_standard_deviation: 3.27
    total_allocated_objects_per_iteration: 25081

Also, perhaps we can do:

rubybench_runner run rails/bm_activerecord_scope_all_over_select.rb --create-standalone-script > run_bench.rb

That will help us remove 100% of the magic here if we wish.

Making it easier to run benchmarks sounds great :slight_smile:
Here are my thoughts. I’m unsure if a gem or the repository + some scripts/harness is better. The gem is probably a bit harder to tweak if needed.

The downside is that the ruby-bench/ruby-bench-suite repo contains many files for ruby, rails, PG and bundler benchmarks.

I think this doesn’t matter much, the repository is only 10MB.

What would matter for me and I think for other Ruby implementers and probably more people is:

  • Easy to run the benchmark: automated setup of dependencies or checks for them is nice.
  • Give me a full command line for running the benchmark (the setup can be done via some scripts, that’s fine), so that it can easily be tweaked (e.g., by passing extra options to the Ruby interpreter). Specifically, I don’t want to execute any other Ruby process to run it, so e.g., I can profile/debug/instrument/etc just the process that matters easily.
  • STDOUT/STDERR should be the defaults, or clearly documented where they are redirected. That’s where my debugging/profiling output ends up often.
  • Ability to run without Docker, it’s just so much more convenient if e.g., I recompile my Ruby interpreter, change files here and there, etc.
  • Ability to run a fixed workload (e.g., 10000 iterations of the benchmark), this makes it easier to compare profiles.

And some orthogonal concern:

  • Each benchmark run should be verified if it produces the correct result. If I do an incorrect optimization, it should tell me it broke the benchmark, not that it got faster.
1 Like

Any chance you can try out what we got? It is all working at least for me (minus some refinements)

Head to: https://rubybench.org/rails/rails/commits?result_type=active_record/postgres_discourse&display_count=2000 and follow the instructions? @noahgibbs any chance you can try that as well?