GSOC 2017 - Long Running Ruby and Rails Benchmarks

God day everyone!

I am student interested in improving RubyBench during gsoc 2017. Before I submit application I would like to get clear understanding of current state of RubyBench project.

I read a portion of this discussion with @shahsaurabh0605 regarding the same project last year. It would be helpful if any of you could summarize what was achieved during last summer, because top level goals are the same as last year.

I tried to find project in gsoc 2016 archive but with no success.

Cheers :slight_smile:

Hello bmarkons,

So sorry for the late answer here ! I’m not sure whether you’ll have enough time to send a proposal but answering anyway to your question.

Last year, it looks like shahsaurabh0605 implemented Sequel benchmarks and also begun working on detecting regressions here : Detecting benchmark regressions by shahsaurabh0605 · Pull Request #174 · ruby-bench/ruby-bench-web · GitHub.

If you have any ideas to improve Ruby Bench, feel free to put them in your proposal. At the very least, it looks like the last pull reference request needs some work. Also, it looks like there is a lack for other Ruby implementations.

Maybe @sam and @system could chime in ; even though you are not mentoring this year, this would be awesome to have your points of view here. :blush:

Have a nice day and sorry again for the late answer ! Thanks !

to me is still the #1 #2 and #3 issues I want sorted, our benchmarks are not honest enough, there is no way to tell how much performance is being left on the table.

Hey, Marko! Sure, I’m fine with having discussions here rather than in email :slight_smile:

1 Like

Awesome! :tada:

This way anyone can drop some thoughts on subjects we will discuss. I read the portion of last summer gsoc topic and it helped me get the feel of project flow, so thats where idea of having conversation here comes from.

Looking forward to ask first questions :sunglasses:

Hi there !

Congratulations Marko !

Try to be as comfortable as possible with the Ruby Bench code base before the coding period starts and obviously, feel free to ask us questions ! :slight_smile:

Also posting to mention that if you want, we can open a Basecamp project to have a Campfire, a place for centralizing documents, to-dos, etc. That’s up to you to Noah and Marko, I don’t mind on my side and I think Jon doesn’t neither.

Thanks Robin :slight_smile:

:+1: for basecamp project.

Okay, fine, you should’ve received an invitation for the Basecamp project !

I changed readme a bit in rubybench guideline repo which was quite poor :worried:

Pls take a look Make README more descriptive by bmarkons · Pull Request #23 · ruby-bench/ruby-bench · GitHub

btw, I managed to run ruby benchmarks locally in docker container today :tada:

What are prepared statements? I see results with prepared statements and without them.

Prepared statements are SQL statements that have placeholders in them to inject data. For example, this is a prepared statement:

SELECT * FROM articles WHERE id = $1;

Here, $1 is a placeholder for a given value (it will be replaced with 10 for instance). You may also see the ? syntax for placeholders. Prepared statements are mainly used in Ruby on Rails to cache SQL queries as you often run the same queries but only with different values. This is also a good practice to avoid SQL injections as the value will be properly escaped (if you are dealing with user inputs for example).

If you have a Rails application, looking at the server logs, you’ll see SQL prepared statements.

Hope it’s clear enough, feel free to tell me if I’m not crystal clear ! :smiley:

Thanks @robin850 :slight_smile:

So in performance aspect, prepared statements are expected to run faster because of caching behind, right?

Yes exactly, because Active Record won’t have to rebuild the query ; it just needs to inject the values. :slight_smile:

I see we currently have benchmarks for rails, ruby, bundler, discourse and sequel in our suite. Is this suite supposed to support only certain gems or as many as possible?

In case we are trying to support as many as possible with benchmarks, I was thinking if duplicating benchmarks to our suite (since benchmarks for ruby and discourse already exist on official repos) is good approach?

The idea is that benchmarks stay on the official repos instead of copying it to ruby-bench-suite. Workflow example would be like, if you are gem developer and you want benchmarks to be executed you would just submit pull request with some configuration, and after ruby-bench approve it and merge it, benchmarks you wrote as a gem developer would be executed like any other.

Do you think it is doable? Pls tell me if I’m missing something. :grimacing:

Maybe @sam and @noahgibbs can answer you better than I would on the subject but I guess the goal is to support only major projects. The real problem isn’t benchmarks but running them somewhere and if Ruby Bench tries to support more and more projects, there may be scaling issues because the resources are a bit limited.

Sort of, in the real world ™ any large install of Postgres will use pgbouncer in transaction pooling mode. This means that prepared statements simply are not an option. PG starts performing really inconsistently with tons of connections and Rails loves creating connections.

Yes I would like to only support major projects here, and for the ORM tests Sequal, AR, Raw are a good enough trio. No need to add more for now.

@bmarkons I am curious, did you write a few of the benchmarks in raw PG and Sequel yet (even 1 is a good start), how does performance compare on local?

Let’s try to stay laser focused on getting a great answer to the question above.

I’m new to Ruby Bench too. But the way most benchmarking works is that you don’t want as much as possible - you want to focus on the things you consider important. Having a huge amount of stuff that doesn’t matter tends to clutter up your understanding.

There’s a great talk by Matt Gaudet on how to benchmark Ruby 3 that may help: [EN] Ruby3x3: How are we going to measure 3x? \/ Matthew Gaudet

What he’s getting at there is that your benchmarks should be broad enough to measure all the stuff you care about, but specific enough not to be too confusing. If we measured hundreds of different gems but they were all a kind of mix of Matt’s eight ideas, we wouldn’t be gaining anything new.

5 posts were split to a new topic: Rewriting the scope_all benchmark in Sequel and Raw

Let’s try to stay laser focused on getting a great answer to the question above.

@sam Maybe I sounded wrong, though I tried to understand rubybench project direction rather than do it atm :slight_smile: Now I know that the plan is to support only major projects in near future :ok_hand:

there may be scaling issues because the resources are a bit limited

@robin850 yeah, it seems like scaling would be an issue :cold_sweat:

@noahgibbs thanks for this great talk. I guess Matt was talking about measuring certain number of different gems for the purpose of measuring ruby 3 performance. I had in mind rubybench as a platform where gem developers are getting feedback on gem performance they are developing.

Thank you guys for explanation :slight_smile:

I don’t know of any current plans to use Ruby Bench for that. But we’d definitely like it as a resource for the Ruby and Rails core teams to see regressions quickly and track them down accurately. Presumably also one or two other major projects like Discourse :wink: