Make comparison chart more simple


#1

I find current comparison chart a bit confusing. I think we don’t need all results displayed from comparing benchmark. Maybe only last commit result from benchmark we want to compare with is enough on graph. This could be simply plotted as one line graph and would be much more clear what is being compared.

You can get the feel about differences between benchmarks but it is not that clear which commit results are being compared.

I propose something like this :

This is much simpler in every way.


#2

I think there are few things we need to do first, before we even consider this:

At a minimum display the relevant comparison at the top.

  • With “prepared statement” vs “without” is just a wash and just causing confusion. My first question here is: Do we have any bench where the difference is significant? If not just suppress one of them for now or add a checkbox at bottom of graph to select prepared vs not.

  • This looks confusing:

When we are missing commit just render a straight dotted line after last result.

I really want to keep a “proper” comparison looking at trends in both benchmarks over time, but we do need to clean up some usability here.


#3

Also, maybe a simple comparison UI to see all the comparison tests available?

For example same comparison, one of the heavy active record ones, can run on various commits of Ruby. Then you can tell how much Ruby can help. What do you think @noahgibbs ?


#4

Certainly being able to compare across Ruby versions could be nice. Especially with some of the JIT stuff we’re looking at for near-future Ruby versions, there could be all kinds of effects on ActiveRecord… Even a lot of the GC work might be quite relevant.

The usability stuff you mention (restricting comparisons, etc.) sounds good in concept. I’m not sure if there’s a simple way to group the tests for comparison, though.


#5

Grouping comparable benchmarks is something that definitely needs to be done. I want to make it possible through admin UI, since I think it’s the most convenient way to do it. So you would have comparison candidates from same group in dropdown.

I haven’t found any benchmark which results significantly differ whether with or without prepared statements. So I agree that displaying both produces bit of a noise. I will manage that.

Oh yeah, that strange issue occurs when benchmark doesn’t have enough results to display. I will make it display minimum of last results these two benchmarks have.

I really want to keep a “proper” comparison looking at trends in both benchmarks over time

I’ve assumed when we want to compare benchmark A to benchmark B we are only interested in looking benchmark A over time.


#6

Nahh I want to look at multiple trends.

At the moment I can select up to 2000 commits, what I would like to do is be able to select a time frame there as well (1 year, 2 years, 3 years, 5 year), then we can pick say 2000 commits in the N year period and graph them.

At that point we can compare trends. And show for example when AR worked on perf, compared to when Sequel did.

Additionally, we will easily be able to tell when time is invested in “catching up” on performance debt.


#7

An news here? I really want something I can tweet out at people :blush:


#8

Have a look @sam :slight_smile:

https://rubybench.org/rails/rails/commits?result_type=activerecord/postgres_scope_all&compare_with=sequel/postgres_scope_all

Right now I am working on grouping benchmarks so we have only appropriate benchmarks in select box for comparison.


#9

Nice!

Are these benches correct? Is AR really that close to Sequel for this benchmark?

https://rubybench.org/rails/rails/commits?result_type=active_record/postgres_discourse&display_count=500&compare_with=sequel/postgres_discourse

Seems a bit odd to me cause we were seeing huge discrepancies in other benches.

cc @jeremyevans


#10

I didn’t look in detail, but it may be just that the query is more
complex than other benches and most of the time is spent in the database
and not in ruby. Sequel does allocate less than a third of objects that
AR does, and is 10% faster, but if the database is the bottleneck, then
the performance difference between Sequel and AR is likely to be small.

Thanks,
Jeremy


#11

I’ve come across an odd issue while I was trying to reproduce these results locally.

After I ran both benchmarks directly on my machine (not in docker) I got Sequel 2x faster than ActiveRecord:

  • sequel/postgres_discourse - 312 ips
  • activerecord/postgres_discourse - 162 ips

While running same benchmarks in a standard way (inside docker) gave me smaller differences in results:

  • sequel/postgres - 198 ips
  • activerecord/postgres - 153 ips

I am super confused with this. Maybe we are missing something in setup. I have verified benchmark correctness (same SQL query and same string generated) by running it directly on local machine.


Can we fill up our benchmarks?
#12

Very interesting,

How is the DB setup? You have to make sure you run a volume for the DB on the host, otherwise you pay a huge price on filesystem access, also be sure to disable fsync and other stuff … see:

for some of our speed hacks.


#13

In our current setup DB is being run in separate linked container.

So I need to make these DB containers use volumes on the host?


#14

I think by default it runs a volume on the host.

Maybe do a “raw” benchmark as well then you can compare all 3 which will give you a great picture of what is going on


#15

Do we want the same setup for pg gem (and mysql2) like for AR and Sequel - tracking over time?

https://github.com/ruby-bench/ruby-bench-suite/pull/76#issuecomment-314076173


#16

I guess this is the correct thing to do, but keep in mind I have seen very very little perf movement in these gems, they are pretty much as fast as they are going to be.

You got to get something in asap though to fully debug this.


#17

Sounds reasonable.

I will work towards having raw benchmark running in docker setup to be able to debug this. Later we can setup automatic run on repo push if we want.


#18

I did some setup change, I’ve decided to try with docker-compose.

https://github.com/ruby-bench/ruby-bench-docker/pull/40

Results I’ve got with this setup :

  • Raw : 230 ips
  • Sequel : 190 ips
  • ActiveRecord : 150 ips

It seems overhead with AR is 2x bigger than with Sequel.