Make comparison chart more simple

bmarkons · July 12, 2017, 11:17am

I find current comparison chart a bit confusing. I think we don’t need all results displayed from comparing benchmark. Maybe only last commit result from benchmark we want to compare with is enough on graph. This could be simply plotted as one line graph and would be much more clear what is being compared.

You can get the feel about differences between benchmarks but it is not that clear which commit results are being compared.

I propose something like this :

This is much simpler in every way.

sam · July 12, 2017, 9:56pm

I think there are few things we need to do first, before we even consider this:

Restrict “comparison targets” why am I allowed to compare s to s

image.png1004×648 28.7 KB

At a minimum display the relevant comparison at the top.

With “prepared statement” vs “without” is just a wash and just causing confusion. My first question here is: Do we have any bench where the difference is significant? If not just suppress one of them for now or add a checkbox at bottom of graph to select prepared vs not.
This looks confusing:

When we are missing commit just render a straight dotted line after last result.

I really want to keep a “proper” comparison looking at trends in both benchmarks over time, but we do need to clean up some usability here.

sam · July 13, 2017, 2:10am

Also, maybe a simple comparison UI to see all the comparison tests available?

For example same comparison, one of the heavy active record ones, can run on various commits of Ruby. Then you can tell how much Ruby can help. What do you think @noahgibbs ?

noahgibbs · July 13, 2017, 4:08am

Certainly being able to compare across Ruby versions could be nice. Especially with some of the JIT stuff we’re looking at for near-future Ruby versions, there could be all kinds of effects on ActiveRecord… Even a lot of the GC work might be quite relevant.

The usability stuff you mention (restricting comparisons, etc.) sounds good in concept. I’m not sure if there’s a simple way to group the tests for comparison, though.

bmarkons · July 13, 2017, 8:05am

Grouping comparable benchmarks is something that definitely needs to be done. I want to make it possible through admin UI, since I think it’s the most convenient way to do it. So you would have comparison candidates from same group in dropdown.

I haven’t found any benchmark which results significantly differ whether with or without prepared statements. So I agree that displaying both produces bit of a noise. I will manage that.

Oh yeah, that strange issue occurs when benchmark doesn’t have enough results to display. I will make it display minimum of last results these two benchmarks have.

I really want to keep a “proper” comparison looking at trends in both benchmarks over time

I’ve assumed when we want to compare benchmark A to benchmark B we are only interested in looking benchmark A over time.

sam · July 13, 2017, 2:02pm

Nahh I want to look at multiple trends.

At the moment I can select up to 2000 commits, what I would like to do is be able to select a time frame there as well (1 year, 2 years, 3 years, 5 year), then we can pick say 2000 commits in the N year period and graph them.

At that point we can compare trends. And show for example when AR worked on perf, compared to when Sequel did.

Additionally, we will easily be able to tell when time is invested in “catching up” on performance debt.

sam · July 20, 2017, 10:42pm

An news here? I really want something I can tweet out at people

bmarkons · July 21, 2017, 8:34am

Have a look @sam

https://rubybench.org/rails/rails/commits?result_type=activerecord/postgres_scope_all&compare_with=sequel/postgres_scope_all

Right now I am working on grouping benchmarks so we have only appropriate benchmarks in select box for comparison.

sam · July 21, 2017, 9:29pm

Nice!

Are these benches correct? Is AR really that close to Sequel for this benchmark?

https://rubybench.org/rails/rails/commits?result_type=active_record/postgres_discourse&display_count=500&compare_with=sequel/postgres_discourse

Seems a bit odd to me cause we were seeing huge discrepancies in other benches.

cc @jeremyevans

jeremyevans · July 21, 2017, 11:43pm

I didn’t look in detail, but it may be just that the query is more
complex than other benches and most of the time is spent in the database
and not in ruby. Sequel does allocate less than a third of objects that
AR does, and is 10% faster, but if the database is the bottleneck, then
the performance difference between Sequel and AR is likely to be small.

Thanks,
Jeremy

bmarkons · July 24, 2017, 11:21am

I’ve come across an odd issue while I was trying to reproduce these results locally.

After I ran both benchmarks directly on my machine (not in docker) I got Sequel 2x faster than ActiveRecord:

sequel/postgres_discourse - 312 ips
activerecord/postgres_discourse - 162 ips

While running same benchmarks in a standard way (inside docker) gave me smaller differences in results:

sequel/postgres - 198 ips
activerecord/postgres - 153 ips

I am super confused with this. Maybe we are missing something in setup. I have verified benchmark correctness (same SQL query and same string generated) by running it directly on local machine.

sam · July 24, 2017, 3:11pm

Very interesting,

How is the DB setup? You have to make sure you run a volume for the DB on the host, otherwise you pay a huge price on filesystem access, also be sure to disable fsync and other stuff … see:

github.com

discourse/discourse/blob/master/lib/tasks/docker.rake#L42-L48


if ENV["SINGLE_PLUGIN"]
  @good &&= run_or_fail("bundle exec rubocop --parallel plugins/#{ENV["SINGLE_PLUGIN"]}")
  @good &&= run_or_fail("eslint --ext .es6 plugins/#{ENV['SINGLE_PLUGIN']}")
else
  @good &&= run_or_fail("bundle exec rubocop --parallel") unless ENV["SKIP_CORE"]
  @good &&= run_or_fail("eslint app/assets/javascripts test/javascripts") unless ENV["SKIP_CORE"]
  @good &&= run_or_fail("eslint --ext .es6 app/assets/javascripts test/javascripts plugins") unless ENV["SKIP_PLUGINS"]

for some of our speed hacks.

bmarkons · July 25, 2017, 10:02am

In our current setup DB is being run in separate linked container.

github.com

ruby-bench/ruby-bench-docker/blob/master/scripts/rails/master.sh#L25-L34


-e "MYSQL2_PREPARED_STATEMENTS=1" \
-e "INCLUDE_PATTERNS=$PATTERNS" \
rails_master \
/bin/bash -l -c "./runner"


docker-compose down -v

So I need to make these DB containers use volumes on the host?

sam · July 25, 2017, 1:32pm

I think by default it runs a volume on the host.

Maybe do a “raw” benchmark as well then you can compare all 3 which will give you a great picture of what is going on

bmarkons · July 25, 2017, 1:52pm

Do we want the same setup for pg gem (and mysql2) like for AR and Sequel - tracking over time?

https://github.com/ruby-bench/ruby-bench-suite/pull/76#issuecomment-314076173

sam · July 25, 2017, 2:06pm

I guess this is the correct thing to do, but keep in mind I have seen very very little perf movement in these gems, they are pretty much as fast as they are going to be.

You got to get something in asap though to fully debug this.

bmarkons · July 25, 2017, 2:39pm

Sounds reasonable.

I will work towards having raw benchmark running in docker setup to be able to debug this. Later we can setup automatic run on repo push if we want.

bmarkons · August 9, 2017, 4:35pm

I did some setup change, I’ve decided to try with docker-compose.

Results I’ve got with this setup :

Raw : 230 ips
Sequel : 190 ips
ActiveRecord : 150 ips

It seems overhead with AR is 2x bigger than with Sequel.

Topic		Replies	Views
Esay way to compare Meta	1	1660	September 13, 2017
GSOC Project : Improving RubyBench	76	9055	August 22, 2016
Bug in benchmark comparison	1	1505	July 4, 2017
Comparison of values for Ruby releases	1	1567	January 28, 2015
PG gem is on RubyBench	3	1669	August 21, 2017

Make comparison chart more simple

Related topics