[GSoC15] Idea: possibility of switching to MongoDB


#1

Another idea I have exposed in my GSoC proposal is the possibility of switching to a NoSQL database storage like MongoDB.

@sam told me that he doesn’t like the idea, but also said that I could explain it here.

In my experience working with high amounts of data, if the database operations require to filter and return high amount of records (may be +10.000), there is a significant difference between MySQL and MongoDB (like x50 times faster maybe?).

In my professional life I have been working in a I+D department, so I don’t know how exactly is the behaviour of this architecture in production, but I know that you can do load balance between different instances, like any SQL engine. In my personal case, the change from MySQL to Mongo was because MySQL was a bottleneck, but it depends on the operations and the architecture.

@sam told me that you are actually using PostgreSQL, so may be it’s behaviour is different.

Despite this, my idea was to try exporting the data from your SQL database to a MongoDB database to try if there is a benefit, but only when the rest of the improvements are made.

If the problem with this idea is how to manage it from RoR, there is an excellent gem called Mongoid ( Mongoid — Mongoid Manual 6.2 ) that is a good deal, because it acts like ActiveRecord in many ways.

If you don’t like this idea, discard it and we can continue with the rest of the ideas if you like my proposal.


#2

My issue with this suggestion is that it is really at odds with the project.

The idea around rubybench is all about measuring, this is proposing making a change to rubybench without measuring and proving that pg is actually some sort of bottleneck, which it is not, and odds are tiny it ever would be.

The amount of data stored in the pg database is minuscule. Postgres is able to eat this for breakfast without breaking a sweat. I am very doubtful that it is in any way a bottleneck that needs optimising.


#3

My idea was, at the end of doing the rest of the improvements, to test and do benchmarks with the service using both DB engines in order to see if it deserves the change, testing before applying (like everything I proposed).

But, if the data saved is minuscule and you don’t see it clear, we can ignore this part and focus on others improvements, there is no problem with that.


#4