The importance of supporting native data types
I'm going to start a new project which can have in very short time hard requirements on database performance: there are two entities, let's say items and users that will be related (a common one-to-many relationship). The question is that this relation is very probable to grow a lot and the item will be related with thousands of users.
In order to avoid having the one-to-many table with millions of rows I have decided to try with some NoSQL solutions, with the good luck that the first that I have tried still keeps me very impressed: Redis is very awesome! It's very fast, extremely easy to use and very powerful. And the set, ordered set or list seem to fit perfectly in my problem.
So I decided to perform some tests:
total_actions = 100000
# 10000 actions
1.upto(total_actions) do |i|
b = Benchmark.measure do
subscribers_per_action = 10000
1.upto(subscribers_per_action) do |user_id|
$redis.incr "action-#{i}-count"
$redis.sadd "action-#{i}", user_id
end
end
puts "#{i} / #{total_actions} / #{b.real} / #{$redis.dbsize}"
end
The results are impressive: 10,000 incr and 10,000 adds to a set performed in 2.4 secs approximately. And not only in an empty database. I'm testing it with databases that have a lot of entries and still is performing such well.
So I decided to try the same in a Tokyo Tyrant server. The problem is that Tokyo Cabinet doesn't support datasets, it is a classical key-value storage. So if you want to add elements in a key you have to store them in a YAML, or in a string with a well-known separator, or that way.
The same test with Redis took the double of time with Tokyo Tyrant. I know that 5.7 secs for 20,000 operations is very very fast, but I'm still impressed of the performance of Redis working with data types much more complex that a string. My choice is quite clear.


