Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Container benchmarks

Posted: 06/30/08 11:45:52 Modified: 06/30/08 11:49:52

I've been playing around with benchmarking the old and new "map" containers and the built-in associative arrays, for myself and anyone interested. It only tests insertion and lookup of the maps. My code.

Here's some results (expect some variance of results). Note that the number of elements inserted ("init") and looked up ("find") vary.

No particular selection of allocator, etc:

Starting warmup... (16384 elements)
Starting tests... (init 1048576 elements and find 1048576 elements)
Initializations finished
Did all find runs get the same results: yes

Results:        Init time       Find time (sec)
Old TreeMap     5.68            1.60
New SortedMap   5.16            1.28
AssocArray      7.15            0.30
Old HashMap     9.47            0.38
New HashMap     7.11            0.28
New StackMap    11.19           0.42


Starting warmup... (16384 elements)
Starting tests... (init 131072 elements and find 1048576 elements)
Did all find runs get the same results: yes

Results:        Init time       Find time (sec)
Old TreeMap     0.52            1.10
New SortedMap   0.14            0.79
AssocArray      0.17            0.26
Old HashMap     0.25            0.33
New HashMap     0.25            0.23
New StackMap    0.05            0.35

Using the chunk allocator seems to make a big difference to the new HashMap? insertions:

Starting warmup... (16384 elements)
Starting tests... (init 131072 elements and find 1048576 elements)
Did all find runs get the same results: yes

Results:        Init time       Find time (sec)
Old TreeMap     0.83            1.06
New SortedMap   0.12            0.77
AssocArray      0.20            0.26
Old HashMap     0.20            0.32
New HashMap     0.05            0.23
New StackMap    0.04            0.35

Calling allocator.config (1000, 1000) for the new hash and stack maps, and setting the number of buckets for the hash map initially makes it really fast to insert elements - compare with the builtin AA!

Starting warmup... (16384 elements)
Starting tests... (init 131072 elements and find 1048576 elements)
Did all find runs get the same results: yes

Results:        Init time       Find time (sec)
Old TreeMap     0.64            1.07
New SortedMap   0.12            0.74
AssocArray      0.21            0.25
Old HashMap     0.20            0.31
New HashMap     0.02            0.22
New StackMap    0.04            0.34


Starting warmup... (16384 elements)
Starting tests... (init 1048576 elements and find 1048576 elements)
Did all find runs get the same results: yes

Results:        Init time       Find time (sec)
Old TreeMap     5.53            1.62
New SortedMap   1.74            1.40
AssocArray      3.88            0.31
Old HashMap     6.13            0.38
New HashMap     0.26            0.30
New StackMap    0.42            0.42

Including the old LinkMap? (using far less elements):

Starting warmup... (8192 elements)
Starting tests... (init 16384 elements and find 16384 elements)
Initializations finished
Did all find runs get the same results: yes

Results:        Init time       Find time (sec)
Old LinkMap     3.36            3.31
Old TreeMap     0.02            0.01
New SortedMap   0.00            0.00
AssocArray      0.00            0.00
Old HashMap     0.00            0.00
New HashMap     0.00            0.00
New StackMap    0.00            0.00
Author Message

Posted: 06/30/08 17:27:50 -- Modified: 07/01/08 16:08:55 by
kris -- Modified 3 Times

checked-in an update yesterday which appears to speed up lookups by about 20% (when using simple data-types and smarter memory allocation). Would be interesting to see memory usage for these also, since that can be a critical aspect in some cases.

BTW, you'll have noticed that some of the numbers are largely gated by the memory-allocation strategy. Because of this, you'll often find more stable metrics are generated on a second run after clearing the container because memory allocation should be largely eradicated from the picture (yes, you kinda do this, but perhaps in a manner that doesn't really make a notable difference?) -- the built-in container performance tests explicitly show both as they can vary considerably.

Also, you'll find that CPU energy-saving strategies will skew performance badly -- you have to ensure such things are disabled when doing these kind of tests. Finally, there's another memory-allocation strategy in the works, which will likely become the default.

(Your number look to be very low, btw .... I use an old 2GHz Athlon to test with which, with HashMap, does 25million adds/s and 30 million lookups/s. Not sure how that jives with your numbers?)

Posted: 06/30/08 22:09:09 -- Modified: 07/01/08 07:33:28 by
kris

you piqued my interest with this :)

I ran my own tests, which are perhaps a bit simpler but show a somewhat different picture. There's 1 million adds, lookups and iterations in this one, with the numbers shown as operations per second:

1000000 adds: 2830212.32/s
1000000 lookups: 30636821.92/s
1000000 adds (after clear): 9405623.07/s
1000000 element dup: 1427123.34/s
1000000 element iteration: 49780202.21/s

1000000 AA adds: 418239.50/s
1000000 AA lookups: 17329574.89/s
1000000 AA iteration: 54531321.41/s

This is with the default (heap) allocator configured for HashMap, so that it's on the same footing as the AA. You can see that HashMap exibits nearly 7 times the insertion performance of AA, and almost twice the lookup performance. This also shows that the AA is faster at iteration, so that's something for us to look into.

Here's one with the GC disabled, with both using heap allocators:

1000000 adds: 6894015.12/s
1000000 lookups: 28872188.03/s
1000000 adds (after clear): 9484047.29/s
1000000 element dup: 4090310.22/s
1000000 element iteration: 49483604.74/s

1000000 AA adds: 1827921.22/s
1000000 AA lookups: 17580139.77/s
1000000 AA iteration: 53767104.77/s

AA is definitely faster with the GC disabled, yet HashMap is still nearly four times faster.

Here's HashMap with the Chunk allocator:

1000000 adds: 25622168.14/s
1000000 lookups: 30938426.43/s
1000000 adds (after clear): 26144286.60/s
1000000 element dup: 9750260.81/s
1000000 element iteration: 50383483.94/s

1000000 AA adds: 1395536.29/s
1000000 AA lookups: 17805669.69/s
1000000 AA iteration: 54486498.42/s

You can see this allocator is almost 10 times faster than the HashMap heap allocator. It's also almost 20 times faster than the AA in this test.

I think what this shows is that benchmarks are sometimes hard to quantify :D

Posted: 07/01/08 13:12:57

Indeed, I didn't know how much about how the allocation, etc. worked and it shows. Thanks for the info! I was however more interested in how well the containers would perform from an end programmers perspective without bothering too much about trying to optimise them, but of course there are too many external influences on the results (and how often is anyone going to want to store a million entries in a hash map anyway?). FYI, my CPU is a 2GHz althlon 64 X2, so power-saving could well have accounted for some of the variance in my results.

Posted: 07/01/08 16:06:36

Couldn't agree more, and thanks for doing the comparison

Posted: 07/07/08 20:34:30 -- Modified: 07/07/08 20:56:10 by
zzzzrrr

I make use of D built in AA's in performance critical areas of Blaze. It would be nice to know which is a better performance choice, both from a speed and memory management perspective.

From the above performance tests, it clearly looks like HashMap? with the Chunk allocator is the best choice, although I don't quite understand how to configure the allocator for optimal Blaze performance.

Thoughts?

Posted: 07/23/08 18:49:55 -- Modified: 07/23/08 18:54:06 by
kris

Yeah, the chunk allocator is very handy. But see the disclaimer in the documentation, about GC mgmt. Briefly, you should not use the chunk allocator to store things that the GC knows about. It's awesome for structs and so on, but not so useful for heap allocated strings, etc.

Schvieguy has provided a faster GC allocator, which we'll probably wind up using as the default. Either way, the above tests indicate HashMap is at least four times faster (at insert) than AA, regardless of which allocator is used. It's also at least twice as fast for lookups (often the more heavily used aspect) and, other tests indicate, is more efficient in terms of memory usage.

Posted: 07/24/08 08:56:52

Sounds great!

It would be useful to know, perhaps in the documentation, how best to configure the allocator:

myMap.allocator.config (1000, 1000);