Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Nested Hashmap

Moderators: larsivi kris

Posted: 09/09/08 15:29:01 Modified: 09/09/08 15:29:21

I try to include Hashmap as a value into another Hashmap, which works fine so far. This is the declaration:

alias hm.HashMap!(uint, float, Container.hash, Container.reap, Container.Chunk) myListMap;
alias hm.HashMap!(uint, myListMap, Container.hash, Container.reap, Container.Chunk) myListMap2;

myListMap tmp_map = new myListMap();
myListMap2 list2 = new myListMap2();

I can add myListMap to myListMap2. The problem starts, when I want to append values to myListMap. My idea was to retrieve the myListMap Hashmap from myListMap2 with a list2_key1 and assign it to a temporary Hashmap of type myListMap. Then append the new key, value pair and reassign the tmp_map to the myListMap2 again.

tmp_map = list2[list2_key1];
tmp_map[tmp_map_key2] = value;
list2[list2_key1] = tmp_map;

All this is inside a class function and all Hashmaps are globally available in the class. tmp_map was made available globally, because I dont wanted to instantiate tmp_map with each call of the function. Therefore I made it globally available and added:

tmp_map.clear

at the beginning of the function. But unfortunately this brings rather weird results.

What I would like is to have is a data structure like this, where I can add new list_key/value pairs and and new tmp_map_key/value pairs.

--------------------------------------------------------------
 list_key1 => tmp_map_key1    | tmp_map_key2    | tmp_map_key3  ..
              tmp_map_value1  | tmp_map_value2  | tmp_map_value3  ..
--------------------------------------------------------------
 list_key2 => tmp_map_key1    | tmp_map_key2    | tmp_map_key3  ..
              tmp_map_value1  | tmp_map_value2  | tmp_map_value3  ..
--------------------------------------------------------------
 list_key3 => tmp_map_key1    | tmp_map_key2    | tmp_map_key3  ..
              tmp_map_value1  | tmp_map_value2  | tmp_map_value3  ..
--------------------------------------------------------------

How do I accomplish that? What exactly does the clear function of the Hashmap? Can anyone give me a advise how to solve that?

best regards Lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Author Message

Posted: 09/10/08 09:40:25

Are you forgetting that HashMaps? are classes, and thus reference types?

In this code, the third line serves no purpose since you're only setting the reference (i.e. pointer) list2[list2_key1] to its original value.

tmp_map = list2[list2_key1];
tmp_map[tmp_map_key2] = value;
list2[list2_key1] = tmp_map;	// already refers to this object
// equivalent:
list2[list2_key1][tmp_map_key2] = value;

Then when you call tmp_map.clear next time the function runs, tmp_map is still a reference to list2[list2_key1] so you're also clearing that! And you needn't worry about reinstantiating tmp_map every time the function runs, because it's only a pointer. You are only instatiating an object when you call new HashMap!(...).

Hope this helps.

Posted: 09/10/08 14:16:23 -- Modified: 09/10/08 14:17:32 by
lars_kirchhoff -- Modified 2 Times

Thanks cyborg16,

thats what I thought, that it is a reference only. I already tried your suggestion:

list2[list2_key1][tmp_map_key2] = value;

But unfortunately this produces that error:

tango.core.Exception.NoSuchElementException: missing or invalid key

What I can do is to add if no list2_key1 is in list2:

tmp_map[tmp_map_key2] = value;
list2[list2_key1] = tmp_map;

and later access that by:

list2[list2_key1][tmp_map_key2] = value;

But then all have the reference to the same object, which means all have the same values.

Here is the complete function:

private void addNodes(uint node1, uint node2, float strength)
{
    if (node1 in list2) {
        list2[node1][node2] = strength;
    } 
    else {
        tmp_map[node2] = strength;
        list2[node1] = tmp_map;
    }
}

Is there any other way to achieve this? I'm still learning how to handle data structures in D.

best regards Lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Posted: 09/10/08 14:28:01

I've tried to simplify the data structure to

alias hm.HashMap!(uint, uint[], Container.hash, Container.reap, Container.Chunk) myListMap2;

--------------------------------------------------------------
 list_key1 => key_1 | key_2 | key_3 ...
--------------------------------------------------------------
 list_key2 => key_1 | key_2 | key_3 ...
--------------------------------------------------------------
 list_key3 => key_1 | key_2 | key_3 ...
--------------------------------------------------------------

and changed the function to:

private void addNodes(uint node1, uint node2, float strength)
{
    uint[] tmp_arr;

    if (node1 in list2) {
        tmp_arr = list2[node1];
    }

    tmp_arr ~= node2;
    list2[node1] = tmp_arr;
}

This works fine, but produces a segmentation fault with more then 1500 items in list2. It runs fine for less items and produces correct results? Do I miss something? Do I run out of memory or in different memory spaces?

thanks Lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Posted: 09/10/08 15:18:28

Not sure about the segfault unfortunately. Probably a bug elsewhere. Kris?

If you don't need to use a hash-map it would be more efficient to use an array of course, but I can answer a few questions about your previous method:

lars_kirchhoff wrote:

Thanks cyborg16,

thats what I thought, that it is a reference only. I already tried your suggestion:

list2[list2_key1][tmp_map_key2] = value;

But unfortunately this produces that error:

tango.core.Exception.NoSuchElementException: missing or invalid key

list2[...] is itself a hashmap. So you need to create it:

list2[list2_key1] = new HashMap!(uint,myListMap);
list2[list2_key1][tmp_map_key2] = value;
// or equivalently, and to save looking up in list2 again:
auto tmp = new HashMap!(uint,myListMap);
tmp[tmp_map_key2] = value;
list2[list2_key1] = tmp;

So your function could be:

private void addNodes(uint node1, uint node2, float strength)
{
    Hashmap!(uint,float)* p = node1 in list2;
    if (p)
        (*p)[node2] = strength;
    else {
        auto tmp = new HashMap!(uint,myListMap);
	tmp[tmp_map_key2] = strength;
	list2[list2_key1] = tmp;
    }
}

Posted: 09/10/08 15:43:23

Thanks again Cyborg16,

The suggest code works like a charm, but has the same problem as with the array. I got segfault with higher number of items.

/lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Posted: 09/10/08 15:47:26

Noticed that you're explicitly using a Chunk-Allocator. Perhaps you could explain your use of that in this context? To start with, I'd suggest leaving the HashMap? configuration in their default setting by using only the first two template arguments instead.

Posted: 09/10/08 16:24:24 -- Modified: 09/10/08 20:48:48 by
lars_kirchhoff -- Modified 2 Times

Thanks Chris,

I've removed it and it works. I used it because I had experienced better performance with the chunk configuration, because I use the Hashmap to store millions of key/value pairs, which take up to 1-2GB in memory.

Could it be that the chunk size of the myListMap2 Hashmap configuration was to small for bigger myListMap items?

best Lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Posted: 09/10/08 18:03:58

The Chunk allocator is only for non-GC allocated types. Which means that if you use it on a reference type, the garbage collector might come along and collect the data without the hash map knowing about it, hence the segfault.

There are plans to add a compromise chunk allocator that is not quite as fast, but supports GC allocated types, see #1180

Because D classes are always reference types, the default value is always null. So providing a meaningful default in your case is not supported by the current HashMap.

For example, you have the following code:

list2[list2_key1][tmp_map_key2] = value;

Analyzing this, it translates to:

list2.opIndex(list2_key1).opIndexAssign(value, tmp_map_key2);

The problem is, the opIndex only works if a key already exists.

The correct method is to do this:

if(!list2.containsKey(list2_key1))
   list2[list2_key1] = new myListMap;
list2[list2_key1][tmp_map_key2] = value;

Yeah, it sucks, but the way classes are always references, it's not easy to implement.

One thing that could be done is to add another parameter to the HashMap (or any map class for that matter) to allow for an allocator for element values, so when an unknown key is indexed, the allocator creates a default value for that key. This would make them similar to C++'s maps. Kris?

Posted: 09/11/08 10:40:52

I have another question? Is there a way to reduce the size that is of allocated by a hashmap. I just notice that it doesnt make a difference if I change from uint to ushort for the values of a hashmap. The size of the hashmap stays the same. Is that correct?

/lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Posted: 09/12/08 01:38:28

lars_kirchhoff wrote:

I have another question? Is there a way to reduce the size that is of allocated by a hashmap. I just notice that it doesnt make a difference if I change from uint to ushort for the values of a hashmap. The size of the hashmap stays the same. Is that correct?

Yes, this is normal due to the alignment requirements of 32-bit processors. A hash map uses a link list, which consists of a pointer, the key, and the value. Since pointers must be 4-byte aligned, the struct that contains a link list element is padded to make sure it is a multiple of 8 bytes.

There really isn't a way around it, sorry.

-Steve

Posted: 09/14/08 11:49:02

Thanks Steve, So I need to look for some other solution..

/lars

Some thoughts about switching from PHP to D: http://www.lars-kirchhoff.de/go/journal/section/from-php-to-d/

Posted: 09/14/08 14:28:41

No, I mean the processor is the problem, and there isn't any way to work around it in code in a generic way. You could write a very specific hashmap for shorts that stores the values differently, but I don't think it's worth it. If you use builtin AA I bet you get the same issue.

-Steve