Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

I want this type of collection

Moderators: kris

Posted: 07/09/08 19:11:22

Hi,

It would be nice to have something like linked list in Tango (I know we already have a lot of these:). I called my list a "round map". Round because items are stored in array and they are reused. So it is circular and circles are round lol. Map because items can be accessed by their index. So I will try to explain. The difference is that all items are stored in a linear array, so they could be accessed very fast by their index. The list maintains the index of the head item and tail item. The tail item is actually the the next free slot. And since the stuff is stored in an array, it needs to grow. For example:

The list is empty. Add several items and the list becomes something like this: 1--2--3--4. Each item knows the index of the next and previous one. And when you call mylist.add(newitem), the add fuction returns the index of the newly added item. Now lets say we delete the second item. The list becomes something like this 1-----3--4--2 and the tail is now the second item so the next call to add will reuse this spot in the array and return the index.

This kind of list is extremely useful when working with C libraries. Take Win32 for example. To associate a HWND with a D class you need associative array. And these are slow. Instead of using AA you use this kind of list. So you add one item and you get its index. Set the user data for the HWND to this integer value and later access the class by index instead of doing AA lookups.

I have a basic implementation of this class, but I am not good at this stuff so it is probably buggy. It would be nice if it is implemented better in Tango consistent style:

import tango.io.Stdout;

//todo: won't work if T.init is nan
class RoundMap(T)
{
	this(uint step=32,uint initial=0,T bad=T.init)
	{
		stack.length=initial;
		this.bad=bad;
		this.step=(step==0?1:step);
	}

	uint add(T v)
	{
		if(size==stack.length)
		{
			auto start=stack.length;
			stack.length=stack.length+step;
			for(uint c=start;c<stack.length;++c)
			{
				auto s=&stack[c];
				s.next=c+1;
				s.prev=c-1;
				s.value=bad;
			}
			if(start+1<stack.length) stack[start+1].prev=tail;
			stack[tail].next=start+1;
		}

		auto i=tail;
		auto s=&stack[tail];
		s.value=v;
		tail=s.next;
		++size;
		return i;
	}

	T del(uint i)
	{
		auto s=&stack[i];
		auto r=s.value;
		if(r==bad) return r;
		s.value=bad;
		if(s.prev!=uint.max) stack[s.prev].next=s.next;
		if(s.next!=uint.max) stack[s.next].prev=s.prev;
		if(i==head) head=s.next;
		if(tail<stack.length) //insert the free cell after the tail
		{
			auto t=&stack[tail];
			s.next=t.next;
			s.prev=tail;
			if(s.next<stack.length) stack[s.next].prev=i;
			t.next=i;
		}
		else //insert the free cell after the last item and make it tail
		{
			Container* t;
			uint c=tail;
			if(tail>=stack.length) for(c=head;c!=tail;c=stack[c].next) t=&stack[c];
			s.prev=c;
			s.next=tail;
			tail=t.next=i;
		}
		--size;
		return r;
	}

	uint index(bool delegate(ref T v) cmp)
	{
		for(uint c=head;c!=tail;c=stack[c].next) if(cmp(stack[c].value)) return c;
		return uint.max;
	}

	/*void print()
	{
		tango.io.Stdout.Stdout("used (",head,tail,"): ");
		for(uint c=head;c!=tail && c<stack.length;c=stack[c].next) Stdout(c," ");//,stack[c].prev,"<>",stack[c].next,"||| ");
		tango.io.Stdout.Stdout("\nfree (",tail,size,stack.length,"): ");
		for(uint c=tail;c!=uint.max && c<stack.length;c=stack[c].next) Stdout(c," ");//,stack[c].prev,"<>",stack[c].next,"||| ");
		tango.io.Stdout.Stdout("\n").newline.flush;
		
	}*/

	T get(uint index){return stack[index].value;}
	T* getRef(uint index){return &stack[index].value;}

	uint length(){return size;}
	uint limit(){return stack.length;}

	T bad;

protected:

	Container[] stack;
	uint head,tail;
	uint size,step;

	struct Container
	{
		//static Container* opCall(T v,uint p){auto ret=new Container;ret.value=v;ret.prev=p;return ret;}

		T value;
		uint prev=uint.max,next=uint.max;
	}
}
Author Message

Posted: 07/12/08 18:27:47 -- Modified: 07/12/08 20:00:07 by
korDen -- Modified 3 Times

Yes, this is a very useful pattern, I'll show you my implementation of it.

Actually, this kind of containers is called a pool. Sometimes you need to store something somewhere so that you can return to it later. And you need some key to get your data back.

A simple analogy would be a deposit box in a bank or some store. Security doesn't let you in until after you leave you bag in a cell/box/safe (what's the correct word, btw?). You are given a key so that you can come back and unlock it. After you take your stuff back, the key is returned so that someone else could use the box.

Now you get it, I hope. Back to the implementation.

First, you need some IdGenerator?:

class IdGenerator {
    int getId();
    void freeId(int id);
}

The purpose of this class it to generate ids and free those ids that aren't used anymore. Generally it should generate ids in ascending order and reuse freed ids so that the id counter doesn't grow very fast. These ids will be used as a free slot index (key).

Next, you need an ordinary array or some other type of collection that is accessible by id. So, a simple pool looks like this:

struct SimplePool(T)
{
    public size_t push(T value)
    {
        size_t id = _idGenerator.getId();
        _array.length = id + 1; // enforce the capacity
        _array[id] = value;
        return id;
    }

    public void remove(size_t index)
    {
        _idGenerator.freeId(index);
    }

    // accessors
    public T opIndex(size_t index)
    {
        return _array[index];
    }

    public void opIndexAssign(size_t index, T value)
    {
        _array[index] = value;
    }

    private T[] _array;
    private IdGenerator _idGenerator;
}

and here is a simple IdGenerator?:

struct IdGenerator
{
    public size_t getId()
    {
        size_t id;
        size_t length = _array.length;
        if (length > 0) {
            --length;
            id = _array[length];
            _array.length = length;
            return id;
        }

        return _nextFreeId++;
    }

    public void freeId(size_t id)
    {
        _array ~= id;
    }

    private size_t[] _array;
    private size_t _nextFreeId = 0;
}

Simple, isn't it?

Posted: 07/12/08 19:52:26

You may also want to be able to aquire an address (pointer to your data) and have multithreading support. This is also implementable, but you no longer may use an array to store your data, because array may get resized at any point and this will invalidate all your pointers. And this is not thread-safe, obvioulsy.

So, now we need a new data type that is able to grow and that doesn't make full reallocation, i.e. that grows in chunks and old data chunks remain valid.

I implemented it as well, the name is SafeGrowingArray?. It is completely thread-safe, lock-free and *very* fast. Source code is available here: http://www.everfall.com/paste/id.php?tmjp3qzeo622

Thread-safe IdGenerator?, based on SafeGrowingArray?: http://www.everfall.com/paste/id.php?99lnpi25xeue Thread-safe SimplePool?: http://www.everfall.com/paste/id.php?msamdlo3iqzu

Performance comparison of ArraySeq?, SafeGrowingArray? and built-in array is attached to SafeGrowingArray? source code. Note that is many cases it performs *better* than ArraySeq? and completely thread-safe!

I would be happy to see these classes in Tango! BTW, I've written a few more lock-free containers and may release them, too, if anyone is interested.

Posted: 07/15/08 10:04:47

So pool then. Nice, that's two votes for the pool in Tango :). korDen, your pool implementation is very simple and clean, but I still believe it would be better to be implemented as a linked list, otherwise it takes more memory to manage separate array for the free slots and requires more memory allocations. Not to imply that my implementation is any good, but just thinking. Of course benchmarking is the way to go not just plain talk.

Posted: 07/15/08 22:17:28

I agree that it would better to implement IdGenerator? with a linked-list, since the only operations it uses are pushBack/popBack. I provided an implementation with an array for the sake of simplicity.

OTOH, *the best* way to implement the IdGenerator? is actually a Set or some other sorted collection, because it is prefered to store free ids in an ascending order for two reasons: - smallest available ids used, - iteration over pool.

But an underlying pool storage should be an array (or my SafeGrowingArray?) for an O(1) indexation, anyway.