Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Strange behaviour in my chameneos test implementation

Moderators: kris

Posted: 04/29/07 20:08:56 Modified: 04/29/07 21:12:02

Hi!

I don't know where to post this problem. And may be this is my error but I don't see it, may be someone could help me.

So, I'm trying to reimplement chameneos benchmark test with Tango. My implementation is presented below. Sometimes it works correctly, but sometimes I get strange results:

bash-3.2$ time ./my_chameneous 40000
creaturesMeet_: 40000
creaturesMeet_: 40000
creaturesMeet_: 40000
creaturesMeet_: 0
creaturesMeet_: 0
80000

real    0m0.281s
user    0m0.046s
sys     0m0.015s
bash-3.2$ time ./my_chameneous 40000
creaturesMeet_: 2579283
creaturesMeet_: 2093332
creaturesMeet_: 2178625
creaturesMeet_: 2290491
9141731

real    0m11.546s
user    0m0.031s
sys     0m0.016s

In first case must be only four messages 'creaturesMeet_' but there are five. In second case total count of meetings must be 80000, not 9 millions.

May be there are some effects of optiomization and I should use some volatile statements or barriers?

I use DMD v.1.013 and DMD v.1.014, Tango 0.97-RC1, WinXP-SP2. Program was compiled with rebuild 0.18 and command lines:

rebuild -oqo my_chameneous.d
rebuild -oqo -O -inline -release my_chameneous.d

with the same effect.

The problem didn't exists with small N and stable at N == 40000 (on my Intel Pentium-M 1.5GHz).

Here is my program:

import tango.core.Thread;

import tango.util.locks.Condition;
import tango.util.locks.Mutex;

import tango.text.convert.Integer;

import tango.io.Stdout;

typedef int Color;
const Color BLUE = 0;
const Color RED = 1;
const Color YELLOW = 2;
const Color FADED = 3;

class MeetingPlace
  {
    int remaining_ = 10;
    Mutex lock_;
    Condition condition_;
    Color * swappedColor_ = null;

    this()
      {
        lock_ = new Mutex;
        condition_ = new Condition( lock_ );
      }
  }

class Creature : Thread
  {
    this(
        Color my,
        MeetingPlace meetingPlace )
      {
        super( &run );

        my_ = my;
        meetingPlace_ = meetingPlace;
      }

    void
    run()
      {
        while( true )
          {
            Color other = my_;
            scope localLock = new ScopedLock( meetingPlace_.lock_ );
            if( meetingPlace_.remaining_ )
              {
                if( !meetingPlace_.swappedColor_ )
                  {
                    meetingPlace_.swappedColor_ = &other;
                    meetingPlace_.condition_.wait();
                    meetingPlace_.swappedColor_ = null;
                    --meetingPlace_.remaining_;
                  }
                else
                  {
                    other = *meetingPlace_.swappedColor_;
                    *meetingPlace_.swappedColor_ = my_;
                    meetingPlace_.condition_.notify();
                  }

                localLock.release;
                if( FADED != other )
                  {
                    my_ = complement( other );
                    ++creaturesMeet_;
                  }
              }
            else
              break;
          }
Stdout( "creaturesMeet_: " )( creaturesMeet_ ).newline;
      }

    int
    total()
      {
        return creaturesMeet_;
      }

  private :
    Color my_;
    MeetingPlace meetingPlace_;

    int creaturesMeet_ = 0;

    Color
    complement( Color other )
      {
        switch( my_ )
        {
            case BLUE:
                return other == RED ? YELLOW : RED;
            case RED:
                return other == BLUE ? YELLOW : BLUE;
            case YELLOW:
                return other == BLUE ? RED : BLUE;
            default:
                break;
        }
        return my_;
      }
  }

void
main( char[][] args )
  {
    auto meetingPlace = new MeetingPlace;
    if( 2 == args.length )
      meetingPlace.remaining_ = parse( args[ 1 ], 10 );

    auto colors = [ BLUE, RED, YELLOW, BLUE ];

    Creature[] group;
    foreach( color; colors )
      {
        Creature creature = new Creature( color, meetingPlace );
        group ~= creature;
        creature.start;
      }

    int total = 0;
    foreach( c; group )
      {
        c.join;
        total += c.total;
      }
    Stdout( total ).newline;
  }

// vim:ts=2:sts=2:sw=2:expandtab:fenc=utf-8:

Thanks!

Regards, Yauheni Akhotnikau

Author Message

Posted: 04/30/07 01:21:01

looking at the code, it seems you've made some changes to the implementation of meeting place? In particular, the original code has a shared lock, while your version is creating a new lock on each iteration?

Posted: 04/30/07 04:33:14

Avoid volatile RMW operations like the ++creaturesMeet. Or if you really want to use them, do so using tango.core.Atomic. The operation would be something like "atomicIncrement(creaturesMeet)." I'd need to spend some more time with the code to figure out what's going on, but for what it's worth, here is a minimal port of the Phobos impl to Tango:

/* The Great Computer Language Shootout
   http://shootout.alioth.debian.org/

   Reference implementation in C# contributed by Isaac Gouy

   converted to D by Dave Fladebo
   compile: dmd -O -inline -release chameneos.d
*/

import tango.stdc.stdio, tango.stdc.stdlib, tango.core.Thread, tango.util.locks.Semaphore;

void main(char[][] args)
{
    int meetings, n = args.length > 1 ? atoi((args[1] ~ '\0').ptr) : 1;
    MeetingPlace m = new MeetingPlace(n);

    Creature[] creatures = new Creature[colours.length];
    foreach(int i, inout Creature c; creatures)
    {
        c = new Creature(m,colours[i]);
        c.start();
    }
    foreach(Creature c; creatures)
    {
        c.join();
        meetings += c.creaturesMet;
    }
    printf("%d\n", meetings);
}

enum Colour { blue, red, yellow, faded }
Colour[] colours = [ Colour.blue, Colour.red, Colour.yellow, Colour.blue ];

class MeetingPlace
{
    private static Semaphore mustWait;
    private Colour first, second;
    private bool firstCall = true;
    private int n;

    this(int maxMeetings)
    {
        n = maxMeetings;
        mustWait = new Semaphore(1); //sem_init(&mustWait,0,1);
    }

    private Colour OtherCreaturesColour(Colour me)
    {
        Colour other = Colour.faded;

        mustWait.acquire(); //sem_wait(&mustWait);
        if(firstCall)
        {
            if(n)
            {
                first = me;
                firstCall = false;
                mustWait.release(); //sem_post(&mustWait);
                other = second;
                n--;
            }
            mustWait.release(); //sem_post(&mustWait);
        } else {
            firstCall = true;
            second = me;
            other = first;
        }

        return other;
    }
}

class Creature : Thread
{
    private MeetingPlace m;
    private int creaturesMet = 0;
    private Colour me;

    this(MeetingPlace m, Colour c)
    {
        super(&run);
        this.m = m; this.me = c;
    }

    void run()
    {
        while(me != Colour.faded) MeetOtherCreature();
        //return 0;
    }

    private void MeetOtherCreature()
    {
        Colour other = m.OtherCreaturesColour(me);
        if(other == Colour.faded)
        {
            me = other;
        } else {
            me = Complement(other);
            creaturesMet++;
        }
    }

    private Colour Complement(Colour other)
    {
        switch(me)
        {
            case Colour.blue:
                return other == Colour.red ? Colour.yellow : Colour.red;
            case Colour.red:
                return other == Colour.blue ? Colour.yellow : Colour.blue;
            case Colour.yellow:
                return other == Colour.blue ? Colour.red : Colour.blue;
            default:
                break;
        }
        return me;
    }
}

Posted: 04/30/07 06:07:47

kris wrote:

looking at the code, it seems you've made some changes to the implementation of meeting place?

Yes, simple porting the original phobos version is very simple task. So I'm trying to create another implementation where each creature implements meeting place's logic.

kris wrote:

In particular, the original code has a shared lock, while your version is creating a new lock on each iteration?

I'm trying to use mutex to lock meeting place data at each iteration and condition variable to wake up the creature who arrived at the meeting place first.

Regards, Yauheni Akhotnikau

Posted: 04/30/07 06:16:51

sean wrote:

Avoid volatile RMW operations like the ++creaturesMeet. Or if you really want to use them, do so using tango.core.Atomic. The operation would be something like "atomicIncrement(creaturesMeet)."

But why should I do that? creaturesMeet is not a shared variable. It is the instance variable in each Creature object.

May be I should use some protection for meetingPlace.remaining? But all access to it protected by mutex.

Regards, Yauheni Akhotnikau

Posted: 04/30/07 07:00:35

I have some doubts about correctness of this implementation:

sean wrote:
class MeetingPlace
{
    private static Semaphore mustWait;
    private Colour first, second;
    private bool firstCall = true;
    private int n;

    this(int maxMeetings)
    {
        n = maxMeetings;
        mustWait = new Semaphore(1); //sem_init(&mustWait,0,1);
    }

    private Colour OtherCreaturesColour(Colour me)
    {
        Colour other = Colour.faded;

        mustWait.acquire(); //sem_wait(&mustWait);
        if(firstCall)
        {
            if(n)
            {
                first = me;
                firstCall = false;
/*1*/           mustWait.release(); //sem_post(&mustWait);
                other = second;
                n--;
            }
            mustWait.release(); //sem_post(&mustWait);
        } else {
            firstCall = true;
/*2*/       second = me;
            other = first;
        }

        return other;
    }
}

The first creature at meeting point releases the semaphore at the point 1. But it must be sure that the second creature already passed point 2. And I don't see any reasons for that assurance.

For example, if change the implementation of MeetingPlace to:

class MeetingPlace
{
    private static Semaphore mustWait;
    private Colour first, second;
    private bool firstCall = true;
    private int n;
    private bool second_was_here = false;

    this(int maxMeetings)
    {
        n = maxMeetings;
        mustWait = new Semaphore(1); //sem_init(&mustWait;,0,1);
    }

    private Colour OtherCreaturesColour(Colour me)
    {
        Colour other = Colour.faded;

        mustWait.acquire(); //sem_wait(&mustWait;);
        if(firstCall)
        {
            if(n)
            {
                first = me;
                firstCall = false;
                second_was_here = false;
                mustWait.release(); //sem_post(&mustWait;);
/*3*/           assert( second_was_here, "second must was be here" );
                other = second;
                n--;
            }
            mustWait.release(); //sem_post(&mustWait;);
        } else {
            firstCall = true;
            second = me;
            second_was_here = true;
            other = first;
        }

        return other;
    }
}

than we have assertion at point 3. It is because the first creature doesn't wait the second one, but simple reads garbage from this.second instance variable.

PS. I found a mistake in Scala implementation, so may be D implementation is not 100% correct too.

Regards, Yauheni Akhotnikau

Posted: 04/30/07 10:09:54

Sorry for wasting your time. It is a problem in my algorithm.

eao197 wrote:
bash-3.2$ time ./my_chameneous 40000
creaturesMeet_: 2579283
creaturesMeet_: 2093332
creaturesMeet_: 2178625
creaturesMeet_: 2290491
9141731

real    0m11.546s
user    0m0.031s
sys     0m0.016s

In second case total count of meetings must be 80000, not 9 millions.

I've understood why total count of meeting is so big. It is because on Windows platfom Condition implemented via CriticalSection? and Event. So, when the first creature waits on:

               if( !meetingPlace_.swappedColor_ )
                  {
                    meetingPlace_.swappedColor_ = &other;
/*here*/            meetingPlace_.condition_.wait();
                    meetingPlace_.swappedColor_ = null;
                    --meetingPlace_.remaining_;
                  }

the second one notifies Condition. But on Windows this notification doesn't wake up the first creature's thread imediatelly. That thread tries to reacquire lock again. But here could be threads of other creatures which concurent with the first thread. So another thread could acquire lock and come into:

                else
                  {
                    other = *meetingPlace_.swappedColor_;
                    *meetingPlace_.swappedColor_ = my_;
                    meetingPlace_.condition_.notify();
                  }

(because meetingPlace_.swappedColor_ is not null). And this may lead to big total count of meetings.

I don't know yet how to resolve this situation but I'm working on it.

PS. The same problem exists in C++ and ACE version, which I have written today morning.

Regards, Yauheni Akhotnikau

Posted: 04/30/07 11:21:10

eao197 wrote:

I don't know yet how to resolve this situation but I'm working on it.

PS. The same problem exists in C++ and ACE version, which I have written today morning.

Sorry for disturbing and thank for your patience.

Here is correct implementation (and here is a C++ version). Some benchmarks on my machine:

# D
bash-3.2$ time ./chameneos.exe 1000000
2000000

real    0m5.109s
user    0m0.015s
sys     0m0.031s

# C++
bash-3.2$ time ./test.chameneos.v1.exe 1000000
2000000

real    0m4.281s
user    0m0.031s
sys     0m0.015s

Regards, Yauheni Akhotnikau

Posted: 05/01/07 18:19:07

would be interesting to see how to speed this up considerably ...

Posted: 05/02/07 05:58:43

kris wrote:

would be interesting to see how to speed this up considerably ...

I think this may be difference in DMD and VC++ optimization.

If you interested here are yet two versions of that test:
version with semaphore (semaphore is used to limit count of creatures at the meeting place)
version with two condition variables.

Version with condition variables is 2x faster than version with semaphore on WinXP.

These two implementation, I hope, could be good examples of great D and Tango features:

  • D converts code blocks to delegates;
  • D supports local functions;
  • Thread constructor, which accept delegate;
  • usage of Semaphore, Mutex, Condition, Barrier and ThreadGroup.

Regards, Yauheni Akhotnikau