Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

GC, Fibers, oh my

Moderators: kris

Posted: 03/22/07 04:23:13

I am using fibers and coroutines for async network servers. They leak badly. As we know, if the dtor is not run on a Fiber then its stack will not be freed. I have read the documentation on digitamars.com about instance allocation and destructors. Apparently, class instances created with new are allocated on the stack and the dtor will never run. I use new so I have never seen the dtor run on a Fiber unless I use a delete expression. dm.com gives a list of ways to create instances on the stack but none on the heap. How is it done? I tried hacking the Fiber class to automatically free their stack by adding a freeStack() after termination/switchOut but that didn't work. What do I do?

Also, it seems to me that Fiber.getThis() is unnecessarily private. To implement a future I need the current Fiber, so I can switch to it when the results are ready.

AtDhVaAnNkCsE

Author Message

Posted: 03/23/07 00:51:23 -- Modified: 03/23/07 00:53:32 by
sean

If all references to a fiber are lost, the GC can collect it and will call its dtor when doing so.

I've just made Fiber.getThis() public. Please be aware that it will return null if no Fiber is currently active.

Oh, the reason the fiber stack is retained until destruction is so fibers can be re-used efficiently. I think the code also currently assumes the stack will exist for as long as the object is alive, and it may break if the stack is destroyed prematurely.

Posted: 03/23/07 22:13:44

Thank you! Wouldn't it be better if getThis() raised a FiberException? instead of returning null? I would subclass Fiber to get this but getThis() is static (no pun intended.)

Posted: 03/23/07 23:26:39

If I changed it to throw then I couldn't use the function internally :-) But I may change it to do so anyway. The idea crossed my mind when changing its visibility but I wanted to think about it a bit more first.

Posted: 03/25/07 16:32:01

Another question, if reuse is external to the Fiber class, how is reuse of Fibers supposed to work? It seems that at every place you call() a fiber you need to check the state and put it back into the unused fiber set. If the Fiber did this itself then you could localize the reuse to a single place in the code and not worry about it (and reduce the possibility of introducing bugs.) Right now I'm not worrying about it. For each event I just create a new fiber and let it get collected when it terminates and nobody is holding a reference to it.

I'm really glad that Tango has the Fiber class. Thanks again!

Posted: 03/27/07 17:29:17 -- Modified: 03/27/07 18:04:08 by
drox

OK, there are definitely memory issues with Fibers.

1. Fibers leak memory. The destructors are never called and the stacks are never freed. Simply create and call fibers in a tight loop. The old fibers are never collected and the process consumes all memory in very short order. If you instantiate any other object in a tight loop like this, the destructor will get called and the memory footprint will stay very small and constant. Something is keeping a reference to the fiber and it isn't me!!

import tango.core.Thread;

void main() {
  // this will eat all of your memory, though it shouldn't
  while(1) {
    auto n = delegate void() {auto foo = 1.0;};
    auto x = new Fiber(n);
    x.call();
    assert(x.state == Fiber.State.TERM);
  }
}

2. There is an issue with opCmp and Fibers (I guess.) This example will reliably cause a crash in "dgliteral2MFZv" (after eating a lot of memory...) If you just call fun() without using a Fiber there is no problem. In my real program the crash comes inside of opCmp in "_D9invariant12_d_invariantFC6ObjectZv" while calling a member function on one of the operands. I'm using dmd 1.010 (though I was having the same problem with 1.090) and an svn checkout of Tango on linux/ubuntu.

import tango.core.Thread;

class boo {
  double x;
  this(double z) {
    x = z;
  }
  int opCmp(boo o){
    if (x < o.x)
      return -1;
    else if (x > o.x)
      return 1;
    return 0;
  }
}

void main() {
  while(1) {
    auto fun = delegate void() {
      auto x = new boo(1.0);
      auto y = new boo(2.0);

      auto n = x > y;
    };
    auto x = new Fiber(fun);
    x.call();
  }
}

I hope this helps...

Posted: 03/27/07 20:30:35

I just fixed the memory leak problem in commit [1973]. Please let me know if the problem persists.

As for the other one... I'll have to look into it. I don't have any idea what's going on there.

Posted: 03/27/07 21:01:02

wow, thanks for looking into this so quickly! I updated to v1973 but it didn't seem to make a difference. Something still has a reference to the Fiber? I was pretty thorough about making sure my libs were up to date...

Posted: 03/27/07 22:39:47 -- Modified: 03/27/07 22:40:17 by
sean

*sigh* I see what's going on. The previous fix did actually correct one problem, but there's another as well. In essence, Fibers register an object describing their stack range in a global list. This object is a struct within the Fiber object (ie. it's not dynamically allocated, for efficiency), and it lives in this list until the Fiber's dtor is called. But since it lives within Fiber, there will always be a valid reference to all Fiber objects and therefore they will never be collected by the GC.

I think I can fix this in one of two ways:

* Dynamically allocate the Context struct to break the reference.

* Do some magic with Fiber states so the Context struct is only added to the list on the first Fiber.call(), and is removed when Fiber.call() results in the TERM state.

I'll probably do the latter since it's consistent with how Threads work, but I want to play with the code a bit so expect a commit either tonight or tomorrow.

Posted: 03/27/07 22:43:48

Oh, I should mention that this will increase the cost of starting and restarting Fibers a bit since adding and removing from the list is a synchronized operation. This cost had previously been incurred on thread construction and destruction. Perhaps I'll use the first option after all, since it would not have this problem. Anyway, I know what's going on, I just need to decide on the best way to handle it :-)

Posted: 03/28/07 18:47:26

Okay, the Fiber issue should be fixed in commit [1975].

Posted: 03/29/07 16:43:14

Thanks a lot! That seemed to do the trick. I'm still having stack smashing problems but I'm not sure what they are. I figured out there are patches for GDB to read D name mangling. My program crashes when it creates a large number of outgoing sockets with a trashed stack. I'm still looking into it.

Posted: 03/30/07 00:58:19

Thanks. If you figure out the opCmp issue or anything else, please let me know. It sounds like you're running GDC?

Posted: 04/02/07 16:15:29

I've been using dmd this whole time but I'm giving gdc a shot. Haven't managed to successfully link yet, everything is a missing symbol.

Posted: 04/06/07 15:39:47

Try again from trunk. Gregor fixed a few GDC link errors the other day.