Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

add detach() to destructors in Socket and DeviceConduit

Moderators: kris

Posted: 10/22/07 18:18:38

I noticed that there are no calls to close an underlying file handle when an object is collected.

The issue I see is that if you forget to call close before allowing the object to be collected, you will leak file descriptors.

I have already encountered this :) I create a SocketConduit, then try to connect it. If it doesn't connect, I wasn't calling close on the socket conduit because I forgot that close still needs to be called even if the connection wasn't made.

Was there a reason why this was not done?

If there is a concern that underlying handles will be incorrectly closed, then you could have a flag indicating whether to close on delete.

-Steve

Author Message

Posted: 10/22/07 18:48:43 -- Modified: 10/22/07 18:50:44 by
kris

As a rule, D destructors have minimal value: GC references made from the dtor host may have already been collected, and thus a dtor is invoked within a non-deterministic state (except where an explicit delete is executed).

Because of this, libraries cannot hope to have a consistent cleanup mechanism which rely upon dtor invocation. In the case of Tango, we decided that the only sane recourse was to insist that the client code is responsible for cleanup. I agree this is perhaps not ideal, but the alternatives (with D) are arguably worse

Posted: 10/22/07 19:42:43

OK, but we are not talking about GC references, we are talking about a file descriptor. I'm pretty sure that will not have been collected :)

It's not a huge deal, but it seems to go against the philosophy in D how you don't usually have to do cleanup. Technically, you can still run into trouble if you happily create sockets forever without allowing the GC to run. I was just thinking of this as a best effort to avoid leaking resources.

BTW, is there a way to do cleanup in a sane manner? Can't you simply call destructors on all objects to be destroyed before reclaiming all the memory? Maybe there is something I'm not understanding...

-Steve

Posted: 10/22/07 20:18:40 -- Modified: 10/22/07 20:37:51 by
kris

Perhaps you missed my point about consistency? We don't want to make a special-case for file-handles when other dtors cannot be implemented correctly :)

Please do talk to Walter about the GC issue

Posted: 10/22/07 21:30:58

kris wrote:

Perhaps you missed my point about consistency?

Nope, I didn't miss it. I understand your protest (well, sort of, I still don't get why things are unfixable due to the language spec, but I'll grant that you have good reason to believe they are).

Perhaps it's your policy that needs nudging. Could you change the policy to only use destructors where resources have been allocated OUTSIDE D? For instance, sockets are created by the socket system call. You can't expect automatic cleanup by the C library, but you also are guaranteed that the D GC is not going to destroy the resource out from under you. Likewise, if you call malloc(), you'd better not expect D to clean up for you, so it makes sense to use a destructor to clean up the memory.

I don't see how this conflicts with D-only object policies. If you insist on a global policy as a protest, that's your call, but I think it would be more beneficial to use the destructors where they can be consistently used. And I think they can be consistently used to destroy resources allocated by C or system calls. If this statement is wrong, I'd appreciate an example of why.

-Steve

Posted: 10/22/07 21:43:12

It's 'wrong' because it expects the user to comprehend a distinction between supportable dtors and non-supportable. How would the user know that a Socket would not require cleanup, but a SocketConduit? or something else might require explicit cleanup?

They'd have to look at the code, and the implementation may change over time. There's no clear and consistent means to portray the distinction to a user, where they are supposed to be dealing with an abstraction layer. Heck, it might even be different based upon which OS is used :)

dtors are a loss in D. There's been a shipload of discussion on the NG about this over the past three or fours years, and yet nothing has changed. Go figure :)

Posted: 10/22/07 22:51:28

kris wrote:

It's 'wrong' because it expects the user to comprehend a distinction between supportable dtors and non-supportable. How would the user know that a Socket would not require cleanup, but a SocketConduit? or something else might require explicit cleanup? They'd have to look at the code, and the implementation may change over time. There's no clear and consistent means to portray the distinction to a user, where they are supposed to be dealing with an abstraction layer. Heck, it might even be different based upon which OS is used :)

Huh? The current library ALREADY expects the user to know that he must specifically clean up a socket! How does a user know? Well the only reason I know is because I have written sockets in C and I know how they have to be cleaned up. In D, it's usually assumed that you DON'T have to do cleanup.

Your argument is actually against the current implementation, as now the user is expected to know what objects they should clean up and what objects they shouldn't, and this is based solely on the implementation in the OS which by your argument they should not have to care about. How come I have to clean up a Socket object, but I don't have to clean up a Layout object?

kris wrote:

dtors are a loss in D. There's been a shipload of discussion on the NG about this over the past three or fours years, and yet nothing has changed. Go figure :)

I'll have to search the NG to find out what you are talking about.

-Steve

Posted: 10/22/07 23:35:14

schveiguy wrote:
kris wrote:

It's 'wrong' because it expects the user to comprehend a distinction between supportable dtors and non-supportable. How would the user know that a Socket would not require cleanup, but a SocketConduit or something else might require explicit cleanup? They'd have to look at the code, and the implementation may change over time. There's no clear and consistent means to portray the distinction to a user, where they are supposed to be dealing with an abstraction layer. Heck, it might even be different based upon which OS is used :)

Huh? The current library ALREADY expects the user to know that he must specifically clean up a socket! How does a user know? Well the only reason I know is because I have written sockets in C and I know how they have to be cleaned up. In D, it's usually assumed that you DON'T have to do cleanup.

Your argument is actually against the current implementation, as now the user is expected to know what objects they should clean up and what objects they shouldn't, and this is based solely on the implementation in the OS which by your argument they should not have to care about. How come I have to clean up a Socket object, but I don't have to clean up a Layout object?

I suppose the latter is true enough yet, unfortunately, this is a limitation of D today. That is, consistent cleanup is simply not supported through D in a manner we can fully utilize.

In lieu of that, we had to draw the line somewhere: we don't clean up any resources. Call it a cop-out, but that's how other major libraries operate also (witness the Eclipse libraries for example, or all of Win32 for that matter. Does C# do it differently?). Basically, our options are limited here. D provides little more than Java does in this regard (the issues with finalizers have all been inherited by D). And let's not even discuss the various implications of 'timely' cleanup WRT lazy collection.

Worst case, anything with a close() method needs to be cleaned up by the user. Other libraries use dispose() for similar purposes. Where D does provide some support is through scope/auto classes (regarding RAII). Unfortunately, that notion (as implemented in D) has shown limited practical application.

I wish D was a whole lot better in this regard, and have discussed it fervently in the past *shrug*

Posted: 10/23/07 01:59:03

I looked at the C# documentation. It says that you need to specifically call close, so it looks like the precedent is in favor of the current behavior. That being said, precedent doesn't always equal right. I can certainly accept that it is not the right thing to do more than I can accept that destructors are useless. I see cases where it would be nice to not worry about closing things like files or sockets, but the truth is that you cannot really count on the GC destroying those resources in a timely fashion, meaning you could still crash the program, most likely in a non-deterministic way. One thing that C# does have for auto factory created resources such as remoted classes is the notion of lifetimes. This works by assigning a sponsor for a resource. As long as the sponsor keeps renewing the lifetime of the object, the resource is kept. The lifetime object describes the sponsor and the time before deletion. Every so often, a system thread asks the sponsor if it should keep around the resource, destroying it if it's no longer needed. This would result in more timely deletion of things like sockets.

So here are a couple of further points to discuss:

  1. C# provides an IDisposable interface, which signifies that any implementing class has non-GC resources that must be disposed of manually. It would be nice to have this interface for both Socket and DeviceConduit?, so it is more clear that the objects have non-GC resources.
  2. Given that all non-GC resources could implement a specific interface, it would be cool to have a way to auto-destroy system resources in a timely fashion with something like lifetimes and an auto-destroy thread.

In any case, at the very least, some documentation that specifically states that you will leak resources if you do NOT call detach would be a good thing to have. This should probably be in DeviceConduit? and in Socket. What I'm thinking is something like:

Note: if you do not call detach() the resources will NOT automatically be released.

This at least sends a very clear message that detach is not an optional method.

-Steve

Posted: 10/23/07 03:54:46 -- Modified: 10/23/07 03:59:29 by
kris

Another potential option would be for the compiler to provide some assistance/warning where it notices that something marked in a specific manner has gone out of scope without some appropriate action being taken.

The runtime 'equivalent' is to emit a warning (or exception) when a dtor is invoked and a relevant explicit close/dispose has not already been issued. The latter doesn't catch all cases, but at least indicates some leakage rather than being silent about them all (the Tango GC was modified to assist in this manner).

We'd considered using a signature interface in the past (for resource holders), but didn't come up with a truly compelling way to take advantage of that. The lifetime notion from C# is an interesting one to contemplate for a while too, though it initially seems more suited to addressing distributed garbage-collection issues?

Posted: 10/23/07 13:10:52

One more possible idea I came up with last night.

The problem I see with automatic destruction of resources is that you are not always sure that a resource isn't used elsewhere. In the case of Linux, the resource is usually an integer, not a pointer, so closing the resource in one object may leave another object useless. So in that sense, it is best left up to the coder as he knows precisely when his resource is no longer needed.

But if you made the resource a D allocated class, then the garbage collector will only destroy it when specifically asked to (via delete) or when no objects are using it. What I'm thinking is a small wrapper class that basically holds an OS handle. Once the OS handle is set, it cannot be changed. For example, a socket would be something like:

class SocketResource
{
   private Handle handle_;
   Handle handle() { return handle_;}
   this(Handle h) { handle_ = h;}
   ~this() { if(handle_ != -1) close(handle_);}
}

Now, objects that use the resource would keep a SocketResource? instead of a handle. Then the Garbage collector would only auto-destroy the handle if nothing else is using it. You would only be allowed to use the handle by passing to any OS system call except close (not enforcable, but reasonable to ask of a user). You would also not be allowed to store the Handle anywhere outside the SocketResource?. If you specifically want to close the handle, you delete the SocketResource?. That has consequences if multiple objects have references to the SocketResource?, but it is no different than the current situation, except now the resource will hopefully be automatically closed if all references go away.

Just thinking out loud.

-Steve

Posted: 10/24/07 14:03:23

One more point. I have seen classes in tango that implement the destructor to close OS resources...

-Steve

Posted: 10/31/07 17:18:15

I'll push this one more time: Here are the list of objects that ALREADY have this behavior in tango:

I now believe that it is possible with the current GC implementation to have the notion that system-generated resources. The only restriction is that you MUST have a direct link to the resource, you cannot link through a GC-allocated container. For example, if you need an array of file descriptors, you must allocate the array not using the GC, or allocate an array of wrapper classes that will automatically destroy the resource when destroyed.

-Steve