Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Change Time to a struct/eliminate Interval

Moderators: kris

Posted: 09/05/07 21:41:33

One of the strange parts of tango I've found is the core/Type.d, which defines Interval as a floating point value, representing seconds, and Time, which is an enumeration based on long, which represents 100ns increments since the year 0.

First, I don't like Interval being a floating point value. Floating point arithmetic is subject to error, and if you are precisely timing things, you wouldn't want floating point error creeping in. There are cases where you can generate infinite loop errors if you are doing floating point comparisons. I propose changing Interval to a long, defined as the number of ticks in an interval.

Second, Time being a simple enumeration provides too much flexibility. For example, you can convert a Time value t to seconds by doing:

t /= Time.TicksPerSecond;

However, now t does not represent a Time type as specified by the Type.d description, rather another type that is based on seconds. If a careless coder saw that t is defined as a Time type, and didn't see the conversion to seconds, he wouldn't know that t is no longer a Time type. Another example, if I want t to be now based on ticks since 1970, I do:

t -= Time.TicksTo1970;

But again, t no longer represents a Time, since Time is defined as the 100ns ticks since the year 0, not 1970.

These semantics could be mostly solved by changing Time to a structure that represents a point in time, and creating a new structure which represents a length of time, maybe called TimeSpan?. The internal representation could be somewhat abstract, requiring the user to think more about the meaning of the types. For example, the operator to subtract two times would return a TimeSpan? instead of a Time. A TimeSpan? could define properties that convert it to various time bases, such as long toSeconds() or long toMilliseconds().

I think this should be done sooner rather than later, as there are already instances of the above mentioned erroneous semantics in the tango tree, and undoubtably in other users' code. I wouldn't mind updating the tango tree, but I wouldn't want to do it if this isn't something anyone wants :)

In addition, I'd get rid of Interval, and use the TimeSpan? struct in it's place.

Using a struct for both Time and TimeSpan? would make the storage size of the value types no bigger than the current implementation.

What do people think?

Author Message

Posted: 09/05/07 22:36:54 -- Modified: 09/06/07 01:05:39 by
kris -- Modified 6 Times

Good points;

Interval is a floating-point value because that is immune to the vagaries of "unit paranoia". In other words, there's no issue about the interval not measuring a sufficiently accurate period, because it's not based on a specific interval such as 1ms or whatever. Instead, it is accurate to the extend of the type used, which is currently float. This could be changed to double without any change to existing code, and one should note that floating-point values become more accurate the closer they become to zero. This is exactly what Interval needs, and it took us a long-time to get there ;)

Note that Interval is intended for short durations only, such as timing loops or waiting for some network activity, waiting for a sempahore to become available, or putting a thread to sleep for a while. It is a real-time Interval. Good to keep that in mind?

TimeSpan already exists in Tango as DateTime, and we'd recommend using it. Time, as a type, cannot be used effectively to measure small amounts of real-time, since it is limited to 100ns periods. This may not sound like much of an issue, but give the industry another ten years ...

Regarding Time as an enum: I understand one has to explicitly cast() arithmetic applied to a Time instance? This ought to make it pretty obvious that some manipulation is occurring? Again, we'd recommend using DateTime for pedestrian arithmetic instead ... you give it a Time instance, and manipulate from there. If we do add conversions between Time and say, seconds since the 1970 epoch, those will be couched in terms of integers rather than Time; in order to make the distinction clear.

Anyway, that's how we arrived at where we are with tango.util.time :)

Posted: 09/06/07 13:58:30 -- Modified: 09/06/07 14:10:42 by
schveiguy -- Modified 2 Times

kris wrote:

Interval is a floating-point value because that is immune to the vagaries of "unit paranoia". In other words, there's no issue about the interval not measuring a sufficiently accurate period, because it's not based on a specific interval such as 1ms or whatever. Instead, it is accurate to the extend of the type used, which is currently float. This could be changed to double without any change to existing code, and one should note that floating-point values become more accurate the closer they become to zero. This is exactly what Interval needs, and it took us a long-time to get there ;)

I understand this rationale, but these kinds of things can be solved with a struct as well. Need more resolution? add more bits to the struct. Another way you could solve this is with an exponent field of the struct. The exponent could represent * 10e. So if you want finer resolution, put a negative number there. One byte should be sufficient for this. Float cannot represent values such as .1 accurately, which was my point. I don't see any "unit paranoia" in other languages, such as Java, or C#. I'll also point out that Interval is only used in places where it is first converted to integer/structs to pass to lower level functions to sleep, which means trying to sleep for a short amount of time is going to be less useful as you must spend time converting before actually doing the sleep.

kris wrote:

TimeSpan? already exists in Tango as DateTime?, and we'd recommend using it.

Hm... I don't really like how one struct represents both points in time and lengths of time. How do I know what mode it is in? It is possible to represent them as separate structures, and I think it makes more sense that way.

kris wrote:

Time, as a type, cannot be used effectively to measure small amounts of real-time, since it is limited to 100ns periods. This may not sound like much of an issue, but give the industry another ten years ...

Ten years is a long time :) We will probably be on to E by then... In all seriousness, 100ns is a pretty small amount of time. For normal applications that don't require real-time accuracy, which is what I see D being most useful for, it is usually milliseconds we are more concerned about. Any real time application is going to know the accuracy that the OS provides, and want to use custom real-time functions which would not be abstracted.

kris wrote:

Regarding Time as an enum: I understand one has to explicitly cast() arithmetic applied to a Time instance? This ought to make it pretty obvious that some manipulation is occurring? Again, we'd recommend using DateTime? for pedestrian arithmetic instead ... you give it a Time instance, and manipulate from there. If we do add conversions between Time and say, seconds since the 1970 epoch, those will be couched in terms of integers rather than Time; in order to make the distinction clear.

I think my examples that I gave are actually used in tango, and they do not require casts, so I disagree on that point. Using two structures would ensure that one is using the appropriate type when representing points in time vs. lengths of time.

I was not aware that DateTime? was intended to be used for time arithmetic. And it appears from reading the code that it has most of the features I wanted, just not split into two structures. So now the problem is more complicated, and closer to being solved :) To a new developer, how is he to know which time types to use? There are three representations. I say drop tango/core/Type.d, and use DateTime/TimeSpan? for everything. C# does something similar and works very well.

Posted: 09/12/07 17:31:03

Does the lack of response here mean:

"Oh yeah, I guess you're right. Sounds good, so go ahead and make those changes"

or

"Damn schveiguy, he is so annoying with his DateTime? proposal that I won't even respond and maybe he'll go away"

or

"I'm thinking about it, get back to you in a few weeks"

or

"What? oh yeah, I forgot about that thing."

or

"Hold on, I'm implementing it now, and it will be ready soon"

:)

Posted: 09/12/07 17:36:26

none of the above :)

I'm waiting for comments from the other folks who work on Tango

Posted: 10/02/07 20:51:46 -- Modified: 10/03/07 18:16:05 by
schveiguy

Hm..

It's been a while. And well, I'm bored :) So I implemented this. Sort of. It compiles, but I didn't test any of it.

So how do I go about submitting this for review?

There are a LOT of library files that use Time and Interval, so there would be a lot of changes. However, most of them are pretty minor and straightforward.

I also am unsure as to how the dependencies should work. For example, core/Thread.d and core/Condition.d depended on Time to sleep, but now they depend on util/time/DateTime.d. I don't know if this is ideal, or if this breaks some rules (i.e. files in core cannot depend on other branches of the lib). In this case, I could move TimeSpan? into core instead of having it live in util/time/DateTime.d.

***update*** I re-implemented with TimeSpan? living in core/Type.d and DateTime? remaining in util/time/DateTime.d

In any case, let me know what to do.

-Steve

Posted: 10/05/07 14:31:53

schveiguy wrote:

I understand this rationale, but these kinds of things can be solved with a struct as well. Need more resolution? add more bits to the struct. Another way you could solve this is with an exponent field of the struct. The exponent could represent * 10e. So if you want finer resolution, put a negative number there. One byte should be sufficient for this. Float cannot represent values such as .1 accurately, which was my point.

So alias Interval to double? I'm not sure I see the value in simulating floating point in a struct when they are built into the language.

schveiguy wrote:

I don't see any "unit paranoia" in other languages, such as Java, or C#. I'll also point out that Interval is only used in places where it is first converted to integer/structs to pass to lower level functions to sleep, which means trying to sleep for a short amount of time is going to be less useful as you must spend time converting before actually doing the sleep.

True enough. However, conversion would occur with a struct as well, because the resolution of various operations are different. What I like about using a floating point number for Interval is that it automatically preserves as much relevant resolution as possible.

schveiguy wrote:

Ten years is a long time :) We will probably be on to E by then... In all seriousness, 100ns is a pretty small amount of time. For normal applications that don't require real-time accuracy, which is what I see D being most useful for, it is usually milliseconds we are more concerned about. Any real time application is going to know the accuracy that the OS provides, and want to use custom real-time functions which would not be abstracted.

Perhaps. But it's also possible that the desire for finer-grained resolution will increase over time as well. Ideally, the type used should be able to represent such fine intervals, or at least be invisibly changed to a type that does.

schveiguy wrote:

I was not aware that DateTime? was intended to be used for time arithmetic. And it appears from reading the code that it has most of the features I wanted, just not split into two structures. So now the problem is more complicated, and closer to being solved :) To a new developer, how is he to know which time types to use? There are three representations. I say drop tango/core/Type.d, and use DateTime/TimeSpan? for everything. C# does something similar and works very well.

Valid point. It may be that there are too many different time types.

Posted: 10/05/07 14:32:40

For something like this, I'd say just submit the new type itself and some examples of how it would be used.

Posted: 10/05/07 16:41:11 -- Modified: 10/09/07 13:40:48 by
schveiguy

sean wrote:

True enough. However, conversion would occur with a struct as well, because the resolution of various operations are different.

Also true, but first, conversion from a float to an int is more costly than a conversion from a long to an int. Second, when 64-bit OSes get around to it, they may have sleep functions that take a single 64-bit integer for a timeout.

sean wrote:

What I like about using a floating point number for Interval is that it automatically preserves as much relevant resolution as possible.

Float does not preserve accuracy. Here is some actual evidence.

import tango.io.selector.SelectSelector;
import tango.io.selector.AbstractSelector;
import tango.sys.Common;
import tango.io.Stdout;

int main(char[][] args)
{
  SelectSelector s =new SelectSelector;

  timeval tv;
  void printtimeval()
  {
    Stdout.format("tv.tv_sec = {}, tv.tv_usec = {}", tv.tv_sec, tv.tv_usec).newline;
  }
  s.toTimeval(&tv, .000001); // 1 microsecond
  printtimeval;
  s.toTimeval(&tv, .005); // 5 milliseconds
  printtimeval;
  return 0;
}

This outputs:

tv.tv_sec = 0, tv.tv_usec = 0
tv.tv_sec = 0, tv.tv_usec = 4999

So clearly, Interval as a floating point value does not cut it. This will happen even if Interval is a double. The issue is floating point error. For example, double can accurately represent .005, but not .007.

using a double, counting from 0 microseconds to int.max microseconds, I found that 1719978142 values were not accurately represented.

Here is the code that proves that:

import tango.io.Stdout;
alias double Interval;

int main(char args[][])
{
  //
  // convert from microseconds to interval in seconds
  //
  Interval toInterval(int x)
  {
    return(cast(Interval)(.000001 * x));
  }

  //
  // convert from interval back to microseconds
  //
  int toUs(Interval x)
  {
    return cast(int)(x * 1000000);
  }

  int bad = 0;

  for(int i = 0; i < int.max; i+= 1)
  {
    //
    // i is in microseconds
    //
    int x = toUs(toInterval(i));
    if(x != i)
      bad++;
  }
  Stdout.format("found {} bad instances out of {}", bad, int.max).newline;
  return 0;
}
sean wrote:

Perhaps. But it's also possible that the desire for finer-grained resolution will increase over time as well. Ideally, the type used should be able to represent such fine intervals, or at least be invisibly changed to a type that does.

I believe my code above speaks for the ability of Interval as a float to handle this, but I wanted to address your concerns about using a struct in this regard: as long as the structure's internals are private, there is no problem with updating for the future. Updating the struct does not break any code. The only place code would be affected is a place that does not use the constants (i.e. ticks/second, etc) to calculate values. My structure has a lot of constructors also that use normal time values to create instances, i.e.

TimeSpan fromSeconds(long seconds);
TimeSpan fromMilliseconds(long milliseconds);
...

which would be easily portable if we decided to change the internals. As long as code uses these constructs (which I think actually makes the code more readable), then it is safe.

sean wrote:

Valid point. It may be that there are too many different time types.

What I found most interesting with DateTime?, which I was told to use for "pedestrian" arithmetic, is that very little of tango uses DateTime? to do the same. The Time type is mostly used for arithmetic. To an outside developer it looks like DateTime? is not really useful for these types of things, which is why I overlooked it.

sean wrote:

For something like this, I'd say just submit the new type itself and some examples of how it would be used.

How do I do that? Is there a way to attach files to posts?

-Steve

Posted: 10/06/07 22:51:05

Not sure. But they can be attached to tickets.

Posted: 10/09/07 15:35:33

Created Ticket #671

Posted: 02/17/08 07:05:49

(this resulted in a revised tango.time package)