Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Debug library makes app really sluggish

Moderators: kris

Posted: 05/02/09 21:08:33

I'm having some trouble when linking with the Tango debug lib (tango-base-dmd-d.lib on Windows). My DWT app runs incredibly slow. GUI redrawing is very slow, loading an XML file with the Tango SAXParser seems to take hundreds of times longer than normal, etc.

Originally I thought this was related to issue #1574. But I just tried with r4570, and the assert exception is gone, but the sluggishness is still there. So I guess it was something else all along.

I've profiled just the XML loading part of my app separately, and it seems that UnicodeData?'s static ctor takes about a second to finish. That's the only slowdown I could find, but there's another second that's not accounted for. But when checking the full app, I can't find any reference to UnicodeData? in the executable, nor in the profiler output. But maybe this issue can point to a general problem with Tango's debug build.

Just profiling my app isn't easy, since it's multithreaded.

Anyone else having the same issues?

Author Message

Posted: 05/02/09 22:33:25

with debug builds, I've noticed DMD does not elide any code ... it there's an import to a module parsed anywhere, all that code winds up in the executable. This is not the case with non-debug.

Don't know how you'd wind up with UnicodeData though, and it does sound wild that debug is causing such a slowdown. Perhaps the GC is doing some crazy scanning?

Posted: 06/12/09 15:43:44

I recently upgraded from 0.99.7 to 0.99.8, had the same problem, and came here to post a thread about it, then saw this one. It is a regression from 0.99.7. The startup time of the program I am working on changed from <1 second to ~10 seconds. I don't want to, but I'll go back to 0.99.7 if this can't be fixed soon.

I've made a short sample that shows the problem to a lesser extent. I'm using the DMD + Tango bundle. The slowdown only occurs when the program is compiled with -g. I think I'm going to need debug symbols...

import tango.time.StopWatch;

char[] toRomanNumerals(int num) {
	auto combos = [new int[0], [0], [0,0], [0,0,0], [0,1],
	[1], [1,0], [1,0,0], [1,0,0,0], [0,2]];
	auto letters = ['I', 'V', 'X', 'L', 'C', 'D', 'M'];
	char[] str = "";
	return str;

void main() {
	StopWatch watch;
	for(int i = 0; i < 4000; ++i)
		assert(toRomanNumerals(i) != "...");
	Stdout.format("Done, took {} seconds", watch.stop).newline;
 >d:\dev\dmd\bin\dmd slow_test.d
Done, took 0.00 seconds

 >d:\dev\dmd\bin\dmd -g slow_test.d
Done, took 4.03 seconds

That is slow!

Can anyone reproduce this or know what is causing it?

Thanks for any help!

Posted: 06/12/09 20:48:43

Do you have -debuglib set to tango-base-dmd-d.lib (not tango-base-dmd.lib) in sc.ini?

Posted: 06/13/09 05:04:38

torhu wrote:

Do you have -debuglib set to tango-base-dmd-d.lib (not tango-base-dmd.lib) in sc.ini?

Oh, that is the difference between 0.99.7 and 0.99.8. I've always just unpacked the zip file and not looked at sc.ini. But now that I compare them:

Tango 0.99.8
DFLAGS="-I%@P%\..\import" -version=Tango -defaultlib=tango-base-dmd.lib -debuglib=tango-base-dmd-d.lib -L+tango-user-dmd.lib

Tango 0.99.7
DFLAGS="-I%@P%\..\import" -version=Tango -defaultlib=tango-base-dmd.lib -debuglib=tango-base-dmd.lib -L+tango-user-dmd.lib

Thanks, now I know how to get back to the old behavior without switching back to 0.99.7. I suppose it isn't really a regression, but I still wish the debug lib wasn't so slow.

Posted: 06/13/09 11:31:34

It is possible that it is a regression (or a previously unfound bug) in the compiler since it really shouldn't be something in Tango causing such a slow down. However, it is well known that the difference between -release and non-release code can be huge (I've seen -release be 3 times faster in certain benchmarks) and this difference will account for at least some of the slow down in a debug library.

Posted: 06/13/09 16:46:48 -- Modified: 06/13/09 16:49:38 by

Larsivi, it could just be that noone ever used the debug build of tango-base before the 0.99.8 release.

Jordan, would be interesting to see what you get if you run your example with profiling enabled.

Posted: 06/13/09 19:20:33

the example (above) is doing a lot of heap activity, so it could be that the GC debug-version has all kinds of internal checks and such?

Posted: 06/14/09 19:23:30

I know the debug version will always be slower, but I think this is too much. I mentioned that the exe takes about 10 seconds to open. When I saw this, I compiled with -profile and found out that toRomanNumerals(int) took 9.4 seconds of the ~13 second startup time (with profiling). Another function, parseRomanNumerals(char[]) is not slower--and doesn't use the heap. The full toRomanNumerals() has string concatenation too. Profiling the full functions:

                  non-debug  debug  (microseconds)
toRomanNumerals()         9   2300
parseRomanNumerals()     13     13

I just wrote toRomanNumerals() for fun and haven't used it. It affected startup time because of this unit test:

for(int i = 0; i < 4000; ++i)
    assert(toRomanNumerals(i).parseRomanNumerals() == i);

torhu, to see what is taking the time in Tango, I suppose I'd have to compile Tango with -profile? I'm not sure how to compile it, but I probably should figure it out. Here is just the example: traces.txt

Posted: 06/14/09 20:15:09

Kris can probably tell you how to build tango-base for profiling, I've never tried that. Sounds plausible that it could be the GC that causes the slowdown, though.

If you allocate the arrays only once, you should see a huge speedup:

char[] toRomanNumerals(int num) {
	static int[][] combos = [new int[0], [0], [0,0], [0,0,0], [0,1],
	[1], [1,0], [1,0,0], [1,0,0,0], [0,2]];
	static char[] letters = ['I', 'V', 'X', 'L', 'C', 'D', 'M'];
	char[] str = "";
	return str;

Posted: 06/14/09 21:36:45

Yes, that is a good change to make. (However, new int[0] is not a constant, so I had to change it to [0][0..0]) I now have

static combos = [[0][0..0], [0], [0,0], [0,0,0], [0,1],
[1], [1,0], [1,0,0], [1,0,0,0], [0,2]];
static letters = "IVXLCDM";

The example is instant (basically an empty function now) and the full function is three times as fast (800 vs 2300). I think Kris is right about the GC debug version. I might use the debug lib more as an incentive to reduce heap allocations. :)