Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #1734 (closed defect: invalid)

Opened 10 years ago

Last modified 10 years ago

Invalid read of size 4 in tango.io.Console static ctor

Reported by: llucax Assigned to: kris
Priority: normal Milestone: 0.99.9
Component: IO Version: 0.99.8 Sean
Keywords: Cc: llucax@gmail.com

Description

This is a strange, strange bug, so I'll give you a little background. I've written a naive GC, which translates gc_malloc(size)s to libc malloc(size)s (when no free cells are available) using a simple linked list for live and free cells.

When trying to run Dil using my collector, I had a segmentation fault. Using Valgrind to look for suspicious stuff, I found this:

==28804== Invalid read of size 4
==28804==    at 0x80A493A: _D5tango2io7Console12_staticCtor1FZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80A4A4F: _D5tango2io7Console9__modctorFZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB73A: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB731: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB731: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB77E: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB5E9: _moduleCtor (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BF3B2: _D2rt6dmain24mainUiPPaZi6runAllMFZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BF2F4: _D2rt6dmain24mainUiPPaZi7tryExecMFDFZvZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BF2A3: main (in /home/luca/tesis/dgcbench/naive/dil)
==28804==  Address 0x426fb54 is 44 bytes inside a block of size 45 alloc'd
==28804==    at 0x402501E: malloc (vg_replace_malloc.c:207)
==28804==    by 0x80C68B4: _D2gc2gc2GC6mallocMFkkZPv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80C288B: gc_malloc (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BDD80: _d_newclass (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80A4923: _D5tango2io7Console12_staticCtor1FZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80A4A4F: _D5tango2io7Console9__modctorFZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB73A: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB731: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB731: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB77E: _D6object12_moduleCtor2FC10ModuleInfoAC10ModuleInfoiZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BB5E9: _moduleCtor (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BF3B2: _D2rt6dmain24mainUiPPaZi6runAllMFZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BF2F4: _D2rt6dmain24mainUiPPaZi7tryExecMFDFZvZv (in /home/luca/tesis/dgcbench/naive/dil)
==28804==    by 0x80BF2A3: main (in /home/luca/tesis/dgcbench/naive/dil)

This is repeated for each new Console.Input and Output created in the tango.io.Console static ctor. I tried to simply add the 3 missing bytes (by malloc()ing 3 bytes more than what's requested in the gc_malloc() call) and the problem goes away and Dil doesn't segfault either.

I think this might be a bug somewhere in tango, or even in DMD, it could went unnoticed because the current GC (both Tango and Phobos) allocates blocks with sizes powers of 2, so when 45 bytes are requested, 64 are really reserved, so when the code that triggered the bug writes to the bytes 46, 47, and 48, there is no problem in the current implementation because that memory belongs to that block anyways.

I'm using Tango 0.99.8, DMD 1.042 and Dil 79d66a04f070b4a8a6850f67b8ff7fd2b347f482. DMD 1.042 debug info seems to be completely broken, so I couldn't use GDB with much success, I only managed to see the asm (which I barely understand). The valgrind error is detected at addresses 0x080a493a, 0x080a496a and 0x080a4995 (all the "pushl"):

Dump of assembler code for function _D5tango2io7Console12_staticCtor1FZv:
0x080a4910 <_D5tango2io7Console12_staticCtor1FZv+0>:	push   %ebp
0x080a4911 <_D5tango2io7Console12_staticCtor1FZv+1>:	mov    %esp,%ebp
0x080a4913 <_D5tango2io7Console12_staticCtor1FZv+3>:	push   %eax
0x080a4914 <_D5tango2io7Console12_staticCtor1FZv+4>:	mov    $0x80ff6bc,%eax
0x080a4919 <_D5tango2io7Console12_staticCtor1FZv+9>:	push   %ebx
0x080a491a <_D5tango2io7Console12_staticCtor1FZv+10>:	push   %esi
0x080a491b <_D5tango2io7Console12_staticCtor1FZv+11>:	push   %edi
0x080a491c <_D5tango2io7Console12_staticCtor1FZv+12>:	push   $0x0
0x080a491e <_D5tango2io7Console12_staticCtor1FZv+14>:	push   %eax
0x080a491f <_D5tango2io7Console12_staticCtor1FZv+15>:	call   0x80bdd44 <_d_newclass>
0x080a4924 <_D5tango2io7Console12_staticCtor1FZv+20>:	add    $0x4,%esp
0x080a4927 <_D5tango2io7Console12_staticCtor1FZv+23>:	call   0x80a4884 <_D5tango2io7Console7Console7Conduit5_ctorMFT5tango2io5model8IConduit11ISelectable6HandleZC5tango2io7Console7Console7Conduit>
0x080a492c <_D5tango2io7Console12_staticCtor1FZv+28>:	mov    $0x80ff61c,%ecx
0x080a4931 <_D5tango2io7Console12_staticCtor1FZv+33>:	mov    %eax,%ebx
0x080a4933 <_D5tango2io7Console12_staticCtor1FZv+35>:	push   %ecx
0x080a4934 <_D5tango2io7Console12_staticCtor1FZv+36>:	call   0x80bdd44 <_d_newclass>
0x080a4939 <_D5tango2io7Console12_staticCtor1FZv+41>:	push   %ebx
0x080a493a <_D5tango2io7Console12_staticCtor1FZv+42>:	pushl  0x18(%ebx)
0x080a493d <_D5tango2io7Console12_staticCtor1FZv+45>:	call   0x80a4544 <_D5tango2io7Console7Console5Input5_ctorMFC5tango2io7Console7Console7ConduitbZC5tango2io7Console7Console5Input>
0x080a4942 <_D5tango2io7Console12_staticCtor1FZv+50>:	mov    $0x80ff6bc,%edx
0x080a4947 <_D5tango2io7Console12_staticCtor1FZv+55>:	push   $0x1
0x080a4949 <_D5tango2io7Console12_staticCtor1FZv+57>:	mov    %eax,0x81069e0
0x080a494e <_D5tango2io7Console12_staticCtor1FZv+62>:	push   %edx
0x080a494f <_D5tango2io7Console12_staticCtor1FZv+63>:	call   0x80bdd44 <_d_newclass>
0x080a4954 <_D5tango2io7Console12_staticCtor1FZv+68>:	add    $0x4,%esp
0x080a4957 <_D5tango2io7Console12_staticCtor1FZv+71>:	call   0x80a4884 <_D5tango2io7Console7Console7Conduit5_ctorMFT5tango2io5model8IConduit11ISelectable6HandleZC5tango2io7Console7Console7Conduit>
0x080a495c <_D5tango2io7Console12_staticCtor1FZv+76>:	mov    $0x80ff66c,%esi
0x080a4961 <_D5tango2io7Console12_staticCtor1FZv+81>:	mov    %eax,%ebx
0x080a4963 <_D5tango2io7Console12_staticCtor1FZv+83>:	push   %esi
0x080a4964 <_D5tango2io7Console12_staticCtor1FZv+84>:	call   0x80bdd44 <_d_newclass>
0x080a4969 <_D5tango2io7Console12_staticCtor1FZv+89>:	push   %ebx
0x080a496a <_D5tango2io7Console12_staticCtor1FZv+90>:	pushl  0x18(%ebx)
0x080a496d <_D5tango2io7Console12_staticCtor1FZv+93>:	call   0x80a46f4 <_D5tango2io7Console7Console6Output5_ctorMFC5tango2io7Console7Console7ConduitbZC5tango2io7Console7Console6Output>
0x080a4972 <_D5tango2io7Console12_staticCtor1FZv+98>:	mov    $0x80ff6bc,%edi
0x080a4977 <_D5tango2io7Console12_staticCtor1FZv+103>:	push   $0x2
0x080a4979 <_D5tango2io7Console12_staticCtor1FZv+105>:	mov    %eax,0x81069e4
0x080a497e <_D5tango2io7Console12_staticCtor1FZv+110>:	push   %edi
0x080a497f <_D5tango2io7Console12_staticCtor1FZv+111>:	call   0x80bdd44 <_d_newclass>
0x080a4984 <_D5tango2io7Console12_staticCtor1FZv+116>:	add    $0x4,%esp
0x080a4987 <_D5tango2io7Console12_staticCtor1FZv+119>:	call   0x80a4884 <_D5tango2io7Console7Console7Conduit5_ctorMFT5tango2io5model8IConduit11ISelectable6HandleZC5tango2io7Console7Console7Conduit>
0x080a498c <_D5tango2io7Console12_staticCtor1FZv+124>:	mov    %eax,%ebx
0x080a498e <_D5tango2io7Console12_staticCtor1FZv+126>:	push   %esi
0x080a498f <_D5tango2io7Console12_staticCtor1FZv+127>:	call   0x80bdd44 <_d_newclass>
0x080a4994 <_D5tango2io7Console12_staticCtor1FZv+132>:	push   %ebx
0x080a4995 <_D5tango2io7Console12_staticCtor1FZv+133>:	pushl  0x18(%ebx)
0x080a4998 <_D5tango2io7Console12_staticCtor1FZv+136>:	call   0x80a46f4 <_D5tango2io7Console7Console6Output5_ctorMFC5tango2io7Console7Console7ConduitbZC5tango2io7Console7Console6Output>
0x080a499d <_D5tango2io7Console12_staticCtor1FZv+141>:	mov    %eax,0x81069e8
0x080a49a2 <_D5tango2io7Console12_staticCtor1FZv+146>:	add    $0xc,%esp
0x080a49a5 <_D5tango2io7Console12_staticCtor1FZv+149>:	pop    %edi
0x080a49a6 <_D5tango2io7Console12_staticCtor1FZv+150>:	pop    %esi
0x080a49a7 <_D5tango2io7Console12_staticCtor1FZv+151>:	pop    %ebx
0x080a49a8 <_D5tango2io7Console12_staticCtor1FZv+152>:	mov    %ebp,%esp
0x080a49aa <_D5tango2io7Console12_staticCtor1FZv+154>:	pop    %ebp
0x080a49ab <_D5tango2io7Console12_staticCtor1FZv+155>:	ret    
End of assembler dump.

I'm sorry I can't provide better information, but this is the newer combination of versions that worked for me to build Dil. I hope it helps.

Change History

08/28/09 02:06:26 changed by llucax

I can reproduce the valgrind error with this trivial testcase:

import tango.io.Console;
void main() {}

(using the naive collector, of course)

(follow-up: ↓ 3 ) 08/28/09 14:58:07 changed by larsivi

I don't remember the details, but DMD 1.042 rings a bell ... I don't think it was a particularly good version.

Further I don't really see how Tango can be at fault, and neither do I agree that it is a Tango bug that a non-Tango GC doesn't work with it :)

(in reply to: ↑ 2 ) 08/28/09 16:55:46 changed by llucax

Replying to larsivi:

I don't remember the details, but DMD 1.042 rings a bell ... I don't think it was a particularly good version.

I'll try 1.041, that version is the officially supported one for Tango 0.99.8, right?

Further I don't really see how Tango can be at fault, and neither do I agree that it is a Tango bug that a non-Tango GC doesn't work with it :)

I don't know why you said that. I don't think this bug report is *that* lame. I didn't said "Hey! My GC doesn't work, fix Tango!". There is more data than that that could suggest that can be some kind of obscure bug in Tango. I didn't even said that this *is* a Tango bug, I said "I think this might be a bug somewhere in tango, or even in DMD ...". Maybe is a bug in my GC (yes, I'm not that arrogant, I'm still considering that a very feasible possibility =). Why I think there is a very good chance is a Tango bug? Because the only part where valgrind detect an error is in that static ctor. Dil uses Tango *a lot* AFAIK, and the only error is there. Adding a little extra space in the malloc fix that only problem.

I'll try to test it with Tangos stub GC, that should tell us if my GC is the one to blame...

08/28/09 18:12:44 changed by llucax

I've tried with the Tango stub GC and the invalid write is still there. I have to try DMD 1.041 yet...

08/29/09 04:45:35 changed by llucax

Well, I dug into the code a little more and it seems like a compiler bug. The Console.Conduit class is the problematic one; it's 13 bytes long (8 for monitor+vtable, 4 for the Device.handle and 1 for its own redirected boolean). It looks like DMD is reading the boolean as if it were 4 bytes long. Changing the boolean to an int fixes the problem too.

I tried to make a small testcase but I can't reproduce the error... I'll try to reproduce it in a small test before closing the bug as invalid if it's ok with you...

08/29/09 18:12:47 changed by llucax

With DMD 1.041 I can't reproduce the valgrind error, but Dil still segfaults, so they might be different problems...

09/03/09 19:59:04 changed by llucax

  • status changed from new to closed.
  • resolution set to invalid.

Well, seems like Dil is manually freeing memory prematurely (it does some manual memory management) or my gc_free() function is buggy. Anyway, not a Tango bug apparently, so I'm closing the bug as invalid.

Sorry for the noise.