FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

First impressions

 
Post new topic   Reply to topic     Forum Index -> Mago Debugger
View previous topic :: View next topic  
Author Message
sagitario



Joined: 03 Mar 2007
Posts: 292

PostPosted: Tue Aug 24, 2010 1:47 pm    Post subject: First impressions Reply with quote

Hi,

This is really an awesome project! A lot of the quirks that you run into with cv2pdb can be solved with a dedicated debugger. And much more... Smile

On the newsgroup I referred to some issues with the current version. I'll list them here:

- function arguments are shown incorrectly in the locals window while the instruction pointer is still on the function declaration
- function names on the stack are not demangled (hint: I've updated the demangle function in cv2pdb recently)
- class references are shown as pointers (I know, dmd outputs this, but I'd remove one indirection)
- single-stepping out of a function switches to disassembly (only with step-into, not with step-over)
- local function's variables are not shown in the locals window. There is no debug info for variables of the outer scope, but vars declared in the function itself should be shown.
- long variable names (>255 chars) cause an empty locals list. Unfortunately there are 3 differnet ways dmd emits these symbols (compressed, cut-off with SHA value string and special length encoding)

Some features I would miss a lot when switching to MaGo, but that are not on your feature list:
- display associative arrays
- support for the "Auto" window

I hope I can also help with some of these issues.

Rainer
Back to top
View user's profile Send private message
aldon



Joined: 08 Aug 2010
Posts: 5
Location: Washington, USA

PostPosted: Thu Aug 26, 2010 2:15 am    Post subject: Reply with quote

Quote:
On the newsgroup I referred to some issues with the current version. I'll list them here:
- function arguments are shown incorrectly in the locals window while the instruction pointer is still on the function declaration

I suspect that without special case code, it might be a limitation of the debug format. But, it's something I should check out.

Quote:
- function names on the stack are not demangled (hint: I've updated the demangle function in cv2pdb recently)

I want to ask for a change in DMD in order to fix this. I don't see a reason why function and variable names should be stored mangled in the debug info. Looking at PDB, I see that only "public" symbols are mangled, not local or global symbols. Storing them unmangled also let's me keep using the symbol hash table that the linker computes, instead of reading all symbols and then making a table.

Quote:
- class references are shown as pointers (I know, dmd outputs this, but I'd remove one indirection)

I agree. This is something to fix.

Quote:
- single-stepping out of a function switches to disassembly (only with step-into, not with step-over)

I remember seeing some weirdness like that caused by a problem with the debug info for some functions with respect to where the function ends. I'll track this down again.

Quote:
- local function's variables are not shown in the locals window. There is no debug info for variables of the outer scope, but vars declared in the function itself should be shown.

From what you've been able to see, are they ever shown? Does the next item about variable name lengths imply that sometimes variables are listed in "Locals"? The fix for WideCharToMultiByte and breakpoint binding on XP didn't also fix this? I wasn't sure if they were related.

Quote:
- long variable names (>255 chars) cause an empty locals list. Unfortunately there are 3 differnet ways dmd emits these symbols (compressed, cut-off with SHA value string and special length encoding)

Hm, I didn't know that. I thought they were all prefixed UTF-8 strings. Where can I find a description of those formats?

Quote:
Some features I would miss a lot when switching to MaGo, but that are not on your feature list:
- display associative arrays

I'll have to look into how to do this. Does the MS C++ debug engine handle this with cv2pdb converted files?

Quote:
- support for the "Auto" window

I found out recently that this is implemented as a cooperation of the language service and debugger. The way I understand it, the language service would parse the source file at the current line looking for variable references, then it would hand them off to the debugger to be evaluated.

I'll add a few more things that are not quite right.

- Enum expressions like "Enum.A" are not evaluated. The problem is that the names of the enum members are stored in the debug info fully qualified, as in "mod.Enum.A", instead of only the name ("A"). I need to file a bug report for DMD about this.

- Sometimes, stepping thru a function will jump to unexpected places, or breakpoints will bind to unexpected lines. You'll likely see this in functions that have nested functinos. It has to do with how the line mapping info is stored. The CV 4.10 spec says that the address-to-line table has to be sorted by address, but I'm doing a binary search by line number. Now, I could change it to a linear search, but I'll need to test the performance. If it's not satisfactory, then I might need to ask for a change in DMD to split the line mapping tables so that addresses AND lines increase in any given table.

Your help is appreciated with any issue. You've given awesome feedback.
By the way, I checked in the fixes for the UTF-8 add-in and WideCharToMultiByte.
Back to top
View user's profile Send private message
sagitario



Joined: 03 Mar 2007
Posts: 292

PostPosted: Fri Aug 27, 2010 2:30 am    Post subject: Reply with quote

aldon wrote:
Quote:

- function arguments are shown incorrectly in the locals window while the instruction pointer is still on the function declaration

I suspect that without special case code, it might be a limitation of the debug format. But, it's something I should check out.

If debugging with cv2pdb, it works correctly. I guess the debug engine detects that the instruction pointer is still before the enter statement and adjusts the stack frame accordingly to display variables.

Quote:
Quote:
- function names on the stack are not demangled (hint: I've updated the demangle function in cv2pdb recently)

I want to ask for a change in DMD in order to fix this. I don't see a reason why function and variable names should be stored mangled in the debug info.

I won't expect this to change. Some issues involved
- DMD does not have a demangler itself (but it should to keep it in step with language changes)
- identifier length might explode, but the debug format and the linker often can only handle strings of length < 256 (see below)

Quote:
Looking at PDB, I see that only "public" symbols are mangled, not local or global symbols.

I think global symbols are always mangled, but not local variables and class/enum names type in the type info.

Quote:
Quote:
- single-stepping out of a function switches to disassembly (only with step-into, not with step-over)

I remember seeing some weirdness like that caused by a problem with the debug info for some functions with respect to where the function ends. I'll track this down again.


I think cv2pdb fixes an issue with the last function in a module not having correct source line info for the last code-block, it ended too early. With MaGo it also happened on functions in the middle of the source file, but maybe this is related.

Quote:
Quote:
- local function's variables are not shown in the locals window. There is no debug info for variables of the outer scope, but vars declared in the function itself should be shown.

From what you've been able to see, are they ever shown? Does the next item about variable name lengths imply that sometimes variables are listed in "Locals"? The fix for WideCharToMultiByte and breakpoint binding on XP didn't also fix this? I wasn't sure if they were related.


The issues look a bit similar, because you see the list of variables with two names and values displayed. But the fix does not change it. Here's an example:
Code:

int global_func(int x)
{
   int local_func(int y)
   {
      return x * y;
   }
   int a = local_func(3);
   return a;
}

Stepping to the return statement in local_func, you see two empty entries in the local window, one should be "y", the other "this", the context pointer for the outer scope.

Quote:
Quote:
- long variable names (>255 chars) cause an empty locals list. Unfortunately there are 3 differnet ways dmd emits these symbols (compressed, cut-off with SHA value string and special length encoding)

Hm, I didn't know that. I thought they were all prefixed UTF-8 strings. Where can I find a description of those formats?

No utf8, no description. You can find a decoder here: http://www.dsource.org/projects/cv2pdb/browser/trunk/src/symutil.cpp
starting from pstrcpy_v().

Quote:
Quote:
Some features I would miss a lot when switching to MaGo, but that are not on your feature list:
- display associative arrays

I'll have to look into how to do this. Does the MS C++ debug engine handle this with cv2pdb converted files?

cv2pdb generates type information for the structs used by the associative array implementation and has some macro magic in autoexp.dat to display it conveniently. Unfortunately this can be quite unstable with uninitialized variables, causing the debugger to freeze. Also, it is rather slow for large arrays.

Quote:
Quote:
- support for the "Auto" window

I found out recently that this is implemented as a cooperation of the language service and debugger. The way I understand it, the language service would parse the source file at the current line looking for variable references, then it would hand them off to the debugger to be evaluated.


Visual D supports the IVsLanguageDebugInfo interface, including GetProximityExpressions. I guess the debugger needs to call this somehow...

Quote:

- Enum expressions like "Enum.A" are not evaluated. The problem is that the names of the enum members are stored in the debug info fully qualified, as in "mod.Enum.A", instead of only the name ("A"). I need to file a bug report for DMD about this.

The debugger might choose to chop off the qualification. But the variable type currently isn't correct anyway. See also http://d.puremagic.com/issues/show_bug.cgi?id=4372

Quote:

- Sometimes, stepping thru a function will jump to unexpected places, or breakpoints will bind to unexpected lines. You'll likely see this in functions that have nested functinos. It has to do with how the line mapping info is stored. The CV 4.10 spec says that the address-to-line table has to be sorted by address, but I'm doing a binary search by line number. Now, I could change it to a linear search, but I'll need to test the performance. If it's not satisfactory, then I might need to ask for a change in DMD to split the line mapping tables so that addresses AND lines increase in any given table.


I did not experience these issues with cv2pdb, but I've seen stepping through templates sometimes is off by 1 or 2 lines.
I think searching linear for the line number would not be too bad, you don't have to do it very often (once per stack frame searching the file table segments?).

Quote:

Your help is appreciated with any issue. You've given awesome feedback.
By the way, I checked in the fixes for the UTF-8 add-in and WideCharToMultiByte.


Thanks,
Rainer
Back to top
View user's profile Send private message
aldon



Joined: 08 Aug 2010
Posts: 5
Location: Washington, USA

PostPosted: Thu Sep 02, 2010 11:54 pm    Post subject: Reply with quote

Quote:
I won't expect this to change. Some issues involved
- DMD does not have a demangler itself (but it should to keep it in step with language changes)
- identifier length might explode, but the debug format and the linker often can only handle strings of length < 256 (see below)

I don't understand the limitations you cited:
- The compiler has the simple names of all entities. Why wouldn't it be able to write the fully qualified but unmangled names?
- If they use compression schemes as you list below, then why wouldn't a change to support this work?

Quote:
Quote:
Looking at PDB, I see that only "public" symbols are mangled, not local or global symbols.
I think global symbols are always mangled, but not local variables and class/enum names type in the type info.

What I mean is that maybe DMD always mangles global and static symbols, but I don't see a reason why it should.
The PDB format seems to have globals and statics unmangled for C++ while publics are mangled.
When I open up MagoNatEE.pdb and look up a function like MagoEE::MakeTypeEnv, I see a record that start with record type 0x1110 (S_GPROC_V3 according to mscvpdb.h), yet the name at the end of the record is exactly MagoEE::MakeTypeEnv.

Storing globals and statics mangled also affects evaluation of things like global variables and static member variables. If the user enters the fully qualified name, we won't be able to find it.

It's true that the debugger can read all symbols, unmangle them, and store them in a table in memory. But, the linker already offers the ability to hash global and static symbols, so why not use it? I think this should be considered, especially since symbol info can be many MB.

Quote:
Stepping to the return statement in local_func, you see two empty entries in the local window, one should be "y", the other "this", the context pointer for the outer scope.

I figured out the problem with listing nested function variables. Evaluating the this pointer failed, and this made the other variables skip being evaluated. Actually, they were being evaluated as if they were "this".
I fixed this problem so now the other variables are evaluated, just not the "this". The problem with "this" is that it's a void*. When the EE comes across a "this" expression it expects that it's of a type that can reference members. So it fails.
Maybe it should be typed as a kind of struct so we can show the parent function's variables?

Quote:
Visual D supports the IVsLanguageDebugInfo interface, including GetProximityExpressions. I guess the debugger needs to call this somehow...

I think the Visual Studio debugger package is supposed to call that function and pass the expressions to the active debug engine. I don't think the debug engine can go out and call the function itself. I don't know what might be keeping the VS debugger package from calling it.

I'll take a look at the other issues. Thanks for the pointers.
Back to top
View user's profile Send private message
sagitario



Joined: 03 Mar 2007
Posts: 292

PostPosted: Fri Sep 03, 2010 2:33 am    Post subject: Reply with quote

aldon wrote:
Quote:
I won't expect this to change. Some issues involved
- DMD does not have a demangler itself (but it should to keep it in step with language changes)
- identifier length might explode, but the debug format and the linker often can only handle strings of length < 256 (see below)

I don't understand the limitations you cited:
- The compiler has the simple names of all entities. Why wouldn't it be able to write the fully qualified but unmangled names?
- If they use compression schemes as you list below, then why wouldn't a change to support this work?

Thinking about it again, you are right, it can be done. Actually, it is what cv2pdb does: demangle the symbols and put it into the pdb file.

Some thoughts:
- Is it necessary to have the same names for debug symbol and the link name? I guess, not.
- Are representations needed with and without type information? A variable is unique, but there can be multiple functions with the same name, but different arguments. Do these need different string representations? If yes, and the full demangled name with type info is placed into the debug info, the debugger might get a hard time to extract the function name without type info (I don't know if it needs to).
- With type info, demangled names will be longer for basic types and qualifiers that are encoded as single characters in the mangled name.
- Without type info, symbols will even get shorter!
- With type info, function name length can easily exceed 255 characters (otherwise you just need long package, module and function names). dmd tries hard to avoid long strings almost everywhere (sometimes compression, if that's still too long, cut off the end and append SHA value of the full identifer), but has a special encoding that allows it for local variables.
- A major problem modifying this might be the linker, if it needs changes after all. It is not open source and is currently written in assembler, so Walter doesn't want to make any changes until he has ported it to C. This process is probably taking another few years.


Quote:
Quote:
Stepping to the return statement in local_func, you see two empty entries in the local window, one should be "y", the other "this", the context pointer for the outer scope.

I figured out the problem with listing nested function variables. Evaluating the this pointer failed, and this made the other variables skip being evaluated. Actually, they were being evaluated as if they were "this".
I fixed this problem so now the other variables are evaluated, just not the "this". The problem with "this" is that it's a void*. When the EE comes across a "this" expression it expects that it's of a type that can reference members. So it fails.
Maybe it should be typed as a kind of struct so we can show the parent function's variables?

Which makes me wonder, whether you have received my personal mail with a comment to the same effect and some patches for longer strings? If dmd can be changed with respect to demangled symbol names, the demangling stuff in the patch can be safely ignored.

Quote:
Quote:
Visual D supports the IVsLanguageDebugInfo interface, including GetProximityExpressions. I guess the debugger needs to call this somehow...

I think the Visual Studio debugger package is supposed to call that function and pass the expressions to the active debug engine. I don't think the debug engine can go out and call the function itself. I don't know what might be keeping the VS debugger package from calling it.

GetProximityExpressions gets called when using the VS debug engine, but not with Mago. I can take a look at the callstack to see what might be initiating it.
Back to top
View user's profile Send private message
sagitario



Joined: 03 Mar 2007
Posts: 292

PostPosted: Fri Sep 03, 2010 1:59 pm    Post subject: Reply with quote

aldon wrote:
Quote:
Stepping to the return statement in local_func, you see two empty entries in the local window, one should be "y", the other "this", the context pointer for the outer scope.

I figured out the problem with listing nested function variables. Evaluating the this pointer failed, and this made the other variables skip being evaluated. Actually, they were being evaluated as if they were "this".


I think "this" does not need to be treated as a special identifier, it simply sits there in the debug info to get displayed. I have removed it from the array of keywords and it works nicely.
What's special about "this" is that it might need to be added to an expression implicitely, but this already seems to work.
Back to top
View user's profile Send private message
aldon



Joined: 08 Aug 2010
Posts: 5
Location: Washington, USA

PostPosted: Mon Sep 20, 2010 10:15 pm    Post subject: Reply with quote

At this point I think there's only one thing that hasn't been handled: supporting associative arrays. I believe the rest that's listed in this topic is either fixed or is a bug in the compiler that I've reported. Thanks for your help and for packaging up Mago Debugger with your project, Visual D.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> Mago Debugger All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group