FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

xml.sax status?

 
Post new topic   Reply to topic     Forum Index -> Mango
View previous topic :: View next topic  
Author Message
Lutger



Joined: 25 May 2006
Posts: 91

PostPosted: Sun Oct 08, 2006 11:59 am    Post subject: xml.sax status? Reply with quote

Hi, I'm sorry if I have overlooked the answer somewhere, but what is the status of mango.xml.sax? It's not in the release download, but so far it seems to work fine. Just wondering if it's okay to use from svn or if there are known issues / missing stuff. I'd like to use it instead of tinyxml (it's easier and faster).

btw, from what I've used now, mango is so much more easier than I thought, it's working great for me, thank you. Very Happy
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Oct 09, 2006 9:06 am    Post subject: Re: xml.sax status? Reply with quote

Lutger wrote:
Hi, I'm sorry if I have overlooked the answer somewhere, but what is the status of mango.xml.sax? It's not in the release download, but so far it seems to work fine. Just wondering if it's okay to use from svn or if there are known issues / missing stuff. I'd like to use it instead of tinyxml (it's easier and faster).

btw, from what I've used now, mango is so much more easier than I thought, it's working great for me, thank you. Very Happy


It's probably ready for use. It's not in the release version for a few reasons:
-It relies on mango.containers, which isn't quite ready for release
-There hasn't been a release since I got it working OK
-I haven't tested it quite as much as I'd like to (although I've done a fair amout)
-More optimizations
-The parser is a bit rough around the edges yet
-Benchmarking


I'd appreciate any and all feedback on the SAX interface and the parser. If you find any bugs, please file them in the Mango project trac. I'd also like to know how you find it in terms of speed. I designed it to have very low heap usage so it should run pretty quick (compile with -inline -O) but I haven't yet figured out how to avoid the vtable lookups it's doing- this should increase the speed a lot.

Glad to see someone's interested.

~John
Back to top
View user's profile Send private message Send e-mail AIM Address
Lutger



Joined: 25 May 2006
Posts: 91

PostPosted: Mon Oct 09, 2006 11:26 am    Post subject: Reply with quote

Awesome. I'm glad you've made this thing. I only hacked up some loading code to parse xml files, but I'll give feedback when I have some.

In terms of performance, I have not done any really valid tests, but found that my hacked up thing is at least 7 to 8 times as fast as the tinyxml code I had, I suspect it will be more for larger files where tinyxml really stalls. More importantly, it took me only an hour or so to understand and write significantly cleaner code than the tinyxml stuff, that took me 2 hours or so - and I was already familiar with tinyxml. I like this sax thing, fast and simple.

Converting mango's String to char[], the following note in mango.text.string did affect performance by about 10?:

Quote:
Convert to the AbstractString types. The optional argument
dst will be resized as required to house the conversion.
To minimize heap allocation, use the following pattern:

String string;

wchar[] buffer;
wchar[] result = string.toUtf16 (buffer);

if (result.length > buffer.length)
buffer = result;
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Oct 09, 2006 11:52 am    Post subject: Reply with quote

Lutger wrote:
In terms of performance, I have not done any really valid tests, but found that my hacked up thing is at least 7 to 8 times as fast as the tinyxml code I had, I suspect it will be more for larger files where tinyxml really stalls. More importantly, it took me only an hour or so to understand and write significantly cleaner code than the tinyxml stuff, that took me 2 hours or so - and I was already familiar with tinyxml. I like this sax thing, fast and simple.


I'm not familar with TinyXML- it's DOM style? Looks like C++- were you writing in C++, or using some D bindings?

Quote:
Converting mango's String to char[], the following note in mango.text.string did affect performance by about 10?:

Quote:
Convert to the AbstractString types. The optional argument
dst will be resized as required to house the conversion.
To minimize heap allocation, use the following pattern:

String string;

wchar[] buffer;
wchar[] result = string.toUtf16 (buffer);

if (result.length > buffer.length)
buffer = result;


This helped to speed up your code to get it out of the String class, you mean? Are you doing a UTF conversion? You shouldn't have to do any UTF conversion yourself, you can use the SAX template directly if you want to use anything other than char. If you're just copying the string, you might also look at the copy method.

In order to reduce heap allocations, the teqXML parser uses one buffer and moves the data around in that buffer. As such, when strings are delievered to the client, the memory references are only good during that function call, after which the memory might get shifted (the parser owns the object). I had also considered never moving memory and allocating more memory when more space was needed (and abandoning references no longer in use for the GC to handle); with this technique I could give ownership of the strings to the client. I decided that it would be better to minimize heap allocations and make the client code do any necessary heap allocation- I think it is more flexible this way. After using it, do you agree with this decision?

~John
Back to top
View user's profile Send private message Send e-mail AIM Address
Lutger



Joined: 25 May 2006
Posts: 91

PostPosted: Mon Oct 09, 2006 5:32 pm    Post subject: Reply with quote

teqdruid wrote:
Lutger wrote:
<snip>


I'm not familar with TinyXML- it's DOM style? Looks like C++- were you writing in C++, or using some D bindings?


DOM, port from C++ under TinyXPath here at dsource.

Quote:
Quote:
Converting mango's String to char[], the following note in mango.text.string did affect performance by about 10?:
<snip


This helped to speed up your code to get it out of the String class, you mean? Are you doing a UTF conversion? You shouldn't have to do any UTF conversion yourself, you can use the SAX template directly if you want to use anything other than char. If you're just copying the string, you might also look at the copy method.

Hmm yes, missed that one, a copy is what I need. It doesn't have a slice does it?

Quote:
In order to reduce heap allocations, the teqXML parser uses one buffer and moves the data around in that buffer. As such, when strings are delievered to the client, the memory references are only good during that function call, after which the memory might get shifted (the parser owns the object). I had also considered never moving memory and allocating more memory when more space was needed (and abandoning references no longer in use for the GC to handle); with this technique I could give ownership of the strings to the client. I decided that it would be better to minimize heap allocations and make the client code do any necessary heap allocation- I think it is more flexible this way. After using it, do you agree with this decision?

~John


I agree. I had one initial bug due to mistakenly relying on ownership, but quickly discovered the error. As long as it is documented this is the right way imo. Just because there is no const and we have garbage collection doesn't mean D libraries should prevent users from shooting at their feet at all costs.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> Mango All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group