View previous topic :: View next topic |
Author |
Message |
stonecobra
Joined: 25 May 2004 Posts: 48 Location: Rough and Ready, CA
|
Posted: Mon Aug 09, 2004 2:16 pm Post subject: mango.xml.sax has landed |
|
|
All,
I have committed a first pass at mango.xml.sax. It is a direct port of SAX 2 release 3 from saxproject.sourceforge.net.
I still have some cleanup to do (mainly in NamespaceSupport and ParserAdapter), but it does compile
I am now working on porting Aelfred over to use the SAX API. If anyone else has a favorite XML parser, SAX is no longer stopping you
Since Aelfred is one of the faster parsers in java land, once I convert to D's array slicing in the implementation, I expect to have a super fast parser on my hands.
Still much to do if anyone wants to help out. Just search for 'TODO' in the code, update the linux makefile (or the scons stuff), help write some documentation (lost all of the comments in the conversion).
Scott Sanders |
|
Back to top |
|
|
stonecobra
Joined: 25 May 2004 Posts: 48 Location: Rough and Ready, CA
|
Posted: Tue Aug 10, 2004 10:48 am Post subject: |
|
|
Implementation question: Should the Aelfred port to D just use dchar and dchar[] instead of char and char[]?
What is the performance impact?
I am assuming it should be easier to code, but would it be?
Scott |
|
Back to top |
|
|
kris
Joined: 27 Mar 2004 Posts: 1494 Location: South Pacific
|
Posted: Tue Aug 10, 2004 11:12 am Post subject: |
|
|
This is not an answer, but you could just alias the usage until the unicode lib shows up (yes, yes; I do have something positive to say about alias)
Part of the problem will be performance expectations: ideally, one would want to slice the incoming content rather than copy/expand it as you go. An alternative is to expand content before it even reaches SAX ... a filter on the conduit could perhaps do that. True performance freaks would point out there should be two paths through the parser: one for UTF8 and another for native "wide" documents <g> |
|
Back to top |
|
|
stonecobra
Joined: 25 May 2004 Posts: 48 Location: Rough and Ready, CA
|
Posted: Tue Aug 10, 2004 12:15 pm Post subject: |
|
|
kris wrote: | True performance freaks would point out there should be two paths through the parser: one for UTF8 and another for native "wide" documents <g> |
Which should come first though? Wide support, or char[]? |
|
Back to top |
|
|
kris
Joined: 27 Mar 2004 Posts: 1494 Location: South Pacific
|
Posted: Tue Aug 10, 2004 12:54 pm Post subject: |
|
|
I'd say the dchar approach, only because it's universally applicable. Maybe that's the right way to go on this one ... just eat the dchar conversion overhead regardless. A faster dchar-only input could be added later without any fuss.
Given that XML is supposed to be language agnostic, you probably need to expose a dchar-oriented API. Whilst being more efficient for ASCII-only documents, exposing an additional char-based API might become a millstone rather than an optional performance-enhancement?
If you go the dchar route, you might simplify the task by ignoring UTF8 encodings for now (until the unicode lib is up to speed). That is, just expand ASCII into 32 bits instead. As long as the slots are "designed in", there's a path to follow. |
|
Back to top |
|
|
teqdruid
Joined: 11 May 2004 Posts: 390 Location: UMD
|
Posted: Sun Sep 12, 2004 4:25 pm Post subject: How long? |
|
|
How far off is a usable XML parser in mango? I ask because I'm at a point where I have to abandon Andy's XML library... It relies on std.stream, which conflicts with Mango, and causes my server to randomly freeze.
So at this point, I HAVE to change xml libraries, and since I'd like to put my XML-RPC stuff in Mango, it'd be nice to have it just use mango libraries.
I haven't been able to make much sense of the xml code in SVN right now (rev. 735) so I can't even determine how to use the parser there, or if it even works. If it is working, I'd appriciate some sample code. The stuff I'm doing isn't terribly complex, so I'd imagine that even the most basic XML parser could handle it. |
|
Back to top |
|
|
|