FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Error: ArrayBoundsError teqXML(880)

 
Post new topic   Reply to topic     Forum Index -> Mango
View previous topic :: View next topic  
Author Message
manni



Joined: 16 Jan 2006
Posts: 25

PostPosted: Wed Mar 01, 2006 7:03 am    Post subject: Error: ArrayBoundsError teqXML(880) Reply with quote

Hallo,

i try to parse this little xml file:
Code:

<?xml version="1.0"  encoding="iso-8859-1" ?>
<PSI>
<FORMAT>
  <SI_BEL>
    <SI>
      <VER_ZUS>
        <GES ABSCHNITT="5">Gesamt</GES>
        <GES ABSCHNITT="4">GESAMT(ohne MwSt.)</GES>
        <GES ABSCHNITT="3">Verbindungen (ohne MwSt.)</GES>
        <GRP ID="225">
          <SPA ID="1" TYP="DATUM">Datum</SPA>
          <SPA ID="2" TYP="UHRZEIT">Uhrzeit</SPA>
          <SPA ID="6" TYP="ANZAHL">Anzahl</SPA>
          <SPA ID="7" TYP="BETRAG">Betrag</SPA>
          <SPA ID="15" TYP="URSPRUNG">Ursprung</SPA>
        </GRP>
        <GRP ID="330">
          <SPA ID="1" TYP="DATUM">Datum</SPA>
          <SPA ID="2" TYP="UHRZEIT">Uhrzeit</SPA>
          <SPA ID="15" TYP="URSPRUNG">Ursprung</SPA>
          <SPA ID="6" TYP="DAUER">Dauer</SPA>
          <SPA ID="7" TYP="BETRAG">Betrag</SPA>
          <SPA ID="16" TYP="DATENVOL">Datenvolumen</SPA>
        </GRP>
        <GRP ID="223">
          <SPA ID="16" TYP="DATENVOL">Datenvolumen</SPA>
          <SPA ID="7" TYP="BETRAG">Betrag</SPA>
        </GRP>
      </VER_ZUS>
    </SI>
  </SI_BEL>
</FORMAT>
</PSI>


with the programm:
Code:

module mango.test.sax;

private import mango.xml.sax.DefaultSAXHandler,
  mango.xml.sax.model.ISAXParser,
  mango.xml.sax.model.ISAXHandler,
  mango.xml.sax.parser.teqXML;

private import mango.io.Stdout,
  mango.io.FileConduit,
  mango.io.Buffer;

private import  mango.text.model.UniString,
                mango.text.String;
private alias StringT!(char) Utf8String;

private import mango.convert.Type;

void main()
{
        readerTest1();
}


/**
   Just outputs the data to the console.
 */
private class MyOutputHandler: DefaultSAXHandler!(char) {
        private int tabs = 0;

        this() {
        }
}

void readerTest1() {
  ISAXReader!() reader = new TeqXMLReader!()(512);
  //FileConduit file = new FileConduit("SR0050531145120A.xml", FileStyle.ReadExisting);
  FileConduit file = new FileConduit("short.xml", FileStyle.ReadExisting);
  MyOutputHandler handler = new MyOutputHandler();
  reader.parse(file, handler);
}


I get the error Message:
Error: ArrayBoundsError teqXML(880)
Have someone an Idea what happens?

My System is Linux Debian testing.

manni
Back to top
View user's profile Send private message
brad
Site Admin


Joined: 22 Feb 2004
Posts: 490
Location: Atlanta, GA USA

PostPosted: Wed Mar 01, 2006 8:30 am    Post subject: Reply with quote

It's pointing here:

http://trac.dsource.org/projects/mango/browser/trunk/mango/xml/sax/parser/teqXML.d?rev=790#L880

but I'd have to defer to teqdruid or kris for the solution.

BA
Back to top
View user's profile Send private message
manni



Joined: 16 Jan 2006
Posts: 25

PostPosted: Thu Mar 02, 2006 3:21 am    Post subject: Reply with quote

Hello,

ich have compile the programm with:
build -O -release -cleanup sax1.d

and now it run fine.
My 600MB xml File are parsesd in 1 Minute.
Nice nice Laughing
The next step, is to build a CVS File from the xml file.

manni
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Thu Mar 02, 2006 11:17 am    Post subject: Reply with quote

manni wrote:
Hello,

ich have compile the programm with:
build -O -release -cleanup sax1.d

and now it run fine.
My 600MB xml File are parsesd in 1 Minute.
Nice nice Laughing
The next step, is to build a CVS File from the xml file.

manni


Sorry. Didn't see this until now. I'm glad you got it working. There are still a few bugs I'm trying to iron out, but it's nearing completion.

I haven't yet run any time trials, is the performance pretty good? I guess 10MB/second sounds OK. Do you happen to know how any other parsers stack up?

BTW, there will be further performance enhancements in the future, I just haven't gotten to all of them yet.

~John Demme
Back to top
View user's profile Send private message Send e-mail AIM Address
manni



Joined: 16 Jan 2006
Posts: 25

PostPosted: Fri Mar 03, 2006 1:43 am    Post subject: Reply with quote

Hallo,

i have test it with perl:
use XML::Parser::PerlSAX;

real 0m30.035s
user 0m24.175s
sys 0m1.869s

In D: with new TeqXMLReader!()(512)
real 0m46.713s
user 0m42.195s
sys 0m1.829s

In D with new TeqXMLReader!()(1024)
real 0m48.861s
user 0m43.040s
sys 0m1.603s

In D with new TeqXMLReader!()(2048);
real 0m47.874s
user 0m43.224s
sys 0m1.390s

manni
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Fri Mar 03, 2006 2:32 pm    Post subject: Reply with quote

That's not quite the speed I was hoping for... Actually, it's performing better on my system. Try compiling with the -release and -inline flags. I was getting similar results until I used them, but with them the parser seems to be much, much faster. That's actually not too surprising considering that the parser makes a lot of calls to small methods, so it would benefit a lot from inlining. There's also a rather large amount of debug code in there, such as array bounds checking and asserts, which is removed with the -release option.

Could you also email me (me@teqdruid.com) the perl code that you're using to test? Also, what is the large XML file from? I wrote a quick app to generate a large XML file, but the file isn't exactly representative of typical XML files.

Thanks,
John

manni wrote:
Hallo,

i have test it with perl:
use XML::Parser::PerlSAX;

real 0m30.035s
user 0m24.175s
sys 0m1.869s

In D: with new TeqXMLReader!()(512)
real 0m46.713s
user 0m42.195s
sys 0m1.829s

In D with new TeqXMLReader!()(1024)
real 0m48.861s
user 0m43.040s
sys 0m1.603s

In D with new TeqXMLReader!()(2048);
real 0m47.874s
user 0m43.224s
sys 0m1.390s

manni
Back to top
View user's profile Send private message Send e-mail AIM Address
manni



Joined: 16 Jan 2006
Posts: 25

PostPosted: Tue Mar 07, 2006 1:05 am    Post subject: Reply with quote

The perl program

Code:

#!/usr/bin/env perl

use XML::Parser::PerlSAX;
my $file ='bigfile.xml';

my $handler = CamelHandler->new();
my $parser = XML::Parser::PerlSAX->new(Handler => $handler);
my $text;

$parser->parse(Source => { SystemId => $file});

package CamelHandler;

use strict;

sub new {
        my $type = shift;
        return bless {}, $type;
}


I think the perl module XML::Parser::PerlSAX is written in
C , it use Expat, maybe that is the reasen why perl is so fast.

The File is from a telefon Company . I believe that they wrote the xml File straight from the database.

Manfred
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Wed Mar 15, 2006 9:28 am    Post subject: Reply with quote

With your perl code, I have the following results:
Quote:
teqdruid@teqdruid ~/workspace/mango/mango/test $ time ./perlXmlRead.pl

real 0m17.514s
user 0m13.749s
sys 0m0.780s
teqdruid@teqdruid ~/workspace/mango/mango/test $ time ./timedXmlRead big.xml
Total time: 17897

real 0m18.355s
user 0m15.409s
sys 0m1.272s


The line "total time:" is the time in milliseconds that my test program calculates it using, this way I'm not measuring the time the program takes to start or close. Similar code in the perl app would be the best comparison.

So teqXML is really close. What's interesting is that the sys time is so much larger. I wonder if this is just a matter of tuning Mango's IO stuff? I don't know anything about it, however. Or is this a measure of memory moves? Is the time the parser spends in malloc() or memmove() code counted here? If so, then I should try to cut down on memory operations I guess.... I'll just have to throw the profiler at it soon.

~John
Back to top
View user's profile Send private message Send e-mail AIM Address
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> Mango All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group