FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Http Server Freezes
Goto page 1, 2  Next
 
Post new topic   Reply to topic     Forum Index -> Mango
View previous topic :: View next topic  
Author Message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Sun Sep 05, 2004 8:26 pm    Post subject: Http Server Freezes Reply with quote

OK... this is really starting to annoy me.

After a certain number of POST calls, my entire program seems to lock up- not just the server threads, but all the threads.

As I've posted before, this is how I'm starting the server:
Code:
      //The local port to bind to
      InternetAddress bindTo = new InternetAddress(8181);
      
      //The ServletProvider
      ServletProvider sp = new ServletProvider();
      
      //This is the nnServer in servlet form
      NNServlet nnServlet = new NNServlet(hLogger);
      nnServlet.install(sp);
      
      //The HTTP Server
      HttpServer httpServer = new HttpServer(sp, bindTo, 1, hLogger);
      
      //Go!
      httpServer.start();
      
      //Wait for input, then close it all up
      stdin.readLine();


So after I hit enter, the program quits and the server dies. When the program locks up now, hitting enter doesn't do anything. The only way for me to kill it is to run a 'killall -s KILL nnEngine' which kills the processes- hard.

And this is proving a really hard problem to trace as well. It seems that each time I put in different logger trace messages, or comment out different sections, it freezes at a different point.

If it helps, I ran it in strace, and here's the output around the freezing point. Looks interesting, but I'm not totally sure what it means:
Code:
TRACE HTTP Server - XML-RPC Server: Setting length
TRACE HTTP Server - XML-RPC Server: Getting writer
TRACE HTTP Server - XML-RPC Server: Writing response
0xbffff1ac, 1)                  = ? ERESTARTSYS (To be restarted)
--- SIGUSR1 (User defined signal 1) @ 0 (0) ---
write(86, "\1\0\0\0\4\0\0\0008\177\10\10\30P\3@ \361\377\277\350\360"..., 148) = 148
rt_sigsuspend(~[USR2]


I'm really starting to pull my hair out on this one- since I don't even know where to start looking. This has been a problem for sometime, but not a major one, since I was usually restarting the server to make changes often... it's starting to work now, and if it weren't for this I could probably do some useful stuff with it.

Anyone got any suggestions before I end up bald?

Thanks
John
Back to top
View user's profile Send private message Send e-mail AIM Address
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Sun Sep 05, 2004 9:53 pm    Post subject: Reply with quote

I'm afraid I don't have much to suggest; here's something though:

a) there's an error ERESTARTSYS just before the write() operation, which looks a bit suspect. Can you show more of the strace output please?

b) what does a successful request/response cycle look like compared to the failed one?

c) You indicate it locks-up in various places. Does it always manage to respond to the client, and then fails to accept more requests? Can you be more specific please?

d) can you disable the SOAP processing, and always return a "canned" response from the server? This will help isolate the culprit.

e) Can you run it in a source-level debugger?

f) Do you have a windows machine you can try it on? I ask because JJR has seen some issues related to sockets on linux.

g) as an experiment, increase the number of server threads to see if the server operates for a longer period (this will identify if it's the process or a worker thread that croaks). To do that, change the '1' value in the HttpServer constructor to 20, or something.

h) remove the stdin.getLine() and replace with sleep(uint.max); Perhaps stdin is causing a conflict? Who knows ...

i) uncomment line 128 in mango.io.SocketConduit, and see it that helps at all ...

j) add some printf() debugging to mango.utils.serverThread, to indicate whether each request is processed completely. Note that the server is a thread-based model; each thread waits upon a common server-socket. After each request is processed, the handling thread waits for another request on this server-socket. They are all expected to be suspended until another request arrives.

Let us know how it goes!
Back to top
View user's profile Send private message
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 2:36 pm    Post subject: Reply with quote

John;

Another thing you might try is to compile mango.example.Servlets (via the make file), and access that via a browser. The supported URLs are noted in the source header (note that you will have to install the Mango documentation if you want to try the file-serving aspects of that example).

If that example breaks, then there's likely something fishy going on either with threads, or with the underlying socket package. It may be that there's some special incantation to make either work on the platform you're using.

BTW; which linux are you using? And, do you know what kind of thread support there is? I'm afraid I'm not familiar with linux, but someone else might spot a discrepancy ...

- Kris
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Sep 06, 2004 4:28 pm    Post subject: Reply with quote

Quote:
a) there's an error ERESTARTSYS just before the write() operation, which looks a bit suspect. Can you show more of the strace output please?

That's the only strace output during the operations. The only other strace output is at the top, when the server is starting, and there's a lot of it. I'll email you a complete log.

Quote:
b) what does a successful request/response cycle look like compared to the failed one?

Here's a sucessful one:
Code:
POST /RPC2 HTTP/1.1

Content-Length: 157

Content-Type: text/xml

Cache-Control: no-cache

Pragma: no-cache

User-Agent: Java/1.4.2_05

Host: localhost:8181

Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2

Connection: keep-alive



<?xml version="1.0" encoding="ISO-8859-1"?><methodCall><methodName>nnCore.getNode</methodName><params><param><value>1@1</value></param></params></methodCall>HTTP/1.1 200 OK
Content-Type:text/xml
Connection:close

<?xml version="1.0"?>
<methodResponse>
    <params>
        <param>
            <value>
                <struct>
                    <member>
                        <name>ID</name>
                        <value>
                            <string>1@1</string>
                        </value>
                    </member>
                </struct>
            </value>
        </param>
    </params>
</methodResponse>

Versus the last one, which is unsucessful:
Code:
POST /RPC2 HTTP/1.1

Content-Length: 157

Content-Type: text/xml

Cache-Control: no-cache

Pragma: no-cache

User-Agent: Java/1.4.2_05

Host: localhost:8181

Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2

Connection: keep-alive



<?xml version="1.0" encoding="ISO-8859-1"?><methodCall><methodName>nnCore.getNode</methodName><params><param><value>1@1</value></param></params></methodCall>HTTP/1.1 200 OK
Content-Type:text/xml
Connection:close



Which probably answers your next question:
Quote:
c) You indicate it locks-up in various places. Does it always manage to respond to the client, and then fails to accept more requests? Can you be more specific please?

It never finishes responding on the last one: it always locks up at some point while trying to respond. Here's the output from the Java client:
Code:
0
1
2
3
java.io.IOException: Connection refused
   at org.apache.xmlrpc.XmlRpcClient$Worker.execute(XmlRpcClient.java:444)
   at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:163)
   at com.neuralnexus.xmlrpc.XmlRpcConnector.getNode(XmlRpcConnector.java:310)
   at test.main(test.java:29)
com.neuralnexus.exceptions.NNException: Connection refused
   at com.neuralnexus.xmlrpc.XmlRpcConnector.getNode(XmlRpcConnector.java:316)
   at test.main(test.java:29)

I have it doing a loop, making requests. it outputs the iteration number just before making the request.

Quote:
d) can you disable the SOAP processing, and always return a "canned" response from the server? This will help isolate the culprit.

I'll try that after posting this.

Quote:
e) Can you run it in a source-level debugger?

Unforunately, no. I haven't been able to get a debugger working. It's massively annoying

Quote:
f) Do you have a windows machine you can try it on? I ask because JJR has seen some issues related to sockets on linux.

Not until I have time to buy a crossover cable (long story... I'll have it working tomorrow)

Quote:
g) as an experiment, increase the number of server threads to see if the server operates for a longer period (this will identify if it's the process or a worker thread that croaks). To do that, change the '1' value in the HttpServer constructor to 20, or something.

Now this is interesting. With the value 1, I'm able to get 19 or 20 requests in, and it dies on the 20th or 21st (it tends to vary). With either the values 10 or 20 there, it dies on the 3rd request. With the value 2, It dies on the 14th request. With the value 3, It dies on the 10th request.

Quote:
h) remove the stdin.getLine() and replace with sleep(uint.max); Perhaps stdin is causing a conflict? Who knows ...

No dice... btw, I'm not sure if it's supposed to be, but the sleep function wasn't already defined, and I had to add a definition for it.

Quote:
i) uncomment line 128 in mango.io.SocketConduit, and see it that helps at all ...

Nothing

Quote:
j) add some printf() debugging to mango.utils.serverThread, to indicate whether each request is processed completely. Note that the server is a thread-based model; each thread waits upon a common server-socket. After each request is processed, the handling thread waits for another request on this server-socket. They are all expected to be suspended until another request arrives.

I'll try this in a minute.

Quote:
BTW; which linux are you using? And, do you know what kind of thread support there is? I'm afraid I'm not familiar with linux, but someone else might spot a discrepancy ...

I'm using Gentoo linux with the following:
GLibC 2.3.2
Kernel 2.6.7-gentoo-r11 (that's kernel 2.6.7 with gentoo patches)
Dunno what other packages are of note.
I know I've got support for pthreads, but I'm not sure what else, and I don't know what you or dmd is using.

Quote:
Another thing you might try is to compile mango.example.Servlets (via the make file), and access that via a browser. The supported URLs are noted in the source header (note that you will have to install the Mango documentation if you want to try the file-serving aspects of that example).

Again, I'll try that in a minute.

Just realized that I forgot one of the basics; I'm running DMD 0.98.

Another thing (which is also mildly annoying) is that socket doesn't seem to close immediately when the program dies, so if I try to run it again without waiting about 60 seconds, I get the following:
Code:
Unable to bind socket: Address already in use
socket cancel status now set
closing resource via destructor
closing socket handle ...
socket handle closed


This happens even when the program terminates "normally" and I only use the quotes because frequently after I hit enter, after the code in the main thread runs, I get a segfault. Not always, however.[/code]
Back to top
View user's profile Send private message Send e-mail AIM Address
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Sep 06, 2004 4:58 pm    Post subject: Servlet test segfaults Reply with quote

Quote:
Another thing you might try is to compile mango.example.Servlets (via the make file), and access that via a browser. The supported URLs are noted in the source header (note that you will have to install the Mango documentation if you want to try the file-serving aspects of that example).


Code:
teqdruid@teqdruid mango $ gdb ./sevlettest
GNU gdb 6.0
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".
 
(gdb) run
Starting program: /home/teqdruid/workspace/mango/sevlettest
 
Program received signal SIGSEGV, Segmentation fault.
0x0804df6c in _D5mango2io11FileConduit11FileConduit5_openFC5mango2io9FileStyle9FileStyleZv ()
Back to top
View user's profile Send private message Send e-mail AIM Address
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 5:18 pm    Post subject: Re: Servlet test segfaults Reply with quote

demmegod wrote:
Quote:
Another thing you might try is to compile mango.example.Servlets (via the make file), and access that via a browser. The supported URLs are noted in the source header (note that you will have to install the Mango documentation if you want to try the file-serving aspects of that example).


Code:
teqdruid@teqdruid mango $ gdb ./sevlettest
GNU gdb 6.0
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".
 
(gdb) run
Starting program: /home/teqdruid/workspace/mango/sevlettest
 
Program received signal SIGSEGV, Segmentation fault.
0x0804df6c in _D5mango2io11FileConduit11FileConduit5_openFC5mango2io9FileStyle9FileStyleZv ()


Whoops! That's no good, is it! Smile

Let's have a look at the code ... oh yeah; there's an unimplemented linux method being used by the Ping servlet (I think JJR is gonna' fix that). If you download the latest SVN version of mango.example.Servlets and comment out line 569 (eliminate the Ping servlet), that should take care of the error.
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Sep 06, 2004 5:20 pm    Post subject: Reply with quote

Quote:
d) can you disable the SOAP processing, and always return a "canned" response from the server? This will help isolate the culprit.

When I do that:
Code:
      //The local port to bind to
      InternetAddress bindTo = new InternetAddress(8181);
      
      //The ServletProvider
      ServletProvider sp = new ServletProvider();
      
      //This is the nnServer in servlet form
      //NNServlet nnServlet = new NNServlet(hLogger);
      //nnServlet.install(sp);
      
      class testServlet : MethodServlet
      {
         public override void doPost (IServletRequest request, IServletResponse response)
         {
            response.setContentType("text/xml");
            response.getWriter().put("<?xml version=\"1.0\"?>
<methodResponse>
    <params>
        <param>
            <value>
                <struct>
                    <member>
                        <name>ID</name>
                        <value>
                            <string>1@1</string>
                        </value>
                    </member>
                </struct>
            </value>
        </param>
    </params>
</methodResponse>");
         }
         
            public void install(ServletProvider provider)
            {
               IRegisteredServlet irs = provider.addServlet (this, "XmlRpc");
               provider.addMapping ("/RPC2", irs);
               //decodeTest();
            }
      }
      
      testServlet ts = new testServlet();
      ts.install(sp);
      
      //The HTTP Server
      HttpServer httpServer = new HttpServer(sp, bindTo, 1, hLogger);
      
      //Go!
      httpServer.start();
      
      //Wait for input, then close it all up
      stdin.readLine();

It seems to work properly.

However, when I use this:
Code:
      class testServlet : XmlRpcServlet
      {   
         public this(Logger logger)
         {
            super(logger);
         }
         
         protected MethodResponse getNodeHandler(MethodCall call)
         {
            //logger.trace("NNServlet: getNodeHandler: in");
            MethodResponse resp = new MethodResponse();
            Struct ret = new Struct();
            ret.setValue("ID", new Value("1@1"));
            resp.appendParam(ret);      
            return resp;
         }
         
         public void install(ServletProvider provider)
         {
            IRegisteredServlet irs = provider.addServlet (this, "XmlRpc");
            addCallHandler("nnCore.getNode", &getNodeHandler);
            provider.addMapping ("/RPC2", irs);
            //decodeTest();
         }
      }


The same thing happens, implying that the problem is somewhere in the XmlRpc stuff. (BTW, just to clear things up, I'm using XML-RPC, not SOAP.)

I just don't know what I'm doing in there that could possibly cause this sort of issue. It seems to be either in my XmlRpc stuff, or Andy's XML parser, or some interaction or those two with Mango.

Would it help if I send all the XmlRpc code to you? The only reason I haven't yet (for submission to Mango is since it relies on Andy's XML stuff, since I've been told that Mango's XML stuff isn't ready to go yet)
Back to top
View user's profile Send private message Send e-mail AIM Address
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 5:36 pm    Post subject: Reply with quote

demmegod wrote:
Another thing (which is also mildly annoying) is that socket doesn't seem to close immediately when the program dies, so if I try to run it again without waiting about 60 seconds, I get the following:
Code:
Unable to bind socket: Address already in use
socket cancel status now set
closing resource via destructor
closing socket handle ...
socket handle closed


This happens even when the program terminates "normally" and I only use the quotes because frequently after I hit enter, after the code in the main thread runs, I get a segfault. Not always, however

That's life with sockets. They actually linger around for a while in the OS. In this case, it's the listener socket on port 8181; when you restart the application, that port is still in use by the prior server-socket because the latter hasn't yet been cleaned up.

You can force the server-socket to grab the port by setting the "address reuse" flag on it. To do this, you should subclass HttpServer, override the createSocket() method, and make your version do the following:

Code:
override ServerSocket createSocket (InternetAddress bind, int backlog)
{
    return new ServerSocket (bind, backlog, true);
}

Now your server-socket will reuse (the last argument) the port without question, even if it's still "in use" by another socket.

(perhaps there should be an easier way to do this)

- Kris
Back to top
View user's profile Send private message
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 5:53 pm    Post subject: Reply with quote

demmegod wrote:
I just don't know what I'm doing in there that could possibly cause this sort of issue. It seems to be either in my XmlRpc stuff, or Andy's XML parser, or some interaction or those two with Mango.

That narrows it down a bit!

the strace.6230 file you sent me is interesting in that it shows a kill(6228, SIGURS1) being triggered just before everything falls apart. The process ID in this instance is 6230, so 6228 is perhaps a thread being killed? The question is, does this kill() show up within a predetermined time? It looks awfully suspicious to me (but I know nothing about linux, or strace for that matter).

You might check this by actually creating the XML-RPC and XML classes, but not executing them (and using the canned reply instead). It's almost as though there's a bizarre timeout scenario going on ...

BTW, does anyone recognize this little dance? Ignoring the first two lines:
Code:
send(87, "HTTP/1.1 200 OK\nContent-Type:tex"..., 56, 0) = 56
gettimeofday({1094508238, 755788}, NULL) = 0
kill(6228, SIGUSR1) = 0
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
) = -1 EINTR (Interrupted system call)
sigreturn()                             = ? (mask now [RTMIN])
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]+++ killed by SIGKILL +++

BTW: the strace.log you sent shows a large number of invalid file-handle errors on calls to close(). I don't know if that has anything to do with it?

- Kris
Back to top
View user's profile Send private message
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 6:00 pm    Post subject: Reply with quote

demmegod wrote:

Quote:
h) remove the stdin.getLine() and replace with sleep(uint.max); Perhaps stdin is causing a conflict? Who knows ...

No dice... btw, I'm not sure if it's supposed to be, but the sleep function wasn't already defined, and I had to add a definition for it.

Right. I was gonna' suggest that you use System.sleep(), but wanted to exclude that as a possible problem. If you are already including mango.base.System, then you should be good to go with that function.
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Sep 06, 2004 6:46 pm    Post subject: Reply with quote

Quote:
Let's have a look at the code ... oh yeah; there's an unimplemented linux method being used by the Ping servlet (I think JJR is gonna' fix that). If you download the latest SVN version of mango.example.Servlets and comment out line 569 (eliminate the Ping servlet), that should take care of the error.

Didn't help. Still segfaults... same point according to gdb.

Quote:
the strace.6230 file you sent me is interesting in that it shows a kill(6228, SIGURS1) being triggered just before everything falls apart. The process ID in this instance is 6230, so 6228 is perhaps a thread being killed? The question is, does this kill() show up within a predetermined time? It looks awfully suspicious to me (but I know nothing about linux, or strace for that matter).

6228 is the main process. This probably explains why this process is hanging. SIGUSR1 is a user-defined signal... I don't know why it's being sent, what's sending it, or why it's causing this behavior... I'm going to post about this on the NG.

Quote:
You might check this by actually creating the XML-RPC and XML classes, but not executing them (and using the canned reply instead). It's almost as though there's a bizarre timeout scenario going on ...

OK, so when I do everything except have have the XmlRpc class send the reply (although it does go through and generate the string to use, it just doesn't send it) and I send the "canned" response, it dies on the 41st request instead. I'm not exactly sure what this suggests.
Back to top
View user's profile Send private message Send e-mail AIM Address
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 6:54 pm    Post subject: Reply with quote

demmegod wrote:
Quote:
Let's have a look at the code ... oh yeah; there's an unimplemented linux method being used by the Ping servlet (I think JJR is gonna' fix that). If you download the latest SVN version of mango.example.Servlets and comment out line 569 (eliminate the Ping servlet), that should take care of the error.

Didn't help. Still segfaults... same point according to gdb.

Hmmm ... we'll have to get that running. Thanks for trying!

demmegod wrote:
Quote:
You might check this by actually creating the XML-RPC and XML classes, but not executing them (and using the canned reply instead). It's almost as though there's a bizarre timeout scenario going on ...

OK, so when I do everything except have have the XmlRpc class send the reply (although it does go through and generate the string to use, it just doesn't send it) and I send the "canned" response, it dies on the 41st request instead. I'm not exactly sure what this suggests.

If this were a C/C++ program, I'd be almost certain there was a dangling pointer or buffer overrun involved ... If you disable the XmlRpc altogether, it works? Or is it when you disable the XML parser that it works?
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Sep 06, 2004 6:58 pm    Post subject: Reply with quote

Quote:
If you disable the XmlRpc altogether, it works? Or is it when you disable the XML parser that it works?

When I disable XmlRpc althogether it works. It's not easy to disable the XML parser without disabling the XmlRpc stuff.
Back to top
View user's profile Send private message Send e-mail AIM Address
kris



Joined: 27 Mar 2004
Posts: 1494
Location: South Pacific

PostPosted: Mon Sep 06, 2004 7:15 pm    Post subject: Reply with quote

demmegod wrote:
Quote:
If you disable the XmlRpc altogether, it works? Or is it when you disable the XML parser that it works?

When I disable XmlRpc althogether it works. It's not easy to disable the XML parser without disabling the XmlRpc stuff.

I can imagine!

Don't suppose gdb tells you anything useful?
Back to top
View user's profile Send private message
teqdruid



Joined: 11 May 2004
Posts: 390
Location: UMD

PostPosted: Mon Sep 06, 2004 7:34 pm    Post subject: Reply with quote

Quote:
Don't suppose gdb tells you anything useful?

I don't really know how to use gdb, so no.
Back to top
View user's profile Send private message Send e-mail AIM Address
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> Mango All times are GMT - 6 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group