[Bug 60948] New: Large TCP timeout delays hcheck disabling a node

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 60948] New: Large TCP timeout delays hcheck disabling a node

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948

            Bug ID: 60948
           Summary: Large TCP timeout delays hcheck disabling a node
           Product: Apache httpd-2
           Version: 2.4.25
          Hardware: Sun
                OS: Solaris
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: mod_proxy_hcheck
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

Created attachment 34892
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34892&action=edit
added new hcconnectiontimeout parameter

Using latest patched mod_proxy_hcheck (with patch from bug 60071) I encountered
a problematic situation.
If a node goes down due to a complete failure and is not reachable via tcp/ip
anymore, the long solaris tcp/ip timeout causes mod_proxy_hcheck to DISABLE the
node very late.
mod_proxy_hcheck does not provide a connection-timeout parameter to shorten
this.
On top, the threadpool defined via ProxyHCTPsize quickly fills up and uses all
available threads waiting for the timeout. The workaround is to increase
ProxyHCTPsize to e.g. 500. But the problem remains, that once the node goes
down it is not DISABLED until the first timeout has been reached. Solaris has a
timeout of about 120s, therefore the problematic node will still get requests
during this time. These requests will run into the "connectiontimeout", but
this is still not a good situation as it slows down many requests.

I have patched (well, more copy/paste) the mod_proxy_hcheck.c and added a new
parameter called "hcconnectiontimeout". With this new parameter my tests look
good now.
Example configuration would look like this:

   SSLProxyEngine On
   SSLProxyVerify none
   SSLProxyCheckPeerCN off
   SSLProxyCheckPeerName off
   SSLProxyCheckPeerExpire off

   ProxyHCTPsize 400
   ProxyHCExpr get {hc('body') =~ /OK/}
   ProxyHCTemplate server hcmethod=GET hcexpr=get hcfails=1 hcinterval=2
hcpasses=1 hcuri=/tester
   <Proxy balancer://group>
      BalancerMember https://192.168.0.2:8080 connectiontimeout=1
hcconnectiontimeout=1 hctemplate=server
      BalancerMember https://192.168.0.3:8080 connectiontimeout=1
hcconnectiontimeout=1 hctemplate=server
   </Proxy>
<VirtualHost *:80>

   ProxyPass "/" "balancer://group/" failontimeout=On timeout=2
   ProxyPassReverse "/" "balancer://group/"

</VirtualHost>

I hope this helps anyone.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60948] Large TCP timeout delays hcheck disabling a node

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948

Michael Renz <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #34892|0                           |1
        is obsolete|                            |

--- Comment #1 from Michael Renz <[hidden email]> ---
Created attachment 34893
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34893&action=edit
I forgot to allow it in ProxyHCTemplate and the parameter is now optional

I forgot to allow it in ProxyHCTemplate and the parameter is now optional

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60948] Large TCP timeout delays hcheck disabling a node

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948

--- Comment #2 from Thomas Meyer <[hidden email]> ---
Hi, any updates on this?

an independent timeout for the health check http request would be really
helpful!

the patch looks okay, any thing that I can do to get this merged?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 60948] Large TCP timeout delays hcheck disabling a node

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948

Christophe JAILLET <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |PatchAvailable

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]