[Bug 63273] New: Proxy error doesnot enable the worker even retry property is enabled.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug 63273] New: Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

            Bug ID: 63273
           Summary: Proxy error doesnot enable the worker even retry
                    property is enabled.
           Product: Apache httpd-2
           Version: 2.4.18
          Hardware: Other
                OS: Linux
            Status: NEW
          Severity: major
          Priority: P2
         Component: mod_proxy
          Assignee: [hidden email]
          Reporter: [hidden email]
  Target Milestone: ---

We have setup reverse proxy on frontend servers which connect to backend
servers throught load balancer.

Apache reverse proxy configuration :
ProxyPass http://<load-balancer-DNS>/context/ retry=0 timeout=30

When we do load test, worker for backend connection fails and never get
connected for that backend. Although `retry=0` is specified.

Apache Error log :

[Wed Mar 20 12:23:26.292232 2019] [proxy:trace2] [pid 36176:tid
140617447184128] proxy_util.c(2765): HTTP: fam 2 socket created to connect to
<LOAD-BALANACER-DNS>
[Wed Mar 20 12:23:56.322423 2019] [proxy:error] [pid 36176:tid 140617447184128]
(70007)The timeout specified has expired: AH00957: HTTP: attempt to connect to
10.19.134.64:80 (<LOAD-BALANACER-DNS>) failed
[Wed Mar 20 12:23:56.322496 2019] [proxy:error] [pid 36176:tid 140617447184128]
AH00959: ap_proxy_connect_backend disabling worker for (<LOAD-BALANACER-DNS>)
for 0s
[Wed Mar 20 12:23:56.322509 2019] [proxy:debug] [pid 36176:tid 140617447184128]
proxy_util.c(2175): AH00943: HTTP: has released connection for
(<LOAD-BALANACER-DNS>)
[Wed Mar 20 12:25:57.590705 2019] [proxy:trace2] [pid 36178:tid
140617304508160] proxy_util.c(1966): [client 14.142.125.100:42564] http: found
worker http://<LOAD-BALANACER-DNS>/<API-CONTEXT>/ for
http://<LOAD-BALANACER-DNS>/<API-ENDPOINT>, referer:
https://<APPLICATION-ENDPOINT>/
[Wed Mar 20 12:25:57.590740 2019] [proxy:debug] [pid 36178:tid 140617304508160]
mod_proxy.c(1160): [client 14.142.125.100:42564] AH01143: Running scheme http
handler (attempt 0), referer: https://<APPLICATION-ENDPOINT>/
[Wed Mar 20 12:25:57.590748 2019] [proxy:debug] [pid 36178:tid 140617304508160]
proxy_util.c(1904): AH00932: HTTP: worker for (<LOAD-BALANACER-DNS>) has been
marked for retry
[Wed Mar 20 12:25:57.590767 2019] [proxy:debug] [pid 36178:tid 140617304508160]
proxy_util.c(2160): AH00942: HTTP: has acquired connection for
(<LOAD-BALANACER-DNS>)
[Wed Mar 20 12:25:57.590772 2019] [proxy:debug] [pid 36178:tid 140617304508160]
proxy_util.c(2213): [client 14.142.125.100:42564] AH00944: connecting
http://<LOAD-BALANACER-DNS>/<API-ENDPOINT> to <LOAD-BALANACER-DNS>:80, referer:
https://<APPLICATION-ENDPOINT>/
[Wed Mar 20 12:25:57.590779 2019] [proxy:debug] [pid 36178:tid 140617304508160]
proxy_util.c(2422): [client 14.142.125.100:42564] AH00947: connected
/<API-ENDPOINT> to <LOAD-BALANACER-DNS>:80, referer:
https://<APPLICATION-ENDPOINT>/
[Wed Mar 20 12:25:57.590798 2019] [proxy:trace2] [pid 36178:tid
140617304508160] proxy_util.c(2765): HTTP: fam 2 socket created to connect to
<LOAD-BALANACER-DNS>
[Wed Mar 20 12:26:00.588182 2019] [proxy:debug] [pid 36178:tid 140617304508160]
proxy_util.c(2790): (113)No route to host: AH00957: HTTP: attempt to connect to
10.19.136.229:80 (<LOAD-BALANACER-DNS>) failed

--------------------------------------------------------------------------
Restarting apache server, then workers starts reconnecting to backend.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 63273] Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

--- Comment #1 from Eric Covener <[hidden email]> ---
Does DNS change after the restart?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 63273] Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

--- Comment #2 from Bhushan Jade <[hidden email]> ---
(In reply to Eric Covener from comment #1)
> Does DNS change after the restart?
No. We are restarting apache service. Backend load balancer is AWS ELB. Which
gives response ,when we hit directly using LOAD-BALANACER-DNS URL.
We have setup like this :
[Front end Server(Apache)]<---elb--->[Backend Server]

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 63273] Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

--- Comment #3 from Bhushan Jade <[hidden email]> ---
Worker connection once lost,its not establishing again even there is retry=0.
It starts when apache service restarted.

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 63273] Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

--- Comment #4 from Dave Rager <[hidden email]> ---
Hello, it looks like I may have encountered this recently in our Production
environment. We have two front end servers that connect to backend servers
through an AWS load balancer similar to what is described here. Both began to
fail about the same time with this error.

I would like to try to recreate this in our Test environment but I'm unsure
what triggered it. Does anyone have any pointers on how to reproduce it?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 63273] Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

Dave Rager <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[Bug 63273] Proxy error doesnot enable the worker even retry property is enabled.

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=63273

--- Comment #5 from Dave Rager <[hidden email]> ---
I believe the issue is related to how AWS ELBs scale and how Apache workers are
configured by default.

From Apache docs:
"When connection reuse is enabled, each backend domain is resolved only once
per child process, and cached for all further connections until the child is
recycled."

When resolving an AWS ELB hostname "by default, Elastic Load Balancing will
return multiple IP addresses when clients perform a DNS resolution, with the
records being randomly ordered on each DNS resolution request."

Under load, AWS ELBs will scale to handle the traffic. (Not the same as scaling
application servers behind the ELB).

"The Elastic Load Balancing service will update the Domain Name System (DNS)
record of the load balancer when it scales so that the new resources have their
respective IP addresses registered in DNS."

I believe what is happening is, under load, previously unused workers are
started, resolve the ELB host and receive a "new" IP for the ELB. Once the load
subsides, the ELBs scale down and one or more IP addresses now cached by
workers are no longer valid. Because the worker has cached that IP, it tries to
use it the next time it receives a request which then fails with the described
error.

Regardless of the value of 'retry', that cached IP address will never be
refreshed and the worker will always fail until it is recycled.

Using the parameter 'disablereuse=on' (or 'enablereuse=off') will force the
worker to resolve the hostname to get a new IP.

Also note, it is better to leave 'retry' to its default value of 60 in the case
a worker resolves an IP address that hasn't yet been removed from the DNS
record when scaling down:

"The DNS record that is created includes a Time-to-Live (TTL) setting of 60
seconds, with the expectation that clients will re-lookup the DNS at least
every 60 seconds."

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]