Re: svn commit: r1802875 - in /httpd/httpd/trunk: CHANGES server/mpm/event/event.c

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: svn commit: r1802875 - in /httpd/httpd/trunk: CHANGES server/mpm/event/event.c

Yann Ylavic
On Tue, Jul 25, 2017 at 1:19 AM,  <[hidden email]> wrote:
>
> New abort_socket_nonblocking() allows to reset connections when nonblocking is
> required and we can't do much about the connection anymore, nor want the system
> to linger on its own after close().
>
[]

>
> +static void abort_socket_nonblocking(apr_socket_t *csd)
> +{
> +    apr_status_t rv;
> +    apr_socket_timeout_set(csd, 0);
> +#if defined(SOL_SOCKET) && defined(SO_LINGER)
> +    /* This socket is over now, and we don't want to block nor linger
> +     * anymore, so reset it. A normal close could still linger in the
> +     * system, while RST is fast, nonblocking, and what the peer will
> +     * get if it sends us further data anyway.
> +     */
> +    {
> +        apr_os_sock_t osd = -1;
> +        struct linger opt;
> +        opt.l_onoff = 1;
> +        opt.l_linger = 0; /* zero timeout is RST */
> +        apr_os_sock_get(&osd, csd);
> +        setsockopt(osd, SOL_SOCKET, SO_LINGER, (void *)&opt, sizeof opt);
> +    }
> +#endif
> +    rv = apr_socket_close(csd);
> +    if (rv != APR_SUCCESS) {
> +        ap_log_error(APLOG_MARK, APLOG_ERR, rv, ap_server_conf, APLOGNO(00468)
> +                     "error closing socket");
> +        AP_DEBUG_ASSERT(0);
> +    }
> +}

This change may be disputable...

I wanted to:
1. avoid blocking in any case (we can't where this is called),
2. avoid issueing a nonblocking "normal" close() which lets the system handle
   the burden of possibly leaking descriptors (until its own linger is done).

Objections?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: svn commit: r1802875 - in /httpd/httpd/trunk: CHANGES server/mpm/event/event.c

Ruediger Pluem


On 07/25/2017 01:37 AM, Yann Ylavic wrote:

> On Tue, Jul 25, 2017 at 1:19 AM,  <[hidden email]> wrote:
>>
>> New abort_socket_nonblocking() allows to reset connections when nonblocking is
>> required and we can't do much about the connection anymore, nor want the system
>> to linger on its own after close().
>>
> []
>>
>> +static void abort_socket_nonblocking(apr_socket_t *csd)
>> +{
>> +    apr_status_t rv;
>> +    apr_socket_timeout_set(csd, 0);
>> +#if defined(SOL_SOCKET) && defined(SO_LINGER)
>> +    /* This socket is over now, and we don't want to block nor linger
>> +     * anymore, so reset it. A normal close could still linger in the
>> +     * system, while RST is fast, nonblocking, and what the peer will
>> +     * get if it sends us further data anyway.
>> +     */
>> +    {
>> +        apr_os_sock_t osd = -1;
>> +        struct linger opt;
>> +        opt.l_onoff = 1;
>> +        opt.l_linger = 0; /* zero timeout is RST */
>> +        apr_os_sock_get(&osd, csd);
>> +        setsockopt(osd, SOL_SOCKET, SO_LINGER, (void *)&opt, sizeof opt);
>> +    }
>> +#endif
>> +    rv = apr_socket_close(csd);
>> +    if (rv != APR_SUCCESS) {
>> +        ap_log_error(APLOG_MARK, APLOG_ERR, rv, ap_server_conf, APLOGNO(00468)
>> +                     "error closing socket");
>> +        AP_DEBUG_ASSERT(0);
>> +    }
>> +}
>
> This change may be disputable...
>
> I wanted to:
> 1. avoid blocking in any case (we can't where this is called),
> 2. avoid issueing a nonblocking "normal" close() which lets the system handle
>    the burden of possibly leaking descriptors (until its own linger is done).

What is the real use case for 2.?


Regards

RĂ¼diger
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: svn commit: r1802875 - in /httpd/httpd/trunk: CHANGES server/mpm/event/event.c

Yann Ylavic
On Tue, Jul 25, 2017 at 9:54 AM, Ruediger Pluem <[hidden email]> wrote:

>
> On 07/25/2017 01:37 AM, Yann Ylavic wrote:
>>
>> This change may be disputable...
>>
>> I wanted to:
>> 1. avoid blocking in any case (we can't where this is called),
>> 2. avoid issueing a nonblocking "normal" close() which lets the system handle
>>    the burden of possibly leaking descriptors (until its own linger is done).
>
> What is the real use case for 2.?

There are three cases where abort_socket_nonblocking() can be called:

a. close_worker_sockets(), as of ungraceful shutdown/restart, to kill sockets
   used by all the workers),

b. stop_lingering_close() to stop reading definitely from the socket when the
   linger timeout in place expires,

c. some push2worker() failure, we can't handle a new connection (or in write
   completion state) because of an internal error (a graceful is scheduled now
   since this commit).

For a. I think RST is the best we can do since we are forcibly
terminating active connections (possibly many), which we know nothing
about (flush/writ-ability), and for which normal close() could block
for an indefinite time (either in mpm_event if the socket is kept
blocking, or in the system if it's set nonblocking).

For b. the socket is already shutdown(), we lingered with the
appropriate timeout, it's time to RST further data (a normal close()
would be the same I think).

For c. we haven't read anything from a new connection or write
completion is now ready, in both cases the socket is writable (a new
connection is unlikely acceptable with a zero window) so we could
close() normally afterall.
Still we abort the connection outside the underlying protocol (HTTP or
else), knowingly with pending data in or out.
I think a normal close() could be misinterpreted in the write case (as
a usual end of connection, think non-HTTP/upgraded data in async),
while it would be equivalent for the newly accepted connection case
(if we close with data already received and ack-ed, the system will
RST by itself).

So possibly these are all legitimate RST cases (we do like TIME_WAITs,
but it seems not reachable with any of the above points probably I
think), but I may be missing something...


Regards,
Yann.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: svn commit: r1802875 - in /httpd/httpd/trunk: CHANGES server/mpm/event/event.c

Ruediger Pluem


On 07/25/2017 03:32 PM, Yann Ylavic wrote:

> On Tue, Jul 25, 2017 at 9:54 AM, Ruediger Pluem <[hidden email]> wrote:
>>
>> On 07/25/2017 01:37 AM, Yann Ylavic wrote:
>>>
>>> This change may be disputable...
>>>
>>> I wanted to:
>>> 1. avoid blocking in any case (we can't where this is called),
>>> 2. avoid issueing a nonblocking "normal" close() which lets the system handle
>>>    the burden of possibly leaking descriptors (until its own linger is done).
>>
>> What is the real use case for 2.?
>
> There are three cases where abort_socket_nonblocking() can be called:
>
> a. close_worker_sockets(), as of ungraceful shutdown/restart, to kill sockets
>    used by all the workers),
>
> b. stop_lingering_close() to stop reading definitely from the socket when the
>    linger timeout in place expires,
>
> c. some push2worker() failure, we can't handle a new connection (or in write
>    completion state) because of an internal error (a graceful is scheduled now
>    since this commit).
>
> For a. I think RST is the best we can do since we are forcibly
> terminating active connections (possibly many), which we know nothing
> about (flush/writ-ability), and for which normal close() could block
> for an indefinite time (either in mpm_event if the socket is kept
> blocking, or in the system if it's set nonblocking).
>
> For b. the socket is already shutdown(), we lingered with the
> appropriate timeout, it's time to RST further data (a normal close()
> would be the same I think).
>
> For c. we haven't read anything from a new connection or write
> completion is now ready, in both cases the socket is writable (a new
> connection is unlikely acceptable with a zero window) so we could
> close() normally afterall.
> Still we abort the connection outside the underlying protocol (HTTP or
> else), knowingly with pending data in or out.
> I think a normal close() could be misinterpreted in the write case (as
> a usual end of connection, think non-HTTP/upgraded data in async),
> while it would be equivalent for the newly accepted connection case
> (if we close with data already received and ack-ed, the system will
> RST by itself).
>
> So possibly these are all legitimate RST cases (we do like TIME_WAITs,
> but it seems not reachable with any of the above points probably I
> think), but I may be missing something...
>

I am still not convinced why we couldn't leave that job of
closing down the sockets (non blocking from application perspective) to the OS.
The best argument might be the case where we are in write completion
and close the connection regularly and this could be perceived on client
side as having received the full response. OTOH when do we sent replies
these days that are neither transfer encoded chunked or have a content-length
header (to talk about the HTTP layer)? So this case does not seem to be strong either.
So I am personally -0.5 on that part of the commit, but I am happy to
hear what others think.

Regards

RĂ¼diger

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: svn commit: r1802875 - in /httpd/httpd/trunk: CHANGES server/mpm/event/event.c

Yann Ylavic
On Tue, Jul 25, 2017 at 7:49 PM, Ruediger Pluem <[hidden email]> wrote:

>
>
> On 07/25/2017 03:32 PM, Yann Ylavic wrote:
>> On Tue, Jul 25, 2017 at 9:54 AM, Ruediger Pluem <[hidden email]> wrote:
>>>
>>> On 07/25/2017 01:37 AM, Yann Ylavic wrote:
>>>>
>>>> This change may be disputable...
>>>>
>>>> I wanted to:
>>>> 1. avoid blocking in any case (we can't where this is called),
>>>> 2. avoid issueing a nonblocking "normal" close() which lets the system handle
>>>>    the burden of possibly leaking descriptors (until its own linger is done).
>>>
>>> What is the real use case for 2.?
>>
>> There are three cases where abort_socket_nonblocking() can be called:
>>
>> a. close_worker_sockets(), as of ungraceful shutdown/restart, to kill sockets
>>    used by all the workers),
>>
>> b. stop_lingering_close() to stop reading definitely from the socket when the
>>    linger timeout in place expires,
>>
>> c. some push2worker() failure, we can't handle a new connection (or in write
>>    completion state) because of an internal error (a graceful is scheduled now
>>    since this commit).
>>
>> For a. I think RST is the best we can do since we are forcibly
>> terminating active connections (possibly many), which we know nothing
>> about (flush/writ-ability), and for which normal close() could block
>> for an indefinite time (either in mpm_event if the socket is kept
>> blocking, or in the system if it's set nonblocking).
>>
>> For b. the socket is already shutdown(), we lingered with the
>> appropriate timeout, it's time to RST further data (a normal close()
>> would be the same I think).
>>
>> For c. we haven't read anything from a new connection or write
>> completion is now ready, in both cases the socket is writable (a new
>> connection is unlikely acceptable with a zero window) so we could
>> close() normally afterall.
>> Still we abort the connection outside the underlying protocol (HTTP or
>> else), knowingly with pending data in or out.
>> I think a normal close() could be misinterpreted in the write case (as
>> a usual end of connection, think non-HTTP/upgraded data in async),
>> while it would be equivalent for the newly accepted connection case
>> (if we close with data already received and ack-ed, the system will
>> RST by itself).
>>
>> So possibly these are all legitimate RST cases (we do like TIME_WAITs,
>> but it seems not reachable with any of the above points probably I
>> think), but I may be missing something...
>>
>
> I am still not convinced why we couldn't leave that job of
> closing down the sockets (non blocking from application perspective) to the OS.

It could be considered a leak, the application (httpd) doesn't take
care of its resources and increases the number of orphaned sockets
which the system has to handle anyway (tcp memory increases and can
reach some limits, system has to be fine tuned to support more
orphans, ...).
Ungraceful restarts can be costly in that regard, new generation may
have issues creating sockets or performing if system (TCP) resources
become short.

> The best argument might be the case where we are in write completion
> and close the connection regularly and this could be perceived on client
> side as having received the full response. OTOH when do we sent replies
> these days that are neither transfer encoded chunked or have a content-length
> header (to talk about the HTTP layer)? So this case does not seem to be strong either.

trunk can already go async in mod_proxy_wstunnel for example, this is
not HTTP anymore there after the Upgrade.

> So I am personally -0.5 on that part of the commit, but I am happy to
> hear what others think.

No strong opinion to push that change either, I tend to think that
it's the right thing to do.
But 2.4.x is currently blocking in the a. case above, so it likely
happens that the child processes terminate before all active
connections are effectively closed, hence orphan sockets already exist
today I think, which is not a raised issue AFAICT.
b. and c. are more sporadic, so the system can probably handle their load.

Still the pathological case of any of [abc] above (multilpe successive
ungraceful shutdowns, many connections expiring in lingering close,
...) is something httpd should care about IMHO.


Regards,
Yann.
Loading...