Re: [Bug 64533] Http crashes observed during fuzzing testing

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Bug 64533] Http crashes observed during fuzzing testing

Yann Ylavic
On Wed, Sep 30, 2020 at 1:40 PM <[hidden email]> wrote:

>
> https://bz.apache.org/bugzilla/show_bug.cgi?id=64533
>
> --- Comment #29 from Ruediger Pluem <[hidden email]> ---
> There seem to be further ones:
>
> #0  0x00007fcf591f2664 in __do_global_dtors_aux () from
> /lib64/libnss_files.so.2
> [Current thread is 1 (Thread 0x7fcf59dc0900 (LWP 3563021))]
> (gdb) bt
> #0  0x00007fcf591f2664 in __do_global_dtors_aux () from
> /lib64/libnss_files.so.2
> #1  0x00007fcf5a30d2eb in _dl_fini () at dl-fini.c:138
> #2  0x00007fcf59f0ee87 in __run_exit_handlers (status=status@entry=0,
> listp=0x7fcf5a092578 <__exit_funcs>,
>     run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
> at exit.c:108
> #3  0x00007fcf59f0f040 in __GI_exit (status=status@entry=0) at exit.c:139
> #4  0x00007fcf5921d716 in clean_child_exit (code=code@entry=0) at event.c:738
> #5  0x00007fcf5921d73d in just_die (sig=<optimized out>) at event.c:743
> #6  <signal handler called>

I think this trace may be caused by OpenSSL shutdown in child
processes, registered with atexit or alike (either cleanup code is
called after the DOS is unloaded, or it points to data on pchild).
We discussed this already in [1], I don't think we should let atexit
code run for children, and just call _exit() in *nix MPMs.

But if some threads are still "using" pchild in between
apr_pool_destroy(pchild) and _exit() we might crash at child shutdown
still (like in Joe's trace I think), so possibly we don't want to
destroy pchild either for ungraceful shutdowns (graceful ones are OK
because we wait for workers).

[1] https://lists.apache.org/thread.html/16ae4b2ff5a52b1af320b081bde4dfb02e0c28dd1253572beb84dd72%40%3Cdev.httpd.apache.org%3E

>
> and
>
> #0  0x00007fcf59e0b5f0 in __do_global_dtors_aux () from /lib64/liblzma.so.5
> #1  0x00007fcf5a30d2eb in _dl_fini () from /lib64/ld-linux-x86-64.so.2
> #2  0x00007fcf59f0ee87 in __run_exit_handlers () from /lib64/libc.so.6
> #3  0x00007fcf59f0f040 in exit () from /lib64/libc.so.6
> #4  0x00007fcf5921d716 in clean_child_exit () from
> /usr/lib64/httpd/modules/mod_mpm_event.so
> #5  0x00007fcf5921d73d in just_die () from
> /usr/lib64/httpd/modules/mod_mpm_event.so
> #6  <signal handler called>

Same case here possibly.

>
> Do we ever use liblzma in vanialla httpd?

I don't think so,


Regards;
Yann.
Reply | Threaded
Open this post in threaded view
|

Re: [Bug 64533] Http crashes observed during fuzzing testing

Ruediger Pluem
Looking at the latest backtraces, I see the following:

#7  0x00007fcf5921d716 in clean_child_exit (code=code@entry=0) at event.c:738
#8  0x00007fcf5921d73d in just_die (sig=<optimized out>) at event.c:743
#9  <signal handler called>
#10 pthread_sigmask (how=how@entry=1, newmask=<optimized out>, newmask@entry=0x7fff02b26230, oldmask=oldmask@entry=0x0) at
../sysdeps/unix/sysv/linux/pthread_sigmask.c:48
#11 0x00007fcf5921ccd5 in unblock_signal (sig=sig@entry=15) at event.c:1264
#12 0x00007fcf5921e5d4 in child_main (child_num_arg=child_num_arg@entry=7, child_bucket=child_bucket@entry=0) at event.c:2586
#13 0x00007fcf5921e914 in make_child (s=0x563ebcececf0, slot=slot@entry=7, bucket=bucket@entry=0) at event.c:2691
#14 0x00007fcf5921f290 in perform_idle_server_maintenance (num_buckets=<optimized out>, child_bucket=<optimized out>) at event.c:2886
#15 server_main_loop (num_buckets=1, remaining_children_to_start=0) at event.c:3015
#16 event_run (_pconf=<optimized out>, plog=<optimized out>, s=<optimized out>) at event.c:3092
#17 0x0000563ebc837ce0 in ap_run_mpm (pconf=0x563ebcea5a48, plog=0x563ebced2c68, s=0x563ebcececf0) at mpm_common.c:94
#18 0x0000563ebc821eb3 in main (argc=<optimized out>, argv=<optimized out>) at main.c:819


Would the following patch makes sense (against trunk)?

Index: server/mpm/event/event.c
===================================================================
--- server/mpm/event/event.c (revision 1882251)
+++ server/mpm/event/event.c (working copy)
@@ -2897,8 +2897,8 @@
          * the other threads in the process needs to take us down
          * (e.g., for MaxConnectionsPerChild) it will send us SIGTERM
          */
+        apr_signal(SIGTERM, dummy_signal_handler);
         unblock_signal(SIGTERM);
-        apr_signal(SIGTERM, dummy_signal_handler);
         /* Watch for any messages from the parent over the POD */
         while (1) {
             rv = ap_mpm_podx_check(my_bucket->pod);


It looks like a queued SIG_TERM is delivered to the current SIG_TERM handler immediately after we unblocked it before we could
change the handler.

Regards

RĂ¼diger

On 9/30/20 3:59 PM, Yann Ylavic wrote:

> On Wed, Sep 30, 2020 at 1:40 PM <[hidden email]> wrote:
>>
>> https://bz.apache.org/bugzilla/show_bug.cgi?id=64533
>>
>> --- Comment #29 from Ruediger Pluem <[hidden email]> ---
>> There seem to be further ones:
>>
>> #0  0x00007fcf591f2664 in __do_global_dtors_aux () from
>> /lib64/libnss_files.so.2
>> [Current thread is 1 (Thread 0x7fcf59dc0900 (LWP 3563021))]
>> (gdb) bt
>> #0  0x00007fcf591f2664 in __do_global_dtors_aux () from
>> /lib64/libnss_files.so.2
>> #1  0x00007fcf5a30d2eb in _dl_fini () at dl-fini.c:138
>> #2  0x00007fcf59f0ee87 in __run_exit_handlers (status=status@entry=0,
>> listp=0x7fcf5a092578 <__exit_funcs>,
>>     run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
>> at exit.c:108
>> #3  0x00007fcf59f0f040 in __GI_exit (status=status@entry=0) at exit.c:139
>> #4  0x00007fcf5921d716 in clean_child_exit (code=code@entry=0) at event.c:738
>> #5  0x00007fcf5921d73d in just_die (sig=<optimized out>) at event.c:743
>> #6  <signal handler called>
>
> I think this trace may be caused by OpenSSL shutdown in child
> processes, registered with atexit or alike (either cleanup code is
> called after the DOS is unloaded, or it points to data on pchild).
> We discussed this already in [1], I don't think we should let atexit
> code run for children, and just call _exit() in *nix MPMs.
>
> But if some threads are still "using" pchild in between
> apr_pool_destroy(pchild) and _exit() we might crash at child shutdown
> still (like in Joe's trace I think), so possibly we don't want to
> destroy pchild either for ungraceful shutdowns (graceful ones are OK
> because we wait for workers).
>
> [1] https://lists.apache.org/thread.html/16ae4b2ff5a52b1af320b081bde4dfb02e0c28dd1253572beb84dd72%40%3Cdev.httpd.apache.org%3E
>
>>
>> and
>>
>> #0  0x00007fcf59e0b5f0 in __do_global_dtors_aux () from /lib64/liblzma.so.5
>> #1  0x00007fcf5a30d2eb in _dl_fini () from /lib64/ld-linux-x86-64.so.2
>> #2  0x00007fcf59f0ee87 in __run_exit_handlers () from /lib64/libc.so.6
>> #3  0x00007fcf59f0f040 in exit () from /lib64/libc.so.6
>> #4  0x00007fcf5921d716 in clean_child_exit () from
>> /usr/lib64/httpd/modules/mod_mpm_event.so
>> #5  0x00007fcf5921d73d in just_die () from
>> /usr/lib64/httpd/modules/mod_mpm_event.so
>> #6  <signal handler called>
>
> Same case here possibly.
>
>>
>> Do we ever use liblzma in vanialla httpd?
>
> I don't think so,
>
>
> Regards;
> Yann.
>
Reply | Threaded
Open this post in threaded view
|

Re: [Bug 64533] Http crashes observed during fuzzing testing

Joe Orton
On Fri, Oct 09, 2020 at 11:57:38AM +0200, Ruediger Pluem wrote:

> Would the following patch makes sense (against trunk)?
>
> Index: server/mpm/event/event.c
> ===================================================================
> --- server/mpm/event/event.c (revision 1882251)
> +++ server/mpm/event/event.c (working copy)
> @@ -2897,8 +2897,8 @@
>           * the other threads in the process needs to take us down
>           * (e.g., for MaxConnectionsPerChild) it will send us SIGTERM
>           */
> +        apr_signal(SIGTERM, dummy_signal_handler);
>          unblock_signal(SIGTERM);
> -        apr_signal(SIGTERM, dummy_signal_handler);
>          /* Watch for any messages from the parent over the POD */
>          while (1) {
>              rv = ap_mpm_podx_check(my_bucket->pod);
>
>
> It looks like a queued SIG_TERM is delivered to the current SIG_TERM handler immediately after we unblocked it before we could
> change the handler.

Oh, very nice catch.  Yes that looks exactly right, +1.

Regards, Joe