[Bug 59237] New: http2 breaks external auth after a few successful requests

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 59237] New: http2 breaks external auth after a few successful requests

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=59237

            Bug ID: 59237
           Summary: http2 breaks external auth after a few successful
                    requests
           Product: Apache httpd-2
           Version: 2.4.18
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: critical
          Priority: P2
         Component: mod_http2
          Assignee: [hidden email]
          Reporter: [hidden email]

I have found that when using HTTP2 in Apache 2.4.18 mod_authnz_external will
break after a few (5-20) requests* even though it worked fine for those first
ones. Suddenly some request is unauth. 401 and re-auth does not work anymore
(using correct password!). Tested using Opera and Firefox. It does not happen
when using HTTP1.1 (disable in Apache or use IE) or when using Apache builtin
auth (e.g. Digest). It appears to happen when a lot of requests are made by the
browser (maybe additional h2 streams are opened or some other pipelining
causing race conditions?).

    Protocols h2 http/1.1 # h2 breaks ext. auth - http1.1 OR builtin auth
digest works
    DefineExternalAuth unixuser pipe /usr/sbin/pwauth
    AuthType Basic
    AuthName blah
    AuthBasicProvider external
    AuthExternal unixuser
    Require valid-user

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 59237] http2 breaks external auth after a few successful requests

Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=59237

--- Comment #1 from [hidden email] ---
*
[Fri Mar 25 23:18:16.121542 2016] [auth_basic:error] [pid 8506:tid
139833380353792] [client :14641] AH01617: user k: authentication failure for "/
/": Password Mismatch
[Fri Mar 25 23:18:16.121498 2016] [authnz_external:error] [pid 8506:tid
139833489458944] [client :14641] External authenticator died on signal 15,
referer:
[Fri Mar 25 23:18:16.121511 2016] [authnz_external:error] [pid 8506:tid
139833380353792] [client :14641] AuthExtern unixuser [/usr/sbin/pwauth]: Failed
(-2) for user k, referer:

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 59237] http2 breaks external auth after a few successful requests

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=59237

Stefan Eissing <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #2 from Stefan Eissing <[hidden email]> ---
Hi, thanks for your report.

I had a quick look at mod_auth_external code and, to my understanding, it
starts in your configuration the 'pwauth' script for every request, be it a
http/1.1 or h2 request. As far as I know, there is no principal difference how
this is invoked.

From your log I can see that pwauth receives a SIGTERM. This is the timeout
handler that tries to stop pwauth when it takes too long. So, it seems that the
script, for some reason, is hanging, maybe trying to access PAM?

If you run pwauth as a root user, and give its input username/password, do you
see timely behavior?

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 59237] http2 breaks external auth after a few successful requests

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=59237

--- Comment #3 from Stefan Eissing <[hidden email]> ---
If no additional information is available, I will soon close this report.

--
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 59237] http2 breaks external auth after a few successful requests

Bugzilla from bugzilla@apache.org
In reply to this post by Bugzilla from bugzilla@apache.org
https://bz.apache.org/bugzilla/show_bug.cgi?id=59237

John Morrissey <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email]
             Status|NEEDINFO                    |NEW

--- Comment #4 from John Morrissey <[hidden email]> ---
I recently experienced the same symptoms as Karim (Apache 2.4.25,
mod_authnz_external 3.3.2) when enabling HTTP2: basic auth fails randomly.

This doesn't look like a timeout on the pwauth subprocess. I can run it in a
continuous loop for as long as I want. Each execution consistently takes ~0.03s
of wallclock.

[jwm@boost:pts/9 ~> while :; do echo -e 'some-user\nsome-password' | time sudo
pwauth; if [ $? -ne 0 ]; then echo $?; fi; done
0.02user 0.00system 0:00.03elapsed 87%CPU (0avgtext+0avgdata 3712maxresident)k
0inputs+0outputs (0major+574minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 93%CPU (0avgtext+0avgdata 3576maxresident)k
0inputs+0outputs (0major+574minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 80%CPU (0avgtext+0avgdata 3720maxresident)k
0inputs+0outputs (0major+575minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3724maxresident)k
0inputs+0outputs (0major+577minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 93%CPU (0avgtext+0avgdata 3708maxresident)k
0inputs+0outputs (0major+576minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3728maxresident)k
0inputs+0outputs (0major+578minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3664maxresident)k
0inputs+0outputs (0major+571minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3668maxresident)k
0inputs+0outputs (0major+575minor)pagefaults 0swaps
[...]

OTOH, it looks like the Apache worker is killing the pwauth child before it's
had a chance to do anything at all:

[pid 13021] <... clone resumed> child_stack=NULL,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f17728969d0) = 14222
[pid 13020] <... wait4 resumed> 0x7ffff3893bac, WNOHANG|WSTOPPED, NULL) = 0
strace: Process 14222 attached
[pid 13020] kill(14222, SIGTERM <unfinished ...>

The backtrace (breakpointed on apr_proc_kill()) shows:

#0  apr_proc_kill (proc=0x7fffdc6faac0, signum=signum@entry=15)
    at ./threadproc/unix/signals.c:36
#1  0x00007ffff7725e0b in free_proc_chain (procs=0x7fffdbe2f068)
    at ./memory/unix/apr_pools.c:2464
#2  0x00007ffff7726b90 in apr_pool_destroy (pool=0x7fffdbe30028)
    at ./memory/unix/apr_pools.c:817
#3  0x00007ffff3e35f0c in h2_blist_cleanup (bl=bl@entry=0x7fffec21d2a8)
    at h2_bucket_beam.c:400
#4  0x00007ffff3e35ff7 in beam_send_cleanup (data=0x7fffec21d278) at
h2_bucket_beam.c:455
#5  0x00007ffff7726b36 in run_cleanups (cref=0x7fffdbe36098)
    at ./memory/unix/apr_pools.c:2352
#6  apr_pool_destroy (pool=0x7fffdbe36028) at ./memory/unix/apr_pools.c:804
#7  0x00007ffff3e41246 in task_destroy (m=0x7fffec2252d0, task=0x7fffdbe320a0,
    called_from_master=<optimized out>) at h2_mplx.c:394
#8  0x00007ffff3e41460 in stream_done (m=m@entry=0x7fffec2252d0,
    stream=stream@entry=0x7fffec21d0a0, rst_error=0) at h2_mplx.c:460
#9  0x00007ffff3e4279d in h2_mplx_stream_done (m=0x7fffec2252d0,
    stream=stream@entry=0x7fffec21d0a0) at h2_mplx.c:654
#10 0x00007ffff3e4a057 in h2_session_stream_done (session=0x7fffec2250a0,
    stream=0x7fffec21d0a0) at h2_session.c:85
#11 0x00005555555a5ce6 in remove_empty_buckets (bb=bb@entry=0x7ffff7e088c0)
    at core_filters.c:720
#12 0x00005555555a6112 in send_brigade_nonblocking (s=s@entry=0x7ffff7e080a0,
    bb=bb@entry=0x7ffff7e088c0,
bytes_written=bytes_written@entry=0x7fffec22f318,
    c=c@entry=0x7ffff7e08290) at core_filters.c:710
#13 0x00005555555a6e41 in send_brigade_blocking (c=0x7ffff7e08290,
    bytes_written=0x7fffec22f318, bb=0x7ffff7e088c0, s=0x7ffff7e080a0)
    at core_filters.c:733
#14 ap_core_output_filter (f=0x7fffec22f148, new_bb=0x7ffff7e088c0) at
core_filters.c:542
#15 0x00007fffed751c23 in ssl_io_filter_output (f=0x7ffff7e08878,
bb=0x7fffec22f3c8)
    at ssl_engine_io.c:1716
#16 0x00007fffed74edce in ssl_io_filter_coalesce (f=0x7ffff7e08850,
bb=0x7fffec22f3c8)
    at ssl_engine_io.c:1663
#17 0x00007ffff3e3a58c in pass_output (io=io@entry=0x7fffec2250e8,
flush=flush@entry=1,
    session_eoc=session_eoc@entry=0x0) at h2_conn_io.c:296
#18 0x00007ffff3e3aa5c in h2_conn_io_flush (io=io@entry=0x7fffec2250e8)
    at h2_conn_io.c:327
#19 0x00007ffff3e4a4aa in h2_session_ev_no_io (msg=0x0, arg=0,
session=0x7fffec2250a0)
    at h2_session.c:1840
#20 dispatch_event (session=session@entry=0x7fffec2250a0,
    ev=ev@entry=H2_SESSION_EV_NO_IO, arg=arg@entry=0, msg=msg@entry=0x0)
    at h2_session.c:1993
#21 0x00007ffff3e4e125 in h2_session_process (session=0x7fffec2250a0,
    async=async@entry=0) at h2_session.c:2239
#22 0x00007ffff3e39aca in h2_conn_run (ctx=ctx@entry=0x7fffec22f2d0,
    c=c@entry=0x7ffff7e08290) at h2_conn.c:212
#23 0x00007ffff3e3f9fb in h2_h2_process_conn (c=0x7ffff7e08290) at h2_h2.c:658
#24 0x00005555555b54f0 in ap_run_process_connection (c=c@entry=0x7ffff7e08290)
    at connection.c:42
#25 0x00005555555b5a40 in ap_process_connection (c=c@entry=0x7ffff7e08290,
    csd=<optimized out>) at connection.c:226
#26 0x00007ffff2f236bf in child_main (child_num_arg=child_num_arg@entry=0,
    child_bucket=child_bucket@entry=0) at prefork.c:723
#27 0x00007ffff2f238f2 in make_child (s=0x7ffff7fbf4a0, slot=slot@entry=0,
    bucket=bucket@entry=0) at prefork.c:767
#28 0x00007ffff2f24e37 in prefork_run (_pconf=<optimized out>,
plog=0x7ffff7fbc028,
    s=0x7ffff7fbf4a0) at prefork.c:979
#29 0x000055555558edde in ap_run_mpm (pconf=0x7ffff7fed028,
plog=0x7ffff7fbc028,
    s=0x7ffff7fbf4a0) at mpm_common.c:94
#30 0x0000555555587a1d in main (argc=<optimized out>, argv=<optimized out>) at
main.c:783

(gdb) print *proc
$2 = {pid = 8950, in = 0x7fffdbe24d60, out = 0x0, err = 0x0}

PID 8950 is indeed pwauth:

root          8950  8950  0.0  0.0      0     0 pts/1 Z    -            17:31
[pwauth] <defunct>

I don't know enough about the Apache 2.4 process model to say much more. It
looks like each prefork worker now has two threads. In this case, one of the
worker's threads clone()s to prepare the child process, but it's the worker's
*other* thread that kills the child process immediately.

How can I help debug this further?

--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...