request_rec.unparsed_uri missing scheme and host. parsed_uri missing most fields

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

request_rec.unparsed_uri missing scheme and host. parsed_uri missing most fields

Paul Callahan
Hello,
I'm having trouble getting the full uri of a request from request_rec.
 The comment string for request_rec.unparsed_uri makes it sound like it
should have the entire url, e.g. http:://hostname/path?etc.

But it only has the path and the query parameters.

The parsed_uri struct is populated with port, path and query paramters.
 Everything else (scheme, hostname, username, password, etc) is null.

I set a breakpoint in "apr_uri_parse()" and verified the incoming *uri
field only has the path and query parameters.

Is this expected?    How can I get the full URI?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: request_rec.unparsed_uri missing scheme and host. parsed_uri missing most fields

Sorin Manolache
On 14/05/2019 20.35, Paul Callahan wrote:

> Hello,
> I'm having trouble getting the full uri of a request from request_rec.
>   The comment string for request_rec.unparsed_uri makes it sound like it
> should have the entire url, e.g. http:://hostname/path?etc.
>
> But it only has the path and the query parameters.
>
> The parsed_uri struct is populated with port, path and query paramters.
>   Everything else (scheme, hostname, username, password, etc) is null.
>
> I set a breakpoint in "apr_uri_parse()" and verified the incoming *uri
> field only has the path and query parameters.
>
> Is this expected?    How can I get the full URI?

Hello,

Yes, it is expected.

When the client (meaning a program, not a human) makes a request it
sends the following first line over the network connection:

GET /path?arg1=val1&arg2=val2 HTTP/1.1

(I assume here that it uses the version 1.1 of the HTTP protocol.)

In HTTP/1.1 a "Host" header must be present (it is not present in
HTTP/1.0 but there is little HTTP/1.0 traffic nowadays)

So you might get

GET /path?arg1=val1&arg2=val2 HTTP/1.1
Host: www.example.com

A browser will decompose the address
http://www.example.com/path?arg1=val1&arg2=val2 that you type in its
address bar and generate the two text lines shown above.

But the server will not receive the string
http://www.example.com/path?arg1=val1&arg2=val2

Moreover, http:// or https:// are not sent by the client. It's the
server (apache) that determines (reconstructs) the scheme (i.e. http://
or https://) from the port and transport protocol (SSL/TLS or plain
text) used by the request.

The HTTP RFC (https://tools.ietf.org/html/rfc7230) has more details.
Especially section 5.3 might be of interest to you.

HTH,
Sorin