Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Ruediger Pluem


On 6/28/20 1:41 AM, [hidden email] wrote:

> Author: minfrin
> Date: Sat Jun 27 23:41:00 2020
> New Revision: 1879285
>
> URL: http://svn.apache.org/viewvc?rev=1879285&view=rev
> Log:
> "[mod_dav_fs etag handling] should really honor the FileETag setting".
> - It now does.
> - Add "Digest" to FileETag directive, allowing a strong ETag to be
>   generated using a file digest.
> - Add ap_make_etag_ex() and ap_set_etag_fd() to allow full control over
>   ETag generation.
> - Add concept of "binary notes" to request_rec, allowing packed bit flags
>   to be added to a request.
> - First binary note - AP_REQUEST_STRONG_ETAG - allows modules to force
>   the ETag to a strong ETag to comply with RFC requirements, such as those
>   mandated by various WebDAV extensions.
>
> Modified:
>     httpd/httpd/trunk/CHANGES
>     httpd/httpd/trunk/docs/manual/mod/core.xml
>     httpd/httpd/trunk/include/ap_mmn.h
>     httpd/httpd/trunk/include/http_core.h
>     httpd/httpd/trunk/include/http_protocol.h
>     httpd/httpd/trunk/include/httpd.h
>     httpd/httpd/trunk/modules/dav/fs/repos.c
>     httpd/httpd/trunk/modules/test/mod_dialup.c
>     httpd/httpd/trunk/server/core.c
>     httpd/httpd/trunk/server/util_etag.c
>

> Modified: httpd/httpd/trunk/server/util_etag.c
> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/util_etag.c?rev=1879285&r1=1879284&r2=1879285&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/server/util_etag.c (original)
> +++ httpd/httpd/trunk/server/util_etag.c Sat Jun 27 23:41:00 2020

> @@ -73,13 +99,96 @@ AP_DECLARE(char *) ap_make_etag(request_
>      cfg = (core_dir_config *)ap_get_core_module_config(r->per_dir_config);
>      etag_bits = (cfg->etag_bits & (~ cfg->etag_remove)) | cfg->etag_add;
>  
> +    if (er->force_weak) {
> +        weak = ETAG_WEAK;
> +        weak_len = sizeof(ETAG_WEAK);
> +    }
> +
> +    if (r->vlist_validator) {
> +
> +        /* If we have a variant list validator (vlv) due to the
> +         * response being negotiated, then we create a structured
> +         * entity tag which merges the variant etag with the variant
> +         * list validator (vlv).  This merging makes revalidation
> +         * somewhat safer, ensures that caches which can deal with
> +         * Vary will (eventually) be updated if the set of variants is
> +         * changed, and is also a protocol requirement for transparent
> +         * content negotiation.
> +         */
> +
> +        /* if the variant list validator is weak, we make the whole
> +         * structured etag weak.  If we would not, then clients could
> +         * have problems merging range responses if we have different
> +         * variants with the same non-globally-unique strong etag.
> +         */
> +
> +        vlv = r->vlist_validator;
> +        if (vlv[0] == 'W') {
> +            vlv += 3;
> +            weak = ETAG_WEAK;
> +            weak_len = sizeof(ETAG_WEAK);
> +        }
> +        else {
> +            vlv++;
> +        }
> +        vlv_len = strlen(vlv);
> +
> +    }
> +    else {
> +        vlv = NULL;
> +        vlv_len = 0;
> +    }
> +
> +    /*
> +     * Did a module flag the need for a strong etag, or did the
> +     * configuration tell us to generate a digest?
> +     */
> +    if (er->finfo->filetype == APR_REG &&
> +            (AP_REQUEST_IS_STRONG_ETAG(r) || (etag_bits & ETAG_DIGEST))) {
> +
> +        apr_sha1_ctx_t context;
> +        unsigned char buf[4096]; /* keep this a multiple of 64 */
> +        unsigned char digest[APR_SHA1_DIGESTSIZE];
> +        apr_file_t *fd = NULL;
> +
> +        apr_size_t nbytes;
> +        apr_off_t offset = 0L;
> +
> +        if (er->fd) {
> +            fd = er->fd;
> +        }
> +        else if (er->pathname) {
> +             apr_file_open(&fd, er->pathname, APR_READ | APR_BINARY, 0, r->pool);
> +        }
> +        if (!fd) {
> +            return "";
> +        }
> +
> +        etag = apr_palloc(r->pool, weak_len + sizeof("\"\"") +
> +                4*(APR_SHA1_DIGESTSIZE/3) + vlv_len + 4);

Using apr_base64_encode_len in the formula above would make it easier to understand and read IMHO.

> +
> +        apr_sha1_init(&context);
> +        nbytes = sizeof(buf);
> +        while (apr_file_read(fd, buf, &nbytes) == APR_SUCCESS) {

I assume that the code has been taken from ap_md5digest, but
if we have MMAP available and if we have not disabled MMAP we use MMAP to read the contents of the file
if sendfile is not possible (e.g. SSL connections).
Wouldn't using MMAP more performant here especially in case of larger files?

> +            apr_sha1_update_binary(&context, buf, nbytes);
> +            nbytes = sizeof(buf);
> +        }
> +        apr_file_seek(fd, APR_SET, &offset);

Why do we always reset the file pointer to 0? Why don't we get the actual file pointer of fd before we do the reading
and restore it afterwards?

> +        apr_sha1_final(digest, &context> +
> +        etag_start(etag, weak, &next);
> +        next += apr_base64_encode_binary(next, digest, APR_SHA1_DIGESTSIZE) - 1;
> +        etag_end(next, vlv, vlv_len);
> +
> +        return etag;
> +    }
> +
>      /*
>       * If it's a file (or we wouldn't be here) and no ETags
>       * should be set for files, return an empty string and
>       * note it for the header-sender to ignore.
>       */
>      if (etag_bits & ETAG_NONE) {
> -        apr_table_setn(r->notes, "no-etag", "omit");
>          return "";
>      }
>  


Regards

Rüdiger
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Yann Ylavic
On Sun, Jun 28, 2020 at 1:41 AM <[hidden email]> wrote:
>
> Author: minfrin
> Date: Sat Jun 27 23:41:00 2020
> New Revision: 1879285
>
> URL: http://svn.apache.org/viewvc?rev=1879285&view=rev
[]
> --- httpd/httpd/trunk/include/httpd.h (original)
> +++ httpd/httpd/trunk/include/httpd.h Sat Jun 27 23:41:00 2020
[]

> +/**
> + * @defgroup bnotes Binary notes recognized by the server
> + * @ingroup APACHE_CORE_DAEMON
> + * @{
> + *
> + * @brief Binary notes recognized by the server.
> + */
> +
> +/**
> + * The type used for request binary notes.
> + */
> +typedef apr_uint64_t ap_request_bnotes_t;
> +
> +/**
> + * These constants represent bitmasks for notes associated with this
> + * request. There are space for 64 bits in the apr_uint64_t.
> + *
> + */
> +#define AP_REQUEST_STRONG_ETAG 1 >> 0
> +
> +/**
> + * This is a convenience macro to ease with getting specific request
> + * binary notes.
> + */
> +#define AP_REQUEST_GET_BNOTE(r, mask) \
> +    ((mask) & ((r)->bnotes))
> +
> +/**
> + * This is a convenience macro to ease with setting specific request
> + * binary notes.
> + */
> +#define AP_REQUEST_SET_BNOTE(r, mask, val) \
> +    (r)->bnotes = (((r)->bnotes & ~(mask)) | (val))
> +
> +/**
> + * Returns true if the strong etag flag is set for this request.
> + */
> +#define AP_REQUEST_IS_STRONG_ETAG(r) \
> +        AP_REQUEST_GET_BNOTE((r), AP_REQUEST_STRONG_ETAG)
> +/** @} */
> +
>
>  /**
>   * @defgroup module_magic Module Magic mime types
> @@ -1097,6 +1138,11 @@ struct request_rec {
>       *  TODO: compact elsewhere
>       */
>      unsigned int flushed:1;
> +    /** Request flags associated with this request. Use
> +     * AP_REQUEST_GET_FLAGS() and AP_REQUEST_SET_FLAGS() to access
> +     * the elements of this field.
> +     */
> +    ap_request_bnotes_t bnotes;
>  };

Can't we use a single bitfield (like "flushed" above) for the single
AP_REQUEST_STRONG_ETAG flag?


Regards;
Yann.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Graham Leggett
On 29 Jun 2020, at 13:08, Yann Ylavic <[hidden email]> wrote:

>> /**
>>  * @defgroup module_magic Module Magic mime types
>> @@ -1097,6 +1138,11 @@ struct request_rec {
>>      *  TODO: compact elsewhere
>>      */
>>     unsigned int flushed:1;
>> +    /** Request flags associated with this request. Use
>> +     * AP_REQUEST_GET_FLAGS() and AP_REQUEST_SET_FLAGS() to access
>> +     * the elements of this field.
>> +     */
>> +    ap_request_bnotes_t bnotes;
>> };
>
> Can't we use a single bitfield (like "flushed" above) for the single
> AP_REQUEST_STRONG_ETAG flag?

Yes we can and should (but in separate commits).

I have my eye on the r->proxyreq flag, we can pack this into the binary notes too, values don’t need to be one bit wide.

Regards,
Graham


Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Yann Ylavic
On Mon, Jun 29, 2020 at 1:59 PM Graham Leggett <[hidden email]> wrote:

>
> On 29 Jun 2020, at 13:08, Yann Ylavic <[hidden email]> wrote:
>
> >> /**
> >>  * @defgroup module_magic Module Magic mime types
> >> @@ -1097,6 +1138,11 @@ struct request_rec {
> >>      *  TODO: compact elsewhere
> >>      */
> >>     unsigned int flushed:1;
> >> +    /** Request flags associated with this request. Use
> >> +     * AP_REQUEST_GET_FLAGS() and AP_REQUEST_SET_FLAGS() to access
> >> +     * the elements of this field.
> >> +     */
> >> +    ap_request_bnotes_t bnotes;
> >> };
> >
> > Can't we use a single bitfield (like "flushed" above) for the single
> > AP_REQUEST_STRONG_ETAG flag?
>
> Yes we can and should (but in separate commits).
>
> I have my eye on the r->proxyreq flag, we can pack this into the binary notes too, values don’t need to be one bit wide.

Actually I was thinking the other way around, have the new "unsigned
int strong_etag:1" bitfield rather than changing the existing ones...
Why adding complexity with bit(s) macros while bitfields exist?


Regards;
Yann.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Graham Leggett
In reply to this post by Ruediger Pluem
On 29 Jun 2020, at 11:41, Ruediger Pluem <[hidden email]> wrote:

+        etag = apr_palloc(r->pool, weak_len + sizeof("\"\"") +
+                4*(APR_SHA1_DIGESTSIZE/3) + vlv_len + 4);

Using apr_base64_encode_len in the formula above would make it easier to understand and read IMHO.

It would also mean it could no longer be optimised to a constant. What side do we want to fall on?

+
+        apr_sha1_init(&context);
+        nbytes = sizeof(buf);
+        while (apr_file_read(fd, buf, &nbytes) == APR_SUCCESS) {

I assume that the code has been taken from ap_md5digest, but
if we have MMAP available and if we have not disabled MMAP we use MMAP to read the contents of the file
if sendfile is not possible (e.g. SSL connections).
Wouldn't using MMAP more performant here especially in case of larger files?

It makes sense, but I would want to change something like this separately.

+            apr_sha1_update_binary(&context, buf, nbytes);
+            nbytes = sizeof(buf);
+        }
+        apr_file_seek(fd, APR_SET, &offset);

Why do we always reset the file pointer to 0? Why don't we get the actual file pointer of fd before we do the reading
and restore it afterwards?

I am guessing that the why is lost in the mists of time. As a guess, it avoids an additional call.

Using the actual file pointer is cleaner, as there is a guarantee that the function does not have any side effects.


Regards,
Graham

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Graham Leggett
In reply to this post by Yann Ylavic
On 29 Jun 2020, at 14:49, Yann Ylavic <[hidden email]> wrote:

Yes we can and should (but in separate commits).

I have my eye on the r->proxyreq flag, we can pack this into the binary notes too, values don’t need to be one bit wide.

Actually I was thinking the other way around, have the new "unsigned
int strong_etag:1" bitfield rather than changing the existing ones...
Why adding complexity with bit(s) macros while bitfields exist?

The problem with bitfields in the public APIs is that they’re not binary compatible across compilers. While it is very rare that a module will be built with a different compiler than httpd was, it’s still theoretically possible, and we should probably avoid it. Bitfields aren’t a problem for in-module or in-core code.

Regards,
Graham

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Ruediger Pluem
In reply to this post by Graham Leggett


On 6/29/20 4:08 PM, Graham Leggett wrote:
> On 29 Jun 2020, at 11:41, Ruediger Pluem <[hidden email] <mailto:[hidden email]>> wrote:
>
>>> +        etag = apr_palloc(r->pool, weak_len + sizeof("\"\"") +
>>> +                4*(APR_SHA1_DIGESTSIZE/3) + vlv_len + 4);
>>
>> Using apr_base64_encode_len in the formula above would make it easier to understand and read IMHO.
>
> It would also mean it could no longer be optimised to a constant. What side do we want to fall on?

Fair enough. The above is faster. Nevertheless I would prefer to group that better. Probably a separate define
SHA1_DIGEST_BASE64_LEN that only defines the base64 length of the base64 encode SHA1 digest. This would mean
more calc work for the compiler and not during runtime, but I guess this is acceptable.

>
>>> +
>>> +        apr_sha1_init(&context);
>>> +        nbytes = sizeof(buf);
>>> +        while (apr_file_read(fd, buf, &nbytes) == APR_SUCCESS) {
>>
>> I assume that the code has been taken from ap_md5digest, but
>> if we have MMAP available and if we have not disabled MMAP we use MMAP to read the contents of the file
>> if sendfile is not possible (e.g. SSL connections).
>> Wouldn't using MMAP more performant here especially in case of larger files?
>
> It makes sense, but I would want to change something like this separately.

Makes sense.
Do you see a possibility to merge this code and the one of ap_md5digest to a more generic procedure that
allows to choose the digest algorithm while using 'MMAPED' reads?
BTW: Is sha1 mandatory for strong etags? If not wouldn't MD5 be enough and if MD5 is seen as too insecure
why isn't sha1?

>>> +            apr_sha1_update_binary(&context, buf, nbytes);
>>> +            nbytes = sizeof(buf);
>>> +        }
>>> +        apr_file_seek(fd, APR_SET, &offset);
>>
>> Why do we always reset the file pointer to 0? Why don't we get the actual file pointer of fd before we do the reading
>> and restore it afterwards?
>
> I am guessing that the why is lost in the mists of time. As a guess, it avoids an additional call.
>
> Using the actual file pointer is cleaner, as there is a guarantee that the function does not have any side effects.
>
> http://svn.apache.org/viewvc?rev=1879332&view=rev

Thanks. Another option that just came to mind would it be also fine in case that we get an fd passed to do
an apr_file_dup() on that descriptor such that we also do not touch the actual buffering (if enabled) of the used fd?
Furthermore don't we need to do a seek to 0 before we start reading? Who tells us that the filepointer is at 0 for a passed fd?


Regards

Rüdiger
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Jim Jagielski
In reply to this post by Graham Leggett


On Jun 29, 2020, at 10:27 AM, Graham Leggett <[hidden email]> wrote:

On 29 Jun 2020, at 14:49, Yann Ylavic <[hidden email]> wrote:

Yes we can and should (but in separate commits).

I have my eye on the r->proxyreq flag, we can pack this into the binary notes too, values don’t need to be one bit wide.

Actually I was thinking the other way around, have the new "unsigned
int strong_etag:1" bitfield rather than changing the existing ones...
Why adding complexity with bit(s) macros while bitfields exist?

The problem with bitfields in the public APIs is that they’re not binary compatible across compilers. While it is very rare that a module will be built with a different compiler than httpd was, it’s still theoretically possible, and we should probably avoid it. Bitfields aren’t a problem for in-module or in-core code.

++1

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Graham Leggett
In reply to this post by Ruediger Pluem
On 29 Jun 2020, at 16:37, Ruediger Pluem <[hidden email]> wrote:

> Makes sense.
> Do you see a possibility to merge this code and the one of ap_md5digest to a more generic procedure that
> allows to choose the digest algorithm while using 'MMAPED' reads?
> BTW: Is sha1 mandatory for strong etags? If not wouldn't MD5 be enough and if MD5 is seen as too insecure
> why isn't sha1?

I chose sha1 as it was a) widely available in APR and b) better than md5, but that was it.

I am wondering if for 2.4 if we use md5 instead, and then set Content-MD5 at the same time in the same code instead of calculating the md5 twice.

Then - as per the removal of Content-MD5 from https://tools.ietf.org/html/rfc7231#appendix-B - we separately switch it to sha1 and remove Content-MD5 in trunk.

Is that sane?

Regards,
Graham


Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Ruediger Pluem


On 7/3/20 10:54 AM, Graham Leggett wrote:

> On 29 Jun 2020, at 16:37, Ruediger Pluem <[hidden email]> wrote:
>
>> Makes sense.
>> Do you see a possibility to merge this code and the one of ap_md5digest to a more generic procedure that
>> allows to choose the digest algorithm while using 'MMAPED' reads?
>> BTW: Is sha1 mandatory for strong etags? If not wouldn't MD5 be enough and if MD5 is seen as too insecure
>> why isn't sha1?
>
> I chose sha1 as it was a) widely available in APR and b) better than md5, but that was it.
>
> I am wondering if for 2.4 if we use md5 instead, and then set Content-MD5 at the same time in the same code instead of calculating the md5 twice.
>
> Then - as per the removal of Content-MD5 from https://tools.ietf.org/html/rfc7231#appendix-B - we separately switch it to sha1 and remove Content-MD5 in trunk.

Thanks for the pointer. Is Content-MD5 really used? And given that it has been removed in the RFC my approach would be as follows:

1. Continue with your new additions as is. Do not try to merge any of this code with Content-MD5 related content.
2. Backport them.
3. Leave Content-MD5 untouched in 2.4.x.
4. Remove Content-MD5 in trunk.

Regards

Rüdiger

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1879285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/core.xml include/ap_mmn.h include/http_core.h include/http_protocol.h include/httpd.h modules/dav/fs/repos.c modules/test/mod_dialup.c server/core.c server/util_etag.c

Graham Leggett
On 03 Jul 2020, at 11:19, Ruediger Pluem <[hidden email]> wrote:

Thanks for the pointer. Is Content-MD5 really used? And given that it has been removed in the RFC my approach would be as follows:

1. Continue with your new additions as is. Do not try to merge any of this code with Content-MD5 related content.
2. Backport them.
3. Leave Content-MD5 untouched in 2.4.x.
4. Remove Content-MD5 in trunk.

Thought about it some more and was going to suggest the above instead, +1.

Regards,
Graham