[PATCH] Handle If-None-Match HTTP header in plain view

Damián Nohales damiannohales at gmail.com
Tue Aug 12 23:53:01 CEST 2014


2014-08-12 16:15 GMT-03:00 John Keeping <john at keeping.me.uk>:
> On Tue, Aug 12, 2014 at 12:38:35PM -0300, Damián Nohales wrote:
>> Ah... I got it.
>>
>> We could implement cache disabling per command to disable it for the
>> plain command. I don't know if there is another way since Etag is
>> calculated at command execution.
>
> Another way to do it would be to only handle "If-None-Match" if the
> content is already cached and then parse the Etag header from the cached
> content.

Uhm... I disagree, because this is related to what the client has in
its cache, not server cache. If client requests an Etag and we
calculate the same Etag in the server for the latest content in the
URL, we can assume that the client cache is up-to-date for this URL
independently on what is in the server cache slot for that URL.

>
> I think it comes back to what Lars wrote in the commit that added the
> current level of Etag support (commit 488a214):
>
>         Todo: add support for HEAD requests...
>

I don't know what Lars means with that. In the current version
(without this patch), CGit sends Etag's for GET and HEAD request. The
problem is that the support for Etag's is incomplete if we don't
handle If-None-Match or If-Match.

> If we have sufficient infrastructure to handle HEAD requests then it
> should be trivial to add proper Etag handling on top, but I don't think
> it's trivial to add that.

With this patch, the Etag verification works on both GET and HEAD
requests, CGit only disable cache and force the exit after
cgit_print_http_headers when the request is HEAD, that doesn't
interfere with the Etag calculation.

What interfere with this is the server's cache. To do it right with
interoperability with server cache, we need to save to the cache the
"200 OK" response independently on Etag matches so we can retrieve
Etag by parsing the cache instead of calculating it in future request.
Then we need an extra step after getting the content from the cache
that decides between actually sends the content or sends a "304 Not
Modified".

I think that is overcomplicated and prevent us to send real time
updates for plain objects (anyways, I don't know if that is a goal)

>
>> 2014-08-12 6:00 GMT-03:00 John Keeping <john at keeping.me.uk>:
>> > On Mon, Aug 11, 2014 at 07:45:53PM -0300, Damián Nohales wrote:
>> >> (Sorry, I forgot to include the list address in my response.)
>> >>
>> >> This does not interact with CGit's cache at all, this interacts with
>> >> client side cache.
>> >>
>> >> If the client sends an Etag it means "Hey, I have the content cached
>> >> for this URL in the client and its Etag is abc123", so, for a given
>> >> URL, if server calculates the Etag and results to the same as the Etag
>> >> sent by client, it means "Ok, client has the same Etag than me, so
>> >> client's cached content is up-to-date".
>> >>
>> >> The Etag is calculated from the requested object hash. I'm not a Git
>> >> expert, but I know that the hash will be modified when the file is
>> >> modified by a commit.
>> >>
>> >> So, if we request a file, then, one year later, we request the same
>> >> file and we still have it cached with its Etag, CGit will respond 304
>> >> Not Modified if the file was not modified by newer commits (CGit
>> >> doesn't check its internal cache, it checks the client's cache).
>> >>
>> >> Actually, we should avoid send "Expires" and "Last-Modified" when
>> >> "Etag" is available, so the caching information is consistent (and
>> >> also, using Etag we have a more precise and smart caching).
>> >
>> > Yes, but I think the code you've changed comes after CGit has duplicated
>> > stdout into the cache file.  So if a request comes in with an
>> > If-None-Match header, then CGit will cache the 304 response and send it
>> > in response to all future request for the same page (until the cached
>> > copy times out).
>> >
>> >> 2014-08-11 18:32 GMT-03:00 John Keeping <john at keeping.me.uk>:
>> >> > On Mon, Aug 11, 2014 at 05:53:23PM -0300, Damián Nohales wrote:
>> >> >> We are sending Etag to clients but this header is basically unusefulness
>> >> >> if the server doesn't tell the client if the content has been changed or
>> >> >> not for a given Path/Etag pair.
>> >> >>
>> >> >> Signed-off-by: Damián Nohales <damiannohales at gmail.com>
>> >> >> ---
>> >> >
>> >> > How does this interact with CGit's cache?  What happens if the original
>> >> > page expires from the cache and then a request comes in with a matching
>> >> > Etag?
>> >> >
>> >> >>  cgit.c      |  1 +
>> >> >>  cgit.h      |  1 +
>> >> >>  ui-plain.c  | 41 +++++++++++++++++++++--------------------
>> >> >>  ui-shared.c | 20 ++++++++++++++++++++
>> >> >>  ui-shared.h |  1 +
>> >> >>  5 files changed, 44 insertions(+), 20 deletions(-)
>> >> >>
>> >> >> diff --git a/cgit.c b/cgit.c
>> >> >> index 8c4517d..7af7712 100644
>> >> >> --- a/cgit.c
>> >> >> +++ b/cgit.c
>> >> >> @@ -385,6 +385,7 @@ static void prepare_context(void)
>> >> >>       ctx.env.server_port = getenv("SERVER_PORT");
>> >> >>       ctx.env.http_cookie = getenv("HTTP_COOKIE");
>> >> >>       ctx.env.http_referer = getenv("HTTP_REFERER");
>> >> >> +     ctx.env.if_none_match = getenv("HTTP_IF_NONE_MATCH");
>> >> >>       ctx.env.content_length = getenv("CONTENT_LENGTH") ? strtoul(getenv("CONTENT_LENGTH"), NULL, 10) : 0;
>> >> >>       ctx.env.authenticated = 0;
>> >> >>       ctx.page.mimetype = "text/html";
>> >> >> diff --git a/cgit.h b/cgit.h
>> >> >> index 0badc64..eddd4c7 100644
>> >> >> --- a/cgit.h
>> >> >> +++ b/cgit.h
>> >> >> @@ -282,6 +282,7 @@ struct cgit_environment {
>> >> >>       const char *server_port;
>> >> >>       const char *http_cookie;
>> >> >>       const char *http_referer;
>> >> >> +     const char *if_none_match;
>> >> >>       unsigned int content_length;
>> >> >>       int authenticated;
>> >> >>  };
>> >> >> diff --git a/ui-plain.c b/ui-plain.c
>> >> >> index 30fff89..a08dc5b 100644
>> >> >> --- a/ui-plain.c
>> >> >> +++ b/ui-plain.c
>> >> >> @@ -103,8 +103,8 @@ static int print_object(const unsigned char *sha1, const char *path)
>> >> >>       ctx.page.filename = path;
>> >> >>       ctx.page.size = size;
>> >> >>       ctx.page.etag = sha1_to_hex(sha1);
>> >> >> -     cgit_print_http_headers();
>> >> >> -     html_raw(buf, size);
>> >> >> +     if (!cgit_print_http_headers_matching_etag())
>> >> >> +             html_raw(buf, size);
>> >> >>       /* If we allocated this, then casting away const is safe. */
>> >> >>       if (freemime)
>> >> >>               free((char*) ctx.page.mimetype);
>> >> >> @@ -128,24 +128,25 @@ static void print_dir(const unsigned char *sha1, const char *base,
>> >> >>       fullpath = buildpath(base, baselen, path);
>> >> >>       slash = (fullpath[0] == '/' ? "" : "/");
>> >> >>       ctx.page.etag = sha1_to_hex(sha1);
>> >> >> -     cgit_print_http_headers();
>> >> >> -     htmlf("<html><head><title>%s", slash);
>> >> >> -     html_txt(fullpath);
>> >> >> -     htmlf("</title></head>\n<body>\n<h2>%s", slash);
>> >> >> -     html_txt(fullpath);
>> >> >> -     html("</h2>\n<ul>\n");
>> >> >> -     len = strlen(fullpath);
>> >> >> -     if (len > 1) {
>> >> >> -             fullpath[len - 1] = 0;
>> >> >> -             slash = strrchr(fullpath, '/');
>> >> >> -             if (slash)
>> >> >> -                     *(slash + 1) = 0;
>> >> >> -             else
>> >> >> -                     fullpath = NULL;
>> >> >> -             html("<li>");
>> >> >> -             cgit_plain_link("../", NULL, NULL, ctx.qry.head, ctx.qry.sha1,
>> >> >> -                             fullpath);
>> >> >> -             html("</li>\n");
>> >> >> +     if (!cgit_print_http_headers_matching_etag()) {
>> >> >> +             htmlf("<html><head><title>%s", slash);
>> >> >> +             html_txt(fullpath);
>> >> >> +             htmlf("</title></head>\n<body>\n<h2>%s", slash);
>> >> >> +             html_txt(fullpath);
>> >> >> +             html("</h2>\n<ul>\n");
>> >> >> +             len = strlen(fullpath);
>> >> >> +             if (len > 1) {
>> >> >> +                     fullpath[len - 1] = 0;
>> >> >> +                     slash = strrchr(fullpath, '/');
>> >> >> +                     if (slash)
>> >> >> +                             *(slash + 1) = 0;
>> >> >> +                     else
>> >> >> +                             fullpath = NULL;
>> >> >> +                     html("<li>");
>> >> >> +                     cgit_plain_link("../", NULL, NULL, ctx.qry.head, ctx.qry.sha1,
>> >> >> +                                     fullpath);
>> >> >> +                     html("</li>\n");
>> >> >> +             }
>> >> >>       }
>> >> >>       free(fullpath);
>> >> >>  }
>> >> >> diff --git a/ui-shared.c b/ui-shared.c
>> >> >> index 9dde0a3..84c7efd 100644
>> >> >> --- a/ui-shared.c
>> >> >> +++ b/ui-shared.c
>> >> >> @@ -661,6 +661,26 @@ void cgit_print_http_headers(void)
>> >> >>               exit(0);
>> >> >>  }
>> >> >>
>> >> >> +int cgit_print_http_headers_matching_etag(void)
>> >> >> +{
>> >> >> +     int match = 0;
>> >> >> +     char *etag;
>> >> >> +     if (ctx.page.etag && ctx.env.if_none_match) {
>> >> >> +             etag = fmtalloc("\"%s\"", ctx.page.etag);
>> >> >> +             if (!strcmp(etag, ctx.env.if_none_match)) {
>> >> >> +                     ctx.page.status = 304;
>> >> >> +                     ctx.page.statusmsg = "Not Modified";
>> >> >> +                     ctx.page.mimetype = NULL;
>> >> >> +                     ctx.page.size = 0;
>> >> >> +                     ctx.page.filename = NULL;
>> >> >> +                     match = 1;
>> >> >> +             }
>> >> >> +             free(etag);
>> >> >> +     }
>> >> >> +     cgit_print_http_headers();
>> >> >> +     return match;
>> >> >> +}
>> >> >> +
>> >> >>  void cgit_print_docstart(void)
>> >> >>  {
>> >> >>       if (ctx.cfg.embedded) {
>> >> >> diff --git a/ui-shared.h b/ui-shared.h
>> >> >> index 3e7a91b..e279f42 100644
>> >> >> --- a/ui-shared.h
>> >> >> +++ b/ui-shared.h
>> >> >> @@ -60,6 +60,7 @@ extern void cgit_vprint_error(const char *fmt, va_list ap);
>> >> >>  extern void cgit_print_date(time_t secs, const char *format, int local_time);
>> >> >>  extern void cgit_print_age(time_t t, time_t max_relative, const char *format);
>> >> >>  extern void cgit_print_http_headers(void);
>> >> >> +extern int cgit_print_http_headers_matching_etag(void);
>> >> >>  extern void cgit_print_docstart(void);
>> >> >>  extern void cgit_print_docend();
>> >> >>  extern void cgit_print_pageheader(void);
>> >> >> --
>> >> >> 2.0.4
>> _______________________________________________
>> CGit mailing list
>> CGit at lists.zx2c4.com
>> http://lists.zx2c4.com/mailman/listinfo/cgit


More information about the CGit mailing list