cache-size implementation downsides
John Keeping
john at keeping.me.uk
Wed Jun 20 09:44:10 CEST 2018
On Wed, Jun 20, 2018 at 08:01:11AM +0200, Christian Hesse wrote:
> John Keeping <john at keeping.me.uk> on Sat, 2018/06/16 16:46:
> > -- >8 --
> > Subject: [PATCH] cache: close race window when unlocking slots
>
> You should add a "From:" line for easy git-am. ;)
"git am --scissors" will pick it up from the message headers, won't it?
> > We use POSIX advisory record locks to control access to cache slots, but
> > these have an unhelpful behaviour in that they are released when any
> > file descriptor referencing the file is closed by this process.
> >
> > Mostly this is okay, since we know we won't be opening the lock file
> > anywhere else, but there is one place that it does matter: when we
> > restore stdout we dup2() over a file descriptor referring to the file,
> > thus closing that descriptor.
> >
> > Since we restore stdout before unlocking the slot, this creates a window
> > during which the slot content can be overwritten. The fix is reasonably
> > straightforward: simply restore stdout after unlocking the slot, but the
> > diff is a bit bigger because this requires us to move the temporary
> > stdout FD into struct cache_slot.
> >
> > Signed-off-by: John Keeping <john at keeping.me.uk>
>
> Reviewed-by: Christian Hesse <mail at eworm.de>
Thanks for the review!
> > ---
> > cache.c | 37 ++++++++++++++-----------------------
> > 1 file changed, 14 insertions(+), 23 deletions(-)
> >
> > diff --git a/cache.c b/cache.c
> > index 0901e6e..2c70be7 100644
> > --- a/cache.c
> > +++ b/cache.c
> > @@ -29,6 +29,7 @@ struct cache_slot {
> > cache_fill_fn fn;
> > int cache_fd;
> > int lock_fd;
> > + int stdout_fd;
> > const char *cache_name;
> > const char *lock_name;
> > int match;
>
> Not relevant for this commit, but... Is there a reason that struct cache_slot
> lives in cache.c, not cache.h?
I guess because it's only needed in this file, so we can encapsulate it
here and avoid leaking the implementation details to other parts of the
code.
But this predates my involvement with CGit by a few years.
> > @@ -197,6 +198,13 @@ static int unlock_slot(struct cache_slot *slot, int
> > replace_old_slot) else
> > err = unlink(slot->lock_name);
> >
> > + /* Restore stdout and close the temporary FD. */
> > + if (slot->stdout_fd >= 0) {
> > + dup2(slot->stdout_fd, STDOUT_FILENO);
> > + close(slot->stdout_fd);
> > + slot->stdout_fd = -1;
> > + }
> > +
> > if (err)
> > return errno;
> >
> > @@ -208,42 +216,24 @@ static int unlock_slot(struct cache_slot *slot, int
> > replace_old_slot) */
> > static int fill_slot(struct cache_slot *slot)
> > {
> > - int tmp;
> > -
> > /* Preserve stdout */
> > - tmp = dup(STDOUT_FILENO);
> > - if (tmp == -1)
> > + slot->stdout_fd = dup(STDOUT_FILENO);
> > + if (slot->stdout_fd == -1)
> > return errno;
> >
> > /* Redirect stdout to lockfile */
> > - if (dup2(slot->lock_fd, STDOUT_FILENO) == -1) {
> > - close(tmp);
> > + if (dup2(slot->lock_fd, STDOUT_FILENO) == -1)
> > return errno;
> > - }
> >
> > /* Generate cache content */
> > slot->fn();
> >
> > /* Make sure any buffered data is flushed to the file */
> > - if (fflush(stdout)) {
> > - close(tmp);
> > + if (fflush(stdout))
> > return errno;
> > - }
> >
> > /* update stat info */
> > - if (fstat(slot->lock_fd, &slot->cache_st)) {
> > - close(tmp);
> > - return errno;
> > - }
> > -
> > - /* Restore stdout */
> > - if (dup2(tmp, STDOUT_FILENO) == -1) {
> > - close(tmp);
> > - return errno;
> > - }
> > -
> > - /* Close the temporary filedescriptor */
> > - if (close(tmp))
> > + if (fstat(slot->lock_fd, &slot->cache_st))
> > return errno;
> >
> > return 0;
> > @@ -393,6 +383,7 @@ int cache_process(int size, const char *path, const
> > char *key, int ttl, strbuf_addstr(&lockname, ".lock");
> > slot.fn = fn;
> > slot.ttl = ttl;
> > + slot.stdout_fd = -1;
> > slot.cache_name = filename.buf;
> > slot.lock_name = lockname.buf;
> > slot.key = key;
More information about the CGit
mailing list