Supporting Namespaces in cgit

John Keeping john at keeping.me.uk
Tue May 10 15:21:36 CEST 2016


On Mon, May 09, 2016 at 10:54:44PM +0100, Daniel Silverstone wrote:
> On Mon, May 09, 2016 at 22:31:37 +0100, John Keeping wrote:
> > Implementation-wise, it looks like using a namespace should just be a
> > matter of setting GIT_NAMESPACE in the environment near the top of
> > cgit.c::prepare_repo_cmd().
> 
> This is certainly the basic starting point.
> 
> > Discovering namespaces is more interesting, since we can't know what
> > exactly is a namespace.  For example, if we have:
> > 
> > 	refs/namespaces/foo/bar/baz
> > 
> > is the namespace "foo" or "foo/bar"?  Maybe checking for "heads" and
> > "tags" subdirectories is enough, but I'm not familiar enough with
> > namespaces to know if those will definitely exist, and obviously users
> > can create or delete any directories anywhere in the hierarchy.
> 
> I'd not attempt to discover namespaces.  I think if you're given a namespace to
> use in the repo stanza you use it, otherwise current behaviour prevails.
> 
> > Also, any attempt to discover namespaces during automated repository
> > discovery (i.e. cgitrc's "scan-tree") is likely to be quite expensive
> > with reading packed-refs and the whole loose refs tree.  However, it
> > sounds like Gitano probably generates an explicit repository list, in
> > which case a "repo.namespace" config key should be usable.
> 
> Yes, that's the intended behaviour.  I wouldn't expect cgit to be able to
> invent namespace understanding out of nothing.
> 
> > If we can indeed ignore any attempt to discover namespaces and just use
> > "repo.namespace", is it enough to add that config value to
> > "struct cgit_repo" and then pass it to setenv() in prepare_repo_cmd()?
> 
> This is a necessary start, but it is not sufficient.  Elsewhere in the codebase
> changes will need to be made to use namespace aware ref iteration among other
> things.  In addition, if we wish to support agefile per-namespace then we need
> a repo.agefile option which can override the global option.  There may be more
> but right now I don't have them to mind because I've not fully scoured the
> codebase.

Ah, right.  I thought git.git's infrastructure might take care of
namespaces automatically, but only git-upload-pack and git-receive-pack
actually make use of namespaces so we'll have to do it ourselves.

Apart from enumeration, which should be fairly mechanical with
strip_namespace(), we'll need to prefix user-provided values with the
namespace.  I think the three relevant parameters (in
cgit.c::querystring_cb()) are "h", "id" and "id2"; currently we allow
each of those to contain either a named ref or a raw SHA-1, although we
generate only named refs for "h" and only SHA-1s for "id" and "id2".
And in fact ui-blob.c enforces that "id" contains a valid SHA-1.

So a simple implementation would just prefix "h" with
get_git_namespace() and call it done, but that risks information leakage
via "id" which is treated equivalently in most places (although as
gitnamespaces(7) points out anyone with write access to the repository
can already read whatever they want and in fact CGit imposes no access
checks if you give it a SHA-1, but at least that's slightly more obscure
than a ref name).

One approach to that would be to switch all the sites using "id" or
"id2" to get_sha1_hex() but I'm sure we have people generating URLs
using those parameters and relying on at least "id2" taking a ref rather
than a raw SHA-1.  I suspect it is simpler to replace calls to
get_sha1() with cgit_get_sha1() and apply the namespace prefix there if
the value is not a raw SHA-1.

> If you think it's worth our while implementing a proof-of-concept patch series
> then we'll give it a go.  I'm quite excited about being able to do this because
> it'll open up so many interesting options for me when Gitano can ACLs which are
> namespace aware :-)


More information about the CGit mailing list