Troubles using special characters in links with lighttpd

John Keeping john at keeping.me.uk
Fri May 15 14:39:11 CEST 2015


On Fri, May 15, 2015 at 01:24:29PM +0100, John Keeping wrote:
> On Fri, May 15, 2015 at 02:08:23PM +0200, David Demelier wrote:
> > I'm trying to setup cgit with lighttpd, this is my actual
> > configuration :
> > 
> > $HTTP["host"] == "git.malikania.fr" {
> >         alias.url = (
> >                 "/static/" => "/usr/local/www/cgit/",
> >                 "/cgit.cgi" => "/usr/local/www/cgit/cgit.cgi",
> >         )
> >         url.rewrite-once =
> > (                                                                                                                                                                     
> >                 "^/static/.*$" => "$0",
> >                 "^/([^?/]+/[^?]*)?(?:\?(.*))?$" => "/cgit.cgi?url=$1&
> > $2",
> >         )
> >         server.document-root = "/usr/local/www/cgit"
> >         cgi.assign = ( ".cgi" => "/usr/local/www/cgit/cgit.cgi" )
> > }
> > 
> > 
> > I have found several documents about that, this is mainly written from
> > [1].
> > 
> > Almost everythin works, except directories that contains special
> > characters that should be escaped. For example, I have a directory named
> > "C++", and this one will not work from the rewrite rule:
> > 
> > http://git.malikania.fr/code/tree/
> > 
> > If you click on C++, you get an empty directory, however, the resolved
> > link with appropriate characters works fine:
> > 
> > http://git.malikania.fr/code/tree/C%2b%2b
> > 
> > I have no idea how to fix that, is it a lighttpd problem or a cgit
> > configuration missing?
> 
> I think CGit should be encoding the '+' in the path here, but according
> to [1] that isn't required in the path element of a URL, so something
> else must be wrong.

I looked a bit closer and noticed that you have "C" and "C++"
directories; the page for "C++" looks like it shows the path "root/C" at
the top, but the content is different from the proper "C" page.  "View
Source" shows what's happening:

	<a href="/code/tree/C%20%20">C  </a>

So I think what's happening is that you rewrite the URL from:

	/code/tree/C++

to:

	/cgit.cgi?url=code/tree/C++

but now the path is in the query part not the path part and the escaping
rules are different, so "+" translates to " ".

I'm not sure if lighttpd will allow you to call a CGI script with a
path.  In Apache you can do this:

	RewriteRule ^/var/www/cgit(.*) /cgi-bin/cgit.cgi$1 [L,PT]

The patch below is still a good idea, but both CGit and lighttpd are
behaving correctly, the problem is using a rewrite rule that moves
something from the path part of a URL to the query part without taking
account of the different rules for escaping.

> The patch below should cause CGit to escape the '+', but I haven't had
> time to analyze the full implications, so I'm not sure if it will break
> anything else (a cursory look suggests it will be OK but I want to spend
> a bit more time examining all the callers before sending a proper
> patch).
> 
> [1] http://blog.lunatech.com/2009/02/03/what-every-web-developer-must-know-about-url-encoding
> 
> -- >8 --
> diff --git a/html.c b/html.c
> index 155cde5..c61db1c 100644
> --- a/html.c
> +++ b/html.c
> @@ -222,7 +222,7 @@ void html_url_path(const char *txt)
>  	while (t && *t) {
>  		unsigned char c = *t;
>  		const char *e = url_escape_table[c];
> -		if (e && c != '+' && c != '&') {
> +		if (e && c != '&') {
>  			html_raw(txt, t - txt);
>  			html(e);
>  			txt = t + 1;
> _______________________________________________
> CGit mailing list
> CGit at lists.zx2c4.com
> http://lists.zx2c4.com/mailman/listinfo/cgit


More information about the CGit mailing list