Performance issue

John Keeping john at keeping.me.uk
Fri Jul 8 15:23:43 CEST 2016


On Fri, Jul 08, 2016 at 02:45:49PM +0200, Miroslav Suchý wrote:
> I use cgit as WebUI for dist-git of Copr [1].  There are 136000 git
> repositories (and growing).  My problem is that no matter how
> aggressive caching in /etc/cgitrc is used, it takes enormous time to
> generate initial /var/cache/cgit/rc-* file where are those "repo.*"
> configurations. And by enormous I mean 30 minutes.
> 
> I came up  with one solution. Set TTL to 2 hours and regenerate the
> cgitrc from cron every hour. This way the cgitrc will never be
> generated by user coming from httpd request.
> 
> I can generate that cgitrc in cron job manually by running:
> 
> QUERY_STRING="url=frostyx/new7/rubygem-active_null.git/commit/&id=b3ceddf17119bc4c9b249fe1b63659039e282c99"
> CGIT_CONFIG="/etc/cgitrc" /var/www/cgi-bin/cgit >/tmp/x.html
> 
> The problem is that even with --nocache it does not refresh existing
> /var/cache/cgit/rc-* file. The only way to refresh the cgitrc file is
> to wait till it become older than TTL or delete it. But until it is
> regenerated the users who access my server, will take it down by
> filling all apache slots with running cgit (which will traverse all
> git repositories).
> 
> I am thinking about implementing new option. E.g. --update-scan-path,
> which will force cgit to scan 'scan-path', create the include cgitrc
> file in tempfile and at the and it will remove original
> /var/cache/cgit/rc-* and rename the newly created cgitrc to that rc-*
> file. So it will be nearly atomic operation.

Can't you already do this by removing scan-path from your config and
instead adding something like:

	include /path/to/my/repo-list

and you can generate the repo-list file with:

	cgit --scan-path=/path/to/repositories >repo-list

It's not quite the same because the rest of the configuration won't have
been loaded but I think we'd rather improve this mechanism that add
manual cache mangement.


More information about the CGit mailing list