Performance issue
John Keeping
john at keeping.me.uk
Fri Jul 8 15:23:43 CEST 2016
On Fri, Jul 08, 2016 at 02:45:49PM +0200, Miroslav Suchý wrote:
> I use cgit as WebUI for dist-git of Copr [1]. There are 136000 git
> repositories (and growing). My problem is that no matter how
> aggressive caching in /etc/cgitrc is used, it takes enormous time to
> generate initial /var/cache/cgit/rc-* file where are those "repo.*"
> configurations. And by enormous I mean 30 minutes.
>
> I came up with one solution. Set TTL to 2 hours and regenerate the
> cgitrc from cron every hour. This way the cgitrc will never be
> generated by user coming from httpd request.
>
> I can generate that cgitrc in cron job manually by running:
>
> QUERY_STRING="url=frostyx/new7/rubygem-active_null.git/commit/&id=b3ceddf17119bc4c9b249fe1b63659039e282c99"
> CGIT_CONFIG="/etc/cgitrc" /var/www/cgi-bin/cgit >/tmp/x.html
>
> The problem is that even with --nocache it does not refresh existing
> /var/cache/cgit/rc-* file. The only way to refresh the cgitrc file is
> to wait till it become older than TTL or delete it. But until it is
> regenerated the users who access my server, will take it down by
> filling all apache slots with running cgit (which will traverse all
> git repositories).
>
> I am thinking about implementing new option. E.g. --update-scan-path,
> which will force cgit to scan 'scan-path', create the include cgitrc
> file in tempfile and at the and it will remove original
> /var/cache/cgit/rc-* and rename the newly created cgitrc to that rc-*
> file. So it will be nearly atomic operation.
Can't you already do this by removing scan-path from your config and
instead adding something like:
include /path/to/my/repo-list
and you can generate the repo-list file with:
cgit --scan-path=/path/to/repositories >repo-list
It's not quite the same because the rest of the configuration won't have
been loaded but I think we'd rather improve this mechanism that add
manual cache mangement.
More information about the CGit
mailing list