Problem with cgit cache
john at keeping.me.uk
Mon Jan 19 20:50:12 CET 2015
On Mon, Jan 19, 2015 at 02:17:00PM -0500, Eclipse Webmaster (Denis Roy) wrote:
> We use cgit for about 800 Git repos. Lately we've noticed that the links
> in the cache become polluted. We've noticed hits like this in the logs,
> which come from Search Bots, which seem to match the garbage in the
> cache links:
> GET /c/set%7Cset%26set/org....
> GET /c/%0aset%7cset%26set%0a/org....
> (we serve cgit from /c/)
> If I clear the cache entries, all is well until these bots come along
> and pollute it again. If I set cache-size=0 everything works well,
> albeit much slower.
> Is this a known bug in cgit? For now I've added some Apache
> RewriteRules so that these hits don't reach cgit, but it would be nice
> if cgit could deal with these.
> You can read more on our bug tracker, here:
Although you seem to have ruled it out, I think storing the cache on NFS
is likely to be problematic.
A quick search found some documentation ,  on problems with
sendfile(2) and NFS. You could try editing cgit.mk to comment out the
HAVE_LINUX_SENDFILE define, but I would recommend avoiding NFS for the
cache if possible.
I have tried a quick test and wasn't able to reproduce your error, but I
will try to find some time to investigate further and see if there is a
problem with certain requests.
More information about the CGit