RFC: don't cache objects larger than X

Mon Oct 17 19:56:17 CEST 2016

On Wed, 12 Oct 2016 at 13:22:34, Jason A. Donenfeld wrote:
> I face this same problem, in fact. Unless somebody beats me to it, I'd
> be interested in giving this a stab.
> 
> One issue is that cache entries are currently "streamed" into the
> cache files, as they're produced. It's not trivially possible to know
> how big it's going to be beforehand. This means that the best we could
> do would be to just immediately unlink it after creation and printing.
> Would this be acceptable?

It is not easy to compute the exact size of the generated page but we
are able to detect huge objects before streaming -- the size of the
object is already returned by read_sha1_file().

I wonder whether the max-blob-size setting already does what you want,
though? It does not only affect the cached version but it seems better
to prevent from generating such huge pages in the first place. If you
really want to offer such files to your users, the max_blob_size check
in print_object() might be a good place to add the "print but do not
cache large files" functionality.

Regards,
Lukas