On 16/11/2022 16:22, René Ferdinand Rivera Morell wrote:
At which point we concluded that approach was not viable. As we were using more bandwidth than the entirety of all the other jFrog downloads combined (this may be an exaggeration). At which point we started changing the various CI methods to selectively git clone Boost (there's this great tool that gets just the right projects you need). And we haven't had download problems since then. It seems likely that a full clean git clone (even at --depth 1) would consume a significantly larger amount of bandwidth than downloading tarballs would. But github has no specific bandwidth limits other than "if we notice you, we might do something about it", so you may get away with things you shouldn't.
Either method could be improved with caching -- the tarball could be cached for all individual library CI jobs from the same commit, and one set of git clones could be reused similarly. Git does win a bit more when only building a subset of libraries, or if the CI caches the clone and performs an incremental checkout rather than a full reclone (although that only provides benefit when *not* using --depth 1, which is otherwise better). (These *could* be supported with tarballs too, but only with more difficulty and complex delta-file management, which isn't really worth the effort when an alternative exists.) Whichever method is used, it should try to "play nice" and reuse as much as it can to avoid abusing cloud resources. (And this also improves speed, so is good as a selfish goal too.) Of course, this sort of caching and reuse is very hard to achieve with a generic cloud CI rather than a custom host.