Please do not overwrite the cache if it failed to download

josch's Avatar

josch

10 Dec, 2018 07:32 AM

Hi,

sometimes it happens that the cache fails to be downloaded:

https://ci.appveyor.com/project/josch/3dtk/builds/20877890

With an error like this:

Error downloading cache item: Unable to read data from the transport connection: The connection was closed.

So the cache will start from zero and the build will try to fill it again. Unfortunately, this does not work in cases where it takes multiple runs for the cache to be filled because it takes longer than an hour (the build timeout) to build all the vcpkg dependencies. So in this case, the cache will start from zero, get halfway built and then the halfway built cache is uploaded and *overwrites* the existing properly filled cache.

For theses cases I'd request an option that allows to either

 - abort the build if the cache couldn't be downloaded (I think that's a sane default) or
 - an option that allows to at least not overwrite the existing cache with a new one or
 - a way for my build script to find out whether the cache failed to download

Could such an option be added?

Thanks!

  1. 1 Posted by josch on 10 Dec, 2018 07:37 AM

    josch's Avatar

    Not only download fails. Sometimes extraction fails:

    https://ci.appveyor.com/project/josch/3dtk/builds/20889529

    With an error like this:

    Error uncompressing cache item: 7z.exe process has exited with code -1073741510. Check C:\Users\appveyor\AppData\Local\Temp\1\build-cache-logs\88e45141cae35ced15beaaf29b99d251bd277254.zip.001.log for details.

    But the log file doesn't contain anything interesting:

    7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30

    Scanning the drive for archives:
    1 file, 30368484 bytes (29 MiB)

    Extracting archive: C:\Users\appveyor\AppData\Local\Temp\1\pzjcvx01.ygj\88e45141cae35ced15beaaf29b99d251bd277254.zip.001

  2. Support Staff 2 Posted by Owen McDonnell on 10 Dec, 2018 10:49 PM

    Owen McDonnell's Avatar

    In this case, why do you have APPVEYOR_SAVE_CACHE_ON_ERROR set to true?
    Also, i wonder if you're running into account wide cache limit which is 1GB on free account. This build alone looks like it has a cache bigger than 500MB.

  3. 3 Posted by josch on 10 Dec, 2018 11:16 PM

    josch's Avatar

    Filling the cache takes more than the time limit of 1 hour. So I need the cache to be saved even after the build timed out. I need to restart it several times until the cache slowly filled itself.

    I don't think I'm running into the 1GB limit, because our recent builds are working just fine:

    https://ci.appveyor.com/project/josch/3dtk/builds/20908907

    Why is the setting of `APPVEYOR_SAVE_CACHE_ON_ERROR` relevant for this problem?

  4. Support Staff 4 Posted by Owen McDonnell on 11 Dec, 2018 06:55 AM

    Owen McDonnell's Avatar

    I mentioned APPVEYOR_SAVE_CACHE_ON_ERROR for cases when the extraction fails and you don't want to overwrite cache. Not sure that's what's going on here though.

    The cache restore errors may have been some stochastic network error. Looks like most recent builds are not experiencing this error?

    As far as sane defaults, I think most users prefer to allow a build to continue in the case of cache failures. This issue suggests so.

  5. 5 Posted by josch on 11 Dec, 2018 08:21 AM

    josch's Avatar

    If APPVEYOR_SAVE_CACHE_ON_ERROR had any influence, then appveyor would exit with an error after the cache failed to download or extract but that's not what it's doing. It's happily continueing the build without any error. So there is no way for my build script to know whether the cache failed to download or whether we are just starting off with a fresh cache.

    I imagine that other users have different expectations. That's why I suggested points two and three in my initial message:

    • an option that allows to at least not overwrite the existing cache with a new one or
    • a way for my build script to find out whether the cache failed to download

    That way, any user could decide for themselves how they want to react to "stochastic network errors".

    For us, "stochastic network errors" are a real pain, because it takes several rebuilds to fully populate the cache again.

  6. Support Staff 6 Posted by Owen McDonnell on 11 Dec, 2018 05:45 PM

    Owen McDonnell's Avatar

    The first build you linked to failed because there was an error in restoring the cache, so if, in the case of a cache restore error you do not want to save that now incomplete cache, then don't set APPVEYOR_SAVE_CACHE_ON_ERROR to true. You can always set it to true when you need to build up a new cache.

    But most importantly, is this a frequent problem for you? Looking through your build history I could only find that one single occurrence.

  7. 7 Posted by josch on 11 Dec, 2018 06:41 PM

    josch's Avatar

    No, the first build I linked to failed because "vcpkg install failed" (see line 492).

    If APPVEYOR_SAVE_CACHE_ON_ERROR indeed means "don't save the cache when there is an error retrieving it" then the documentation of that variable is wrong. The docs say:

    save build cache on build failure. By default build cache is being saved only during successful build Finalize steps

    Yes, it's a frequent problem. Just today I got this:

    https://ci.appveyor.com/project/josch/3dtk/builds/20921897

    It was lucky that I was attentive because if I hadn't paused the build, then the build cache would've wrongly been updated.

    Right now the problem is, that a "cache restore error" is not treated as an error. If it were treated as an error, then the build would stop immediately. But it doesn't as you can see from the build logs. And you also already said that it's undesirable in the default case to treat cache restore failures as an error and abort the build as a result..

  8. Support Staff 8 Posted by Owen McDonnell on 11 Dec, 2018 10:25 PM

    Owen McDonnell's Avatar

    I understand the build failure's proximate cause was the vcpkg install, but I assumed that was due to the cache restore error.

    In any case, two things to add.

    First, we increased your build time to 90 minutes, so perhaps you can fill your cache in one build now (though keep in mind that there are 3 different datacenters that builds occur in and cache is only saved to most recent one so it may take some time to populate all of them upon cache changes).

    Secondly, caching is meant as a best effort feature to speed up builds and act as a fail safe if nuget/npm/maven repos have issues. Your build should not rely on it 100% and package managers should play nicely with it (i.e. download what is needed if the cache is not restored). This is why we don't want cache failures to stop the build.

  9. 9 Posted by josch on 11 Dec, 2018 11:04 PM

    josch's Avatar

    Okay, if you think that allowing the user to choose whether stochastic network errors during build cache downloads make the whole build fail or not is not something in scope of appveyor, then we are nevertheless grateful for the bumped build time (it will help us a lot for filling up the cache) and you can consider this issue closed as wontfix. Thanks!

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

 

21 Mar, 2019 05:38 PM
21 Mar, 2019 06:16 AM
21 Mar, 2019 01:47 AM
20 Mar, 2019 11:39 PM
20 Mar, 2019 09:03 PM
20 Mar, 2019 08:55 PM