Spuriously failing test due to seemingly stray open handles

acrichton's Avatar

acrichton

29 Dec, 2016 02:17 AM

Hello! We've been using AppVeyor quite a bit on the rust-lang/rust and rust-lang/cargo GitHub repositories and it's been working fantastically!

Over time we've seen a number of spuriously failing tests in both test suites. More information can be found in the rust and cargo respective issues, but the general gist is that builds fail spuriously because they're unable to remove a file for one reason or another. Our best guess as to the cause of this is that some other process still has a handle open to the file in question, preventing its removal. We don't know for sure if this is the problem, but it's just out best guess so far.

So with that information, my question would be if AppVeyor has any sort of virus scanning utility running in the background by default? We're drawing blanks trying to track down processes which might have handles open to these files, but we know that historically Windows has had problems with this behavior and virus scanners (although we think it should be fixed nowadays).

Failing that, if you happen to have seen anyone else with a similar problem on AppVeyor, any help would be much appreciated! We've unfortunately been unable to reproduce any of these issues locally, as well.

  1. 1 Posted by Ilya Finkelshte... on 29 Dec, 2016 09:55 PM

    Ilya Finkelshteyn's Avatar

    Hi Alex,

    Thank you for a good words!

    We don't run antivirus software and in general we try to keep our VMs as free from any unnecessary processes as possible. Yes we have a lot of things pre-installed on build VM, but it is for customer to decide what process or service (besides bare minimum) to run.

    It is possible that you could catch those errors and run handle.exe against problem file? It might look like this:

    handle.exe -a -u "C:\projects\rust\build\x86_64-pc-windows-msvc\test\incremental\cache_file_headers.stage2-x86_64-pc-windows-msvc.exe" -nobanner
    
    Or if it is too expensive to change test code to catch this error, you maybe can just run this command periodically between the tests?

    --ilya.

  2. 2 Posted by acrichton on 29 Dec, 2016 11:42 PM

    acrichton's Avatar

    Ok, thanks for the clarification! I'll test out using handle.exe on AppVeyor and see if it works. Thanks for the tip!

  3. 3 Posted by diggsey on 25 Mar, 2017 02:31 PM

    diggsey's Avatar

    Hi Ilya,
    I'm having the exact same issue too. I did as you suggested, and added periodic calls to `handle.exe` immediately before tests that I know to fail frequently. However, over the course of ~20 builds, it has yet to show any open handles to any of the files being deleted.

    Adding calls to `handle.exe` does reduce the failure frequency (presumably just due to timing), but normally I'm seeing failure rates of up to about 1 in 4, which is very problematic when we're building for 4 targets on every PR.

    These spurious failures are not reproducible locally, despite a large number of builds and attempts to reproduce the conditions on appveyor.

    You can see my attempts to run `handle.exe` on appveyor in this build, and all other builds of the same PR:
    https://ci.appveyor.com/project/brson/rustup-rs/build/1.0.891

    Do you have any idea what could be causing this, or further suggestions for debugging?

    Thanks,
    Diggory

  4. 4 Posted by Ilya Finkelshte... on 28 Mar, 2017 02:26 AM

    Ilya Finkelshteyn's Avatar

    Hi Diggory,

    Another trick which works in file locking situation lately is to use Visual Studio 2017 image. It might work better because it is based on Windows Server 2016, where Windows Defender is disabled explicitly.

    But problem with this image is that it does not have all the same software installed as our main image. I tried to run your build against my fork and I was needed to install MSYS (at least 64-bit one) to make it work. Here are changes I was needed to do. However it increased each build job time from about 8 to about 16 minutes. If you could afford this time increase I would recommend to try it now. If not, please watch this issue and try Visual Studio 2017as it closed. Or maybe you can find a way to build your project on Visual Studio 2017 image without this heavy installation, I made it quite blindly/instinctively and it can be actually easier way to make it run on this new image.

    Ilya.

  5. 5 Posted by diggsey on 30 Mar, 2017 11:10 AM

    diggsey's Avatar

    Hi Ilya,
    Thanks for your response, fixing our build to work with the new image was going above and beyond!

    I've had time to run your changes several times, and while it doesn't *completely* fix the problem, it does reduce the failure rate significantly - I only got 1 failure in 40 builds. In combination with other changes we are making to retry failed operations where possible, it should bring the failure rate down to an acceptable level.

    Best Regards,
    Diggory

  6. 6 Posted by Ilya Finkelshte... on 30 Mar, 2017 03:41 PM

    Ilya Finkelshteyn's Avatar

    Great, thanks a lot for update!

  7. Ilya Finkelshteyn closed this discussion on 25 Aug, 2018 02:15 AM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac