BYOC: 100% CPU usage in 'bash-shell.sh' and random 'exited with code 1'

Jan Vesely's Avatar

Jan Vesely

14 Oct, 2019 04:50 PM

I've tried to setup Linux Docker BYOC in 3 different projects:
1.) Uses gcc and clang, works nicely [0]
2.) Uses python/pytest Linux only [1]. Tests execute OK. but the build hangs before executing 'on_finish' script

on_finish:
- sh: curl -X POST -F "file=@tests_out.xml" https://ci.appveyor.com/api/testresults/junit/$APPVEYOR_JOB_ID
Inspecting the machine shows one thread stuck in bash-shell.sh
3.) Same project as 2. the configuration is mixed sh/cmd/pwsh to execute on both windows and linux. The build hangs while executing
test_script:
  - pwsh: pytest --junit-xml=tests_out.xml -n auto --strict-markers $Env:EXTRA_ARGS
same as 2. I see 100% cpu usage in bash-shell.sh

all 3 instances use the same docker image. It was originally built using the BYOC, but I had to update it (apt update/apt upgrade) otherwise some clang dependencies wouldn't install. I've also added several packages that I forgot to include in the BYOC setup steps.
I'm not sure if/how to update appveyor build environment or if it's necessary.

  1. and 3. also used to randomly "exited with code 1` during pip package downloads [3]. I'm not sure how to investigate those.

[0] https://ci.appveyor.com/project/jvesely/libclc

[1] https://ci.appveyor.com/project/jvesely/psyneulink/builds/28084070/...

[2] https://ci.appveyor.com/project/jvesely/psyneulink-wuxsn/builds/280...

[3] https://ci.appveyor.com/project/jvesely/psyneulink/builds/28103699/...

  1. Support Staff 1 Posted by Feodor Fitsner on 14 Oct, 2019 08:08 PM

    Feodor Fitsner's Avatar

    Hi Jan,

    Thank you for such a thorough test of BYOC!

    Do you still have a command to create a custom Docker image that would work with your project (or you can upload it to Docker registry if it's not huge :)?

  2. 2 Posted by Jan Vesely on 14 Oct, 2019 09:09 PM

    Jan Vesely's Avatar

    Hi,

    it should be available as "jvesely/ci:appveyor" on docker hub.
    setting up BYOC was my first exposure to docker infrastructure so I'm sure there are ways to do things better.
    thanks.

  3. 3 Posted by Jan Vesely on 15 Oct, 2019 02:44 AM

    Jan Vesely's Avatar

    I was able to get around the test hang in 3., by using sh instead of pwsh.
    I assume finishing a job (before 'on_finish') also involves some powershell commands.
    so it looks like there's something wrong with powershell on that image.

    EDIT: to be more specific, the problem seems to be switching from sh to pwsh

  4. 4 Posted by Jan Vesely on 17 Oct, 2019 02:05 AM

    Jan Vesely's Avatar

    Hi,

    Switching builds to use pwsh exclusively works around the hangs.
    Exporting PIP_PROGRESS_BAR=off fixed the occasional "exited with code 1" error.

  5. Support Staff 5 Posted by Feodor Fitsner on 17 Oct, 2019 03:32 AM

    Feodor Fitsner's Avatar

    Interesting findings! We'll look into what might be wrong with bash loop. I guess "exited with code 1" is also related to bash issue...

    You know what, could you try one more thing please? What if you switch back to using sh everywhere, but additionally configure APPVEYOR_CONSOLE_DISABLE_PTY: true environment variable on that build (either yaml or UI)? This way the log "coloring" will be off without PTY emulation proxy. Just wondering if that makes the build stable.

  6. 6 Posted by Jan Vesely on 17 Oct, 2019 06:35 PM

    Jan Vesely's Avatar

    Hi,
    using APPVEYOR_CONSOLE_DISABLE_PTY: true fixes the hangs when using a mix of sh and pwsh commands.

    thanks!

  7. Support Staff 7 Posted by Feodor Fitsner on 18 Oct, 2019 05:25 PM

    Feodor Fitsner's Avatar

    That's great, thanks for the update! It means we should revisit our implementation of PTY.

  8. 8 Posted by Jan Vesely on 03 Nov, 2019 03:57 AM

    Jan Vesely's Avatar

    I've run into one more similar problem. A pure sh setup (similar to 2.) with artifacts works without setting APPVEYOR_CONSOLE_DISABLE_PTY: true, but the artifact upload speeds are extremely low (~1min to upload 4MB), to the point it occasionally fails to upload at all[0]
    The other symptoms are similar; 100% cpu usage in bash-shell.sh.

    [0] https://ci.appveyor.com/project/jvesely/llvm-project-libclc/builds/...

  9. Support Staff 9 Posted by Feodor Fitsner on 03 Nov, 2019 09:29 PM

    Feodor Fitsner's Avatar

    Where the server you are uploading from is located?

  10. 10 Posted by Jan Vesely on 04 Nov, 2019 01:03 AM

    Jan Vesely's Avatar

    The machine is located on the Rutgers University network (central NJ).

    Edit: the upload speed looks like local Rutger issue. It's still interesting that it doesn't need the PTY workaround in the presence of artifacts.

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac