Hosted runners: Job killed after 60 mins despite longer timeout

Jan Vesely's Avatar

Jan Vesely

15 Feb, 2020 05:43 AM

Hi,

docker containers in self-hosted runners are killed after 60 minutes even if the config specifies a longer timeout (see for example [0], the last line is at 01:00:57, while the job timed out at 1:20)

Jan

[0] https://ci.appveyor.com/project/jvesely/psyneulink-coveralls/builds... edit: correct link:
https://ci.appveyor.com/project/jvesely/llvm-project-llvm/builds/30...

  1. Support Staff 1 Posted by Feodor Fitsner on 16 Feb, 2020 03:02 AM

    Feodor Fitsner's Avatar

    Hi Jan,

    That build you gave a link to failed in 10 seconds, not cancelled by timeout. Is that the correct build URL?

  2. 2 Posted by Jan Vesely on 16 Feb, 2020 03:05 AM

    Jan Vesely's Avatar

    Hi,

    sorry, you're right I've updated the link. The first link was meant for https://help.appveyor.com/discussions/problems/26293-quote-branch-n...

    Jan

  3. Support Staff 3 Posted by Feodor Fitsner on 16 Feb, 2020 03:41 AM

    Feodor Fitsner's Avatar

    Right, running on a self-hosted agent doesn't automatically increase timeout.
    However, you can configure project-specific build timeout on "General" tab of project settings.

  4. 4 Posted by Jan Vesely on 16 Feb, 2020 03:44 AM

    Jan Vesely's Avatar

    I did, that's the problem. The project timeout is set to 80 minutes and it reports as 'timed out' in 1h 20 mins, which is correct.

    The problem is that it did nothing in the last 20 mins.
    The container stopped after 60mins. I checked 'docker ps', as well as the last line of output (which is at 1:00:57).

    Is there another timeout setting on the runner side that needs to be bumped above 60 mins?

  5. Support Staff 5 Posted by Feodor Fitsner on 16 Feb, 2020 03:47 AM

    Feodor Fitsner's Avatar

    I see. When you run the build in the container could you open http://localhost:5020 on Docker machine and check if what time should the build finish?

  6. 6 Posted by Jan Vesely on 16 Feb, 2020 03:59 AM

    Jan Vesely's Avatar

    I've only remote access to the machine. Is there a resource I can GET using nc to retrieve the info?
    I found these lines in the log:

    Feb 15 14:37:44 baltix appveyor-host-agent[2214]: info: Appveyor.HostAgent.Docker.WorkerCloud[0]
    Feb 15 14:37:44 baltix appveyor-host-agent[2214]:       [worker-557-004] Received job message: JobId=2vqf4opd9jhvc980, JobName=jvesely/llvm-project-llvm/36, JobTimeout=60, ImageName=Ubuntu
    
  7. Support Staff 7 Posted by Feodor Fitsner on 16 Feb, 2020 03:32 PM

    Feodor Fitsner's Avatar

    Looks like there is a bug. We are going to fix it and deploy update. I've created an issue: https://github.com/appveyor/ci/issues/3317

  8. Support Staff 8 Posted by Feodor Fitsner on 09 Mar, 2020 11:42 PM

    Feodor Fitsner's Avatar

    Hi Jan,

    Timeout for BYOC clouds have been fixed. Let me know how that worked.

  9. 9 Posted by Jan Vesely on 10 Mar, 2020 08:31 PM

    Jan Vesely's Avatar

    Hi,

    I can see the logs now currently report >=60 min timeouts and jobs completing after >60 mins. I think it's fixed.
    Thank you!

  10. Support Staff 10 Posted by Feodor Fitsner on 10 Mar, 2020 08:34 PM

    Feodor Fitsner's Avatar

    Cool, thanks for the update!

  11. Jan Vesely closed this discussion on 10 Mar, 2020 08:36 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac