Troubleshooting connection issues to azure

Boris Callens's Avatar

Boris Callens

19 Mar, 2019 03:25 PM

Since a few days we're seeing an alarming increase in failed builds due to connection related errors and we could use some assistance troubleshooting them. Currently the failing builds is keeping us from deploying to production in a timely manner.

The issues are across multiple projects.
They are not consistent (intermittent) and not reproducible on our local environments.
As we are connecting to azure in our integration tests and also deploying to azure we checked our Azure logs, but can't find any relevant connection issue notifications or log entries.
The exceptions are across multiple projects and of varied nature.
Ranging from connection errors with nuget, our own existing azure services, the azure zip deploy failing, timeouts to azure search service etc.
The only thing connecting them is how they all seem to be related to connection issues.

We saw there was a ticket yesterday that was subsequently resolved, but we didn't see our issues disappear.
Is there something else we can check from our side?

Below are a few examples of builds failing because of connection issues within the last three hours (there are many more):
  - https://ci.appveyor.com/project/ichoosr/horizons-backoffice/builds/23188646
  - https://ci.appveyor.com/project/ichoosr/horizons-api/builds/23189337
  - https://ci.appveyor.com/project/ichoosr/horizons-api/builds/23189599
  - https://ci.appveyor.com/project/ichoosr/horizons-api/builds/23190286
  - https://ci.appveyor.com/project/ichoosr/horizons-api/builds/23190440
  - https://ci.appveyor.com/project/ichoosr/horizons-app/builds/23189168
  - https://ci.appveyor.com/project/ichoosr/horizons-app/builds/23188559
  - https://ci.appveyor.com/project/ichoosr/horizons-app/builds/23188974

Slightly related: in an effort to gather some relevant log files I tried using the Appveyor API, but didn't understand how to get to the logs of failed builds. Asked a SO question here: https://stackoverflow.com/questions/55243440/get-jobids-of-failed-builds

  1. Support Staff 1 Posted by Ilya Finkelshte... on 19 Mar, 2019 11:48 PM

    Ilya Finkelshteyn's Avatar

    Hi Boris,

    We noticed similar behavior reported by our monitoring as well. We still not sure we understand the root cause but believe that it is related to the Hyper-V networking issues on some hosts. We are rolling out the fix which we hope should help. Infrastructure level changes are not very fast to deploy, so it will be fully deployed in 24-48 hours. Please send us a links to the failed builds if you hit it again.

    Ilya.

  2. 2 Posted by jeroen.heijmans on 20 Mar, 2019 11:09 AM

    jeroen.heijmans's Avatar

    Dear Ilya,

    I'm a colleague of Boris and am running into similar issues. I realize 24-48 hours have not yet passed, but only about 12 hours so far. At the moment we're still experiencing the issues, e.g. this build just now:

    Failed with an error while connecting to an Azure resource:

    System.Net.Http.HttpRequestException : No such host is known

    But we'll have some more patience and monitor builds for the upcoming hours until your fix has been fully deployed.

    Kind regards,
    Jeroen

  3. Support Staff 3 Posted by Ilya Finkelshte... on 20 Mar, 2019 09:03 PM

    Ilya Finkelshteyn's Avatar

    Yes, this specific build happened on the host machine without patch applied at the moment of the build.

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac