DNS Problem?

ukio's Avatar

ukio

23 Dec, 2014 12:12 PM

I'm facing the same DNS / can not resolve host issue with several builds in a row now, for example here:
https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

The build stalls here:

if (!(Test-Path -path C:\Vagrant-1.7.1)) {
27    Start-FileDownload "https://dl.bintray.com/mitchellh/vagrant/vagrant_1.7.1.msi"
28    Start-Process -FilePath "msiexec.exe" -ArgumentList "/a vagrant_1.7.1.msi /qb TARGETDIR=C:\Vagrant-1.7.1" -Wait
29}
30 
31Exception calling "DownloadFile" with "3" argument(s): "The remote name could not be resolved: 'dl.bintray.com'"
32At C:\Program Files\AppVeyor\BuildAgent\Modules\build-worker-api\build-worker-api.psm1:238 char:5
33+     [Appveyor.BuildAgent.Api.RestBuildServices]::DownloadFile($Url, $FileName, $ ...
34+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
36    + FullyQualifiedErrorId : WebException
37

The file it should download does definitely exist at that location:
https://dl.bintray.com/mitchellh/vagrant/vagrant_1.7.1.msi

Any ideas?

  1. Support Staff 1 Posted by Feodor Fitsner on 23 Dec, 2014 05:00 PM

    Feodor Fitsner's Avatar

    Hm, I've been able to download the file from that location:
    https://ci.appveyor.com/project/FeodorFitsner/simple-console/build/...

    AppVeyor workers use Google DNS (8.8.8.8/8.8.4.4).

    Try flushing DNS before making that call:

    ipconfig /flushdns
    

    https://ci.appveyor.com/project/FeodorFitsner/simple-console/build/...

  2. 2 Posted by ukio on 24 Dec, 2014 12:14 PM

    ukio's Avatar

    Hi Feodor,

    weird, tried flushing the DNS cache via ipconfig /flushdns but it did not help:
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

    That also raises another question, namely what kind of state is preserved between the builds and whether you get the same Hyper-V worker for each build or always an random one?

    My assumption was that I would get a random build worker that is reset to a prisitine state each time a build is started. The only shared and preserved state would the cached directories, if any.

    Merry Xmas!
    Torben

  3. Support Staff 3 Posted by Feodor Fitsner on 24 Dec, 2014 06:49 PM

    Feodor Fitsner's Avatar

    Yes, you are correct. Each time it's a random worker from a random Hyper-V host which was reset to a "clean" state. I could imagine there might be some workers which was snapshoted with "wrong" DNS, but you would randomly get that DNS error then.

    Try downloading that file in the very beginning of your script?

  4. 4 Posted by ukio on 27 Dec, 2014 12:44 PM

    ukio's Avatar

    Weird. I put it at the beginning (init) and it suddenly worked:
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

  5. 5 Posted by ukio on 27 Dec, 2014 01:05 PM

    ukio's Avatar

    ...and it stays weird as it is. After a successful run I moved the downloading part back the where it was. It won't fail here anymore because the downloaded and extracted files are now cached.

    However, now it fails a bit further in the build process, again in a place where a download is involved and again with something that looks like a dns issue:

    gem --version
    502.0.14
    51gem install bundler --quiet --no-ri --no-rdoc
    52ERROR:  Could not find a valid gem 'bundler' (>= 0), here is why:
    53          Unable to download data from https://rubygems.org/ - no such name (https://rubygems.org/latest_specs.4.8.gz)
    54Command exited with code 2
    

    Failed exactly here for 3 times in a row now, so I guess its reproducible:
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

  6. 6 Posted by ukio on 27 Dec, 2014 01:16 PM

    ukio's Avatar

    ok, might have been only a hiccup though. The 4th build later the error is gone:
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

  7. 7 Posted by ukio on 27 Dec, 2014 01:35 PM

    ukio's Avatar
  8. 8 Posted by ukio on 27 Dec, 2014 09:05 PM

    ukio's Avatar

    Hi Feodor,

    I suspect all this has something to do with the installation of Virtualbox. I'm pretty sure it adds and resets some network adapters, which might explain the DNS resolution issues.

    For now I am waiting until the network is back up again before continuing:

    12While ((Test-Connection heise.de -count 1 -quiet) -ne "True") {
    13    echo "waiting for network..."
    14    Start-Sleep 1
    15}
    

    There seem to be at least 2 seconds after installing VirtualBox where the network is not available, e.g. see here:
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

    Once the cache for this project is cleaned and the build is green again I consider this theory confirmed and will close the discussion.

    What a nasty bug in my appveyor.yml...

    Cheers,
    Torben

  9. Support Staff 9 Posted by Feodor Fitsner on 27 Dec, 2014 09:09 PM

    Feodor Fitsner's Avatar

    Oh, I think you are right! Might be VirtualBox setting up virtual NICs.

    Anyway, using VirtualBox inside AV build workers is a great case and if that works it adds another edge case to this one. :)

    Let me know about results.

  10. 10 Posted by ukio on 27 Dec, 2014 09:21 PM

    ukio's Avatar

    Oh, might it be that outbound ICMP is blocked from within the azure build workers?

    Compare this on hyperv (works after 2 secs)
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...
    vs the same build on azure (ping does not work after 13 minutes yet...)
    https://ci.appveyor.com/project/tknerr/vagrant-appveyor-testing/bui...

    Cheers,
    Torben

  11. 11 Posted by ukio on 27 Dec, 2014 09:50 PM

    ukio's Avatar

    Trying with Test-NetConnection on port 80 instead of using ICMP.

    Looks good on the hyperv workers:
    https://ci.appveyor.com/project/ukio/vagrant-appveyor-testing/build...

    Waiting for the azure worker to start...
    https://ci.appveyor.com/project/tknerr/vagrant-appveyor-testing/bui...

  12. 12 Posted by ukio on 27 Dec, 2014 09:59 PM

    ukio's Avatar

    So, the azure worker just started, and I'm glad this check now works both for hyperv and azure workers:

    15 While ((Test-NetConnection heise.de -Port 80 -InformationLevel Quiet) -ne "True") {
    16    echo "waiting for network..."
    17    Start-Sleep 1
    18}
    

    @Feodor: concerning the results of using VirtualBox / Vagrant inside Appveyor I will comment in the other thread here: http://help.appveyor.com/discussions/problems/1247-vagrant-not-work...

    The cause of the "DNS issues" has been found (ie. VirtualBox setting up virtual NICs) and a solution too (wait until outbound http works) - so I will close this thread.

  13. ukio closed this discussion on 27 Dec, 2014 09:59 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac