Random fetch error using git on a github repository

olivler.grisel's Avatar

olivler.grisel

25 Jul, 2014 03:10 PM

For instance see:

https://ci.appveyor.com/project/ogrisel/scikit-learn/build/1.1.122/job/yuk9okk64k5el9dy

    1 Build started
    2 git clone -q https://github.com/scikit-learn/scikit-learn.git C:\projects\scikit-learn
    3 fatal: unable to access 'https://github.com/scikit-learn/scikit-learn.git/': Failed connect to github.com:443; No error
    4 Command exited with code 128

Note the other 3 jobs from the same build were successful. It would be great to make the fetch more robust by retrying 3 times in case of failures or increasing the time out.

  1. Support Staff 1 Posted by Feodor Fitsner on 25 Jul, 2014 03:18 PM

    Feodor Fitsner's Avatar

    I've seen that numerous times. Might be a problem on GitHub side?

  2. 2 Posted by olivler.grisel on 27 Jul, 2014 05:52 PM

    olivler.grisel's Avatar

    Yes but as it's random and quite rare I think it could be mitigated on AppVeyor's side by implementing a retry mechanism in a for loop, e.g. in pseudo-code:

    n_retries = 5
    success = False
    for retry in [1, n_retries]:
        try:
            do_git_fetch(url)
            # success
            success = True
            break
         except IOError:
            sleep(5)
            success = False

  3. Support Staff 3 Posted by Feodor Fitsner on 28 Jul, 2014 04:42 AM

    Feodor Fitsner's Avatar

    Indeed, we could check exit code and clone again if it's not 0. Thanks!

  4. 4 Posted by olivler.grisel on 31 Jul, 2014 06:01 PM

    olivler.grisel's Avatar

    A similar problem often happen with artifacts uploads to the Azure blob store, for instance:

    https://ci.appveyor.com/project/sklearn-ci/scikit-learn/build/1.0.1/job/ae17hb0poxf0ji84

    Packaging artifacts...Done
    2257Uploading artifact dist\scikit-learn-0.16-git.win32-py2.7.exe (2.8 MB)...An exception occurred during a WebClient request.

    I think a similar strategy could help mitigate the issue.

  5. 5 Posted by dane on 01 Aug, 2014 06:24 PM

    dane's Avatar

    I've been seeing this consistently causing failed builds on appveyor, but strangely not on travis (which runs builds at the same time). So I wonder what could be different to cause appveyor to fail when travis does not?

    Here is one example build:
     - travis (clone worked): https://travis-ci.org/mapbox/mapbox-studio/jobs/31445776
      - appveyor (clone failed): https://ci.appveyor.com/project/Mapbox/mapbox-studio/build/1.0.72

    One obvious difference is that travis checks out like:

    `git clone --depth=50 git://github.com/mapbox/mapbox-studio.git mapbox/mapbox-studio`

    While appveyor clones like:

    `git clone -q https://github.com/mapbox/mapbox-studio.git C:\projects\mapbox-studio`

    Feodor: what do you think about doing shallow clones by default and pulling using the `git://` ssh url instead of https?

  6. Support Staff 6 Posted by Feodor Fitsner on 02 Aug, 2014 05:57 AM

    Feodor Fitsner's Avatar

    I'll try using SSH with public repos. Have you tried setting "Fetch with API" (shallow_clone in appveyor.yml) http://www.appveyor.com/docs/how-to/repository-shallow-clone#downlo...?

  7. 7 Posted by dane on 06 Aug, 2014 05:49 PM

    dane's Avatar

    Thanks for the doc link for shallow clones - I had not seen that and will be trying it out. Thanks!

  8. Ilya Finkelshteyn closed this discussion on 25 Aug, 2018 01:46 AM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac