From git, perform shallow clone

Chris Dembia's Avatar

Chris Dembia

20 May, 2014 11:21 PM

My repository is large (200 MB), so it takes a while to download. However, we only need the last commit to do the test. Travis uses a --depth=50 flag on the git clone command. Could appveyor do this as well?

  1. Support Staff 1 Posted by Feodor Fitsner on 21 May, 2014 03:20 AM

    Feodor Fitsner's Avatar

    Hi Chris,

    This is how it was before, but after few clients reported issues with depth=50 it was reverted back to a full clone. 50 was not enough for some cases like large rebases. Rather than playing with the depth (which is clearly a bet) we are thinking on two possible solutions to this problem:

    1) Using GitHub API calculate required depth (distance) from required commit to the last one right before calling clone. It's definitely better than a fixed number and in the most cases the depth will be 1. However, there is still a probability of another push between querying for depth and cloning the repo.

    2) Use GitHub API to download required commit (ref) as a single zip. We did some proof-of-concepts and it works really well. This method is my favorite :) through if you are relying on .git folder in your builds it won't work for you.

  2. Support Staff 2 Posted by Feodor Fitsner on 30 May, 2014 09:34 PM

    Feodor Fitsner's Avatar

    Hey Chris,

    We've just deployed a new feature called "shallow clone" which uses GitHub API to grab specific commit's zipball. It's experimental, but you can give it a try by putting in your appveyor.yml:

    shallow_clone: true
    

    Let me know if it makes your build lighter.

  3. 3 Posted by Chris Dembia on 31 May, 2014 06:16 PM

    Chris Dembia's Avatar

    Hey Feodor:

    Thanks for implementing that feature. However, building my project is taking over 30 minutes, so I am unable to use the service (unless, of course, I decide to pay). It is a fantastic service, though! I hope I'll be using it in the future. It's evident you're rapidly addressing your users' concerns.

  4. Support Staff 4 Posted by Feodor Fitsner on 31 May, 2014 07:19 PM

    Feodor Fitsner's Avatar

    Ah, OK :) I'm wondering what's taking so long in your build?

  5. 5 Posted by Chris Dembia on 31 May, 2014 10:13 PM

    Chris Dembia's Avatar

    It's just a big C++ project.

  6. Support Staff 6 Posted by Feodor Fitsner on 31 May, 2014 10:16 PM

    Feodor Fitsner's Avatar

    I see. How long does build take on your current CI server and what's its configuration?

  7. 7 Posted by David Pfeffer on 02 Jun, 2014 02:34 PM

    David Pfeffer's Avatar

    I need shallow clone too, but I'm using the web UI instead of .yml config. Is there a way I can use this feature? Without it, my clones take 10+ minutes and I cannot use the AppVeyor service. :(

  8. Support Staff 8 Posted by Feodor Fitsner on 02 Jun, 2014 04:21 PM

    Feodor Fitsner's Avatar

    Hi David,

    Have you tried running shallow_clone: true through appveyor.yml? I'm wondering how long would it take for your project to download it. It's experimental feature and we'd like to collect some feedback before making it on UI.

  9. 9 Posted by David Pfeffer on 02 Jun, 2014 04:41 PM

    David Pfeffer's Avatar

    I haven't. In order to do that, I'd have to entirely switch to .yml
    configuration, right? I can't do that because we have different configs
    depending on branch, and I'm going to need to set up multiple build configs
    in the UI.

  10. Support Staff 10 Posted by Feodor Fitsner on 02 Jun, 2014 06:35 PM

    Feodor Fitsner's Avatar

    Yes, you should switch to appveyor.yml.

    But you can have different appveyor.yml for every branch, no? That's the beauty of this approach - build config is stored along with your sources and it's versioned! When I do a new branch I inherit appveyor.yml from master and then just update appveyor.yml to make it work with a new branch. When AppVeyor starts a new build it downloads branch-specific YAML config.

  11. 11 Posted by David Pfeffer on 02 Jun, 2014 06:47 PM

    David Pfeffer's Avatar

    We have a Production repository and a development repo with a master branch for releasable code, a develop branch for code that should end up on a testing server, and various feature branches that aren't deployable.

    As features are tested, they're merged to develop to end up on testing. When completed, they're pull requested into master which deploys to our UAT instance. When master is ready, it gets pushed to the protected production repository. Each of those workflow operations would overwrite or merge failure on the dissimilar yml file and thus require manual conflict resolution. Then, what happens if I accidentially merge incorrectly, and now just pushed out UAT system settings to production?

    That sort of fragility is what I avoid with settings stored in the UI instead. We're currently on TeamCity, where all of our settings are stored in the UI, but hoping to move over to AppVeyor. However, I can't store CI settings in the main repository for these reasons.

  12. Support Staff 12 Posted by Feodor Fitsner on 02 Jun, 2014 06:59 PM

    Feodor Fitsner's Avatar

    Ah, I see. Thanks for describing your scenario. Indeed, merging into master with appveyor.yml changes might do a mess...

    OK, we will add "Shallow clone" checkbox on UI - hopefully will push it in today's update.

  13. Support Staff 13 Posted by Feodor Fitsner on 02 Jun, 2014 07:02 PM

    Feodor Fitsner's Avatar

    If downloading through zip still takes too long we'll add configurable depth parameter for git to see if it helps.

  14. 14 Posted by Chris Dembia on 02 Jun, 2014 07:12 PM

    Chris Dembia's Avatar

    We do not currently have a CI server for Windows. On Linux (travis-ci), the
    build takes no more than 10 minutes.

  15. Support Staff 15 Posted by Feodor Fitsner on 02 Jun, 2014 07:21 PM

    Feodor Fitsner's Avatar

    So you have dual-platform project and trying to establish automatic builds on Windows platform too? I guess it's a private project on Travis, right?

  16. 16 Posted by Chris Dembia on 02 Jun, 2014 07:28 PM

    Chris Dembia's Avatar

    Correct. Well, Windows, Mac, Linux. It's public:
    https://travis-ci.org/simbody/simbody.

  17. Support Staff 17 Posted by Feodor Fitsner on 02 Jun, 2014 07:39 PM

    Feodor Fitsner's Avatar

    Well, I see there would be other things besides cloning the repo like make tools, compiler (we have VC++ right now), etc. Do you have any plan for making it built on Windows? Maybe we could deploy a separate build worker image to play with such projects...

  18. 18 Posted by Chris Dembia on 02 Jun, 2014 07:45 PM

    Chris Dembia's Avatar

    Oh that is not an issue for me. I think I can do everything I need to do,
    except that it takes longer than 30 mins. See my appveyor script:
    https://github.com/chrisdembia/simbody/blob/patch-3/appveyor.yml

  19. Support Staff 19 Posted by Feodor Fitsner on 02 Jun, 2014 07:48 PM

    Feodor Fitsner's Avatar

    I see. Currently builds run on "Small" Azure instances with one CPU core. Wondering how long would it take to run it on "Medium" instance with 2 cores...

  20. 20 Posted by David Pfeffer on 03 Jun, 2014 02:25 AM

    David Pfeffer's Avatar

    Under which settings screen should I look for that checkbox?

  21. Support Staff 21 Posted by Feodor Fitsner on 03 Jun, 2014 04:00 AM

    Feodor Fitsner's Avatar

    What checkbox?

  22. Support Staff 22 Posted by Feodor Fitsner on 03 Jun, 2014 04:52 AM

    Feodor Fitsner's Avatar

    David,

    Just wanted to let you know that AppVeyor update with shallow clone/depth on UI has been deployed. You can see these settings on "General" tab of project settings.

    Let me know how it goes.

  23. 23 Posted by David Pfeffer on 03 Jun, 2014 11:48 AM

    David Pfeffer's Avatar

    Definitely a HUGE improvement. However, it took about a minute and a half
    to download the 70 MB commit snapshot. When I download it locally, GitHub
    downloads at 6.5 MB/s (the download takes just over 10 seconds). If you're
    using Azure small machines, you should have 100 Mbit/s available, so you
    should be able to get the 6.5 MB/s. Any idea what's wrong?

  24. Support Staff 24 Posted by Feodor Fitsner on 03 Jun, 2014 04:47 PM

    Feodor Fitsner's Avatar

    That's interesting. Right, I think downloading zip is not an issue. My guesses would be a) packaging commit on GitHub side or b) unzipping archive on AppVeyor side. The bottleneck may be either CPU or I/O.

    How many files are there in repo?

  25. 25 Posted by David Pfeffer on 03 Jun, 2014 05:21 PM

    David Pfeffer's Avatar

    You're completely right -- its almost definitely the unzip, which I hadn't
    tried locally, because of the huge volume of files. Would a git shallow
    clone be faster than unzipping?

  26. Support Staff 26 Posted by Feodor Fitsner on 03 Jun, 2014 05:24 PM

    Feodor Fitsner's Avatar

    Yes, another option is trying out "Clone depth" which adds --depth parameter to git clone command.

  27. Support Staff 27 Posted by Feodor Fitsner on 03 Jun, 2014 05:28 PM

    Feodor Fitsner's Avatar

    But I'm wondering if it's I/O problem or CPU. I've been noticing that VMs sometimes are not very cool in doing disk ops.

    Just for curiosity, what if you set clone folder (General tab) to some location at d: drive, let say d:\projects\test. D: drive is "temp" storage on Azure VMs, it's transient and it is local hypervisor hard drive, not NAS. Would it be faster or slower? :)

  28. 28 Posted by David Pfeffer on 03 Jun, 2014 07:16 PM

    David Pfeffer's Avatar

    Azure VMs get a hard limit of 500 IOPS *per disk* on all VM sizes from XS
    to XL. Since you're using small VMs, you could get more IO by creating a
    RAID0 of two data disks. I do this on my Azure Extra Large VMs for database
    servers where I RAID0 8 disks. Theres no data loss potential for RAID0
    because the disks are stored in locally redundant storage.

    However, for checkouts, you are right that the temp disk is probably a
    better bet. I'll give it a shot an report back.

  29. 29 Posted by David Pfeffer on 03 Jun, 2014 07:24 PM

    David Pfeffer's Avatar

    No slower, but no faster.

    Clone depth = 50 took 3x longer than the ZIP.

  30. Support Staff 30 Posted by Feodor Fitsner on 03 Jun, 2014 07:34 PM

    Feodor Fitsner's Avatar

    OK, got it. I like the idea with RAID. Will play with it some day.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac