Caching NumPy build within a wheel on AppVeyor

Randy's Avatar

Randy

01 Mar, 2015 08:35 AM

Sorry if this has been asked before (I couldn't find a solution online), but is there a way to build a Python module, put it into a wheel, and cache it in between AppVeyor builds? It takes about 30 minutes to build my project for each [Python version]/ [Windows architecture] combination (i.e. Python2.7/Win32, Python2.7/Win64, Python3.2/Win32, ...). Because I have a total of eight combinations, it takes a couple hours to produce wheels for my project. I would say 90% of this time is taken up in building numpy after the build process executes 'pip install numpy'. If I were able to install from a pre-built wheel, I'm sure I could have the entire build process down to a handful of minutes.

  1. Support Staff 1 Posted by Feodor Fitsner on 02 Mar, 2015 01:34 AM

    Feodor Fitsner's Avatar
  2. 2 Posted by Pieter on 04 Mar, 2015 12:03 PM

    Pieter's Avatar

    I have exactly the same problem. Binary wheels for numpy do not seem to be available, so caching them when build in appveyor seems like a good solution.
     
    Building the wheel can be done with:

    pip wheel --wheel-dir=c:\tmp\ numpy

    And then install it with:

    pip install --no-index --find-links=c:\tmp numpy

    I do not know how check in the appveyor scripts whether the wheel already exists or not.

  3. Support Staff 3 Posted by Feodor Fitsner on 04 Mar, 2015 05:51 PM

    Feodor Fitsner's Avatar

    You can use build cache to preserve contents of C:\tmp folder between builds (if this is the directory where you put compiled numpy).

    cache:
    - c:\tmp
    

    It also makes sense to define cache item dependency. Changing this file(s) will invalidate cache and refresh it at the end of the build.

    Let's assume your build script is in build.cmd, so we can use it as cache dependency:

    cache:
    - c:\tmp -> build.cmd
    

    Now, whenever you update build.cmd the contents of c:\tmp won't be restored at the start of the build and it will be added to the cache again at the end of the build.

    To decide whether to build numpy or not you can check the contents of c:\tmp folder for any files. If it's empty - build numpy to there; if not - it was restored from cache.

    Hope that helps.

  4. 4 Posted by fomcl on 04 Mar, 2015 09:28 PM

    fomcl's Avatar

    I've been experimenting a bit. I think it is possible with just a small modification of requirements.txt and appveyor.yml.
    # [1] simulate cached wheel
    antonia@antonia-HP-2133 ~/Desktop $ wget https://pypi.python.org/packages/py2.py3/r/requests/requests-2.5.3-py2.py3-none-any.whl#md5=233249f4627ac5481c948e494d2a090e
    # [2] try to use the locally cached .whl, but download from Pypi if that fails (ie the first time). Note that "--no-index" is NOT specified, so Pypi is a fallback
    antonia@antonia-HP-2133 ~/Desktop $ pip install --find-links=file:///home/antonia/Downloads requests --upgrade
    Processing /home/antonia/Downloads/requests-2.5.3-py2.py3-none-any.whl
    Installing collected packages: requests
      Found existing installation: requests 2.5.1
        Uninstalling requests-2.5.1:
    ...
    # [3] demonstrate that a package that is not in the local cache (here xlrd) is really fetched from Pypi
    antonia@antonia-HP-2133 ~/Desktop $ pip install --find-links=file:///home/antonia/Downloads xlrd --upgrade
    Collecting xlrd from https://pypi.python.org/packages/source/x/xlrd/xlrd-0.9.3.tar.gz#md5=6f3325132f246594988171bc72e1a385
      Downloading xlrd-0.9.3.tar.gz (178kB)
        100% |################################| 180kB 161kB/s
    Installing collected packages: xlrd
    ...

    According to this page: https://pip.pypa.io/en/latest/reference/pip_install.html#requirements-file-format a requirements.txt may contain a line like this:
    numpy==1.9.1 --find-links file:///home/antonia/Downloads

    CAVEAT: I have not yet tried this. Please let me know if this works. I will try this tomorrow if I have time.

    Albert-Jan

  5. 5 Posted by Randy Direen on 05 Mar, 2015 07:45 AM

    Randy Direen's Avatar

    Ok, thanks for the answers. I have a solution based on Pieter and Feodor's ideas. The build cache was made for this, so I just stuffed the wheel files into the cache and then I recall numpy from there. Looks like everything works, so the following is a description of what I did.

    I wrote a powershell script called install_numpy.ps1 that checks to see if a wheel has been created for numpy; if it hasn't built one already, it makes that happen:

    #contents of install_numpy.ps1
    
    function InstallNumpy(){
     
        if (-not(Test-Path "c:\tmp\*.whl")) {
            Write-Host "numpy has not been compiled yet. Starting Long process..."
            Write-Host "pip wheel --wheel-dir=c:\tmp\ numpy"
            iex "cmd /E:ON /V:ON /C .\\appveyor\\run_with_env.cmd pip wheel --wheel-dir=c:\\tmp numpy"
        } else {
            Write-Host "numpy has already been compiled."
            Get-ChildItem "C:\tmp"
        }
    }
    
    InstallNumpy
    

    This is the first time I've written a powershell scripts (kinda feel like I need a shower), so if you see a better way of doing this, let me know. Also, make note that the run_with_env.cmd script I use is the one I found here1 .

    I added a couple things to the appveyor.yml file I got here2 . First, I added the build cache file, which is dependent on install_numpy.ps1 (whenever I change install_numpy.ps1 and push to GitHub, it will recompile numpy):

    cache:
      - C:\tmp -> \appveyor\install_numpy.ps1
    

    Second, I changed the install section to look like this

    install:
      - ECHO "Filesystem root:"
      - ps: "ls \"C:/\""
    
      - ECHO "Installed SDKs:"
      - ps: "ls \"C:/Program Files/Microsoft SDKs/Windows\""
    .
      - "powershell ./appveyor/install.ps1"
    
      - "SET PATH=%PYTHON%;%PYTHON%\\Scripts;%PATH%"
    
      # [NOTE] I took numpy out of my appveyor-reqs.txt file because I don't want 
      # numpy to be built each time I run appveyor. Make sure that wheel is one 
      # of your requirements so that you can build a numpy wheel.
      - "%CMD_IN_ENV% pip install -r appveyor-reqs.txt"
      
      # Now that wheel has been installed, check to see if a numpy wheel has been
      # made yet. If it hasn't, compile it and put it in C:\tmp.
      - "powershell ./appveyor/install_numpy.ps1"
      
      # This is where I install numpy from the pre-built wheel I compiled either
      # earlier in this session, or in a previous session.
      - "%CMD_IN_ENV% pip install --no-index --find-links=c:\\tmp numpy"
    

    The original install.ps1 file can be found here3 .

    The first time I send the project to appveyor, numpy is built and then installed; the second time I send the project to appveyor, numpy is loaded from a wheel and then installed; I also tested what happens if I modify install_numpy.ps1, the result, of course, is that numpy is rebuilt and then installed. Everything works!

    That shaved off 15 to 20 minutes of build time I was spending on each of my six builds. In total I'm saving between 1.5 and 2 hours every time I build my project on appveyor. Worth it...

  6. Support Staff 6 Posted by Feodor Fitsner on 05 Mar, 2015 06:27 PM

    Feodor Fitsner's Avatar

    Great solution, thank you for sharing it with community!

  7. 7 Posted by Kai on 29 Feb, 2016 01:54 PM

    Kai's Avatar

    This thread has been very useful, I think I have an improvement to suggest:

    --find-links can be used with pip wheel, so it is not necessary to check whether a wheel exists. It can be run like this:

    pip wheel pyfftw --wheel-dir build/ --download-cache download/ --find-links build/
    pip install --no-index --find-links build/
    

    This will trigger a build if the wheel does not exist, but use it if it does.

  8. Ilya Finkelshteyn closed this discussion on 25 Aug, 2018 02:04 AM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac