patternpythonMinor
Re-using Python virtual environments on a build server
Viewed 0 times
virtualpythonusingserverbuildenvironments
Problem
Currently, every time we run a build through Jenkins+Ansible, we are re-creating a virtual environment and re-installing all the dependencies listed inside the
This is very slow and does not scale well. How can we improve and speed up the process? Can we re-use virtual environments?
Our latest idea which we have not yet implemented was to build virtual environments outside of a Jenkins workspace, name virtual environments based on a project+branch and keep an MD5 sum of the requirements for every virtual environment. Then, before building an environment the next time, we calculate the MD5 sum of the current requirements in this branch and look up existing virtual environments by this MD5 sum. If an existing environment with this sum was found, just re-use it. We are not sure if this is the best way to solve the problem.
requirements.txt file.This is very slow and does not scale well. How can we improve and speed up the process? Can we re-use virtual environments?
Our latest idea which we have not yet implemented was to build virtual environments outside of a Jenkins workspace, name virtual environments based on a project+branch and keep an MD5 sum of the requirements for every virtual environment. Then, before building an environment the next time, we calculate the MD5 sum of the current requirements in this branch and look up existing virtual environments by this MD5 sum. If an existing environment with this sum was found, just re-use it. We are not sure if this is the best way to solve the problem.
Solution
It's generally a best practice to make your builds idempotent. Leaving artifacts behind only opens up opportunities for dependency management issues- and resolving those in a resilient way is exactly why you're using a build server. I suggest you take a look at WHY your Python builds are running so slow- if you discover this is because of an overabundance of packages in each project, this might be optimizable. If you discover this is because of server performance issues, those are easily fixable with more resources. If you find you are blocked by network throughput, consider using an artifact repository to cache packages locally. But building persistent virtualenvs seems like a recipe for problems that could be unpredictable when moving to another server, and extremely frustrating to troubleshoot.
Context
StackExchange DevOps Q#1992, answer score: 5
Revisions (0)
No revisions yet.