HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Re-using Python virtual environments on a build server

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
virtualpythonusingserverbuildenvironments

Problem

Currently, every time we run a build through Jenkins+Ansible, we are re-creating a virtual environment and re-installing all the dependencies listed inside the requirements.txt file.

This is very slow and does not scale well. How can we improve and speed up the process? Can we re-use virtual environments?

Our latest idea which we have not yet implemented was to build virtual environments outside of a Jenkins workspace, name virtual environments based on a project+branch and keep an MD5 sum of the requirements for every virtual environment. Then, before building an environment the next time, we calculate the MD5 sum of the current requirements in this branch and look up existing virtual environments by this MD5 sum. If an existing environment with this sum was found, just re-use it. We are not sure if this is the best way to solve the problem.

Solution

It's generally a best practice to make your builds idempotent. Leaving artifacts behind only opens up opportunities for dependency management issues- and resolving those in a resilient way is exactly why you're using a build server. I suggest you take a look at WHY your Python builds are running so slow- if you discover this is because of an overabundance of packages in each project, this might be optimizable. If you discover this is because of server performance issues, those are easily fixable with more resources. If you find you are blocked by network throughput, consider using an artifact repository to cache packages locally. But building persistent virtualenvs seems like a recipe for problems that could be unpredictable when moving to another server, and extremely frustrating to troubleshoot.

Context

StackExchange DevOps Q#1992, answer score: 5

Revisions (0)

No revisions yet.