Best practice to use virtualenv in a docker container

https://hynek.me/articles/virtualenv-lives/

virtualenv Lives!

Setting up Python to the point to be able install packages from PyPI can be annoying and time-intensive. Even worse are OS-provided installations that start throwing cryptic error messages. Especially desktops are prone to that but it’s possible to break the whole toolchain of a server by installing some shiny package you heard about on reddit.

Your desktop system is unlikely to be a throwaway virtual machine or container. Which makes it a highly mutable system with difficult rollbacks and a lot of pain if stuff breaks. So until we all run NixOS on our desktops:

Don’t pip-install anything into its global site-packages beyond virtualenv.

Does that sound extreme to you? Only if you haven’t found the right tools to make it effortless.

virtualenv in 2014‽

virtualenv has been around for a while and was a somewhat accepted standard for installing Python software. Sadly, there are many missionaries running around nowadays, boldly proclaiming the end of virtualenv. Mostly because of containers in general and usually because of Docker in particular.

I find that unfortunate and shortsighted. Frankly, they fail to see the whole picture: virtualenv’s job isn’t just to separate your projects from each other. Its job is also to separate you from the operating system’s Python installation and the installed packages you probably have no idea about.

Let’s use the widely celebrated virtualenv-killer Docker to see why that’s a good idea. For that we look at the pre-installed packages you get after installing only python-pip into a trusty container1:

argparse (1.2.1)
chardet (2.0.1)
colorama (0.2.5)
html5lib (0.999)
pip (1.5.4)
requests (2.2.1)
setuptools (3.3)
six (1.5.2)
urllib3 (1.7.1)
wsgiref (0.1.2)

Surprised2? What happens if you install a newer requests, html5lib, or colorama over it? I’ll tell you what: stuff starts breaking.

Such things can happen any time and make your system fragile. Full-featured Ubuntu servers carry even more baggage of course. Whenever you install a system tool written in Python you can expect some kind of breakage. Whenever debian packagers decide that they don’t like something about how pip works and patch around it you’re involuntarily part of the “will it explode?” lottery.

OS X is in no way different; it comes with several dozens of Python packages.

And on the desktop – no matter what platform! – the situation is even worse. I dare to say that the average site-packages is a mess and most users have no idea why a certain package is installed. The step to break the whole installation is a very short one as many tutorial mentors will confirm.


The operating system Python’s site-packages3 belongs to the operating system.

I’ve been saying for a while now that I would prefer if OS vendors would create a virtualenv for their stuff somewhere else and let the users have the system site-packages. But that’s not happening. And no one can guarantee that some system tool you don’t even know about won’t ever install a version of a library that’s incompatible with your project’s requirements. Is it worth it to take that chance?

Only put software into site-packages that is explicitly written for that version of the OS. In other words: system tools that only use Python packages provided by the OS. Keep everything else in virtualenv.

Stop discussing virtualenv vs. system isolation as if they were mutually exclusive. You should use both at once, neither replaces the other:

  1. Do isolate your application server’s OS from its host using Docker/lxc/jails/zones/kvm/VMware/… to one container/vm per application.
  2. But inside of them also do isolate your Python environment using virtualenv from unexpected surprises in the system site-packages.

Less Typing, More Happiness

On your desktop you’ll want a bit more convenience than pure virtualenvs. So I urge you take five minutes to install and understand virtualenvwrapper, virtualenvwrapper-win, or virtualfish – depending on the shell and operating system you use. They take all the hassle from managing per-project virtualenvs.

Servers are a bit different because you’re unlikely to have more than one application (i.e. virtualenv) per user or even per server. You can have a look at my current approach to packaging and installing virtualenvs of server applications if you’re interested.

What About Python CLI Tools?

There’s one question that arises from this: what about all the amazing Python-based tools we love? How do we install tox, mynt, httpie, Pygments, and so forth? Arguably, they made the biggest mess in my system installations in the past.

Should you create virtualenvs for them all and link the executable scripts into some directory within you PATH?

‘Yes’ and ‘no’. ‘Yes’, that’s the correct approach (and it has been taken before). ‘No’, you shouldn’t do it yourself. There’s a helpful (alas POSIX–only) tool for just that: pipsi.

After installation (do not try to install it using pip, please read the installation instructions) you can install Python CLI tools by calling e.g.

$ pipsi install Pygments

pipsi then will create a new virtualenv in ~/.local/venvs and install the package into it. Finally it links the scripts into ~/.local/bin which you can add to your PATH.

Footnotes

  1. Built from an official trusty image as of 2014-09-14. ↩︎
  2. The reason for the presence of so many high profile packages is that debian decided to recursively unwrap all dependencies that are bundled within pip (including requests). They’re begging for version conflicts. ↩︎
  3. It doesn’t matter whether it’s one physical directory or many merged. Python doesn’t allow for multiple versions of one package in any case. ↩︎

你可能感兴趣的:(Best practice to use virtualenv in a docker container)