biscotty's Workshop

biscotty's Workshop

Nix for Data Scientists

Nix your venvs and skip pip

Brian Carey's avatar
Brian Carey
May 29, 2025
∙ Paid
Share

Imagine having a fully provisioned data science environment with all desired libraries and even an IDE with useful extensions, without having to install any packages packages on your system, just by simply placing a single file in a directory and running a single command. What if, by sharing two files, two people can have identical setups, regardless of operating system. Imagine you could add, remove, and upgrade libraries without saying a little prayer not to get the dreaded message that a compatible set of dependencies could not be found. The Nix package manager provides all this, and more, and can be itself installed on any operating system.

My computer is very clean. I don’t have any IDEs installed, nor any data science libraries. Well, that’s not really true. Actually I have many versions of each of these installed on my system. But they are only accessible in the directories to which they pertain, although they are not installed in those directories like they are when using virtual environments. And even though Python itself is installed system-wide on my computer, each development directory has its own version of Python.

Traditional Package Management

Package management for data scientists working in Python has been notoriously difficult. Pip, Conda, Miniconda, Mamba, ux, poetry, venvwrapper, Docker … the list goes on of the different package managers and solutions are available. They all work, mostly. But they all break, occasionally. None are truly reproducible, and they are all prone to the eventual library conflicts, when one package upgrades its own dependencies, and the upgraded dependencies aren’t compatible with another package. Returning to older projects after some time can be hazardous.

To be fair, package management is a complex problem when dozens of packages with shared dependencies need to be installed and maintained. The problem is worse when considering portability to other computers and other operating systems. I should say, it is a complex problem under the traditional FHS paradigm. I have written about the problems caused by the FHS, and the gymnastics required to work around its limitations. The basic problem is that the FHS does not easily accommodate multiple versions of the same package, and the traditional way of installing packages relies heavily on shared libraries.

Virtual environments go some way to resolve the package management problems. Instead of installing packages system-wide, they are installed in a subdirectory in the project folder. When the virtual environment is active (and you need to remember to manually do so), all relevant environment variables are set to point to these local packages. This way there are no version conflicts between different projects on the same machine.

But many problems still remain. Firstly, it doesn’t solve the problem of version conflicts among packages within the same project. Secondly, even with an accompanying requirements.txt, there is no guarantee that someone else will get the exact same libraries as were used in the original project. And then there is the question of interoperablilty between OS. Docker solves that one, but then requires that a Docker or Podman process be running the whole time to manage the container, and each Docker image contains and runs a full blown operating system, with no version guarantees about that.

Nix, uniquely, sidesteps all these problems by abandoning the whole FHS paradigm with shared libraries, installing everything needed by a package, and not relying on anything else being installed elsewhere.

Keep reading with a 7-day free trial

Subscribe to biscotty's Workshop to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Brian Carey
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture