Transition to a ceph nix store
As discussed with Aleix and Vicenç, we would benefit from having the nix store directly placed in the ceph filesystem and let the compute nodes boot directly mounting it in /nix/store. This would solve the cache problems of the overlay FS as observed in #41 at the same time that prepares the path to export the nix store to other nodes (ejem, MN4/5). It also makes the room for the nix store larger and more robust (3 redundant copies).
The nodes can boot directly from the net via PXE, so we don't have to worry about their disk state (they are essentially stateless). However, we must ensure that they don't write into the nix database. We can achieve it by mounting the nix store as read only.
But we would need to be able to build some packages from inside the compute nodes (specially for debugging purposes) so they must be able to write to the store via the nix daemon of hut. This is probably doable as we already configured something similar for MN4.
Here is roughly the plan:
-
Ensure that nix build/develop/shell don't modify the store, but is all handled by the nix daemon (which will be on hut). -
Determine how to mount /nix/store via ceph early in the initrd, so we can continue the boot -
Also mount some paths for state/logs (maybe /var and some others) -
Make a tunnel for the nix daemon socket -
Configure the node to use the remote nix daemon -
Fix references to /nix/var/nix/profiles/system to be per host. -
Test that we can build packages locally from the node and be submitted to hut for build -
Prepare the PXE boot instead of reading the kernel from disk -
Switch the BIOS to boot via PXE