Guix shared checkouts
Intro
In guix, every user has their own .cache/guix under their home. This folder stores, among other things, the checkouts that are made when you guix pull, use transformations or even when you use guix system reconfigure. Those checkouts are probably what is taking up most space in the cache and also most time to download. When you initially install Guix System, you will have to wait quite a long time for the initial pull. Depending on your internet speed, most of the time it takes can be on fetching the checkout and then authenticating every commit in it. And on top of that when you run the first reconfigure, the checkouts will be fetched once more, under root’s home.
Motivation
Most users of Guix System who are the only users of their computer will likely end up with at least two checkouts of the Guix channel. One in their own home and the other in root’s home. That is because on reconfigure, the script checks if the commit you’re reconfiguring to is a descendant of the commit used in current system generation, so called forward update check. For this check to work, it has to fetch the whole Guix repository into a local checkout and check the relation of the commits.
This means new users will end up fetching the whole Guix repository twice. One mitigation is to use sudo with the flag -E on reconfigure. This flag will pass the whole environment through sudo, including the XDG_CACHE_HOME and HOME environment variables. Those are used by Guix to find where to store the checkouts. While this does work, the user is risking to get root owned files in their home cache. So they may have to chown the cache from time to time.
That’s why I started thinking about a different solution to this issue. Guix cache is probably not going away and it wouldn’t be very easy to share it between users, as the files would end up with different owners and permissions.
Solution
I came up with the idea of using fuse filesystem. Specifically, there will be just one shared checkout somewher in the filesystem. Then, it will be mounted to user’s homes. Thanks to fuse filesystem capabilities it’s possible to specify owners per each mount. This means that users will see all files as being owned by them in their local home. In the shared checkout, all files will be owned by root so no one else can modify them.
As an initial test, I tried creating a simple shell script that uses bindfs
#!/usr/bin/env bash
set -euxo pipefail
MAIN=$1
USER=$2
GROUP=users
HOME=/home/$USER
MOUNTDIR=$HOME/.cache/guix
SHAREDIR=/shared/guix-cache
# for each user
mkdir -p "$MOUNTDIR"
mkdir -p "$SHAREDIR"
chown $USER:$GROUP "$MOUNTDIR"
bindfs --create-for-group=$MAIN \
--create-for-user=$MAIN \
--force-user=$USER \
--force-group=$GROUP \
"$SHAREDIR" \
"$MOUNTDIR"
This shell script will use /shared/guix-cache as the shared folder and mount it to user’s home under .cache/guix. This means not only the checkouts are shared, but also authenticated commits, substitute and http caches, inferiors, shell profiles and the locate database. Some of those may be desirable, while some may not. For example, sharing shells shouldn’t really cause any issues, and better yet, the users don’t have to wait for a shell profile to build. On the other hand, sharing the locate database means you won’t be getting hits of your currently used guix if the users are on different guix revisions and one of them updates the database.
I’ve then tried pulling with two different users. First user had to pull the whole repository, and the other was able to use the checkout from the previous time!
To unmount, fusermount -u "$MOUNTDIR"
is used.
To make this more fit in the Guix System ecosystem, it is better to make it into a Guile script, and then make a shepherd service that will mount the shared guix cache on boot. There should also be couple of more options on what to mount exactly to mitigate the issue with sharing undesired folders.
Service implementation
The service should support per-user configuration, that is why I went with a record called user-info:
(define-record-type* <user-info>
user-info make-user-info
user-info?
(user user-info-user)
(home user-info-home)
(group user-info-group (default "users"))
(files user-info-files (default '("authentication" "checkouts" "http" "inferiors" "locate" "profiles" "substitute"))))
Since it’s not possible to read one service config from another, I cannot know the home of the user, that’s why home is needed here as well, it will probably be /home/ for most users, except for root.
As for the config of the service itself, the service will use bindfs and fuse services, have a main user that owns the shared directory and the shared directory path.
(define-configuration/no-serialization guix-shared-cache-config
(bindfs (file-like bindfs) "The bindfs package to use.")
(fuse (file-like fuse-2) "The fuse package to use")
(main-user (string "root") "The user that owns the main shared directory")
(main-group (string "root") "The group that owns the main shared directory")
(shared-directory (string "/shared/guix-cache") "The directory that is shared between users")
(users (list-of-user-info %default-guix-shared-users) "The users that have the directory shared"))
Now for the main part, the shepherd service script that will mount and unmount the directories. It just iterates over the folders to share, and shares them with bindfs invocation. To make sure everything can be mounted, the directories are first created. I made a simple procedure to make directory recursively, while making sure the directories created will be owned by the user. So for example if ~/.cache doesn’t exist, it will get created and chowned to the user. This ensures consistent permissions in user’s homes even if the system has just been installed and the user never used guix.
Also note the -o nonempty option. This option is used for cases when users already have their guix cache. I didn’t want to make the service remove the contents of the directories. If you want to get rid of them, do so manually. Stop the service, cleanup, start it again. If this option wasn’t present, there would be an error from bindfs.
(define (shared-guix-cache-shepherd-services config)
(map
(lambda (user)
(let* ((fuse (guix-shared-cache-config-fuse config))
(bindfs (guix-shared-cache-config-bindfs config))
(user-name (user-info-user user))
(user-home (user-info-home user))
(user-group (user-info-group user))
(user-files (user-info-files user))
(main-group (guix-shared-cache-config-main-group config))
(main-user (guix-shared-cache-config-main-user config))
(shared-dir-base (guix-shared-cache-config-shared-directory config))
(mount-dir-base (string-append user-home "/.cache/guix")))
(shepherd-service
;; Each user has their own service
(provision (list (symbol-append 'shared-guix-cache-
(string->symbol user-name))))
;; Make sure the homes are already present
(requirement '(file-systems user-homes))
(stop #~(lambda args
;; For each mounted directory, unmount it
(for-each
(lambda (dir)
(let ((mount-dir (string-append #$mount-dir-base "/" dir) ))
(invoke
#$(file-append fuse "/bin/fusermount")
"-u"
mount-dir)))
'#$user-files)
#f))
(start #~(lambda args
;; Like mkdir-p, but chown all created directories
;; by the user specified.
(define (mkdir-recursively dir user group)
(unless (eq? dir "/")
(when (not (file-exists? dir))
(mkdir-recursively (dirname dir) user group)
(mkdir dir)
(let* ((pw (getpw user))
(uid (passwd:uid pw))
(gid (passwd:gid pw)))
(chown dir uid gid)))))
;; For each mount directory, mount it to the shared directory
(for-each
(lambda (dir)
(let ((mount-dir (string-append #$mount-dir-base "/" dir))
(shared-dir (string-append #$shared-dir-base "/" dir)))
(mkdir-recursively shared-dir #$main-user #$main-group)
(mkdir-recursively mount-dir #$user-name #$user-group)
(invoke
#$(file-append bindfs "/bin/bindfs")
(string-append "--create-for-group=" #$main-group)
(string-append "--create-for-user=" #$main-user)
(string-append "--force-user=" #$user-name)
(string-append "--force-group=" #$user-group)
"-o" "nonempty"
shared-dir mount-dir)))
'#$user-files)
#t)))))
(guix-shared-cache-users config)))
The service itself is then quite simple. The extension is written so that the service can be extended with lists of user-info. Apart from that it just extends the shepherd root to provide the shepherd service.
(define guix-shared-cache-service-type
(service-type
(name 'shared-guix-cache)
(extensions (list
(service-extension shepherd-root-service-type
shared-guix-cache-shepherd-services)))
(compose append)
(extend (lambda (original extensions)
(guix-shared-cache-config
(inherit original)
(users (append (guix-shared-cache-users original))))))
(default-value (guix-shared-cache-config))
(description "Share ~/.cache/guix between
multiple users. The root user is going to own the shared checkout,
and will be part of the users who can use the shared checkout.
If you want to change the default user, set main-user of the
configuration. This user owns the shared checkout folder.")))
Usage
Normal usage of the service should be quite simple, just add the users you want to share with.
(service guix-shared-cache-service-type
(guix-shared-cache-config
(users
(cons*
(user-info
(user "ruther")
(home "/home/ruther"))
%default-guix-shared-users))))
To extend it instead,
(simple-service 'shared-checkout-ruther
guix-shared-cache-service-type
(list
(user-info
(user "ruther")
(home "/home/ruther"))))
Risks
I am currently using this just for a short time, so I haven’t mapped all the possible risks yet. I was a bit afraid there can be issues if multiple guix processes are going to be accessing those files. On the other hand I thought it should probably be fine as you can have more processes accessing those files even when you run guix only as one user.
Another thing to keep in mind is that any of the users that have the folder shared can delete anything. That means one user can remove checkout of another user. This might be undesirable for some. I think it can happen even mistakenly, when one user wou be clearing up their .cache and forgot about this.