02 May 2024

feedPlanet Debian

Paul Wise: FLOSS Activities April 2024

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Administration

Communication

Sponsors

The SWH work was sponsored. All other work was done on a volunteer basis.

02 May 2024 8:03am GMT

01 May 2024

feedPlanet Debian

Dirk Eddelbuettel: RcppInt64 0.0.5 on CRAN: Minor Maintenance

The new-ish package RcppInt64 (announced last fall in this post, with three small updates following) arrived on CRAN yesterday as relase 0.0.5. RcppInt64 collects some of the previous conversions between 64-bit integer values in R and C++, and regroups them in a single package. It offers two interfaces: both a more standard as<>() converter from R values along with its companions wrap() to return to R, as well as more dedicated functions 'from' and 'to'.

This release addresses an new nag from CRAN who no longer want us to use the 'non-API' header function SET_S4_OBJECT so a small change was made.

The brief NEWS entry follows:

Changes in version 0.0.5 (2024-04-30)

  • Minor refactoring of internal code to not rely on SET_S4_OBJECT.

Courtesy of my CRANberries, there is a diffstat report relative to the previous release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

01 May 2024 11:44pm GMT

Bits from Debian: Debian welcomes the 2024 GSOC contributors/students

GSoC logo

We are very excited to announce that Debian has selected seven contributors to work under mentorship on a variety of projects with us during the Google Summer of Code.

Here are the list of the projects, students, and details of the tasks to be performed.


Project: Android SDK Tools in Debian

Deliverables of the project: Make the entire Android toolchain, Android Target Platform Framework, and SDK tools available in the Debian archives.


Project: Benchmarking Parallel Performance of Numerical MPI Packages

Deliverables of the project: Deliver an automated method for Debian maintainers to test selected numerical Debian packages for their parallel performance in clusters, in particular to catch performance regressions from updates, and to verify expected performance gains, such as Amdahl's and Gufstafson's law, from increased cluster resources.


Project: Debian MobCom

Deliverables of the project: Update the outdated mobile packages and recreate aged packages due to new dependencies. Bring in more mobile communication tools by adding about 5 new packages.


Project: Improve support of the Rust coreutils in Debian

Deliverables of the project: Make uutils behave more like GNU's coreutils by improving compatibility with GNU coreutils test suit.


Project: Improve support of the Rust findutils in Debian

Deliverables of the project: A safer and more performant implementation of the GNU suite's xargs, find, locate and updatedb tools in rust.


Project: Expanding ROCm support within Debian and derivatives

Deliverables of the project: Building, packaging, and uploading missing ROCm software into Debian repositories, starting with simple tools and progressing to high-level applications like PyTorch, with the final deliverables comprising a series of ROCm packages meeting community quality assurance standards.


Project: procps: Development of System Monitoring, Statistics and Information Tools in Rust

Deliverables of the project: Improve the usability of the entire Rust-based implementation of the procps utility on Linux.


Congratulations and welcome to all the contributors!

The Google Summer of Code program is possible in Debian thanks to the efforts of Debian Developers and Debian Contributors that dedicate part of their free time to mentor contributors and outreach tasks.

Join us and help extend Debian! You can follow the contributors' weekly reports on the debian-outreach mailing-list, chat with us on our IRC channel or reach out to the individual projects' team mailing lists.

01 May 2024 9:56pm GMT

Antoine Beaupré: Tor migrates from Gitolite/GitWeb to GitLab

Note: I've been awfully silent here for the past ... (checks notes) oh dear, 3 months! But that's not because I've been idle, quite the contrary, I've been very busy but just didn't have time to write about anything. So I've taken it upon myself to write something about my work this week, and published this post on the Tor blog which I copy here for a broader audience. Let me know if you like this or not.

Tor has finally completed a long migration from legacy Git infrastructure (Gitolite and GitWeb) to our self-hosted GitLab server.

Git repository addresses have therefore changed. Many of you probably have made the switch already, but if not, you will need to change:

https://git.torproject.org/

to:

https://gitlab.torproject.org/

In your Git configuration.

The GitWeb front page is now an archived listing of all the repositories before the migration. Inactive git repositories were archived in GitLab legacy/gitolite namespace and the gitweb.torproject.org and git.torproject.org web sites now redirect to GitLab.

Best effort was made to reproduce the original gitolite repositories faithfully and also avoid duplicating too much data in the migration. But it's possible that some data present in Gitolite has not migrated to GitLab.

User repositories are particularly at risk, because they were massively migrated, and they were "re-forked" from their upstreams, to avoid wasting disk space. If a user had a project with a matching name it was assumed to have the right data, which might be inaccurate.

The two virtual machines responsible for the legacy service (cupani for git-rw.torproject.org and vineale for git.torproject.org and gitweb.torproject.org) have been shutdown. Their disks will remain for 3 months (until the end of July 2024) and their backups for another year after that (until the end of July 2025), after which point all the data from those hosts will be destroyed, with only the GitLab archives remaining.

The rest of this article expands on how this was done and what kind of problems we faced during the migration.

Where is the code?

Normally, nothing should be lost. All repositories in gitolite have been either explicitly migrated by their owners, forcibly migrated by the sysadmin team (TPA), or explicitly destroyed at their owner's request.

An exhaustive rewrite map translates gitolite projects to GitLab projects. Some of those projects actually redirect to their parent in cases of empty repositories that were obvious forks. Destroyed repositories redirect to the GitLab front page.

Because the migration happened progressively, it's technically possible that commits pushed to gitolite were lost after the migration. We took great care to avoid that scenario. First, we adopted a proposal (TPA-RFC-36) in June 2023 to announce the transition. Then, in March 2024, we locked down all repositories from any further changes. Around that time, only a handful of repositories had changes made after the adoption date, and we examined each repository carefully to make sure nothing was lost.

Still, we built a diff of all the changes in the git references that archivists can peruse to check for data loss. It's large (6MiB+) because a lot of repositories were migrated before the mass migration and then kept evolving in GitLab. Many other repositories were rebuilt in GitLab from parent to rebuild a fork relationship which added extra references to those clones.

A note to amateur archivists out there, it's probably too late for one last crawl now. The Git repositories now all redirect to GitLab and are effectively unavailable in their original form.

That said, the GitWeb site was crawled into the Internet Archive in February 2024, so at least some copy of it is available in the Wayback Machine. At that point, however, many developers had already migrated their projects to GitLab, so the copies there were already possibly out of date compared with the repositories in GitLab.

Software Heritage also has a copy of all repositories hosted on Gitolite since June 2023 and have continuously kept mirroring the repositories, where they will be kept hopefully in eternity. There's an issue where the main website can't find the repositories when you search for gitweb.torproject.org, instead search for git.torproject.org.

In any case, if you believe data is missing, please do let us know by opening an issue with TPA.

Why?

This is an old project in the making. The first discussion about migrating from gitolite to GitLab started in 2020 (almost 4 years ago). But going further back, the first GitLab experiment was in 2016, almost a decade ago.

The current GitLab server dates from 2019, replacing Trac for issue tracking in 2020. It was originally supposed to host only mirrors for merge requests and issue trackers but, naturally, one thing led to another and eventually, GitLab had grown a container registry, continuous integration (CI) runners, GitLab Pages, and, of course, hosted most Git repositories.

There were hesitations at moving to GitLab for code hosting. We had discussions about the increased attack surface and ways to mitigate that, but, ultimately, it seems the issues were not that serious and the community embraced GitLab.

TPA actually migrated its most critical repositories out of shared hosting entirely, into specific servers (e.g. the Puppet Git repository is just on the Puppet server now), leveraging Git's decentralized nature and removing an entire attack surface from our infrastructure. Some of those repositories are mirrored back into GitLab, but the authoritative copy is not on GitLab.

In any case, the proposal to migrate from Gitolite to GitLab was effectively just formalizing a fait accompli.

How to migrate from Gitolite / cgit to GitLab

The progressive migration was a challenge. If you intend to migrate between hosting platforms, we strongly recommend to make a "flag day" during which you migrate all repositories at once. This ensures a smoother transition and avoids elaborate rewrite rules.

When Gitolite access was shutdown, we had repositories on both GitLab and Gitolite, without a clear relationship between the two. A priori, the plan then was to import all the remaining Gitolite repositories into the legacy/gitolite namespace, but that seemed wasteful, particularly for large repositories like Tor Browser which uses nearly a gigabyte of disk space. So we took special care to avoid duplicating repositories.

When the mass migration started, only 71 of the 538 Gitolite repositories were Migrated to GitLab in the gitolite.conf file. So, given that we had hundreds of repositories to migrate:, we developed some automation to "save time". We already automate similar ad-hoc tasks with Fabric, so we used that framework here as well. (Our normal configuration management tool is Puppet, which is a poor fit here.)

So a relatively large amount of Python code was produced to basically do the following:

  1. check if all on-disk repositories are listed in gitolite.conf (and vice versa) and either add missing repositories or delete them from disk if garbage
  2. for each repository in gitolite.conf, if its category is marked Migrated to GitLab, skip, otherwise;
  3. find a matching GitLab project by name, prompt the user for multiple matches
  4. if a match is found, redirect if the repository is non-empty
    • we have GitLab projects that look like the real thing, but are only present to host migrated Trac issues
    • in such cases we cloned the Gitolite project locally and pushed to the existing repository instead
  5. otherwise, a new repository is created in the legacy/gitolite namespace, using the "import" mechanism in GitLab to automatically import the repository from Gitolite, creating redirections and updating gitolite.conf to document the change

User repositories (those under the user/ directory in Gitolite) were handled specially. First, the existing redirection map was checked to see if a similarly named project was migrated (so that, e.g. user/dgoulet/tor is properly treated as a fork of tpo/core/tor). Then the parent project was forked in GitLab and the Gitolite project force-pushed to the fork. This allows us to show the fork relationship in GitLab and, more importantly, benefit from the "pool" feature in GitLab which deduplicates disk usage between forks.

Sometimes, we found no such relationships. Then we simply imported multiple repositories with similar names in the legacy/gitolite namespace, sometimes creating forks between user repositories, on a first-come-first-served basis from the gitolite.conf order.

The code used in this migration is now available publicly. We encourage other groups planning to migrate from Gitolite/GitWeb to GitLab to use (and contribute to) our fabric-tasks repository, even though it does have its fair share of hard-coded assertions.

The main entry point is the gitolite.mass-repos-migration task. A typical migration job looked like:

anarcat@angela:fabric-tasks$ fab -H cupani.torproject.org gitolite.mass-repos-migration 
[...]
INFO: skipping project project/help/infra in category Migrated to GitLab
INFO: skipping project project/help/wiki in category Migrated to GitLab
INFO: skipping project project/jenkins/jobs in category Migrated to GitLab
INFO: skipping project project/jenkins/tools in category Migrated to GitLab
INFO: searching for projects matching fastlane
INFO: Successfully connected to https://gitlab.torproject.org
import gitolite project project/tor-browser/fastlane into gitlab legacy/gitolite/project/tor-browser/fastlane with desc 'Tor Browser app store and deployment configuration for Fastlane'? [Y/n] 
INFO: importing gitolite project project/tor-browser/fastlane into gitlab legacy/gitolite/project/tor-browser/fastlane with desc 'Tor Browser app store and deployment configuration for Fastlane'
INFO: building a new connect to cupani
INFO: defaulting name to fastlane
INFO: importing project into GitLab
INFO: Successfully connected to https://gitlab.torproject.org
INFO: loading group legacy/gitolite/project/tor-browser
INFO: archiving project
INFO: creating repository fastlane (fastlane) in namespace legacy/gitolite/project/tor-browser from https://git.torproject.org/project/tor-browser/fastlane into https://gitlab.torproject.org/legacy/gitolite/project/tor-browser/fastlane
INFO: migrating Gitolite repository project/tor-browser/fastlane to GitLab project legacy/gitolite/project/tor-browser/fastlane
INFO: uploading 399 bytes to /srv/git.torproject.org/repositories/project/tor-browser/fastlane.git/hooks/pre-receive
INFO: making /srv/git.torproject.org/repositories/project/tor-browser/fastlane.git/hooks/pre-receive executable
INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project project/tor-browser/fastlane to category Migrated to GitLab
INFO: skipping project project/bridges/bridgedb-admin in category Migrated to GitLab
[...]

In the above, you can see migrated repositories skipped then the fastlane project being archived into GitLab. Another example with a later version of the script, processing only user repositories and showing the interactive prompt and a force-push into a fork:

$ fab -H cupani.torproject.org  gitolite.mass-repos-migration --include 'user/.*' --exclude '.*tor-?browser.*'
INFO: skipping project user/aagbsn/bridgedb in category Migrated to GitLab
[...]
INFO: skipping project user/phw/atlas in category Migrated to GitLab
INFO: processing project user/phw/obfsproxy (Philipp's obfsproxy repository) in category Users' development repositories (Attic)
INFO: Successfully connected to https://gitlab.torproject.org
INFO: user repository detected, trying to find fork phw/obfsproxy
WARNING: no existing fork found, entering user fork subroutine
INFO: found 6 GitLab projects matching 'obfsproxy' (https://gitweb.torproject.org/user/phw/obfsproxy.git)
0 legacy/gitolite/debian/obfsproxy
1 legacy/gitolite/debian/obfsproxy-legacy
2 legacy/gitolite/user/asn/obfsproxy
3 legacy/gitolite/user/ioerror/obfsproxy
4 tpo/anti-censorship/pluggable-transports/obfsproxy
5 tpo/anti-censorship/pluggable-transports/obfsproxy-legacy
select parent to fork from, or enter to abort: ^G4
INFO: repository is not empty: in-pack: 2104, packs: 1, size-pack: 414
fork project tpo/anti-censorship/pluggable-transports/obfsproxy into legacy/gitolite/user/phw/obfsproxy^G [Y/n] 
INFO: loading project tpo/anti-censorship/pluggable-transports/obfsproxy
INFO: forking project user/phw/obfsproxy into namespace legacy/gitolite/user/phw
INFO: waiting for fork to complete...
INFO: fork status: started, sleeping...
INFO: fork finished
INFO: cloning and force pushing from user/phw/obfsproxy to legacy/gitolite/user/phw/obfsproxy
INFO: deleting branch protection: <class 'gitlab.v4.objects.branches.ProjectProtectedBranch'> => {'id': 2723, 'name': 'master', 'push_access_levels': [{'id': 2864, 'access_level': 40, 'access_level_description': 'Maintainers', 'deploy_key_id': None}], 'merge_access_levels': [{'id': 2753, 'access_level': 40, 'access_level_description': 'Maintainers'}], 'allow_force_push': False}
INFO: cloning repository git-rw.torproject.org:/srv/git.torproject.org/repositories/user/phw/obfsproxy.git in /tmp/tmp6orvjggy/user/phw/obfsproxy
Cloning into bare repository '/tmp/tmp6orvjggy/user/phw/obfsproxy'...
INFO: pushing to GitLab: https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
remote: 
remote: To create a merge request for bug_10887, visit:        
remote:   https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy/-/merge_requests/new?merge_request%5Bsource_branch%5D=bug_10887        
remote: 
[...]
To ssh://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
 + 2bf9d09...a8e54d5 master -> master (forced update)
 * [new branch]      bug_10887 -> bug_10887
[...]
INFO: migrating repo
INFO: migrating Gitolite repository https://gitweb.torproject.org/user/phw/obfsproxy.git to GitLab project https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project user/phw/obfsproxy to category Migrated to GitLab
INFO: processing project user/phw/scramblesuit (Philipp's ScrambleSuit repository) in category Users' development repositories (Attic)
INFO: user repository detected, trying to find fork phw/scramblesuit
WARNING: no existing fork found, entering user fork subroutine
WARNING: no matching gitlab project found for user/phw/scramblesuit
INFO: user fork subroutine failed, resuming normal procedure
INFO: searching for projects matching scramblesuit
import gitolite project user/phw/scramblesuit into gitlab legacy/gitolite/user/phw/scramblesuit with desc 'Philipp's ScrambleSuit repository'?^G [Y/n] 
INFO: checking if remote repo https://git.torproject.org/user/phw/scramblesuit exists
INFO: importing gitolite project user/phw/scramblesuit into gitlab legacy/gitolite/user/phw/scramblesuit with desc 'Philipp's ScrambleSuit repository'
INFO: importing project into GitLab
INFO: Successfully connected to https://gitlab.torproject.org
INFO: loading group legacy/gitolite/user/phw
INFO: creating repository scramblesuit (scramblesuit) in namespace legacy/gitolite/user/phw from https://git.torproject.org/user/phw/scramblesuit into https://gitlab.torproject.org/legacy/gitolite/user/phw/scramblesuit
INFO: archiving project
INFO: migrating Gitolite repository https://gitweb.torproject.org/user/phw/scramblesuit.git to GitLab project https://gitlab.torproject.org/legacy/gitolite/user/phw/scramblesuit
INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project user/phw/scramblesuit to category Migrated to GitLab
[...]

Acute eyes will notice the bell used as a notification mechanism as well in this transcript.

A lot of the code is now useless for us, but some, like "commit and push" or is-repo-empty live on in the git module and, of course, the gitlab module has grown some legs along the way. We've also found fun bugs, like a file descriptor exhaustion in bash, among other oddities. The retirement milestone and issue 41215 has a detailed log of the migration, for those curious.

This was a challenging project, but it feels nice to have this behind us. This gets rid of 2 of the 4 remaining machines running Debian "old-old-stable", which moves a bit further ahead in our late bullseye upgrades milestone.

Full transparency: we tested GPT-3.5, GPT-4, and other large language models to see if they could answer the question "write a set of rewrite rules to redirect GitWeb to GitLab". This has become a standard LLM test for your faithful writer to figure out how good a LLM is at technical responses. None of them gave an accurate, complete, and functional response, for the record.

The actual rewrite rules as of this writing follow, for humans that actually like working answers provided by expert humans instead of artificial intelligence which currently seem to be, glorified, mansplaining interns.

git.torproject.org rewrite rules

Those rules are relatively simple in that they rewrite a single URL to its equivalent GitLab counterpart in a 1:1 fashion. It relies on the rewrite map mentioned above, of course.

RewriteEngine on
# this RewriteMap connects the gitweb projects to their GitLab
# equivalent
RewriteMap gitolite2gitlab "txt:/etc/apache2/gitolite2gitlab.txt"
# if this becomes a performance bottleneck, convert to a DBM map with:
#
#  $ httxt2dbm -i mapfile.txt -o mapfile.map
#
# and:
#
# RewriteMap mapname "dbm:/etc/apache/mapfile.map"
#
# according to reports lavamind found online, we hit such a
# performance bottleneck only around millions of entries, which is not our case

# those two rules can go away once all the projects are
# migrated to GitLab
#
# this matches the request URI so we can check the RewriteMap
# for a match next
#
# WARNING: this won't match URLs without .git in them, which
# *do* work now. one possibility would be to match the request
# URI (without query string!) with:
#
# /git/(.*)(.git)?/(((branches|hooks|info|objects/).*)|git-.*|upload-pack|receive-pack|HEAD|config|description)?.
#
# I haven't been able to figure out the actual structure of
# those URLs, so it's really hard to figure out the boundaries
# of the project name here. I stopped after pouring around the
# http-backend.c code in git
# itself. https://www.git-scm.com/docs/http-protocol is also
# kind of incomplete and unsatisfying.
RewriteCond %{REQUEST_URI} ^/(git/)?(.*).git/.*$
# this makes the RewriteRule match only if there's a match in
# the rewrite map
RewriteCond ${gitolite2gitlab:%2|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(git/)?(.*).git/(.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$2}.git/$3 [R=302,L]

# Fallback everything else to GitLab
RewriteRule (.*) https://gitlab.torproject.org [R=302,L]

gitweb.torproject.org rewrite rules

Those are the vastly more complicated GitWeb to GitLab rewrite rules.

Note that we say "GitWeb" but we were actually not running GitWeb but cgit, as the former didn't actually scale for us.

RewriteEngine on
# this RewriteMap connects the gitweb projects to their GitLab
# equivalent
RewriteMap gitolite2gitlab "txt:/etc/apache2/gitolite2gitlab.txt"

# special rule to process targets of the old spec.tpo site and
# bring them to the right redirect on the new spec.tpo site. that should turn, for example:
#
# https://gitweb.torproject.org/torspec.git/tree/address-spec.txt
#
# into:
#
# https://spec.torproject.org/address-spec
RewriteRule ^/torspec.git/tree/(.*).txt$ https://spec.torproject.org/$1 [R=302]

# list of endpoints taken from cgit's cmd.c

# those two RewriteCond are necessary because we don't move
# all repositories at once. once the migration is completed,
# they can be removed.
#
# and yes, they are copied all over the place below
#
# create a match for the project name to check if the project
# has been moved to GitLab
RewriteCond %{REQUEST_URI} ^/(.*).git(/.*)?$
# this makes the RewriteRule match only if there's a match in
# the rewrite map
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
# main project page, like summary below
RewriteRule ^/(.*).git/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/ [R=302,L]

# summary
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/summary/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/ [R=302,L]

# about
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/about/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/ [R=302,L]

# commit
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond "%{QUERY_STRING}" "(.*(?:^|&))id=([^&]*)(&.*)?$"
RewriteRule ^/(.*).git/commit/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%2 [R=302,L,QSD]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/commit/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD [R=302,L]

# diff, incomplete because can diff arbitrary refs and files in cgit but not in GitLab, hard to parse
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} id=([^&]*)
RewriteRule ^/(.*).git/diff/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%1 [R=302,L,QSD]

# patch
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} id=([^&]*)
RewriteRule ^/(.*).git/patch/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%1.patch [R=302,L,QSD]

# rawdiff, incomplete because can show only one file diff, which GitLab cannot
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} id=([^&]*)
RewriteRule ^/(.*).git/rawdiff/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%1.diff [R=302,L,QSD]

# log
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} h=([^&]*)
RewriteRule ^/(.*).git/log/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/%1 [R=302,L,QSD]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/log/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD [R=302,L]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/log(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD$2 [R=302,L]

# atom
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} h=([^&]*)
RewriteRule ^/(.*).git/atom/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/%1 [R=302,L,QSD]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/atom/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD [R=302,L,QSD]

# refs, incomplete because two pages in GitLab, defaulting to "tags"
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/refs/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tags [R=302,L]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} h=([^&]*)
RewriteRule ^/(.*).git/tag/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tags/%1 [R=302,L,QSD]

# tree
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} id=([^&]*)
RewriteRule ^/(.*).git/tree(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tree/%1$2 [R=302,L,QSD]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/tree(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tree/HEAD$2 [R=302,L]

# /-/tree has no good default in GitLab, revert to HEAD which is a good
# approximation (we can't assume "master" here anymore)
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/tree/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tree/HEAD [R=302,L]

# plain
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteCond %{QUERY_STRING} h=([^&]*)
RewriteRule ^/(.*).git/plain(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/raw/%1$2 [R=302,L,QSD]
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/plain(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/raw/HEAD$2 [R=302,L]

# blame: disabled
#RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
#RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
#RewriteCond %{QUERY_STRING} h=([^&]*)
#RewriteRule ^/(.*).git/blame(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/blame/%1$2 [R=302,L,QSD]
# same default as tree above
#RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
#RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
#RewriteRule ^/(.*).git/blame(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/blame/HEAD/$2 [R=302,L]

# stats
RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^/(.*).git/stats/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/graphs/HEAD [R=302,L]

# still TODO:
# repolist: once migration is complete
#
# cannot be done:
# atom: needs a feed token, user must be logged in
# blob: no direct equivalent
# info: not working on main cgit website?
# ls_cache: not working, irrelevant?
# objects: undocumented?
# snapshot: pattern too hard to match on cgit's side

# special case, we keep a copy of the main index on the archive
RewriteRule ^/?$ https://archive.torproject.org/websites/gitweb.torproject.org.html [R=302,L]
# Fallback: everything else to GitLab
RewriteRule .* https://gitlab.torproject.org [R=302,L]

The reference copy of those is available in our (currently private) Puppet git repository.

01 May 2024 2:55pm GMT

Guido Günther: Free Software Activities April 2024

A short status update of what happened on my side last month. Maintenance and code review keep to be the top time sinks (in a positive way).

01 May 2024 11:41am GMT

Colin Watson: Free software activity in April 2024

My Debian contributions this month were all sponsored by Freexian.

You can support my work directly via Liberapay.

01 May 2024 11:34am GMT

Bits from Debian: Infomaniak First Platinum Sponsor of DebConf24

infomaniaklogo

We are pleased to announce that Infomaniak has committed to sponsor DebConf24 as a Platinum Sponsor.

Infomaniak is an independent cloud service provider recognised throughout Europe for its commitment to privacy, the local economy and the environment. Recording growth of 18% in 2023, the company is developing a suite of online collaborative tools and cloud hosting, streaming, marketing and events solutions.

Infomaniak uses exclusively renewable energy, builds its own data centers and develops its solutions in Switzerland at the heart of Europe, without relocating. The company powers the website of the Belgian radio and TV service (RTBF) and provides streaming for more than 3,000 TV and radio stations in Europe.

With this commitment as Platinum Sponsor, Infomaniak is contributing to the Debian annual Developers' conference, directly supporting the progress of Debian and Free Software. Infomaniak contributes to strengthen the community that collaborates on Debian projects from all around the world throughout all of the year.

Thank you very much, Infomaniak, for your support of DebConf24!

Become a sponsor too!

DebConf24 will take place from 28th July to 4th August 2024 in Busan, South Korea, and will be preceded by DebCamp, from 21st to 27th July 2024.

DebConf24 is accepting sponsors! Interested companies and organizations should contact the DebConf team through sponsors@debconf.org, or viisit the DebConf24 website at https://debconf24.debconf.org/sponsors/become-a-sponsor/.

01 May 2024 10:08am GMT

Russ Allbery: Review: To Each This World

Review: To Each This World, by Julie E. Czerneda

Publisher: DAW
Copyright: November 2022
ISBN: 0-7564-1543-8
Format: Kindle
Pages: 676

To Each This World is a standalone science fiction novel.

Henry m'Yama t'Nowak is the Arbiter of New Earth. This is somewhat akin to a president, but only in very specific ways. Henry's job is to deal with the Kmet.

New Earth was settled by slower-than-light colony ship from old Earth, our Earth. It is, so far as they know, the last of humanity in the universe. Origin Earth fell silent hundreds of years previous, before the colonists even landed. New Earth is now a carefully and thoughtfully managed world where humans survived, thrived, and at one point sent out six slower-than-light colony ships of its own. All were feared lost after a rushed launch due to a solar storm.

As this story opens, a probe from one of those ships arrives.

This is cause for rejoicing, but there are two small problems. The first is that the culture of New Earth has changed drastically since the days when they launched the Halcyon colony ships. New Earth is now part of the Duality, a new alliance with aliens painstakingly negotiated after their portal appeared in orbit. The Kmet were peaceful, eager to form an alliance and offer new technology, although they struggled with concepts such as individuality and insisted on interacting only with the Arbiter. Their technological gifts and the apparent loss of the Halcyon colony ships refocused New Earth on safety and caution. This unexpected message is a somewhat tricky political problem, a reminder of the path not taken.

The other small problem is that the reaction of the Kmet to this message is... dramatic.

This book has several problems, but the most serious is that it is simply too long. If you have read any other Czerneda novels, you know that she tends towards sprawling world-building, but usually there are enough twists and turns in the plot to keep the story moving while the protagonists slowly puzzle out the scientific mysteries. To Each This World is not sufficiently twisty for 676 pages. I think you could have cut half the novel without losing any major plot points.

The interesting parts of this book, to me, were figuring out what's going on with the Kmet, some of the political tensions within the New Earth government, and understanding what Henry and Pilot Killian's story had to do with the apparently-unrelated but intriguing interludes following Beth Seeker in a strange place called Doublet. All that stuff is in here, but it's alongside a whole lot of Henry wrestling with lifeboat ethics in situations where he thinks he needs to lie to and manipulate people for their own good. We also get several extended tours of societies that, while vaguely interesting in a science fiction world-building way, have essentially nothing to do with the plot.

We also get a whole lot of Henry's eagerly helpful AI polymorph Flip. I wanted to like this character, and I occasionally managed, but I felt like there was a constant mismatch between, in hindsight, how Czerneda meant for me to see Flip and what I thought she was signaling while I was reading. I wanted Flip to either be a fascinatingly weird companion or to be directly relevant to the plot, but instead there were hundreds of pages of unnerving creepiness mixed with obsequiousness and emotional neediness, all of which I think I read more into than Czerneda had intended. The overall experience was more exhausting than fun.

The core of the plot is solid, and if you like SF novels built around world-building and scientific mysteries, there's a lot here to enjoy. I think Czerneda's Species Imperative series (starting with Survival) is a better execution of some of the same ideas, but I liked that series a lot and was willing to read another take on it. Czerneda is one of the SF writers who takes biology seriously and is willing to write very alien aliens, and that leads to a few satisfying twists. Also, Beth Seeker is a great character (I wish we'd seen more of her), and Killian, while a bit generic, is a serviceable protagonist when Czerneda needs someone to go poke things with a stick.

Henry... I'm not sure what I think of Henry, and your enjoyment of this book may depend on how much you click with him.

Henry is a diplomat and an extrovert. His greatest joy and talent is talking to people, navigating political situations, and negotiating. Science fiction is full of protagonists who should be this character, but they rarely are this character, probably because a lot of writers are introverts. I think Czerneda deserves real credit for making her charismatic politician sufficiently accurate that his thought processes occasionally felt alien. For me, Henry was easiest to appreciate when Killian was the viewpoint protagonist and I could look him through someone else's eyes, but Henry's viewpoint mostly worked as well. There's a lot of competence porn enjoyment in watching him do his thing.

The problem for me is that I thought several of his actions were unforgivably unethical, but no one in the book who matters seems to agree. I can see why he reached those unethical decisions, but they were profound violations of consent. He directly lies to people because he thinks telling the truth would be too risky and not get them to do what he wants them to do, and Czerneda sets up the story to imply that he might be right.

This is not necessarily a bad choice in a novel, but the author has to do some work to bring me along, and Czerneda didn't do enough of that work. I kept wanting there to be some twist or sting or complication that forced Henry to come to terms with what he was doing, but it never happens. He has to pick between two moral principles that I consider rather finely balanced, if not tilted in the opposite direction that he does, and he treats one principle as inviolable and the other as mostly unimportant. The plans he makes on that basis work fine, and those on the other side of that decision are never heard from again. It left a bad taste in my mouth, particularly given how much of the book is built around Henry making tough, tricky decisions under pressure.

I don't know about this book. I have a lot of mixed feelings. Parts of it I quite enjoyed. Parts of it I mostly enjoyed but wish were much less dragged out. Parts of it frustrated or bored me. It's one of those books where the more I thought about it after reading it, the more the parts I disliked annoyed me.

If you like Czerneda's style of world-building and biology, and if you have more tolerance for Henry's decisions than I did, you may well like this, but read Species Imperative first. I should probably also warn that there is a lot of magical technology in this book that blatantly violates some core principles of physics. I have a high tolerance for that sort of thing, but if you don't, you're going to be grumbling.

Rating: 6 out of 10

01 May 2024 3:39am GMT

Matthew Palmer: The Mediocre Programmer's Guide to Rust

Me: "Hi everyone, my name's Matt, and I'm a mediocre programmer."

Everyone: "Hi, Matt."

Facilitator: "Are you an alcoholic, Matt?"

Me: "No, not since I stopped reading Twitter."

Facilitator: "Then I think you're in the wrong room."

Yep, that's my little secret - I'm a mediocre programmer. The definition of the word "hacker" I most closely align with is "someone who makes furniture with an axe". I write simple, straightforward code because trying to understand complexity makes my head hurt.

Which is why I've always avoided the more "academic" languages, like OCaml, Haskell, Clojure, and so on. I know they're good languages - people far smarter than me are building amazing things with them - but the time I hear the word "endofunctor", I've lost all focus (and most of my will to live). My preferred languages are the ones that come with less intellectual overhead, like C, PHP, Python, and Ruby.

So it's interesting that I've embraced Rust with significant vigour. It's by far the most "complicated" language that I feel at least vaguely comfortable with using "in anger". Part of that is that I've managed to assemble a set of principles that allow me to almost completely avoid arguing with Rust's dreaded borrow checker, lifetimes, and all the rest of the dark, scary corners of the language. It's also, I think, that Rust helps me to write better software, and I can feel it helping me (almost) all of the time.

In the spirit of helping my fellow mediocre programmers to embrace Rust, I present the principles I've assembled so far.

Neither a Borrower Nor a Lender Be

If you know anything about Rust, you probably know about the dreaded "borrow checker". It's the thing that makes sure you don't have two pieces of code trying to modify the same data at the same time, or using a value when it's no longer valid.

While Rust's borrowing semantics allow excellent performance without compromising safety, for us mediocre programmers it gets very complicated, very quickly. So, the moment the compiler wants to start talking about "explicit lifetimes", I shut it up by just using "owned" values instead.

It's not that I never borrow anything; I have some situations that I know are "borrow-safe" for the mediocre programmer (I'll cover those later). But any time I'm not sure how things will pan out, I'll go straight for an owned value.

For example, if I need to store some text in a struct or enum, it's going straight into a String. I'm not going to start thinking about lifetimes and &'a str; I'll leave that for smarter people. Similarly, if I need a list of things, it's a Vec<T> every time - no &'b [T] in my structs, thank you very much.

Attack of the Clones

Following on from the above, I've come to not be afraid of .clone(). I scatter them around my code like seeds in a field. Life's too short to spend time trying to figure out who's borrowing what from whom, if I can just give everyone their own thing.

There are warnings in the Rust book (and everywhere else) about how a clone can be "expensive". While it's true that, yes, making clones of data structures consumes CPU cycles and memory, it very rarely matters. CPU cycles are (usually) plentiful and RAM (usually) relatively cheap. Mediocre programmer mental effort is expensive, and not to be spent on premature optimisation. Also, if you're coming from most any other modern language, Rust is already giving you so much more performance that you're probably ending up ahead of the game, even if you .clone() everything in sight.

If, by some miracle, something I write gets so popular that the "expense" of all those spurious clones becomes a problem, it might make sense to pay someone much smarter than I to figure out how to make the program a zero-copy masterpiece of efficient code. Until then… clone early and clone often, I say!

Derive Macros are Powerful Magicks

If you start .clone()ing everywhere, pretty quickly you'll be hit with this error:


error[E0599]: no method named `clone` found for struct `Foo` in the current scope

This is because not everything can be cloned, and so if you want your thing to be cloned, you need to implement the method yourself. Well… sort of.

One of the things that I find absolutely outstanding about Rust is the "derive macro". These allow you to put a little marker on a struct or enum, and the compiler will write a bunch of code for you! Clone is one of the available so-called "derivable traits", so you add #[derive(Clone)] to your structs, and poof! you can .clone() to your heart's content.

But there are other things that are commonly useful, and so I've got a set of traits that basically all of my data structures derive:


#[derive(Clone, Debug, Default)]
struct Foo {
    // ...
}

Every time I write a struct or enum definition, that line #[derive(Clone, Debug, Default)] goes at the top.

The Debug trait allows you to print a "debug" representation of the data structure, either with the dbg!() macro, or via the {:?} format in the format!() macro (and anywhere else that takes a format string). Being able to say "what exactly is that?" comes in handy so often, not having a Debug implementation is like programming with one arm tied behind your Aeron.

Meanwhile, the Default trait lets you create an "empty" instance of your data structure, with all of the fields set to their own default values. This only works if all the fields themselves implement Default, but a lot of standard types do, so it's rare that you'll define a structure that can't have an auto-derived Default. Enums are easily handled too, you just mark one variant as the default:


#[derive(Clone, Debug, Default)]
enum Bar {
    Something(String),
    SomethingElse(i32),
    #[default]   // <== mischief managed
    Nothing,
}

Borrowing is OK, Sometimes

While I previously said that I like and usually use owned values, there are a few situations where I know I can borrow without angering the borrow checker gods, and so I'm comfortable doing it.

The first is when I need to pass a value into a function that only needs to take a little look at the value to decide what to do. For example, if I want to know whether any values in a Vec<u32> are even, I could pass in a Vec, like this:


fn main() {
    let numbers = vec![0u32, 1, 2, 3, 4, 5];

    if has_evens(numbers) {
        println!("EVENS!");
    }
}

fn has_evens(numbers: Vec<u32>) -> bool {
    numbers.iter().any(|n| n % 2 == 0)
}

Howver, this gets ugly if I'm going to use numbers later, like this:


fn main() {
    let numbers = vec![0u32, 1, 2, 3, 4, 5];

    if has_evens(numbers) {
        println!("EVENS!");
    }

    // Compiler complains about "value borrowed here after move"
    println!("Sum: {}", numbers.iter().sum::<u32>());
}

fn has_evens(numbers: Vec<u32>) -> bool {
    numbers.iter().any(|n| n % 2 == 0)
}

Helpfully, the compiler will suggest I use my old standby, .clone(), to fix this problem. But I know that the borrow checker won't have a problem with lending that Vec<u32> into has_evens() as a borrowed slice, &[u32], like this:


fn main() {
    let numbers = vec![0u32, 1, 2, 3, 4, 5];

    if has_evens(&numbers) {
        println!("EVENS!");
    }
}

fn has_evens(numbers: &[u32]) -> bool {
    numbers.iter().any(|n| n % 2 == 0)
}

The general rule I've got is that if I can take advantage of lifetime elision (a fancy term meaning "the compiler can figure it out"), I'm probably OK. In less fancy terms, as long as the compiler doesn't tell me to put 'a anywhere, I'm in the green. On the other hand, the moment the compiler starts using the words "explicit lifetime", I nope the heck out of there and start cloning everything in sight.

Another example of using lifetime elision is when I'm returning the value of a field from a struct or enum. In that case, I can usually get away with returning a borrowed value, knowing that the caller will probably just be taking a peek at that value, and throwing it away before the struct itself goes out of scope. For example:


struct Foo {
    id: u32,
    desc: String,
}

impl Foo {
    fn description(&self) -> &str {
        &self.desc
    }
}

Returning a reference from a function is practically always a mortal sin for mediocre programmers, but returning one from a struct method is often OK. In the rare case that the caller does want the reference I return to live for longer, they can always turn it into an owned value themselves, by calling .to_owned().

Avoid the String Tangle

Rust has a couple of different types for representing strings - String and &str being the ones you see most often. There are good reasons for this, however it complicates method signatures when you just want to take some sort of "bunch of text", and don't care so much about the messy details.

For example, let's say we have a function that wants to see if the length of the string is even. Using the logic that since we're just taking a peek at the value passed in, our function might take a string reference, &str, like this:


fn is_even_length(s: &str) -> bool {
    s.len() % 2 == 0
}

That seems to work fine, until someone wants to check a formatted string:


fn main() {
    // The compiler complains about "expected `&str`, found `String`"
    if is_even_length(format!("my string is {}", std::env::args().next().unwrap())) {
        println!("Even length string");
    }
}

Since format! returns an owned string, String, rather than a string reference, &str, we've got a problem. Of course, it's straightforward to turn the String from format!() into a &str (just prefix it with an &). But as mediocre programmers, we can't be expected to remember which sort of string all our functions take and add & wherever it's needed, and having to fix everything when the compiler complains is tedious.

The converse can also happen: a method that wants an owned String, and we've got a &str (say, because we're passing in a string literal, like "Hello, world!"). In this case, we need to use one of the plethora of available "turn this into a String" mechanisms (.to_string(), .to_owned(), String::from(), and probably a few others I've forgotten), on the value before we pass it in, which gets ugly real fast.

For these reasons, I never take a String or an &str as an argument. Instead, I use the Power of Traits to let callers pass in anything that is, or can be turned into, a string. Let us have some examples.

First off, if I would normally use &str as the type, I instead use impl AsRef<str>:


fn is_even_length(s: impl AsRef<str>) -> bool {
    s.as_ref().len() % 2 == 0
}

Note that I had to throw in an extra as_ref() call in there, but now I can call this with either a String or a &str and get an answer.

Now, if I want to be given a String (presumably because I plan on taking ownership of the value, say because I'm creating a new instance of a struct with it), I use impl Into<String> as my type:


struct Foo {
    id: u32,
    desc: String,
}

impl Foo {
    fn new(id: u32, desc: impl Into<String>) -> Self {
        Self { id, desc: desc.into() }
    }
}

We have to call .into() on our desc argument, which makes the struct building a bit uglier, but I'd argue that's a small price to pay for being able to call both Foo::new(1, "this is a thing") and Foo::new(2, format!("This is a thing named {name}")) without caring what sort of string is involved.

Always Have an Error Enum

Rust's error handing mechanism (Results… everywhere), along with the quality-of-life sugar surrounding it (like the short-circuit operator, ?), is a delightfully ergonomic approach to error handling. To make life easy for mediocre programmers, I recommend starting every project with an Error enum, that derives thiserror::Error, and using that in every method and function that returns a Result.

How you structure your Error type from there is less cut-and-dried, but typically I'll create a separate enum variant for each type of error I want to have a different description. With thiserror, it's easy to then attach those descriptions:


#[derive(Clone, Debug, thiserror::Error)]
enum Error {
    #[error("{0} caught fire")]
    Combustion(String),
    #[error("{0} exploded")]
    Explosion(String),
}

I also implement functions to create each error variant, because that allows me to do the Into<String> trick, and can sometimes come in handy when creating errors from other places with .map_err() (more on that later). For example, the impl for the above Error would probably be:


impl Error {
    fn combustion(desc: impl Into<String>) -> Self {
        Self::Combustion(desc.into())
    }

    fn explosion(desc: impl Into<String>) -> Self {
        Self::Explosion(desc.into())
    }
}

It's a tedious bit of boilerplate, and you can use the thiserror-ext crate's thiserror_ext::Construct derive macro to do the hard work for you, if you like. It, too, knows all about the Into<String> trick.

Banish map_err (well, mostly)

The newer mediocre programmer, who is just dipping their toe in the water of Rust, might write file handling code that looks like this:


fn read_u32_from_file(name: impl AsRef<str>) -> Result<u32, Error> {
    let mut f = File::open(name.as_ref())
        .map_err(|e| Error::FileOpenError(name.as_ref().to_string(), e))?;

    let mut buf = vec![0u8; 30];
    f.read(&mut buf)
        .map_err(|e| Error::ReadError(e))?;

    String::from_utf8(buf)
        .map_err(|e| Error::EncodingError(e))?
        .parse::<u32>()
        .map_err(|e| Error::ParseError(e))
}

This works great (or it probably does, I haven't actually tested it), but there are a lot of .map_err() calls in there. They take up over half the function, in fact. With the power of the From trait and the magic of the ? operator, we can make this a lot tidier.

First off, assume we've written boilerplate error creation functions (or used thiserror_ext::Construct to do it for us)). That allows us to simplify the file handling portion of the function a bit:


fn read_u32_from_file(name: impl AsRef<str>) -> Result<u32, Error> {
    let mut f = File::open(name.as_ref())
        // We've dropped the `.to_string()` out of here...
        .map_err(|e| Error::file_open_error(name.as_ref(), e))?;

    let mut buf = vec![0u8; 30];
    f.read(&mut buf)
        // ... and the explicit parameter passing out of here
        .map_err(Error::read_error)?;

    // ...

If that latter .map_err() call looks weird, without the |e| and such, it's passing a function-as-closure, which just saves on a few characters typing. Just because we're mediocre, doesn't mean we're not also lazy.

Next, if we implement the From trait for the other two errors, we can make the string-handling lines significantly cleaner. First, the trait impl:


impl From<std::string::FromUtf8Error> for Error {
    fn from(e: std::string::FromUtf8Error) -> Self {
        Self::EncodingError(e)
    }
}

impl From<std::num::ParseIntError> for Error {
    fn from(e: std::num::ParseIntError) -> Self {
        Self::ParseError(e)
    }
}

(Again, this is boilerplate that can be autogenerated, this time by adding a #[from] tag to the variants you want a From impl on, and thiserror will take care of it for you)

In any event, no matter how you get the From impls, once you have them, the string-handling code becomes practically error-handling-free:


    Ok(
        String::from_utf8(buf)?
            .parse::<u32>()?
    )

The ? operator will automatically convert the error from the types returned from each method into the return error type, using From. The only tiny downside to this is that the ? at the end strips the Result, and so we've got to wrap the returned value in Ok() to turn it back into a Result for returning. But I think that's a small price to pay for the removal of those .map_err() calls.

In many cases, my coding process involves just putting a ? after every call that returns a Result, and adding a new Error variant whenever the compiler complains about not being able to convert some new error type. It's practically zero effort - outstanding outcome for the mediocre programmer.

Just Because You're Mediocre, Doesn't Mean You Can't Get Better

To finish off, I'd like to point out that mediocrity doesn't imply shoddy work, nor does it mean that you shouldn't keep learning and improving your craft. One book that I've recently found extremely helpful is Effective Rust, by David Drysdale. The author has very kindly put it up to read online, but buying a (paper or ebook) copy would no doubt be appreciated.

The thing about this book, for me, is that it is very readable, even by us mediocre programmers. The sections are written in a way that really "clicked" with me. Some aspects of Rust that I'd had trouble understanding for a long time - such as lifetimes and the borrow checker, and particularly lifetime elision - actually made sense after I'd read the appropriate sections.

Finally, a Quick Beg

I'm currently subsisting on the kindness of strangers, so if you found something useful (or entertaining) in this post, why not buy me a refreshing beverage? It helps to know that people like what I'm doing, and helps keep me from having to sell my soul to a private equity firm.

01 May 2024 12:00am GMT

30 Apr 2024

feedPlanet Debian

Russell Coker: Links April 2024

Ron Garret wrote an insightful refutation to 2nd amendment arguments [1].

Interesting article from the UK about British Gas losing a civil suit about bill collecting techniques that are harassment [2]. This should be a criminal offence investigated by the police and prosecuted by the CPS.

David Brin wrote a new version of his essay about dealing with blackmail in the US political system [3].

Cory Doctorow gave an insightful lecture about Enshittification for the Transmediale festival in Berlin [4]. This link has video and a transcript, I read the transcript.

The Cut has an insightful article by a journalist who gave $50k in cash to a scammer and compares the scam to techniques used to extort false confessions [5].

Truth Dig has an informative article about how Nick Bostrom is racist and how his advocacy of eugenics influences Effective Altruism and a lot of Silicon Valley [6].

Bruce Scneier and Nathan Sanders wrote an insightful article about the problems with a frontier flogan for AI development [7].

Brian Krebs wrote an informative article about the links between Chinese APT companies and the Chinese government [8].

30 Apr 2024 1:50pm GMT

29 Apr 2024

feedPlanet Debian

Tim Retout: seL4 Microkit Tutorial

Recently I revisited my previous interest in seL4 - a fast, highly assured operating system microkernel for building secure systems.

The seL4 Microkit tutorial uses a simple Wordle game example to teach the basics of seL4 Microkit (formerly known as the seL4 Core Platform), which is a framework for creating static embedded systems on top of the seL4 microkernel. Microkit is also at the core of LionsOS, a new project to make seL4 accessible to a wider audience.

The tutorial is easy to follow, needing few prerequisites beyond a QEMU emulator and an AArch64 cross-compiler toolchain (Microkit being limited to 64-bit ARM systems currently). Use of an emulator makes for a quick test-debug cycle with a couple of Makefile targets, so time is spent focusing on walking through the Microkit concepts rather than on tooling issues.

This is an unusually good learning experience, probably because of the academic origins of the project itself. The Diátaxis documentation framework would class this as truly a "tutorial" rather than a "how-to guide" - you do learn a lot by implementing the exercises.

29 Apr 2024 9:02pm GMT

28 Apr 2024

feedPlanet Debian

Russell Coker: USB PSUs

I just bought a new USB PSU from AliExpress [1]. I got this to reduce the clutter in my bedroom, I charge my laptop, PineTime, and a few phones at the same time and a single PSU with lots of ports makes it easier. Also I bought a couple of really short USB-C cables as it's been proven by both real life tests and mathematical modelling that shorter cables get tangled less. This power supply is based on Gallium Nitride (GaN) [2] technology which makes it efficient and cool.

One thing I only learned about after that purchase is the new USB PPS standard (see the USB Wikipedia page for details [3]). The PPS (Programmable Power Supply) standard allows (quoting Wikipedia) "allowing a voltage range of 3.3 to 21 V in 20 mV steps, and a current specified in 50 mA steps, to facilitate constant-voltage and constant-current charging". What this means in practice (when phones support it which for me will probably be 2029 or something) is that the phone could receive power exactly matching the voltage needed for the battery and not have any voltage conversion inside the phone. Phones are designed to stop charging at a certain temperature, this probably doesn't concern people in places like Northern Europe but in Australia it can be an issue. Removing the heat dissipation from inefficiencies in voltage change circuitry means the phone will be cooler when charging and can charge at a higher rate.

There is a "Certified USB Fast Charger" logo for chargers which do this, but it seems that at the moment they just include "PPS" in the feature list. So I highly recommend that GaN and PPS be on your feature list for your next USB PSU, but failing that the 240W PSU I bought for $36 was a good deal.

28 Apr 2024 10:02pm GMT

Evgeni Golov: Running Ansible Molecule tests in parallel

Or "How I've halved the execution time of our tests by removing ten lines". Catchy, huh? Also not exactly true, but quite close. Enjoy!

Molecule?!

"Molecule project is designed to aid in the development and testing of Ansible roles."

No idea about the development part (I have vim and mkdir), but it's really good for integration testing. You can write different test scenarios where you define an environment (usually a container), a playbook for the execution and a playbook for verification. (And a lot more, but that's quite unimportant for now, so go read the docs if you want more details.)

If you ever used Beaker for Puppet integration testing, you'll feel right at home (once you've thrown away Ruby and DSLs and embraced YAML for everything).

I'd like to point out one thing, before we continue. Have another look at the quote above.

"Molecule project is designed to aid in the development and testing of Ansible roles."

That's right. The project was started in 2015 and was always about roles. There is nothing wrong about that, but given the Ansible world has moved on to collections (which can contain roles), you start facing challenges.

Challenges using Ansible Molecule in the Collections world

The biggest challenge didn't change since the last time I looked at the topic in 2020: running tests for multiple roles in a single repository ("monorepo") is tedious.

Well, guess what a collection is? Yepp, a repository with multiple roles in it.

It did get a bit better though. There is pytest-ansible now, which has integration for Molecule. This allows the execution of Molecule and even provides reasonable logging with something as short as:

% pytest --molecule roles/

That's much better than the shell script I used in 2020!

However, being able to execute tests is one thing. Being able to execute them fast is another one.

Given Molecule was initially designed with single roles in mind, it has switches to run all scenarios of a role (--all), but it has no way to run these in parallel. That's fine if you have one or two scenarios in your role repository. But what if you have 10 in your collection?

"No way?!" you say after quickly running molecule test --help, "But there is…"

% molecule test --help
Usage: molecule test [OPTIONS] [ANSIBLE_ARGS]...

  --parallel / --no-parallel      Enable or disable parallel mode. Default is disabled.

Yeah, that switch exists, but it only tells Molecule to place things in separate folders, you still need to parallelize yourself with GNU parallel or pytest.

And here our actual journey starts!

Running Ansible Molecule tests in parallel

To run Molecule via pytest in parallel, we can use pytest-xdist, which allows pytest to run the tests in multiple processes.

With that, our pytest call becomes something like this:

% MOLECULE_OPTS="--parallel" pytest --numprocesses auto --molecule roles/

What does that mean?

However, once we actually execute it, we see:

% MOLECULE_OPTS="--parallel" pytest --numprocesses auto --molecule roles/

WARNING  Driver podman does not provide a schema.
INFO     debian scenario test matrix: dependency, cleanup, destroy, syntax, create, prepare, converge, idempotence, side_effect, verify, cleanup, destroy
INFO     Performing prerun with role_name_check=0...
WARNING  Retrying execution failure 250 of: ansible-galaxy collection install -vvv --force ../..
ERROR    Command returned 250 code:

OSError: [Errno 39] Directory not empty: 'roles'

FileExistsError: [Errno 17] File exists: b'/home/user/namespace.collection/collections/ansible_collections/namespace/collection'

FileNotFoundError: [Errno 2] No such file or directory: b'/home/user/namespace.collection//collections/ansible_collections/namespace/collection/roles/my_role/molecule/debian/molecule.yml'

You might see other errors, other paths, etc, but they all will have one in common: they indicate that either files or directories are present, while the tool expects them not to be, or vice versa.

Ah yes, that fine smell of race conditions.

I'll spare you the wild-goose chase I went on when trying to find out what the heck was calling ansible-galaxy collection install here. Instead, I'll just point at the following line:

INFO     Performing prerun with role_name_check=0...

What is this "prerun" you ask? Well… "To help Ansible find used modules and roles, molecule will perform a prerun set of actions. These involve installing dependencies from requirements.yml specified at the project level, installing a standalone role or a collection."

Turns out, this step is not --parallel-safe (yet?).

Luckily, it can easily be disabled, for all our roles in the collection:

% mkdir -p .config/molecule
% echo 'prerun: false' >> .config/molecule/config.yml

This works perfectly, as long as you don't have any dependencies.

And we don't have any, right? We didn't define any in a molecule/collections.yml, our collection has none.

So let's push a PR with that and see what our CI thinks.

OSError: [Errno 39] Directory not empty: 'tests'

Huh?

FileExistsError: [Errno 17] File exists: b'remote.sh' -> b'/home/runner/work/namespace.collection/namespace.collection/collections/ansible_collections/ansible/posix/tests/utils/shippable/aix.sh'

What?

ansible_compat.errors.InvalidPrerequisiteError: Found collection at '/home/runner/work/namespace.collection/namespace.collection/collections/ansible_collections/ansible/posix' but missing MANIFEST.json, cannot get info.

Okay, okay, I get the idea… But why?

Well, our collection might not have any dependencies, BUT MOLECULE HAS! When using Docker containers, it uses community.docker, when using Podman containers.podman, etc…

So we have to install those before running Molecule, and everything should be fine. We even can use Molecule to do this!

$ molecule dependency --scenario <scenario>

And with that knowledge, the patch to enable parallel Molecule execution on GitHub Actions using pytest-xdist becomes:

diff --git a/.config/molecule/config.yml b/.config/molecule/config.yml
new file mode 100644
index 0000000..32ed66d
--- /dev/null
+++ b/.config/molecule/config.yml
@@ -0,0 +1 @@
+prerun: false
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 0f9da0d..df55a15 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -58,9 +58,13 @@ jobs:
       - name: Install Ansible
         run: pip install --upgrade https://github.com/ansible/ansible/archive/${{ matrix.ansible }}.tar.gz
       - name: Install dependencies
-        run: pip install molecule molecule-plugins pytest pytest-ansible
+        run: pip install molecule molecule-plugins pytest pytest-ansible pytest-xdist
+      - name: Install collection dependencies
+        run: cd roles/repository && molecule dependency -s suse
       - name: Run tests
-        run: pytest -vv --molecule roles/
+        run: pytest -vv --numprocesses auto --molecule roles/
+        env:
+          MOLECULE_OPTS: --parallel

   ansible-lint:
     runs-on: ubuntu-latest

But you promised us to delete ten lines, that's just a +7-2 patch!

Oh yeah, sorry, the +10-20 (so a net -10) is the foreman-operations-collection version of the patch, that also migrates from an ugly bash script to pytest-ansible.

And yes, that cuts down the execution from ~26 minutes to ~13 minutes.

In the collection I originally tested this with, it's a more moderate "from 8-9 minutes to 5-6 minutes", which is still good though :)

28 Apr 2024 7:04pm GMT

Russell Coker: Galaxy Note 9 Droidian

Droidian Support for Note 9

Droidian only supported the version of this phone with the Exynos chipset. The GSM Arena specs page for the Note 9 shows that it's the SM-N960F part number [1]. In Australia all Note 9 phones should have the Exynos but it doesn't hurt to ask for the part number before buying.

The status of the Note9 in Droidian went from fully supported to totally unsupported in the time I was working on this blog post. Such a rapid change is disappointing, it would be good if they at least kept the old data online. It would also be good if they didn't require a hash character in the URL for each phone which breaks the archive.org mirroring.

Installing Droidian

Firstly Power+VolumeDown will reboot in some situations where Power button on its own won't. The Note 9 hardware keys are:

The Droidian install document for the Galaxy Note 9 9 now deleted is a bit confusing and unclear. Here is the install process that worked for me.

  1. The doc says to start by installing "Android 10 (Q) stock firmware", but apparently a version of Android 10 that's already on the phone will do for that.
  2. Download the rescue.img file and the "Droidian's image" files from the Droidian page and extract the "Droidian's image" zip.
  3. Connect your phone to your workstation by USB, preferably USB 3 because it will take a few minutes to transfer the image at USB 2 speed. Install the Debian package adb on the workstation.
  4. To "Unlock the bootloader" you can apparently use a PC and the Samsung software but the unlock option in the Android settings gives the same result without proprietary software, here's how to do it:
    1. Connect the phone to Wifi. Then in settings go to "Software update", then click on "Download and install". Refuse to install if it offers you a new version (the unlock menu item will never appear unless you do this, so you can't unlock without Internet access).
    2. In settings go to "About phone", then "Software information", then tap on "Build number" repeatedly until "Developer mode" is enabled.
    3. In settings go to the new menu "Developer options" then turn on the "OEM unlocking" option, this does a factory reset of the phone.
  5. To flash the recovery.img you apparently use Odin on Windows. I used the heimdall-flash package on Debian. On your Linux workstation run the commands:
    adb reboot download
    heimdall flash --RECOVERY recovery.img

    Then press VOLUME-UP+BIXBY+POWER as soon as it reboots to get into the recovery image. If you don't do it soon enough it will do a default Android boot which will wipe the recovery.img you installed and also do a factory reset which will disable "Developer mode" and you will need to go back to step 4.

  6. If the above step works correctly you will have a RECOVERY menu where the main menu has options "Reboot system now", "Apply update", "Factory reset", and "Advanced" in a large font. If you failed to install recovery.img then you would get a similar menu but with a tiny font which is the Samsung recovery image which won't work so reboot and try again.
  7. When at the main recovery menu select "Advanced" and then "Enter fastboot". Note that this doesn't run a different program or do anything obviously different, just gives a menu - that's OK we want it at this menu.
  8. Run "./flash_all.sh" on your workstation.
  9. Then it should boot Droidian! This may take a bit of time.

First Tests

Battery

The battery and its charge and discharge rates are very important to me, it's what made the PinePhonePro and Librem5 unusable as daily driver phones.

After running for about 100 minutes of which about 40 minutes were playing with various settings the phone was at 89% battery. The output of "upower -d" isn't very accurate as it reported power use ranging from 0W to 25W! But this does suggest that the phone might last for 400 minutes of real use that's not CPU intensive, such as reading email, document editing, and web browsing. I don't think that 6.5 hours of doing such things non-stop without access to a power supply or portable battery is something I'm ever going to do. Samsung when advertising the phone claimed 17 hours of video playback which I don't think I'm ever going to get - or want.

After running for 11 hours it was at 58% battery. Then after just over 21 hours of running it had 13% battery. Generally I don't trust the upower output much but the fact that it ran for over 21 hours shows that its battery life is much better than the PinePhonePro and the Librem5. During that 21 hours I've had a ssh session open with the client set to send ssh keep-alive messages every minute. So it had to remain active. There is an option to suspend on Droidian but they recommend you don't use it. There is no need for the "caffeine mode" that you have on Mobian. For comparison my previous tests suggested that when doing nothing a PinePhonePro might last for 30 hours on battery while the Liberem5 might only list 10 hours [2]. This test with Droidian was done with the phone within my reach for much of that time and subject to my desire to fiddle with new technology - so it wasn't just sleeping all the time.

When charging from the USB port on my PC it went from 13% to 27% charge in half an hour and then after just over an hour it claimed to be at 33%. It ended up taking just over 7 hours to fully charge from empty that's not great but not too bad for a PC USB port. This is the same USB port that my Librem5 couldn't charge from. Also the discharge:charge ratio of 21:7 is better than I could get from the PinePhonePro with Caffeine mode enabled.

rndis0

The rndis0 interface used for IP over USB doesn't work. Droidian bug #36 [3].

Other Hardware

The phone I bought for testing is the model with 6G of RAM and 128G of storage, has a minor screen crack and significant screen burn-in. It's a good test system for $109. The screen burn-in is very obvious when running the default Android setup but when running the default Droidian GNOME setup set to the Dark theme (which is a significant power saving with an AMOLED screen) I can't see it at all. Buying a cheap phone with screen burn-in is something I recommend.

The stylus doesn't work, this isn't listed on the Droidian web page. I'm not sure if I tested the stylus when the phone was running Android, I think I did.

D State Processes

I get a kernel panic early in the startup for unknown reasons and some D state kernel threads which may or may not be related to that. Droidian bug #37 [4].

Second Phone

The Phone

I ordered a second Note9 on ebay, it had been advertised at $240 for a month and the seller accepted my offer of $200. With postage that's $215 for a Note9 in decent condition with 8G of RAM and 512G of storage. But Droidian dropped support for the Note9 before I got to install it. At the moment I'm not sure what I'll do with this, maybe I'll keep it on Android.

I also bought four phone cases for $16. I got spares because of the high price of postage relative to the case cost and the fact that they may be difficult to get in a few years.

The Tests

For the next phone my plan was to do more tests on Android before upgrading it to Debian. Here are the ones I can think of now, please suggest any others I should do.

Droidian and Security

When I tell technical people about Droidian a common reaction is "great you can get a cheap powerful phone and have better security than Android". This is wrong in several ways. Firstly Android has quite decent security. Android runs most things in containers and uses SE Linux. Droidian has the Debian approach for most software (IE it all runs under the same UID without any special protections) and the developers have no plans to use SE Linux. I've previously blogged about options for Sandboxing for Debian phone use, my blog post is NOT a solution to the problem but an analysis of the different potential ways of going about solving it [5].

The next issue is that Droidian has no way to update the kernel and the installation instructions often advise downgrading Android (running a less secure kernel) before the installation. The Android Generic Kernel Image project [6] addresses this by allowing a separation between drivers supplied by the hardware vendor and the kernel image supplied by Google. This also permits running the hardware vendor's drivers with a GKI kernel released by Google after the hardware vendor dropped security support. But this only applies to Android 11 and later, so Android 10 devices (like the Note 9 image for Droidian) miss out on this.

28 Apr 2024 11:40am GMT

Russell Coker: Kitty and Mpv

6 months ago I switched to Kitty for terminal emulation [1]. So far there's only been one thing that I couldn't effectively do with Kitty that I did with Konsole in the past, that is watching a music video in 1/4 of the screen while using the rest for terminals. I could setup multiple Kitty windows taking up the rest of the screen but I wanted to keep using a single Kitty with multiple terminals and just have mpv go over one of them. Kitty supports it's own graphical interface so "mpv -vo=kitty" works but took 6* the CPU power in my tests which isn't good for a laptop.

For X11 there's a -ontop option for mpv that does what you expect, but that doesn't work on Wayland. Not working is mostly Wayland's fault as there is a long tail of less commonly used graphical operations that work in X11 but aren't yet implemented in Wayland. I have filed a Debian bug report about this, the mpv man page should note that it's only going to work on X11 on Linux.

I have discovered a solution to that, in the KDE settings there's a "Window Rules" section, I created an entry for "Window class" exactly matching "mpv" and then added a rule "Keep above other windows" and set it for "force" and "yes".

After that I can just resize mpv to occlude just one terminal and keep using the rest. Also one noteworthy thing with this is that it makes mpv go on top of the KDE taskbar, which can be a feature.

28 Apr 2024 5:38am GMT

27 Apr 2024

feedPlanet Debian

Dirk Eddelbuettel: qlcal 0.0.11 on CRAN: Calendar Updates

The eleventh release of the qlcal package arrivied at CRAN today.

qlcal delivers the calendaring parts of QuantLib. It is provided (for the R package) as a set of included files, so the package is self-contained and does not depend on an external QuantLib library (which can be demanding to build). qlcal covers over sixty country / market calendars and can compute holiday lists, its complement (i.e. business day lists) and much more. Examples are in the README at the repository, the package page, and course at the CRAN package page.

This releases synchronizes qlcal with the QuantLib release 1.34 and contains more updates to 2024 calendars.

Changes in version 0.0.11 (2024-04-27)

  • Synchronized with QuantLib 1.34

  • Calendar updates for Brazil, India, Singapore, South Africa, Thailand, United States

  • Minor continuous integration update

Courtesy of my CRANberries, there is a diffstat report for this release. See the project page and package documentation for more details, and more examples. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

27 Apr 2024 9:58pm GMT