The still very new package zigg which
arrived on CRAN a week ago just
received a micro-update at CRAN. zigg provides
the Ziggurat
pseudo-random number generator (PRNG) for Normal, Exponential and
Uniform draws proposed by Marsaglia and
Tsang (JSS, 2000),
and extended by Leong et al. (JSS, 2005). This PRNG
is lightweight and very fast: on my machine speedups for the
Normal, Exponential, and Uniform are on the order of 7.4, 5.2 and 4.7
times faster than the default generators in R as illustrated in the benchmark
chart borrowed from the git repo.
As wrote last week in the initial
announcement, I had picked up their work in package RcppZiggurat
and updated its code for the 64-buit world we now live in. That package
alredy provided the Normal generator along with several competing
implementations which it compared rigorously and timed them. As one of
the generators was based on the GNU GSL via the
implementation of Voss, we always ended
up with a run-time dependency on the GSL too. No more: this new package
is zero-depedency, zero-suggsts and hence very easy to deploy.
Moreover, we also include a demonstration of four distinct ways of
accessing the compiled code from another R package: pure and straight-up
C, similarly pure C++, inclusion of the header in C++ as well as via Rcpp. The other advance is the
resurrection of the second generator for the Exponential distribution.
And following Burkardt we expose the
Uniform too. The main upside of these generators is their excellent
speed as can be seen in the comparison the default R generators
generated by the example script timings.R:
Needless to say, speed is not everything. This PRNG comes the time of
32-bit computing so the generator period is likely to be shorter than
that of newer high-quality generators. If in doubt, forgo speed and
stick with the high-quality default generators.
This release essentially just completes the DESCRIPTION file and
README.md now that this is a CRAN package. The short NEWS entry
follows.
Changes in version 0.0.2
(2025-02-07)
Complete DESCRIPTION and README.md following initial CRAN
upload
Armadillo is a powerful
and expressive C++ template library for linear algebra and scientific
computing. It aims towards a good balance between speed and ease of use,
has a syntax deliberately close to Matlab, and is useful for algorithm
development directly in C++, or quick conversion of research code into
production environments. RcppArmadillo
integrates this library with the R environment and language–and is
widely used by (currently) 1215 other packages on CRAN, downloaded 38.2 million
times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint
/ vignette) by Conrad and myself has been cited 612 times according
to Google Scholar.
Conrad released a minor
version 14.2.3 yesterday. As it has been two months since the last
minor release, we prepared a new version for CRAN too which arrived there early
this morning.
The changes since the last CRAN release are summarised
below.
Changes in
RcppArmadillo version 14.2.3-1 (2025-02-05)
Upgraded to Armadillo release 14.2.3 (Smooth Caffeine)
Minor fix for declaration of xSYCON and
xHECON functions in LAPACK
Cookiecutter is a tool for building coding project templates. It’s often used to provide a scaffolding to build lots of similar project. I’ve seen it used to create Symfony projects and several cloud infrastructures deployed with Terraform. This tool was useful to accelerate the creation of new projects.
Since these templates were bound to evolve, the teams providing these template relied on cruft to update the code provided by the template in their user’s code. In other words, they wanted their users to apply a diff of the template modification to their code.
At the beginning, all was fine. But problems began to appear during the lifetime of these projects.
What went wrong ?
In both cases, we had the following scenario:
user team:
creates new project with cookiecutter template
makes modification on their code, including on code provided by template
meanwhile, provider team:
makes modifications to cookiecutter template
releases new template version
asks his users to update code brought by template using cruft
user team then:
runs cruft to update template code
discovers a lot of code conflicts (similar to git merge conflicts)
often rolls back cruft update and gives up on template update
User team giving up on updates is a major problem because these update may bring security or compliance fixes.
Note that code conflicts seen with cruft are similar to git merge conflicts, but harder to resolve because, unlike with a git merge, there’s no common ancestor, so 3-way merges are not possible.
From an organisation point of view, the main problem is the ambiguous ownership of the functionalities brought by template code: who own this code ? The provider team who writes the template or the user team who owns the repository of the code generated from the template ? Conflicts are bound to happen. �
Possible solutions to get out of this tar pit:
Assume that template are one shot. Template update are not practical in the long run.
Make sure that template are as thin as possible. They should contain minimal logic.
Move most if not all logic in separate libraries or scripts that are owned by provider team. This way update coming from provider team can be managed like external dependencies by upgrading the version of a dependency.
Of course your users won’t be happy to be faced with a manual migration from the old big template to the new one with external dependencies. On the other hand, this may be easier to sell than updates based on cruft since the painful work will happen once. Further updates will be done by incrementing dependency versions (which can be automated with renovate).
If many projects are to be created with this template, it may be more practical to provide use a CLI that will create a skeleton project. See for instance terragrunt scaffold command.
My name is Dominique Dumont, I’m a devops freelance. You can find the devops and audit services I propose on my website or reach out to me on LinkedIn.
We are pleased to announce that Proxmox has
committed to sponsor DebConf25 as a
Platinum Sponsor.
Proxmox develops powerful, yet easy-to-use Open Source server software. The
product portfolio from Proxmox, including server virtualization, backup, and
email security, helps companies of any size, sector, or industry to simplify
their IT infrastructures. The Proxmox solutions are based on the great Debian
platform, and we are happy that we can give back to the community by sponsoring
DebConf25.
With this commitment as Platinum Sponsor, Proxmox is contributing to the Debian
annual Developers' conference, directly supporting the progress of Debian and
Free Software. Proxmox contributes to strengthen the community that
collaborates on Debian projects from all around the world throughout all of
the year.
Thank you very much, Proxmox, for your support of DebConf25!
Become a sponsor too!
DebConf25 will take place from 14 to 20
July 2025 in Brest, France, and will be preceded by DebCamp, from 7 to 13
July 2025.
Just a "warn your brothers" for people foolish enough to
use GKE and run on the Rapid release channel.
Update from version 1.31.1-gke.1146000 to 1.31.1-gke.1678000 is causing
trouble whenever NetworkPolicy resources and a readinessProbe (or health check)
are configured. As a workaround we started to remove the NetworkPolicy
resources. E.g. when kustomize is involved with a patch like this:
We tried to update to the latest version - right now 1.31.1-gke.2008000 - which
did not change anything.
Behaviour is pretty much erratic, sometimes it still works and sometimes the traffic
is denied. It also seems that there is some relevant fix in 1.31.1-gke.1678000
because that is now the oldest release of 1.31.1 which I can find in the regular and
rapid release channels. The last known good version 1.31.1-gke.1146000 is not
available to try a downgrade.
Update: 1.31.4-gke.1372000 in late January 2025 seems to finally fix it.
If you use SteamOS and you like to install third-party tools or modify the system-wide configuration some of your changes might be lost after an OS update. Read on for details on why this happens and what to do about it.
As you all know SteamOS uses an immutable root filesystem and users are not expected to modify it because all changes are lost after an OS update.
However this does not include configuration files: the /etc directory is not part of the root filesystem itself. Instead, it’s a writable overlay and all modifications are actually stored under /var (together with all the usual contents that go in that filesystem such as logs, cached data, etc).
/etc contains important data that is specific to that particular machine like the configuration of known network connections, the password of the main user and the SSH keys. This configuration needs to be kept after an OS update so the system can keep working as expected. However the update process also needs to make sure that other changes to /etc don’t conflict with whatever is available in the new version of the OS, and there have been issues due to some modifications unexpectedly persisting after a system update.
SteamOS 3.6 introduced a new mechanism to decide what to to keep after an OS update, and the system now keeps a list of configuration files that are allowed to be kept in the new version. The idea is that only the modifications that are known to be important for the correct operation of the system are applied, and everything else is discarded1.
However, many users want to be able to keep additional configuration files after an OS update, either because the changes are important for them or because those files are needed for some third-party tool that they have installed. Fortunately the system provides a way to do that, and users (or developers of third-party tools) can add a configuration file to /etc/atomic-update.conf.d, listing the additional files that need to be kept.
There is an example in /etc/atomic-update.conf.d/example-additional-keep-list.conf that shows what this configuration looks like.
Sample configuration file for the SteamOS updater
Developers who are targeting SteamOS can also use this same method to make sure that their configuration files survive OS updates. As an example of an actual third-party project that makes use of this mechanism you can have a look at the DeterminateSystems Nix installer:
As usual, if you encounter issues with this or any other part of the system you can check the SteamOS issue tracker. Enjoy!
A copy is actually kept under /etc/previous to give the user the chance to recover files if necessary, and up to five previous snapshots are kept under /var/lib/steamos-atomupd/etc_backup︎
Our monthly reports outline what we’ve been up to over the past month and highlight items of news from elsewhere in the world of software supply-chain security when relevant. As usual, though, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website.
The last few months saw the introduction of reproduce.debian.net. Announced at the recent Debian MiniDebConf in Toulouse, reproduce.debian.net is an instance of rebuilderd operated by the Reproducible Builds project. Powering that is rebuilderd, our server designed monitor the official package repositories of Linux distributions and attempt to reproduce the observed results there.
Giacomo Benedetti, Oreofe Solarin, Courtney Miller, Greg Tystahl, William Enck, Christian Kästner, Alexandros Kapravelos, Alessio Merlo and Luca Verderame published an interesting article recently. Titled An Empirical Study on Reproducible Packaging in Open-Source Ecosystem, the abstract outlines its optimistic findings:
[We] identified that with relatively straightforward infrastructure configuration and patching of build tools, we can achieve very high rates of reproducible builds in all studied ecosystems. We conclude that if the ecosystems adopt our suggestions, the build process of published packages can be independently confirmed for nearly all packages without individual developer actions, and doing so will prevent significant future software supply chain attacks.
Answering strongly in the affirmative, the article’s abstract reads as follows:
In this work, we perform the first large-scale study of bitwise reproducibility, in the context of the Nix functional package manager, rebuilding 709,816 packages from historical snapshots of the nixpkgs repository[. We] obtain very high bitwise reproducibility rates, between 69 and 91% with an upward trend, and even higher rebuildability rates, over 99%. We investigate unreproducibility causes, showing that about 15% of failures are due to embedded build dates. We release a novel dataset with all build statuses, logs, as well as full diffoscopes: recursive diffs of where unreproducible build artifacts differ.
As above, the entire PDF of the article is available to view online.
Distribution work
There as been the usual work in various distributions this month, such as:
10+ reviews of Debian packages were added, 11 were updated and 10 were removed this month adding to our knowledge about identified issues. A number of issue types were updated also.
The FreeBSD Foundation announced that “a planned project to deliver zero-trust builds has begun in January 2025”. Supported by the Sovereign Tech Agency, this project is centered on the various build processes, and that the “primary goal of this work is to enable the entire release process to run without requiring root access, and that build artifacts build reproducibly – that is, that a third party can build bit-for-bit identical artifacts.” The full announcement can be found online, which includes an estimated schedule and other details.
Following-up to a substantial amount of previous work pertaining the Sphinx documentation generator, James Addison asked a question pertaining to the relationship between SOURCE_DATE_EPOCH environment variable and testing that generated a number of replies.
Adithya Balakumar of Toshiba asked a question about whether it is possible to make ext4 filesystem images reproducible. Adithya’s issue is that even the smallest amount of post-processing of the filesystem results in the modification of the “Last mount” and “Last write” timestamps.
FUSE (Filesystem in USErspace) filesystems such as disorderfs do not delete files from the underlying filesystem when they are deleted from the overlay. This can cause seemingly straightforward tests — for example, cases that expect directory contents to be empty after deletion is requested for all files listed within them — to fail.
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 285, 286 and 287 to Debian:
Security fixes:
Validate the --css command-line argument to prevent a potential Cross-site scripting (XSS) attack. Thanks to Daniel Schmidt from SRLabs for the report. […]
Prevent XML entity expansion attacks. Thanks to Florian Wilkens from SRLabs for the report.. […][…]
Print a warning if we have disabled XML comparisons due to a potentially vulnerable version of pyexpat. […]
Bug fixes:
Correctly identify changes to only the line-endings of files; don’t mark them as Ordering differences only. […]
When passing files on the command line, don’t call specialize(…) before we’ve checked that the files are identical or not. […]
Do not exit with a traceback if paths are inaccessible, either directly, via symbolic links or within a directory. […]
Don’t cause a traceback if cbfstool extraction failed.. […]
Use the surrogateescape mechanism to avoid a UnicodeDecodeError and crash when any decoding zipinfo output that is not UTF-8 compliant. […]
Testsuite improvements:
Don’t mangle newlines when opening test fixtures; we want them untouched. […]
In addition, fridtjof added support for the ASAR.tar-like archive format. […][…][…][…] and lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 285 […][…] and 286 […][…].
strip-nondeterminism is our sister tool to remove specific non-deterministic results from a completed build. This month version 1.14.1-1 was uploaded to Debian unstable by Chris Lamb, making the following the changes:
Clarify the --verbose and non --verbose output of bin/strip-nondeterminism so we don’t imply we are normalizing files that we are not. […]
Update the website’s README to make the setup command copy & paste friendly. […]
Reproducibility testing framework
The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen, including:
Ed Maste modified the FreeBSD build system to the clean the object directory before commencing a build. […]
Gioele Barabucci updated the rebuilder stats to first add a category for network errors […] as well as to categorise failures without a diffoscope log […].
Jessica Clarke also made some FreeBSD-related changes, including:
Ensuring we clean up the object directory for second build as well. […][…]
Updating the sudoers for the relevant rm -rf command. […]
Update the cleanup_tmpdirs method to to match other removals. […]
Update the reproducible_debstrap job to call Debian’s debootstrap with the full path […] and to use eatmydata as well […][…].
Make some changes to deduce the CPU load in the debian_live_build job. […]
Lastly, both Holger Levsen […] and Vagrant Cascadian […] performed some node maintenance.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
In my last blog, I explained how we resolved a throttling issue involving Azure storage API. In the end, I mentioned that I was not sure of the root cause of the throttling issue.
Even though we no longer had any problem in dev and preprod cluster, we still faced throttling issue with prod. The main difference between these 2 environments is that we have about 80 PVs in prod versus 15 in the other environments. Given that we manage 1500 pods in prod, 80 PVs does not look like a lot.
To continue the investigation, I’ve modified k8s-scheduled-volume-snapshotter to limit the number of snaphots done in a single cron run (see add maxSnapshotCount parameter pull request).
In prod, we used the modified snapshotter to trigger snapshots one by one.
Even with all previous snapshots cleaned up, we could not trigger a single new snapshot without being throttled. I guess that, in the cron job, just checking the list of PV to snapshot was enough to exhaust our API quota.
Azure doc mention that a leaky bucket algorithm is used for throttling. A full bucket holds tokens for 250 API calls, and the bucket gets 25 new tokens per second. Looks like that not enough.
I was puzzled and out of ideas .
I looked for similar problems in AKS issues on GitHub where I found this comment that recommend using useDataPlaneAPI parameter in the CSI file driver. That was it!
I was flabbergasted by this parameter: why is CSI file driver able to use 2 APIs ? Why is one on them so limited ? And more importantly, why is the limited API the default one ?
Anyway, setting useDataPlaneAPI: "true" in our VolumeSnapshotClass manifest was the right solution. This indeed solved the throttling issue in our prod cluster.
But not the snaphot issue . Amongst the 80 PV, I still had 2 snaphots failing.
Fortunately, the error was mentioned in the description of the failed snapshots: we had too many (200) snapshots for these shared volumes.
What ?? All these snapshots were cleaned up last week.
I then tried to delete these snaphots through azure console. But the console failed to delete these snapshot due to API throttling. Looks like Azure console is not using the right API.
Anyway, I went back to the solution explained in my previous blog, I listed all snapshots with az command. I indeed has a lot of snaphots, a lot of them dated Jan 19 and 20. There was often a new bogus snaphot created every minute.
These were created during the first attempt at fixing the throttling issue. I guess that even though CSI file driver was throttled, a snaphot was still created in the storage account, but the CSI driver did not see it and retried a minute later. What a mess.
Anyway, I’ve cleaned up again these bogus snapshot , and now, snaphot creation is working fine .
I was recently pointed to Technologies and Projects supported by the
Sovereign Tech Agency which is financed by the German Federal
Ministry for Economic Affairs and Climate Action. It is a subsidiary of
the Federal Agency for Disruptive Innovation, SPRIND GmbH.
It is worth sending applications there for distinct projects as that is
their preferred method of funding. Distinguished developers can also
apply for a fellowship position that pays up to 40hrs / week (32hrs when
freelancing) for a year. This is esp. open to maintainers of larger
numbers of packages in Debian (or any other Linux distribution).
There might be a chance that some of the Debian-related projects
submitted to the Google Summer of Code that did not get funded could be
retried with those foundations. As per the FAQ of the project:
"The Sovereign Tech Agency focuses on securing and strengthening open
and foundational digital technologies. These communities working on
these are distributed all around the world, so we work with people,
companies, and FOSS communities everywhere."
Similar funding organizations include the Open Technology Fund and
FLOSS/fund. If you have a Debian-related project that fits these
funding programs, they might be interesting options. This list is by no
means exhaustive—just some hints I’ve received and wanted to share. More
suggestions for such opportunities are welcome.
Year of code reviews
On the debian-devel mailing list, there was a long thread titled
"Let's make 2025 a year when code reviews became common in Debian".
It initially suggested something along the lines of:
"Let's review MRs in Salsa." The discussion quickly expanded to
include patches that have
been sitting in the BTS for years, which deserve at least the same
attention. One idea I'd like to emphasize is that associating BTS bugs
with MRs could be very convenient. It’s not only helpful for
documentation but also the easiest way to apply patches.
I’d like to emphasize that no matter what workflow we use—BTS, MRs, or a
mix—it is crucial to uphold Debian’s reputation for high quality.
However, this reputation is at risk as more and more old issues
accumulate. While Debian is known for its technical excellence,
long-standing bugs and orphaned packages remain a challenge. If we don’t
address these, we risk weakening the high standards that Debian is
valued for. Revisiting old issues and ensuring that unmaintained
packages receive attention is especially important as we prepare for the
Trixie release.
Debian Publicity Team will no longer post on X/Twitter
The team is in charge of deciding the most suitable publication
venue or venues for announcements and when they are published.
the team once decided to join Twitter, but circumstances have since
changed. The current Press delegates have the institutional authority to
leave X, just as their predecessors had the authority to join. I
appreciate that the team carefully considered the matter, reinforced by
the arguments developed on the debian-publicity list, and communicated
its reasoning openly.
The RcppUUID package
on CRAN has been providing
UUIDs (based on the underlying Boost
library) for several years. Written by Artem Klemsov and maintained
in this gitlab
repo, the package is a very nice example of clean and
straightforward library binding.
When we did our annual
BH upgrade to 1.87.0 and check reverse dependencies, we noticed the
RcppUUID
needed a small and rather minor update which we showed as a short diff
in an
issue filed. Neither I nor CRAN heard from Artem, so the
packaged ended up being archived last week. Which in turn lead me to
make this minimal update to 1.1.2 to resurrect it, which CRAN processed more or less like a
regular update given this explanation and so it arrived last Friday.
Most of my Debian contributions this month were
sponsored by
Freexian. If you appreciate this sort of work and are at a company that
uses Debian, have a look to see whether you can pay for any of
Freexian‘s services; as well as the direct
benefits, that revenue stream helps to keep Debian development sustainable
for me and several other lovely
people.
You can also support my work directly via
Liberapay.
Python team
We finally made Python 3.13 the default version in testing! I fixed various
bugs that got in the way of this:
I helped with some testing of a debian-installer-utils
patch
as part of the /usr move. I need to get around to uploading this, since
it looks OK now.
Other small things
Helmut Grohne reached out for help debugging a multi-arch coinstallability
problem (you know it’s going to be complicated when even Helmut can’t
figure it out on his own …) in
binutils, and we had a call about that.
For many years I wished I had a setup that would allow me to work (that is, code) productively outside in the bright sun. It’s winter right now, but when its summer again it’s always a bit. this weekend I got closer to that goal.
TL;DR: Using code-server on a beefy machine seems to be quite neat.
Passively lit coding
Personal history
Looking back at my own old blog entries I find one from 10 years ago describing how I bought a Kobo eBook reader with the intent of using it as an external monitor for my laptop. It seems that I got a proof-of-concept setup working, using VNC, but it was tedious to set up, and I never actually used that. I subsequently noticed that the eBook reader is rather useful to read eBooks, and it has been in heavy use for that every since.
Four years ago I gave this old idea another shot and bought an Onyx BOOX Max Lumi. This is an A4-sized tablet running Android and had the very promising feature of an HDMI input. So hopefully I’d attach it to my laptop and it just works™. Turns out that this never worked as well as I hoped: Even if I set the resolution to exactly the tablet’s screen’s resolution I got blurry output, and it also drained the battery a lot, so I gave up on this. I subsequently noticed that the tablet is rather useful to take notes, and it has been in sporadic use for that.
Going off on this tangent: I later learned that the HDMI input of this device appears to the system like a camera input, and I don’t have to use Boox’s “monitor” app but could other apps like FreeDCam as well. This somehow managed to fix the resolution issues, but the setup still wasn’t as convenient to be used regularly.
I also played around with pure terminal approaches, e.g. SSH’ing into a system, but since my usual workflow was never purely text-based (I was at least used to using a window manager instead of a terminal multiplexer like screen or tmux) that never led anywhere either.
My colleagues have said good things about using VSCode with the remote SSH extension to work on a beefy machine, so I gave this a try now as well, and while it’s not a complete game changer for me, it does make certain tasks (rebuilding everything after a switching branches, running the test suite) very convenient. And it’s a bit spooky to run these work loads without the laptop’s fan spinning up.
In this setup, the workspace is remote, but VSCode still runs locally. But it made me wonder about my old goal of being able to work reasonably efficient on my eInk tablet. Can I replicate this setup there?
VSCode itself doesn’t run on Android directly. There are project that run a Linux chroot or in termux on the Android system, and then you can VNC to connect to it (e.g. on Andronix)… but that did not seem promising. It seemed fiddly, and I probably should take it easy on the tablet’s system.
code-server, running remotely
A more promising option is code-server. This is a fork of VSCode (actually of VSCodium) that runs completely on the remote machine, and the client machine just needs a browser. I set that up this weekend and found that I was able to do a little bit of work reasonably.
Access
With code-server one has to decide how to expose it safely enough. I decided against the tunnel-over-SSH option, as I expected that to be somewhat tedious to set up (both initially and for each session) on the android system, and I liked the idea of being able to use any device to work in my environment.
I also decided against the more involved “reverse proxy behind proper hostname with SSL” setups, because they involve a few extra steps, and some of them I cannot do as I do not have root access on the shared beefy machine I wanted to use.
That left me with the option of using a code-server’s built-in support for self-signed certificates and a password:
(I am using nix as a package manager on a Debian system there, hence the additional PATH and complex ExecStart. If you have a more conventional setup then you do not have to worry about Environment and can likely use ExecStart=code-server.
For this to survive me logging out I had to ask the system administrator to run loginctl enable-linger joachim, so that systemd allows my jobs to linger.
Git credentials
The next issue to be solved was how to access the git repositories. The work is all on public repositories, but I still need a way to push my work. With the classic VSCode-SSH-remote setup from my laptop, this is no problem: My local SSH key is forwarded using the SSH agent, so I can seamlessly use that on the other side. But with code-server there is no SSH key involved.
I could create a new SSH key and store it on the server. That did not seem appealing, though, because SSH keys on Github always have full access. It wouldn’t be horrible, but I still wondered if I can do better.
I thought of creating fine-grained personal access tokens that only me to push code to specific repositories, and nothing else, and just store them permanently on the remote server. Still a neat and convenient option, but creating PATs for our org requires approval and I didn’t want to bother anyone on the weekend.
So I am experimenting with Github’s git-credential-manager now. I have configured it to use git’s credential cache with an elevated timeout, so that once I log in, I don’t have to again for one workday.
To login, I have to https://github.com/login/device on an authenticated device (e.g. my phone) and enter a 8-character code. Not too shabby in terms of security. I only wish that webpage would not require me to press Tab after each character…
This still grants rather broad permissions to the code-server, but at least only temporarily
Android setup
On the client side I could now open https://host.example.com:8080 in Firefox on my eInk Android tablet, click through the warning about self-signed certificates, log in with the fixed password mentioned above, and start working!
I switched to a theme that supposedly is eInk-optimized (eInk by Mufanza). It’s not perfect (e.g. git diffs are unhelpful because it is not possible to distinguish deleted from added lines), but it’s a start. There are more eInk themes on the official Visual Studio Marketplace, but because code-server is a fork it cannot use that marketplace, and for example this theme isn’t on Open-VSX.
For some reason the F11 key doesn’t work, but going fullscreen is crucial, because screen estate is scarce in this setup. I can go fullscreen using VSCode’s command palette (Ctrl-P) and invoking the command there, but Firefox often jumps out of the fullscreen mode, which is annoying. I still have to pay attention to when that’s happening; maybe its the Esc key, which I am of course using a lot due to me using vim bindings.
A more annoying problem was that on my Boox tablet, sometimes the on-screen keyboard would pop up, which is seriously annoying! It took me a while to track this down: The Boox has two virtual keyboards installed: The usual Google ASOP keyboard, and the Onyx Keyboard. The former is clever enough to stay hidden when there is a physical keyboard attached, but the latter isn’t. Moreover, pressing Shift-Ctrl on the physical keyboard rotates through the virtual keyboards. Now, VSCode has many keyboard shortcuts that require Shift-Ctrl (especially on an eInk device, where you really want to avoid using the mouse). And the limited settings exposed by the Boox Android system do not allow you configure that or disable the Onyx keyboard! To solve this, I had to install the KISS Launcher, which would allow me to see more Android settings, and in particular allow me to disable the Onyx keyboard. So this is fixed.
I was hoping to improve the experience even more by opening the web page as a Progressive Web App (PWA), as described in the code-server FAQ. Unfortunately, that did not work. Firefox on Android did not recognize the site as a PWA (even though it recognizes a PWA test page). And I couldn’t use Chrome either because (unlike Firefox) it would not consider a site with a self-signed certificate as a secure context, and then code-server does not work fully. Maybe this is just some bug that gets fixed in later versions.
I did not work enough with this yet to assess how much the smaller screen estate, the lack of colors and the slower refresh rate will bother me. I probably need to hide Lean’s InfoView more often, and maybe use the Error Lens extension, to avoid having to split my screen vertically.
I also cannot easily work on a park bench this way, with a tablet and a separate external keyboard. I’d need at least a table, or some additional piece of hardware that turns tablet + keyboard into some laptop-like structure that I can put on my, well, lap. There are cases for Onyx products that include a keyboard, and maybe they work on the lap, but they don’t have the Trackpoint that I have on my ThinkPad TrackPoint Keyboard II, and how can you live without that?
Conclusion
After this initial setup chances are good that entering and using this environment is convenient enough for me to actually use it; we will see when it gets warmer.
A few bits could be better. In particular logging in and authenticating GitHub access could be both more convenient and more safe – I could imagine that when I open the page I confirm that on my phone (maybe with a fingerprint), and that temporarily grants access to the code-server and to specific GitHub repositories only. Is that easily possible?
Version 0.0.20 of RcppSpdlog arrived
on CRAN early this morning and
has been uploaded to Debian. RcppSpdlog
bundles spdlog, a
wonderful header-only C++ logging library with all the bells and
whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn
more at the nice package
documention site.
This release updates the code to the version 1.15.1 of spdlog which was released
this morning as well. It also contains a contributed PR which
illustrates logging in a multithreaded context.
The NEWS entry for this release follows.
Changes in
RcppSpdlog version 0.0.20 (2025-02-01)
New multi-threaded logging example (Young Geun Kim and Dirk via
#22)
Another short status update of what happened on my side last
month. Mostly focused on quality of life improvements in phosh and
cleaning up and improving phoc this time around (including catching up
with wlroots git) but some improvements for other things like
phosh-osk-stub happened on the side line too.
Allow events to override the sound feedback with custom sounds
(MR). Allows
desktop/mobile shells like phosh to honour application prefs for notifications.
udev regression affecting gmobile (Bug). Many thanks to Yu Watanabe
for providing the fix so quickly
Reviews
This is not code by me but reviews on other peoples code. The list is
incomplete, but I hope to improve on this in the upcoming
months. Thanks for the contributions!
I recently discovered “Architecture Decision Logs” or Architecture Decision Records (ADL/ADR) in a software project and really like the idea of explicitly writing down such decisions. You can find complex templates, theory and process for these documents that might be appropriate for big projects. For a small project I don’t think that these are necessary and a free-form text file is probably fine.
Besides benefits of an ADL listed elsewhere, I also see benefits especially for free software projects:
Potential contributers can quickly decide whether they want to align with the decisions.
Discussions about project directions can be handled with less emotions.
A decision to fork a project can be based on diverging architecture decisions.
Potential users can decide whether the software aligns with their needs.
Code readers might have less WTF moments in which they believe the code author was stupid, incompetent and out of their mind.
The purpose of ADLs overlap with Project Requirements and Design documents (PRD, DD). While the latter should in theory be written before coding starts, ADLs are written during the development of the project and capture the thought process.
Thus ADLs are in my opinion more aligned with the reality of (agile) software development while PRDs and DDs are more aligned with hierarchic organizations in which development is driven by management decisions. As a consequence PRDs and DDs often don’t have much in common with the real software or hinder the development process since programmers feel pressured not to deviate from them.
As people around the world understand how LLMs behave, more and more people
wonder as to why these models hallucinate, and what can be done about to
reduce it. This provocatively named article by Michael Townsen Hicks, James
Humphries and Joe Slater bring is an excellent primer to better understanding
how LLMs work and what to expect from them.
As humans carrying out our relations using our language as the main tool, we are
easily at awe with the apparent ease with which ChatGPT (the first widely
available, and to this day probably the best known, LLM-based automated
chatbot) simulates human-like understanding and how it helps us to easily
carry out even daunting data aggregation tasks. It is common that people ask
ChatGPT for an answer and, if it gets part of the answer wrong, they justify it
by stating that it’s just a hallucination. Townsen et al. invite us to switch
from that characterization to a more correct one: LLMs are bullshitting. This
term is formally presented by Frankfurt [1]. To Bullshit is not the same as to
lie, because lying requires to know (and want to cover) the truth. A
bullshitter not necessarily knows the truth, they just have to provide a
compelling description, regardless of what is really aligned with truth.
After introducing Frankfurt’s ideas, the authors explain the fundamental ideas
behind LLM-based chatbots such as ChatGPT; a Generative Pre-trained Transformer
(GPT)’s have as their only goal to produce human-like text, and it is carried
out mainly by presenting output that matches the input’s high-dimensional
abstract vector representation, and probabilistically outputs the next token
(word) iteratively with the text produced so far. Clearly, a GPT’s ask is not to
seek truth or to convey useful information — they are built to provide a
normal-seeming response to the prompts provided by their user. Core data are not
queried to find optimal solutions for the user’s requests, but are generated on
the requested topic, attempting to mimic the style of document set it was
trained with.
Erroneous data emitted by a LLM is, thus, not equiparable with what a person
could hallucinate with, but appears because the model has no understanding of
truth; in a way, this is very fitting with the current state of the world, a
time often termed as the age of post-truth [2]. Requesting an LLM to provide
truth in its answers is basically impossible, given the difference between
intelligence and consciousness: Following Harari’s definitions [3], LLM
systems, or any AI-based system, can be seen as intelligent, as they have the
ability to attain goals in various, flexible ways, but they cannot be seen as
conscious, as they have no ability to experience subjectivity. This is, the
LLM is, by definition, bullshitting its way towards an answer: their goal is
to provide an answer, not to interpret the world in a trustworthy way.
The authors close their article with a plea for literature on the topic to adopt
the more correct “bullshit” term instead of the vacuous, anthropomorphizing
“hallucination”. Of course, being the word already loaded with a negative
meaning, it is an unlikely request.
This is a great article that mixes together Computer Science and Philosophy, and
can shed some light on a topic that is hard to grasp for many users.
[1] Frankfurt, Harry (2005). On Bullshit. Princeton University Press.
[2] Zoglauer, Thomas (2023). Constructed truths: truth and knowledge in a
post-truth world. Springer.
[3] Harari, Yuval Noah (2023. Nexus: A Brief History of Information Networks
From the Stone Age to AI. Random House.
Thrilled to announce a new package: zigg. It arrived
on CRAN today after a few days
of review in the ‘newbies’ queue. zigg provides
the Ziggurat
pseudo-random number generator for Normal, Exponential and Uniform draws
proposed by Marsaglia and
Tsang (JSS, 2000),
and extended by Leong et al. (JSS, 2005).
I had picked up their work in package RcppZiggurat
and updated its code for the 64-buit world we now live in. That package
alredy provided the Normal generator along with several competing
implementations which it compared rigorously and timed them. As one of
the generators was based on the GNU GSL via the
implementation of Voss, we always ended
up with a run-time dependency on the GSL too. No more: this new package
is zero-depedency, zero-suggsts and hence very easy to deploy.
Moreover, we also include a demonstration of four distinct ways of
accessing the compiled code from another R package: pure and straight-up
C, similarly pure C++, inclusion of the header in C++ as well as via Rcpp.
The other advance is the resurrection of the second generator for the
Exponential distribution. And following Burkardt we expose the
Uniform too. The main upside of these generators is their excellent
speed as can be seen in the comparison the default R generators
generated by the example script timings.R:
Needless to say, speed is not everything. This PRNG comes the time of
32-bit computing so the generator period is likely to be shorter than
that of newer high-quality generators. If in doubt, forgo speed and
stick with the high-quality default generators.
There are two major internationalization APIs in the C library:
locales and iconv. Iconv is an isolated component which only performs
charset conversion in ways that don't interact with anything else in
the library. Locales affect pretty much every API that deals with
strings and covers charset conversion along with a huge range of
localized information from character classification to formatting of
time, money, people's names, addresses and even standard paper sizes.
Picolibc inherits it's implementation of both of these from
newlib. Given that embedded applications rarely need advanced
functionality from either these APIs, I hadn't spent much time
exploring this space.
Newlib locale code
When run on Cygwin, Newlib's locale support is quite complete as it
leverages the underlying Windows locale support. Without Windows
support, everything aside from charset conversion and character
classification data is stubbed out at the bottom of the stack. Because
the implementation can support full locale functionality, the
implementation is designed for that, with large data structures and
lots of code.
Charset conversion and character classification data for locales is
all built-in; none of that can be loaded at runtime. There is support
for all of the ISO-8859 charsets, three JIS variants, a bunch of
Windows code pages and a few other single-byte encodings.
One oddity in this code is that when using a JIS locale, wide
characters are stored in EUC-JP rather than Unicode. Every other
locale uses Unicode. This means APIs like wctype are implemented by
mapping the JIS-encoded character to Unicode and then using the
underlying Unicode character classification tables. One consequence of
this is that there isn't any Unicode to JIS mapping provided as it
isn't necessary.
When testing the charset conversion and Unicode character
classification data, I found numerous minor errors and a couple of
pretty significant ones. The JIS conversion code had the most serious
issue I found; most of the conversions are in a 2d array which is
manually indexed with the wrong value for the length of each row. This
led to nearly every translated value being incorrect.
The charset conversion tables and Unicode classification data are now
generated using python charset support and the standard Unicode data
files. In addition, tests have been added which compare Picolibc to
the system C library for every supported charset.
Newlib iconv code
The iconv charset support is completely separate from the locale
charset support with a much wider range of supported targets. It also
supports loading charset data from files at runtime, which reduces the
size of application images.
Because the iconv and locale implementations are completely separate,
the charset support isn't the same. Iconv supports a lot more
charsets, but it doesn't support all of those available to
locales. For example, Iconv has Big5 support which locale
lacks. Conversely, locale has Shift-JIS support which iconv does not.
There's also a difference in how charset names are mapped in the two
APIs. The locale code has a small fixed set of aliases, which doesn't
include things like US-ASCII or ANSI X3.4. In contrast, the iconv
code has an extensive database of charset aliases which are compiled
into the library.
Picolibc has a few tests for the iconv API which verify charset names
and perform some translations. Without an external reference, it's
hard to know if the results are correct.
POSIX vs C internationalization
In addition to including the iconv API, POSIX extends locale support
in a couple of ways:
Exposing locale objects via the newlocale, uselocale, duplocale
and freelocale APIs.
uselocale sets a per-thread locale, rather than the process-wide
locale.
Goals for Picolibc internationalization support
For charsets, supporting UTF-8 should cover the bulk of embedded
application needs, and even that is probably more than what most
applications require. Most (all?) compilers use Unicode for wide
character and string constants. That means wchar_t needs to be Unicode
in every locale.
Aside from charset support, the rest of the locale infrastructure is
heavily focused on creating human-consumable strings. I don't think
it's a stretch to say that none of this is very useful these days,
even for systems with sophisticated user interactions. For picolibc,
the cost to provide any of this would be high.
Having two completely separate charset conversion datasets makes
for a confusing and error-prone experience for developers. Replacing
iconv with code that leverages the existing locale support for
translating between multi-byte and wide-character representations will
save a bunch of source code and improve consistency.
Embedded systems can be very sensitive to memory usage, both read-only
and read-write. Applications not using internationalization
capabilities shouldn't pay a heavy premium even when the library
binary is built with support. For the most sensitive targets, the
library should be configurable to remove unnecessary functionality.
Picolibc needs to be conforming with at least the C language standard,
and as much of POSIX as makes sense. Fortunately, the requirements for
C are modest as it only includes a few locale-related APIs and doesn't
include iconv.
Finally, picolibc should test these APIs to make sure they conform
with relevant standards, especially character set translation and
character classification. The easiest way to do this is to reference
another implementation of the same API and compare results.
Switching to Unicode for JIS wchar_t
This involved ripping the JIS to Unicode translations out of all of
the wide character APIs and inserting them into the translations
between multi-byte and wide-char representations. The missing Unicode
to JIS translation was kludged by iterating over all JIS code points
until a matching Unicode value was found. That's an obvious place for
a performance improvement, but at least it works.
Tiny locale
This is a minimal implementation of locales which conforms with the C
language standard while providing only charset translation and
character classification data. It handles all of the existing
charsets, but splits things into three levels
ASCII
UTF-8
Extended, including any or all of:
a. ISO 8859
b. Windows code pages and other 8-bit encodings
c. JIS (JIS, EUC-JP and Shift-JIS)
When built for ASCII-only, all of the locale support is
short-circuited, except for error checking. In addition, support in
printf and scanf for wide characters is removed by default (it can be
re-enabled with the -Dio-wchar=true meson option). This offers the
smallest code size. Because the wctype APIs (e.g. iswupper) are all locale-specific,
this mode restricts them to ASCII-only, which means they become
wrappers on top of the ctype APIs with added range checking.
When built for UTF-8, character classification for wide characters
uses tables that provide the full Unicode range. Setlocale now selects
between two locales, "C" and "C.UTF-8". Any locale name other than "C"
selects the UTF-8 version. If the locale name contains "." or "-",
then the rest of the locale name is taken to be a charset name and
matched against the list of supported charsets. In this mode, only
"us_ascii", "ascii" and "utf-8" are recognized.
Because a single byte of a utf-8 string with the high-bit set is not a
complete character, all of the ctype APIs in this mode can use the
same implementation as the ASCII-only mode. This means the small ctype
implementation is available.
Calling setlocale(LC_ALL, "C.UTF-8") will allow the application to use
the APIs which translate between multi-byte and wide-characters to
deal with UTF-8 encoded strings. In addition, scanf and printf can
read and write UTF-8 strings into wchar_t strings.
Locale names are converted into locale IDs, an enumeration which lists
the available locales. Each ID implies a specific charset as that's
the only thing which differs between them. This means a locale can be
encoded in a few bytes rather than an array of strings.
In terms of memory usage, applications not using locales and not using
the wctype APIs should see only a small increase in code space. That's
due to the wchar_t support added to printf and scanf which need to
translate between multi-byte and wide-character representations. There
aren't any tables required as ASCII and UTF-8 are directly convertible
to Unicode. On ARM-v7m, The added code in printf and scanf add up to
about 1kB and another 32 bytes of RAM is used.
The big difference when enabling extended charset support is that all
of the charset conversion and character classification operations
become table driven and dependent on the locale. Depending on the
extended charsets supported, these can be quite large. With all of the
extended charsets included, this adds an additional 30kB of code and
static data and uses another 56 bytes of RAM.
There are two known gaps in functionality compared with the newlib
code:
Locale strings that encode different locales for different
categories. That's nominally required by POSIX as LC_ALL is
supposed to return a string sufficient to restore the locale, but
the only category which actually matters is LC_CTYPE.
No nl_langinfo support. This would be fairly easy to add,
returning appropriate constant values for each parameter.
Tiny locale was merged to picolibc main in this PR
Tiny iconv
Replacing the bulky newlib iconv code was far easier than swapping
locale implementations. Essentially all that iconv does is compute two
functions, one which maps from multi-byte to wide-char in one locale
and another which maps from wide-char to multi-byte in another locale.
Once the JIS locales were fixed to use Unicode, the new iconv
implementation was straightforward. POSIX doesn't provide any _l
version of mbrtowc or wcrtomb, so using standard C APIs would have
been clunky. Instead, the implementation uses the internal
APIs to compute the correct charset conversion functions. The entire
implementation fits in under 200 lines of code.
Right now, both of these new bits of code sit in the source tree
parallel to the old versions. I'm not seeing any particular reason to
keep the old versions around; they have provided a useful point of
comparison in developing the new code, but I don't think they offer
any compelling benefits going forward.
The Sky Road is the fourth book in the Fall Revolution series, but
it represents an alternate future that diverges after (or during?) the
events of The Sky Fraction. You probably
want to read that book first, but I'm not sure reading
The Stone Canal or
The Cassini Division adds anything to
this book other than frustration. Much more on that in a moment.
Clovis colha Gree is a aspiring doctoral student in history with a summer
job as a welder. He works on the platform for the project, which the
reader either slowly discovers from the book or quickly discovers from the
cover is a rocket to get to orbit. As the story opens, he meets (or, as he
describes it) is targeted by a woman named Merrial, a tinker who works on
the guidance system. The early chapters provide only a few hints about
Clovis's world: a statue of the Deliverer on a horse that forms the
backdrop of their meeting, the casual carrying of weapons, hints that
tinkers are socially unacceptable, and some division between the white
logic and the black logic in programming.
Also, because this is a Ken MacLeod novel, everyone is obsessed with
smoking and tobacco the way that the protagonists of erotica are obsessed
with sex.
Clovis's story is one thread of this novel. The other, told in the
alternating chapters, is the story of Myra Godwin-Davidova, chair of the
governing Council of People's Commissars of the International Scientific
and Technical Workers' Republic, a micronation embedded in post-Soviet
Kazakhstan. Series readers will remember Myra's former lover, David Reid,
as the villain of The Stone Canal and the head of the corporation
Mutual Protection, which is using slave labor (sort of) to support a
resurgent space movement and its attempt to take control of a balkanized
Earth. The ISTWR is in decline and a minor power by all standards except
one: They still have nuclear weapons.
So, first, we need to talk about the series divergence.
I know from reading about this book on-line that The Sky Road is an
alternate future that does not follow the events of The Stone Canal
and The Cassini Division. I do not know this from the text of the
book, which is completely silent about even being part of a series.
More annoyingly, while the divergence in the Earth's future compared to
The Cassini Division is obvious, I don't know what the
Jonbar
hinge is. Everything I can find on-line about this book is maddeningly
coy. Wikipedia claims the divergence happens at the end of The Sky
Fraction. Other reviews and the Wikipedia talk page claim it happens in
the middle of The Stone Canal. I do have a guess, but it's an
unsatisfying one and I'm not sure how to test its correctness. I suppose I
shouldn't care and instead take each of the books on their own terms, but
this is the type of thing that my brain obsesses over, and I find it
intensely irritating that MacLeod didn't explain it in the books
themselves. It's the sort of authorial trick that makes me feel dumb, and
books that gratuitously make me feel dumb are less enjoyable to read.
The second annoyance I have with this book is also only partly its fault.
This series, and this book in particular, is frequently mentioned as good
political science fiction that explores different ways of structuring
human society. This was true of some of the earlier books in a
surprisingly superficial way. Here, I would call it hogwash.
This book, or at least the Myra portion of it, is full of people doing
politics in a tactical sense, but like the previous books of this series,
that politics is mostly embedded in personal grudges and prior romantic
relationships. Everyone involved is essentially an authoritarian whose
ability to act as they wish is only contested by other authoritarians and
is largely unconstrained by such things as persuasion, discussions,
elections, or even theory. Myra and most of the people she meets are
profoundly cynical and almost contemptuous of any true discussion of
political systems. This is the trappings and mechanisms of politics
without the intellectual debate or attempt at consensus, turning it into a
zero-sum game won by whoever can threaten the others more effectively.
Given the glowing reviews I've seen in relatively political SF circles,
presumably I am missing something that other people see in MacLeod's
approach. Perhaps this level of pettiness and cynicism is an accurate
depiction of what it's like inside left-wing political movements. (What an
appalling condemnation of left-wing political movements, if so.) But many
of the on-line reviews lead me to instead conclude that people's
understanding of "political fiction" is stunted and superficial. For
example, there is almost nothing Marxist about this book — it contains
essentially no economic or class analysis whatsoever — but MacLeod uses a
lot of Marxist terminology and sets half the book in an explicitly
communist state, and this seems to be enough for large portions of the
on-line commentariat to conclude that it's full of dangerous, radical
ideas. I find this sadly hilarious given that MacLeod's societies tend, if
anything, towards a low-grade libertarianism that would be at home in a
Robert Heinlein novel. Apparently political labels are all that's needed
to make political fiction; substance is optional.
So much for the politics. What's left in Clovis's sections is a classic
science fiction adventure in which the protagonist has a radically
different perspective from the reader and the fun lies in figuring out the
world-building through the skewed perspective of the characters. This was
somewhat enjoyable, but would have been more fun if Clovis had any
discernible personality. Sadly he instead seems to be an empty receptacle
for the prejudices and perspective of his society, which involve a lot of
quasi-religious taboos and an essentially magical view of the world.
Merrial is a more interesting character, although as always in this series
the romance made absolutely no sense to me and seemed to be conjured by
authorial fiat and weirdly instant sexual attraction.
Myra's portion of the story was the part I cared more about and was more
invested in, aided by the fact that she's attempting to do something more
interesting than launch a crewed space vehicle for no obvious reason. She
at least faces some true moral challenges with no obviously correct
response. It's all a bit depressing, though, and I found Myra's
unwillingness to ground her decisions in a more comprehensive moral
framework disappointing. If you're going to make a protagonist the ruler
of a communist state, even an ironic one, I'd like to hear some real
political philosophy, some theory of sociology and economics that she used
to justify her decisions. The bits that rise above personal animosity and
vibes were, I think, said better in The Cassini Division.
This series was disappointing, and I can't say I'm glad to have read it.
There is some small pleasure in finishing a set of award-winning genre
books so that I can have a meaningful conversation about them, but the
awards failed to find me better books to read than I would have found on
my own. These aren't bad books, but the amount of enjoyment I got out of
them didn't feel worth the frustration. Not recommended, I'm afraid.
Moose Madness is a sapphic shifter romance novella (on the short
side for a novella) by the same author as Wolf Country. It was originally published in the anthology
Her Wild Soulmate, which appears to be very out of print.
Maggie (she hates the nickname Moose) grew up in Moose Point, a tiny
fictional highway town in (I think) Alaska. (There is, unsurprisingly, an
actual Moose Point in Alaska, but it's a geographic feature and not a
small town.) She stayed after graduation and is now a waitress in the
Moose Point Pub. She's also a shifter; specifically, she is a moose
shifter like her mother, the town mayor. (Her father is a fox shifter.) As
the story opens, the annual Moose Madness festival is about to turn the
entire town into a blizzard of moose kitsch.
Fiona Barton was Maggie's nemesis in high school. She was the cool,
popular girl, a red-headed wolf shifter whose friend group teased and
bullied awkward and uncoordinated Maggie mercilessly. She was also
Maggie's impossible crush, although the very idea seemed laughable. Fi
left town after graduation, and Maggie hadn't thought about her for years.
Then she walks into Moose Point Pub dressed in biker leathers, with
piercings and one side of her head shaved, back in town for a wedding in
her pack.
Much to the shock of both Maggie and Fi, they realize that they're
soulmates as soon as their eyes meet. Now what?
If you thought I wasn't going to read the moose and wolf shifter romance
once I knew it existed, you do not know me very well. I have been saving
it for when I needed something light and fun. It seemed like the right
palette cleanser after a very
disappointing book.
Moose Madness takes place in the same universe as Wolf
Country, which means there are secret shifters all over Alaska (and
presumably elsewhere) and they have the strong magical version of love at
first sight. If one is a shifter, one knows immediately as soon as one
locks eyes with one's soulmate and this feeling is never wrong. This is
not my favorite romance trope, but if I get moose shifter romance out of
it, I'll endure.
As you can tell from the setup, this is enemies-to-lovers, but the whole
soulmate thing shortcuts the enemies to lovers transition rather abruptly.
There's a bit of apologizing and air-clearing at the start, but most of
the novella covers the period right after enemies have become lovers and
are getting to know each other properly. If you like that part of the arc,
you will probably enjoy this, but be warned that it's slight and somewhat
obvious. There's a bit of tension from protective parents and annoying
pack mates, but it's sorted out quickly and easily. If you want the
characters to work for the relationship, this is not the novella for you.
It's essentially all vibes.
I liked the vibes, though! Maggie is easy to like, and Fi does a solid job
apologizing. I wish there was quite a bit more moose than we get, but
Delaney captures the combination of apparent awkwardness and raw power of
a moose and has a good eye for how beautiful large herbivores can be. This
is not the sort of book that gives a moment's thought to wolves being
predators and moose being, in at least some sense, prey animals, so if you
are expecting that to be a plot point, you will be disappointed. As with
Wolf Country, Delaney elides most of the messier and more ethically
questionable aspects of sometimes being an animal.
This is a sweet, short novella about two well-meaning and fundamentally
nice people who are figuring out that middle school and high school are
shitty and sometimes horrible but don't need to define the rest of one's
life. It's very forgettable, but it made me smile, and it was indeed a
good palette cleanser.
If you are, like me, the sort of person who immediately thought "oh, I
have to read that" as soon as you saw the moose shifter romance, keep your
expectations low, but I don't think this will disappoint. If you are not
that sort of person, you can safely miss this one.
The House That Fought is the third and final book of the
self-published space fantasy trilogy starting with
The House That Walked Between
Worlds. I read it as part of the Uncertain Sanctuary omnibus,
which is reflected in the sidebar metadata.
At the end of the last book, one of Kira's random and vibe-based trust
decisions finally went awry. She has been betrayed! She's essentially
omnipotent, the betrayal does not hurt her in any way, and, if anything,
it helps the plot resolution, but she has to spend some time feeling bad
about it first. Eventually, though, the band of House residents return to
the problem of Earth's missing magic.
By Earth here, I mean our world, which technically isn't called Earth in
the confusing world-building of this series. Earth within this universe is
an archetypal world that is the origin world for humans, the two types
of dinosaurs, and Neanderthals. There are numerous worlds that have split
off from it, including Human, the one world where humans are dominant,
which is what we think of as Earth and what Kira calls Earth half the
time. And by worlds, I mean entire universes (I think?), because traveling
between "worlds" is dimensional travel, not space travel. But there is
also space travel?
The world building started out confusing and has degenerated over the
course of the series. Given that the plot, such as it is, revolves around
a world-building problem, this is not a good sign.
Worse, though, is that the quality of the writing has become unedited,
repetitive drivel. I liked the first book and enjoyed a few moments of the
second book, but this conclusion is just bad. This is the sort of book
that the maxim "show, don't tell" was intended to head off. The dull,
thudding description of the justification for every character emotion
leaves no room for subtlety or reader curiosity.
Evander was elf and I was human. We weren't the same. I had magic. He
had the magic I'd unconsciously locked into his augmentations. We were
different and in love. Speaking of our differences could be a trigger.
I peeked at him, worried. My customary confidence had taken a hit.
"We're different," he answered my unspoken question. "And we work
anyway. We'll work to make us work."
There is page after page after page of this sort of thing: facile
emotional processing full of cliches and therapy-speak, built on the most
superficial of relationships. There's apparently a romance now, which
happened with very little build-up, no real discussion or communication
between the characters, and only the most trite and obvious relationship
work.
There is a plot underneath all this, but it's hard to make it suspenseful
given that Kira is essentially omnipotent. Schwartz tries to turn the
story into a puzzle that requires Kira figure out what's going on before
she can act, but this is undermined by the confusing world-building. The
loose ends the plot has accumulated over the previous two books are mostly
dropped, sometimes in a startlingly casual way. I thought Kira would care
who killed her parents, for example; apparently, I was wrong.
The previous books caught my attention with a more subtle treatment of
politics than I expect from this sort of light space fantasy. The
characters had, I thought, a healthy suspicion of powerful people and a
willingness to look for manipulation or ulterior motives. Unfortunately,
we discover here that this is not due to an appreciation of the complexity
of power and motive in governments. Instead, it's a reflexive bias against
authority and structured society that sounds like an Internet libertarian
complaining about taxes. Powerful people should be distrusted because all
governments are corrupt and bad and steal your money in order to waste it.
Oh, except for the cops and the military; they're generally good people
you should trust.
In retrospect, I should have expected this turn given the degree to which
Schwartz stressed the independence of sorcerers. I thought that was going
somewhere more interesting than sorcerers as self-appointed vigilantes who
are above the law and can and should do anything they damn well please.
Sadly, it was not.
Adding to the lynch mob feeling, the ending of this book is a deeply
distasteful bit of magical medieval punishment that I thought was vile,
and which is, of course, justified by bad things happening to children. No
societal problems were solved, but Kira got her petty revenge and got to
be gleeful and smug about it. This is apparently what passes for a happy
ending.
I don't even know what to say about the bizarre insertion of Christianity,
which makes little sense given the rest of the world-building. It's
primarily a way for Kira to avoid understanding or thinking about an
important part of the plot. As sadly seems to often be the case in books
like this, Kira's faith doesn't appear to prompt any moral analysis or
thoughtful ethical concern about her unlimited power, just certainty that
she's right and everyone else is wrong.
This was dire. It is one of those self-published books that I feel a
little bad about writing this negative of a review about, because I think
most of the problem was that the author's skill was not up to the story
that she wanted to tell. This happens a lot in self-published fiction,
particularly since Kindle Unlimited has started rewarding quantity over
quality. But given how badly the writing quality degraded over the course
of the series, and how offensive the ending was, I do want to warn other
people off of the series.
There is so much better fiction out there. Avoid this one, and probably
the rest of the series unless you're willing to stop after the first book.
Dark Matters is the fourth book in the science fiction semi-romance
Class 5 series. There are spoilers for all of the previous books, and
although enough is explained that you could make sense of the story
starting here, I wouldn't recommend it. As with the other books in the
series, it follows new protagonists, but the previous protagonists make an
appearance.
You will be unsurprised to hear that the Tecran kidnapped yet
another Earth woman. The repetitiveness of the setup would be more
annoying if the book took itself too seriously, but it doesn't, and so I
mostly find it entertaining. I thought Diener was going to dodge the
obvious series structure, but now I am wondering if we're going to end up
with one woman per Class 5 ship after all.
Lucy is not on a ship, however, Tecran or otherwise. She is a captive in a
military research facility on the Tecran home world. The Tecran are in
very deep trouble given the events of the previous book and have decided
that Lucy's existence is a liability. Only the intervention of some
sympathetic Tecran scientists she partly befriended during her captivity
lets her escape the facility before it's destroyed. Now she's alone, on an
alien world, being hunted by the military.
It's not entirely the fault of this book that it didn't tell the story
that I wanted to read. The setup for Dark Matters implies this book
will see the arrival of consequences for the Tecran's blatant violations
of the Sentient Beings Agreement. I was looking forward to a more
political novel about how such consequences could be administered. This is
the sort of problem that we struggle with in our politics: Collective
punishment isn't acceptable, but there have to be consequences sufficient
to ensure that a state doesn't repeat the outlawed behavior, and yet
attempting to deliver those consequences feels like occupation and can set
off worse social ruptures and even atrocities. I wasn't expecting that
deep of political analysis of what is, after all, a lighthearted SF
adventure series, but Diener has been willing to touch on hard problems.
The ethics of violence has been an ongoing theme of the series.
Alas for me, this is not what we get. The arriving cavalry, in the form of
a Class 5 and the inevitable Grih hunk to serve as the love interest du
jour, quickly become more interested in helping Lucy elude pursuers (or
escape captors) than in the delicate political situation. The conflict
between the local population is a significant story element, but only as
backdrop. Instead, this reads like a thriller or an action movie, complete
with alien predators and a cinematic set piece finale.
The political conflict between the Tecran and the United Council does
reach a conclusion of sorts, but it's not that satisfying. Perhaps some of
the political fallout will happen in future books, but here Diener
simplifies the morality of the story in the climax and dodges out of the
tricky ethical and social challenge of how to punish a sovereign nation.
One of the things I like about this series is that it takes moral
indignation seriously, but now that Diener has raised the (correct)
complication that people have strong motivations to find excuses for the
actions of their own side, I hope she can find a believable political
resolution that isn't simple brute force.
This entry in the series wasn't bad, but it didn't grab me. Lucy was fine
as a protagonist; her ability to manipulate the Tecran into making
mistakes fits the longer time she's had to study them and keeps her
distinct from the other protagonists. But the small bit of politics we do
see is unsatisfying and conveniently simplistic, and this book mostly
degenerates into generic action sequences. Bane, the Class 5 ship featured
in this story, is great when he's active, and I continue to be entertained
by the obsession the Class 5 ships have with Earth women, but he's
sidelined for too much of the story. I felt like Diener focused on the
least interesting part of the story setup.
If you've read this far, there's nothing wrong with this entry. You'll
probably want to keep reading. But it felt like a missed opportunity.
Followed in publication order by Dark Ambitions, a novella that
returns to Rose to tell a side story. The next novel is Dark Class,
in which we'll presumably see the last kidnapped Earth woman.
I keep saying I'm "done" with my CP/M emulator, but then I keep overhauling it in significant ways. Today is no exception. In the past the emulator used breakpoints to detect when calls to the system BIOS, or BDOS, were made. That was possible because the BIOS and BDOS entry points are at predictable locations. For example a well-behaved program might make a system call with code like this:
LD A,42
LD C,4
CALL 0x0005
So setting a breakpoint on 0x0005 would let you detect a system-call was being made, inspect the registers to see which system-call was being made and then carry out the appropriate action in your emulator before returning control back to the program. Unfortunately some binaries patch the RAM, changing the contents of the entry points, or changing internal jump-tables, etc. The end result is that sometimes code running at the fixed addresses is not your BIOS at all, but something else. By trapping/faulting/catching execution here you break things, badly.
So today's new release fixes that! No more breakpoints. Instead we deploy a "real BDOS" in RAM that will route system-calls to our host emulator via a clever trick. For BDOS functions the C-register will contain the system call to operate, our complete BDOS implementation is:
OUT (C),C
RET
The host program can catch writes to output ports, and will know that "OUT (3), 3" means "Invoke system call #3", for example. This means binary patches to entry-points, or any internal jump-tables won't confuse things and so long as control eventually reaches my BIOS or BDOS code areas things will work.
I also added a new console-input driver, since I have a notion of pluggable input and output devices, which just reads input from a file. Now I can prove that my code works. Pass the following file to the input-driver and we have automated testing:
A:
ERA HELLO.COM
ERA HELLO.HEX
ERA HELLO.PRN
hello
ASM HELLO
LOAD HELLO
DDT HELLO.com
t
t
t
t
t
t
t
t
t
C-c
EXIT
Here we:
Erase "HELLO.COM", "HELLO.HEX", "HELLO.PRN"
Invoke "hello[.com]" (which will fail, as we've just deleted it).
Then we assemble "HELLO.ASM" to "HELLO.HEX", then to "HELLO.COM"
Invoke DDT, the system debugger, and tell it to trace execution a bunch of times.
Finally we exit the debugger with "Ctrl-C"
And exit the emulator with "exit"
I can test the output and confirm there are no regressions. Neat.
We are pleased to announce that
Infomaniak has committed to sponsor
DebConf25 as a Platinum Sponsor.
Infomaniak is Switzerland’s leading developer of Web technologies. With
operations all over Europe and based exclusively in Switzerland, the company
designs and manages its own data centers powered by 100% renewable energy,
and develops all its solutions locally, without outsourcing. With millions of
users and the trust of public and private organizations across Europe - such
as RTBF, the United Nations, central banks, over 3,000 radio and TV stations,
as well as numerous cities and security bodies - Infomaniak stands for
sovereign, sustainable and independent digital technology. The company offers
a complete suite of collaborative tools, cloud hosting, streaming, marketing
and events solutions, while being owned by its employees and self-financed
exclusively by its customers.
With this commitment as Platinum Sponsor, Infomaniak is contributing to
the Debian annual Developers' conference, directly supporting the
progress of Debian and Free Software. Infomaniak contributes to
strengthen the community that collaborates on Debian projects from all
around the world throughout all of the year.
Thank you very much, Infomaniak, for your support of DebConf25!
Become a sponsor too!
DebConf25 will take place from
14th to July 20th 2025 in Brest, France, and will be preceded by DebCamp,
from 7th to 13th July 2025.
I watch a lot of films. Since “completing” the IMDB Top 250 back in 2016 I’ve kept an eye on it, and while I don’t go out of my way to watch the films that newly appear in it I generally sit at over 240 watched. I should note I don’t consider myself a film buff/critic, however. I watch things for enjoyment, and a lot of the time that’s kicking back and relaxing and disengaging my brain. So I don’t get into writing reviews, just high level lists of things I’ve watched, sometimes with a few comments.
With that in mind, let’s talk about Christmas movies. Yes, I appreciate it’s the end of January, but generally during December we watch things that have some sort of Christmas theme. New ones if we can find them, but also some of what we consider “classics”. This almost always starts with Scrooged after we’ve put up the tree. I don’t always like Bill Murray (I couldn’t watch The Life Aquatic with Steve Zissou and I think Lost in Translation is overrated), but he’s in a bunch of things I really like, and Scrooged is one of those.
I don’t care where you sit on whether Die Hard is a Christmas movie or not, it’s a good movie and therefore regularly gets a December watch. Die Hard 2 also fits into that category of “sequel at least as good as the original”, though Helen doesn’t agree. We watched it anyway, and I finally made the connection between the William Sadler hotel scene and Michael Rooker’s in Mallrats.
It turns out I’m a Richard Curtis fan. Love Actually has not aged well; most times I watch it I find something new questionable about it, and I always end up hating Alan Rickman for cheating on Emma Thompson, but I do like watching it. He had a new one, That Christmas, out this year, so we watched it as well.
Another new-to-us film this year was Spirited. I generally like Ryan Reynolds, and Will Ferrell is good as long as he’s not too overboard, so I had high hopes. I enjoyed it, but for some reason not as much as I’d expected, and I doubt it’s getting added to the regular watch list.
Larry doesn’t generally like watching full length films, but he (and we), enjoyed The Grinch, which I actually hadn’t seen before. He’s not as fussed on The Muppet Christmas Carol, but we watched it every year, generally on Christmas or Boxing Day. Favourite thing I saw on the Fediverse in December was “Do you know there’s a book of The Muppet Christmas Carol, and they don’t mention that there’s muppets in it once?”
There are a various other light hearted Christmas films we regularly watch. This year included The Holiday (I have so many issues with even just the practicalities of a short notice house swap), and Last Christmas (lots of George Michael music, what’s not to love? Also it was only on this watch through that we realised the lead character is the Mother of Dragons).
We started, but could not finish, Carry On. I saw it described somewhere as copaganda, and that feels accurate. It does not accurately reflect any of my interactions with TSA at airports, especially during busy periods.
Things we didn’t watch this year, but are regularly in the mix, include Fatman, Violent Night (looking forward to the sequel, hopefully this year), and Lethal Weapon. Klaus is kinda at the other end of the spectrum, but very touching, and we’ve watched it a couple of years now.
Given what we seem to like, any suggestions for other films to add? It’s nice to have enough in the mix that we get some variety every year.
Pretty much exactly a year ago, I posted about how I was trying out this
bcachefs thing, being cautiously optimistic (but reminding you to keep
backups). Now I'm going the other way; I've converted my last bcachefs
filesystem to XFS, and I don't intend to look at it again in the near
future.
What changed in the meantime? Well, the short version is: I no longer
trust bcachefs' future. Going into a new filesystem is invariably
filled with rough edges, and I totally accepted that (thus the backups).
But you do get a hope that things will get better, and for a filesystem
developed by a single person (Kent Overstreet), that means you'll need
to trust that person to a fairly large degree. Having both hung out in
#bcache and seen how this plays out
on LKML and against Debian, I don't
really have that trust anymore.
To be clear: I've seen my share of bugs. Whenever you see Kent defending
his filesystem, he usually emphasizes how he has a lot of happy users
and very few bugs left and things are going to be great after Just The
Next Refactor. (Having to call out this explicitly all the time is usually
a bad sign in itself.) But, well, I've had catastrophic data loss bugs that went unfixed
for weeks despite multiple people reporting them. I've seen strange read
performance issues. I've had oopses. And when you go and ask about why you get e.g. hang
messages in the kernel log, you get “oh, yeah, that's a known issue with
compression, we're not entirely sure what to do about it”.
There are more things: SSD promotion/demotion doesn't always work.
Erasure coding is known-experimental. Userspace fsck frequently hangs my
system during boot (I need to go into a debug console and kill mount,
and then the kernel mounts the filesystem). umount can take minutes sometimes.
The filesystem claims to support quotas, but there's no way to actually
make the Linux standard quota tools enable quotas on a multi-device
filesystem. And you'll generally need to spent 8% on spare space for
copygc, even if your filesystem consists of entirely static files.
You could argue that since I didn't report all of these bugs, I cannot
expect them to be fixed either. But here's the main issue for me: Reporting bugs to bcachefs
is not a pleasant experience. You hang around in #bcache on IRC, and perhaps
Kent is awake, perhaps he's not, perhaps things get fixed or perhaps other
things take priority. But you can expect to get flamed about running Debian,
or perhaps more accurately, not being willing to spearhead Kent's crusade
against Debian's Rust packaging policies. (No, you cannot stay neutral.
No, you cannot just want to get your filesystem fixed. You are expected
to actively go and fight the Rust team on Kent's behalf.) Kent has made
it clear that for distributions to carry bcachefs-tools (which you need
to, among others, mount filesystems), it's his way
or the highway; ship exactly what he wants in the way that he wants it,
or just go away. (Similarly, it's the “kernel's culture” and “an mm
maintainer” that are the problems; it's not like Kent ought to change
the way he wants to develop or anything.)
So I simply reverted back to something tried and trusted. It's a bit sad
to lose the compression benefits, but I can afford those 100 extra gigabytes
of disk space. And it was nice to have partial-SSD-partial-HDD filesystems
(it worked better than dm-cache for me), but it turns out 1TB SSDs are cheap
now and I could have my key filesystems (/home and /var/lib/postgres)
entirely on SSD instead.
Look, I'm not saying bcachefs sucks. Nor that it is doomed; perhaps Kent
will accept that he needs to work differently for it to thrive in the kernel
(and the Linux ecosystem as a whole), no matter how right he feels he is.
But with a filesystem this young, you inevitably have to accept some rough
edges in return for some fun. And at some point, the fun just stopped for me.
dsafilter is a mail filter I wrote two decades ago to solve a problem I had:
I was dutifully subscribed to
debian-security-announce
to learn of new security package updates, but most were not relevant to me.
The filter creates a new, summarizing mail, reporting on whether the DSA was
applicable to any package installed on the system running the filter, and
attached the original DSA mail for reference. Users can then choose to drop
mails for packages that aren't relevant.
In 2005 I'd been a Debian user for about 6 years, I'd met a few Debian
developers in person and I was interested in getting involved. I started my
journey to Developer later that same year. I published dsafilter, and I think
I sent an announcement to debian-devel, but didn't do a great deal to
socialise it, so I suspect nobody else is using it.
That said, I have been for the two decades, and I still am! What's notable
to me about that is that I haven't had to modify the script at all to keep
up with software changes, in particular, from the interpreter. I wrote it as
a Ruby script. If I had chosen Perl, it would probably be the same story, but
if I'd chosen Python, there's no chance at all that it would still be working
today.
If it sounds interesting to you, please give it a
try. I think it might be due some spring
cleaning.
Despite comments on my ikiwiki blog being fully moderated, spammers have
been increasingly posting link spam comments on my blog. While I used to use
the blogspam plugin, the
underlying service was likely retired circa
2017 and its public
repositories are all archived.
It turns out that there is a relatively simple way to drastically reduce the
amount of spam submitted to the moderation queue: ban the datacentre IP
addresses that spammers are using.
Looking up AS numbers
It all starts by looking at the IP address of a submitted comment:
From there, we can look it up using whois:
$ whois -r 2a0b:7140:1:1:5054:ff:fe66:85c5
% This is the RIPE Database query service.
% The objects are in RPSL format.
%
% The RIPE Database is subject to Terms and Conditions.
% See https://docs.db.ripe.net/terms-conditions.html
% Note: this output has been filtered.
% To receive output for a database update, use the "-B" flag.
% Information related to '2a0b:7140:1::/48'
% Abuse contact for '2a0b:7140:1::/48' is 'abuse@servinga.com'
inet6num: 2a0b:7140:1::/48
netname: EE-SERVINGA-2022083002
descr: servinga.com - Estonia
geoloc: 59.4424455 24.7442221
country: EE
org: ORG-SG262-RIPE
mnt-domains: HANNASKE-MNT
admin-c: CL8090-RIPE
tech-c: CL8090-RIPE
status: ASSIGNED
mnt-by: MNT-SERVINGA
created: 2020-02-18T11:12:49Z
last-modified: 2024-12-04T12:07:26Z
source: RIPE
% Information related to '2a0b:7140:1::/48AS207408'
route6: 2a0b:7140:1::/48
descr: servinga.com - Estonia
origin: AS207408
mnt-by: MNT-SERVINGA
created: 2020-02-18T11:18:11Z
last-modified: 2024-12-11T23:09:19Z
source: RIPE
% This query was served by the RIPE Database Query Service version 1.114 (SHETLAND)
While I do want to eliminate this source of spam, I don't want to block
these datacentre IP addresses outright since legitimate users could be using
these servers as VPN endpoints or crawlers.
I therefore added the following to my Apache config to restrict the CGI
endpoint (used only for write operations such as commenting):
<Location /blog.cgi>
Include /etc/apache2/spammers.include
Options +ExecCGI
AddHandler cgi-script .cgi
</Location>
and then put the following in /etc/apache2/spammers.include:
<RequireAll>
Require all granted
# https://ipinfo.io/AS207408
Require not ip 46.11.183.0/24
Require not ip 80.77.25.0/24
Require not ip 194.76.227.0/24
Require not ip 2a0b:7140:1::/48
</RequireAll>
Finally, I can restart the website and commit my changes:
$ apache2ctl configtest && systemctl restart apache2.service
$ git commit -a -m "Ban all IP blocks from Servinga"
Future improvements
I will likely automate this process in the future, but at the moment my
blog can go for a week without a single spam message (down from dozens every
day). It's possible that I've already cut off the worst offenders.
While most podcasts are available on multiple platforms and either offer an
RSS feed or have one that can be
discovered, some are only available in
the form of a YouTube channel. Thankfully, it's possible to both monitor
them for new episodes (i.e. new videos), and time-shift the audio for
later offline listening.
When it comes to downloading the audio, the most reliable tool I have found is
yt-dlp. Since the exact arguments needed
to download just the audio as an MP3 are a bit of a mouthful, I wrote a wrapper
script
which also does a few extra things:
cleans up the filename so that it can be stored on any filesystem
adds ID3 tags so that MP3 players can have the metadata they need to
display and group related podcast episodes together
This issue was quite puzzling, so I’m sharing how we investigated this issue. I hope it can be useful for you.
My client informed me that he was no longer able to install new instances of his application.
k9s showed that only some pods could not be created, only the ones that created physical volume (PV). The description of these pods showed a HTTP error 429 when creating pods: New PVC could not be created because we were throttled by Azure storage API.
This issue was confirmed by Azure diagnostic console on Kubernetes ( menu “Diagnose and solve problems” → “Cluster and Control Plane Availability and Performance” → “Azure Resource Request Throttling“).
We had a lot of throttling:
Which were explained by the high call rate:
The first clue was found at the bottom of Azure diagnostic page:
According, to this page, throttling is done by services whose user agent is:
The main information is Azure-SDK-For-Go, which means the program making all these calls to storage API is written in Go. All our services are written in Typescript or Rust, so they are not suspect.
That leaves controllers running in kube-systems namespace. I could not find anything suspects in the logs of these services.
At that point I was convinced that a component in Kubernetes control plane was making all those calls. Unfortunately, AKS is managed by Microsoft and I don’t have access to the control plane logs.
However, we’re realized that we had quite a lot of volumesnapshots that are created in our clusters using k8s-scheduled-volume-snapshotter:
about 1800 on dev instead of 240
1070 on preprod instead of 180
6800 on prod instead of 2400
We suspected that kubernetes reconciliation loop is throttled when checking the status of all these snapshots. May be so, but we also had the same issues and throttle rates on preprod and prod were the number of snapshots were quite different.
We tried to get more information using Azure console on our snapshot account, but it was also broken by the throttling issue.
First, we removed all our applications, no change. �
Then, all ancillary components like rabbitmq, cert-manager were removed, no change. 😶
Then, we tried remove the namespace containing our applications. But, we faced another issue: Kubernetes was unable to remove the namespace because it could not destroy some PVC and volumesnapshots. � That was actually good news, because it meant that we were close to the actual issue. 🤗
🪓 We managed to destroy the PVC and volumesnapshots by removing their finalizers. Finalizers are some kind of markers that tell kubernetes that something needs to be done before actually deleting a resource.
So we looked for a better fix to try on our preprod cluster. �
�� Poking around in PVC and volumesnapshots, I finally found this error message in the description on a volumesnapshotcontents:
Code="ShareSnapshotCountExceeded" Message="The total number of snapshots
for the share is over the limit."
The number of snapshots found in our cluster was not that high. So I wanted to check the snapshots present in our storage account using Azure console, which was still broken. ⚰�
Fortunately, Azure CLI is able to retry HTTP calls when getting 429 errors. I managed to get a list of snapshots with
az storage share list --account-name [redacted] --include-snapshots \
| tee preprod-list.json
There, I found a lot of snapshots dating back from 2024. These were no longer managed by Kubernetes and should have been cleaned up. That was our smoking gun.
I guess that we had a chain of events like:
too many snapshots in some volumes
Kubernetes control plane tries to reconcile its internal status with Azure resources and frequently retries snapshot creation
API throttling kicks in
client not happy ☹�
To make things worse, k8s-scheduled-volume-snapshotter creates new snapshots when it cannot list the old ones. So we had 4 new snapshots per day instead of one. 🌊
Since we had the chain of events, fixing the issue was not too difficult (but quite long 😵�💫):
stop k8s-scheduled-volume-snapshotter by disabling its cron job
delete all volumesnapshots and volume snapshots contents from k8s.
since Azure API was throttled, we also had to remove their finalizers
delete all snapshots from azure using az command and a Perl script (this step took several hours)
re-enable k8s-scheduled-volume-snapshotter
After these steps, preprod was back to normal. ğŸ�¯ I’m now applying the same recipe on prod. 💊
We still don’t know why we had all these stale snapshots. It may have been a human error or a bug in k8s-scheduled-volume-snapshotter.
Anyway, to avoid this problem is the future, we will:
setup an alert on the number of snapshots per volume
check with k8s-scheduled-volume-snapshotter author to better cope with throttling
My name is Dominique Dumont, I’m a devops freelance. You can find the devops and audit services I propose on my website or reach out to me on LinkedIn.
HMAC stands for Hash-Based Message Authentication Code. It’s a specific way to use a cryptographic hash function (like SHA-1, SHA-256, etc.) along with a secret key to produce a unique “fingerprint” of some data. This fingerprint allows someone else with the same key to verify that the data hasn’t been tampered with.
How HMAC Works
Keyed Hashing: The core idea is to incorporate the secret key into the hashing process. This is done in a specific way to prevent clever attacks that might try to bypass the security.
Inner and Outer Hashing: HMAC uses two rounds of hashing. First, the message and a modified version of the key are hashed together. Then, the result of that hash, along with another modified version of the key, are hashed again. This two-step process adds an extra layer of protection.
HMAC in OpenSSH
OpenSSH uses HMAC to ensure the integrity of messages sent back and forth during an SSH session. This prevents an attacker from subtly modifying data in transit.
HMAC-SHA1 with OpenSSH: Is it Weak?
SHA-1 itself is considered cryptographically broken. This means that with enough computing power, it’s possible to find collisions (two different messages that produce the same hash). However, HMAC-SHA1 is generally still considered secure for most purposes. This is because exploiting weaknesses in SHA-1 to break HMAC-SHA1 is much more difficult than just finding collisions in SHA-1.
Should you use it?
While HMAC-SHA1 might still be okay for now, it’s best practice to move to stronger alternatives like HMAC-SHA256 or HMAC-SHA512. OpenSSH supports these, and they provide a greater margin of safety against future attacks.
In Summary
HMAC is a powerful tool for ensuring data integrity. Even though SHA-1 has weaknesses, HMAC-SHA1 in OpenSSH is likely still safe for most users. However, to be on the safe side and prepare for the future, switching to HMAC-SHA256 or HMAC-SHA512 is recommended.
Following are instructions for creating dataproc clusters with sha1 mac support removed:
I can appreciate an excess of caution, and I can offer you some code to produce Dataproc instances which do not allow HMAC authentication using sha1.
#!/bin/bash
# remove mac specification from sshd configuration
sed -i -e 's/^macs.*$//' /etc/ssh/sshd_config
# place a new mac specification at the end of the service configuration
ssh -Q mac | perl -e \
'@mac=grep{ chomp; ! /sha1/ }; print("macs ", join(",",@mac), $/)' >> /etc/ssh/sshd_config
# reload the new ssh service configuration
systemctl reload ssh.service
If this code is hosted on GCS, you can refer to it with
A feature of systemd is the ability to reduce the access that daemons have to the system. The restrictions include access to certain directories, system calls, capabilities, and more. The systemd.exec(5) man page describes them all [1]. To see an overview of the security of daemons run “systemd-analyze security” and to get details of one particular daemon run a command like “systemd-analyze security mon.service”.
I created a Debian wiki page for a systemd-analyze security goal [2]. At this time release goals aren’t a serious thing for Debian so this won’t result in release critical bug reports, but it is still something we can aim for.
For a simple daemon (EG BIND, dhcpd, and syslogd) this isn’t difficult to do. It might be difficult to understand the implications of some changes (especially when restricting system calls) but you can do some quick tests. The functionality of such programs has a limited scope and once you get it basically working it’s done.
For some daemons it’s harder. Network-Manager is one of the well known slightly more difficult cases as it could do things like starting a VPN connection. The larger scope and the use of plugins makes it difficult to test the combinations. The systemd restrictions apply to child processes too unlike restrictions by SE Linux and AppArmor which permit a child process to run in a different security context.
The messages when a daemon fails due to systemd restrictions are usually unclear which makes things harder to setup and makes it more important to get it right.
My “mon” package (which I forked upstream as etbe-mon [3] is one of the difficult daemons as local test can involve probing large parts of the system. But I have got that working reasonably well for most cases.
I have a bug report about running mon with Exim [4]. The problem with this is that Exim has a single process model which means that the process doing local delivery can be a child of the process that initially received the message. So the main mon process needs all the access for delivering mail (writing to /home etc). This also means that every other child of mon will get such access including programs that receive untrusted data from the Internet. Most of the extra access needed by Exim is not a problem, but /home access is a potential risk. It also means that more effort is needed when reviewing the access control.
The problem with this Exim design is that it applies to many daemons. Every daemon that sends email or that potentially could send email in some configuration needs extra access to be granted.
Can Exim be configured to have it’s sendmail -T” type operation just write a file in a spool directory for another program to process? Do we need to grant permissions to most of the system just for Exim?
It’s been 5 years since I created auto-cpufreq. Today, it has over 6000 stars on GitHub, attracting 97 contributors, releasing 47 versions, and reaching what...
The new years starts with a FAI release. FAI 6.2.5 is available
and contains many small improvements. A new feature is that the
command fai-cd can now create ISOs for the ARM64 architecture.
The FAIme service uses the newest
FAI version and the Debian most recent point release 12.9.
The FAI CD images were also updated.
The Debian packages of FAI 6.2.5 are available for Debian stable (aka
bookworm) via the FAI repository adding this line to sources.list:
deb https://fai-project.org/download bookworm koeln
Using the tool extrepo, you can also add the FAI repository to your host
# extrepo enable fai
FAI 6.2.5 will soon be available in Debian testing via the official
Debian mirrors.
The opening of the Royalmount shopping center nearly doubled the traffic
of the De La Savane station.
The Montreal subway continues to grow, but has not yet recovered from the
pandemic. Berri-UQAM station (the largest one) is still below 1 million
entries per quarter compared to its pre-pandemic record.
By clicking on a subway station, you'll be redirected to a graph of the
station's foot traffic.
Blocking comment spammers on an Ikiwiki blog
Despite comments on my ikiwiki blog being fully moderated, spammers have been increasingly posting link spam comments on my blog. While I used to use the blogspam plugin, the underlying service was likely retired circa 2017 and its public repositories are all archived.
It turns out that there is a relatively simple way to drastically reduce the amount of spam submitted to the moderation queue: ban the datacentre IP addresses that spammers are using.
Looking up AS numbers
It all starts by looking at the IP address of a submitted comment:
From there, we can look it up using
whois
:The important bit here is this line:
which referts to Autonomous System 207408, owned by a hosting company in Germany called Servinga.
Looking up IP blocks
Autonomous Systems are essentially organizations to which IPv4 and IPv6 blocks have been allocated.
These allocations can be looked up easily on the command line either using a third-party service:
or a local database downloaded from IPtoASN.
This is what I ended up with in the case of Servinga:
Preventing comment submission
While I do want to eliminate this source of spam, I don't want to block these datacentre IP addresses outright since legitimate users could be using these servers as VPN endpoints or crawlers.
I therefore added the following to my Apache config to restrict the CGI endpoint (used only for write operations such as commenting):
and then put the following in
/etc/apache2/spammers.include
:Finally, I can restart the website and commit my changes:
Future improvements
I will likely automate this process in the future, but at the moment my blog can go for a week without a single spam message (down from dozens every day). It's possible that I've already cut off the worst offenders.
I have published the list I am currently using.
19 January, 2025 09:00PM