04 Feb 2025

feedPlanet Debian

Dominique Dumont: Azure API throttling strikes back

Hi

In my last blog, I explained how we resolved a throttling issue involving Azure storage API. In the end, I mentioned that I was not sure of the root cause of the throttling issue.

Even though we no longer had any problem in dev and preprod cluster, we still faced throttling issue with prod. The main difference between these 2 environments is that we have about 80 PVs in prod versus 15 in the other environments. Given that we manage 1500 pods in prod, 80 PVs does not look like a lot. 🤨

To continue the investigation, I've modified k8s-scheduled-volume-snapshotter to limit the number of snaphots done in a single cron run (see add maxSnapshotCount parameter pull request).

In prod, we used the modified snapshotter to trigger snapshots one by one.

Even with all previous snapshots cleaned up, we could not trigger a single new snapshot without being throttled🕳. I guess that, in the cron job, just checking the list of PV to snapshot was enough to exhaust our API quota. 😒

Azure doc mention that a leaky bucket algorithm is used for throttling. A full bucket holds tokens for 250 API calls, and the bucket gets 25 new tokens per second. Looks like that not enough.🐌

I was puzzled 😵‍💫 and out of ideas 😶.

I looked for similar problems in AKS issues on GitHub where I found this comment that recommend using useDataPlaneAPI parameter in the CSI file driver. That was it! 😃

I was flabbergasted 🤯 by this parameter: why is CSI file driver able to use 2 APIs ? Why is one on them so limited ? And more importantly, why is the limited API the default one ?

Anyway, setting useDataPlaneAPI: "true" in our VolumeSnapshotClass manifest was the right solution. This indeed solved the throttling issue in our prod cluster. ⚕

But not the snaphot issue 😑. Amongst the 80 PV, I still had 2 snaphots failing.🦗

Fortunately, the error was mentioned in the description of the failed snapshots: we had too many (200) snapshots for these shared volumes.

What ?? 😤 All these snapshots were cleaned up last week.

I then tried to delete these snaphots through azure console. But the console failed to delete these snapshot due to API throttling. Looks like Azure console is not using the right API. 🤡

Anyway, I went back to the solution explained in my previous blog, I listed all snapshots with az command. I indeed has a lot of snaphots, a lot of them dated Jan 19 and 20. There was often a new bogus snaphot created every minute.

These were created during the first attempt at fixing the throttling issue. I guess that even though CSI file driver was throttled, a snaphot was still created in the storage account, but the CSI driver did not see it and retried a minute later💥. What a mess.

Anyway, I've cleaned up again these bogus snapshot 🧨, and now, snaphot creation is working fine 🤸🏻‍♂️.

For now.

All the best.

04 Feb 2025 1:23pm GMT

Paul Wise: FLOSS Activities January 2025

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Sponsors

All work was done on a volunteer basis.

04 Feb 2025 2:43am GMT

Valhalla's Things: Conference Talk Timeout Ring, Part One

Posted on February 4, 2025
Tags: madeof:atoms, madeof:bits

low quality video of a ring of rgb LED in front of a computer: the LED light up one at a time in colours that go from yellow to red.

A few ago I may have accidentally bought a ring of 12 RGB LEDs; I soldered temporary leads on it, connected it to a CircuitPython supported board and played around for a while.

They we had a couple of friends come over to remote FOSDEM together, and I had talked with one of them about WS2812 / NeoPixels, so I brought them to the living room, in case there was a chance to show them in sort-of-use.

Then I was dealing with playing the various streams as we moved from one room to the next, which lead to me being called "video team", which lead to me wearing a video team shirt (from an old DebConf, not FOSDEM, but still video team), which lead to somebody asking me whether I also had the sheet with the countdown to the end of the talk, and the answer was sort-of-yes (I should have the ones we used to use for our Linux Day), but not handy.

But I had a thing with twelve things in a clock-like circle.

A bit of fiddling on the CircuitPython REPL resulted, if I remember correctly, in something like:

import board
import neopixel
import time

num_pixels = 12

pixels = neopixel.NeoPixel(board.GP0, num_pixels)
pixels.brightness = 0.1

def end(min):
    pixels.fill((0, 0, 0))
    for i in range(12):
        pixels[i] = (127 + 10 * i, 8 * (12 - i), 0)
        pixels[i-1] = (0, 0, 0)
        time.sleep(min * 5)  # min * 60 / 12

Now, I wasn't very consistent in running end, especially since I wasn't sure whether I wanted to run it at the beginning of the talk with the full duration or just in the last 5 - 10 minutes depending of the length of the slot, but I've had at least one person agree that the general idea has potential, so I'm taking these notes to be able to work on it in the future.

One thing that needs to be fixed is the fact that with the ring just attached with temporary wires and left on the table it isn't clear which LED is number 0, so it will need a bit of a case or something, but that's something that can be dealt with before the next fosdem.

And I should probably add some input interface, so that it is self-contained and not tethered to a computer and run from the REPL.

(And then I may also have a vague idea for putting that ring into some wearable thing: good thing that I actually bought two :D )

04 Feb 2025 12:00am GMT

02 Feb 2025

feedPlanet Debian

Bits from Debian: Bits from the DPL

Dear Debian community,

this is bits from DPL for January.

Sovereign Tech Agency

I was recently pointed to Technologies and Projects supported by the Sovereign Tech Agency which is financed by the German Federal Ministry for Economic Affairs and Climate Action. It is a subsidiary of the Federal Agency for Disruptive Innovation, SPRIND GmbH.

It is worth sending applications there for distinct projects as that is their preferred method of funding. Distinguished developers can also apply for a fellowship position that pays up to 40hrs / week (32hrs when freelancing) for a year. This is esp. open to maintainers of larger numbers of packages in Debian (or any other Linux distribution).

There might be a chance that some of the Debian-related projects submitted to the Google Summer of Code that did not get funded could be retried with those foundations. As per the FAQ of the project: "The Sovereign Tech Agency focuses on securing and strengthening open and foundational digital technologies. These communities working on these are distributed all around the world, so we work with people, companies, and FOSS communities everywhere."

Similar funding organizations include the Open Technology Fund and FLOSS/fund. If you have a Debian-related project that fits these funding programs, they might be interesting options. This list is by no means exhaustive-just some hints I've received and wanted to share. More suggestions for such opportunities are welcome.

Year of code reviews

On the debian-devel mailing list, there was a long thread titled "Let's make 2025 a year when code reviews became common in Debian". It initially suggested something along the lines of: "Let's review MRs in Salsa." The discussion quickly expanded to include patches that have been sitting in the BTS for years, which deserve at least the same attention. One idea I'd like to emphasize is that associating BTS bugs with MRs could be very convenient. It's not only helpful for documentation but also the easiest way to apply patches.

I'd like to emphasize that no matter what workflow we use-BTS, MRs, or a mix-it is crucial to uphold Debian's reputation for high quality. However, this reputation is at risk as more and more old issues accumulate. While Debian is known for its technical excellence, long-standing bugs and orphaned packages remain a challenge. If we don't address these, we risk weakening the high standards that Debian is valued for. Revisiting old issues and ensuring that unmaintained packages receive attention is especially important as we prepare for the Trixie release.

Debian Publicity Team will no longer post on X/Twitter

The Press Team has my full support in its decision to stop posting on X. As per the Publicity delegation:

the team once decided to join Twitter, but circumstances have since changed. The current Press delegates have the institutional authority to leave X, just as their predecessors had the authority to join. I appreciate that the team carefully considered the matter, reinforced by the arguments developed on the debian-publicity list, and communicated its reasoning openly.

Kind regards,

Andreas.

02 Feb 2025 11:00pm GMT

Dirk Eddelbuettel: RcppUUID 1.1.2 on CRAN: Newly Adopted Package

The RcppUUID package on CRAN has been providing UUIDs (based on the underlying Boost library) for several years. Written by Artem Klemsov and maintained in this gitlab repo, the package is a very nice example of clean and straightforward library binding.

When we did our annual BH upgrade to 1.87.0 and check reverse dependencies, we noticed the RcppUUID needed a small and rather minor update which we showed as a short diff in an issue filed. Neither I nor CRAN heard from Artem, so the packaged ended up being archived last week. Which in turn lead me to make this minimal update to 1.1.2 to resurrect it, which CRAN processed more or less like a regular update given this explanation and so it arrived last Friday.

Courtesy of my CRANberries, there is also a a 'new package' note (no diffstat report yet). More detailed information is on the RcppUUID page, or the github repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

02 Feb 2025 10:38pm GMT

Dave Hibberd: SOTA Trip Reports: Feb 02, 2025 - Bennachie

To Quote @MM0EFI and the GM0ESS gang, today was a particularly Amateur showing!

Having spent all weekend locked in the curling rink ruining my knees and inflicting mild liver damage in the Aberdeen City Open competition, I needed some outside time away from people to stretch the legs and loosen my knees.

With my teammates/guests shipped off early on account of our quality performance and the days fair drawin' out now, I found myself with a free afternoon to have a quick run up something nearby before a 1640 sunset! Up the back of Bennachie is a quick steady ascent and in 13 years of living up here I've never summited the big hill! Now is as good a time as any. In SOTA terms, this hill is GM/ES-061. In Geographical terms, it's around 20 miles inland from Aberdeen city here.

I've been experimenting with these Aliexpress whips since the end of last year and the forecast wind was low enough to take one into the hills. I cut and terminated 8x 2.5m radials for an effective ground plane last week and wanted to try that against the flat ribbon that it came with.

The ascent was pleasant enough, got to the summit in good time, and out came my Quansheng radio to get the GM/ES-Society on 2m. First my Nagoya whip - called CQ and heard nothing, with general poor reports in WhatsApp I opted to get the slim-g up my aliexpress fibreglass mast.

In an amateur showing last week, I broke the tip of the mast on Cat Law helping 2M0HSK do his first activation due to the wind, and had forgotten this until I summited this week. Squeezing my antenna on was tough, and after many failed attempts to get it up (the mast kept collapsing as I was rushing and not getting the friction hold on each section correctly) and still not hearing anything at all, I changed location and tried again.

In my new position, I received 2M0RVZ 4/4 at best, but he was hearing my 5/9. Similarly GM5ALX and GM4JXP were patiently receiving me loud and clear but I couldn't hear them at all. I fiddled with settings and decided the receive path of the Quansheng must be fried or sad somehow, but I don't yet have a full set of diagnostics run.

I'll take my Anytone on the next hill and compare them against each other I think.

I gave up and moved to HF, getting my whip and new radials into the ground: 295B58E1-BA43-4348-A4C7-0B1E013C4006_1_102_o|375x500

Quick to deploy which is what I was after. My new 5m of coax with a choke fitted attached to the radio and we were off to the races - A convenient thing of beauty when it's up: 33C35D56-F470-46BB-B31E-F66361504A1C_1_102_o|375x500

I've made a single guy with a sotabeams top insulator to brace against wind if need be, but that didn't need to be used today.

I hit tune, and the G90 spent ages clicking away. In fact, tuning to 14.074, I could only see the famed FT8 signals at S2.

What could be wrong here? Was it my new radials? the whip has behaved before… Minutes turned into tens of minutes playing with everything, and eventually I worked out what was up - my coax only passed signal when I the PL259 connector at the antenna juuuust right. Once I did that, I could take the tuner out the system and work 20 spectacularly well. Until now, I'd been tuning the coax only.

Another Quality Hibby Build Job™️. That's what's wrong!

I managed to struggle my way through a touch of QRM and my wonky cable woes to make enough contacts with some very patient chasers and a summit to summit before my frustration at the situation won out, and down the hill I went after a quick pack up period. I managed to beat the sunset - I think if the system had worked fine, I'd have stayed on the hill for sunset.

I think it's time for a new mast and a coax retermination!

02 Feb 2025 8:00pm GMT

Colin Watson: Free software activity in January 2025

Most of my Debian contributions this month were sponsored by Freexian. If you appreciate this sort of work and are at a company that uses Debian, have a look to see whether you can pay for any of Freexian's services; as well as the direct benefits, that revenue stream helps to keep Debian development sustainable for me and several other lovely people.

You can also support my work directly via Liberapay.

Python team

We finally made Python 3.13 the default version in testing! I fixed various bugs that got in the way of this:

As with last month, I fixed a few more build regressions due to the removal of a deprecated intersphinx_mapping syntax in Sphinx 8.0:

I ported a few packages to Django 5.1:

I ported python-pypump to IPython 8.0.

I fixed python-datamodel-code-generator to handle isort 6, and contributed that upstream.

I fixed some packages to tolerate future versions of dh-python that will drop their dependency on python3-setuptools:

I removed the old python-celery-common transitional package from celery, since nothing in Debian needs it any more.

I fixed or helped to fix various other build/test failures:

I upgraded these packages to new upstream versions:

Rust team

I fixed rust-pyo3-ffi to avoid explicit Python version dependencies that were getting in the way of making Python 3.13 the default version.

Security tools packaging team

I uploaded libevt to fix a build failure on i386 and to tolerate future versions of dh-python that will drop their dependency on python3-setuptools.

Installer team

I helped with some testing of a debian-installer-utils patch as part of the /usr move. I need to get around to uploading this, since it looks OK now.

Other small things

Helmut Grohne reached out for help debugging a multi-arch coinstallability problem (you know it's going to be complicated when even Helmut can't figure it out on his own …) in binutils, and we had a call about that.

I reviewed and applied a new Romanian translation of debconf's manual pages.

I did my twice-yearly refresh of debmirror's mirror_size documentation, and applied a contribution to improve the example debmirror.conf.

I fixed an arguable preprocessor string handling bug in man-db, and applied a fix for out-of-tree builds.

02 Feb 2025 7:48pm GMT

Joachim Breitner: Coding on my eInk Tablet

For many years I wished I had a setup that would allow me to work (that is, code) productively outside in the bright sun. It's winter right now, but when its summer again it's always a bit. this weekend I got closer to that goal.

TL;DR: Using code-server on a beefy machine seems to be quite neat.

Passively lit coding Passively lit coding

Personal history

Looking back at my own old blog entries I find one from 10 years ago describing how I bought a Kobo eBook reader with the intent of using it as an external monitor for my laptop. It seems that I got a proof-of-concept setup working, using VNC, but it was tedious to set up, and I never actually used that. I subsequently noticed that the eBook reader is rather useful to read eBooks, and it has been in heavy use for that every since.

Four years ago I gave this old idea another shot and bought an Onyx BOOX Max Lumi. This is an A4-sized tablet running Android and had the very promising feature of an HDMI input. So hopefully I'd attach it to my laptop and it just works™. Turns out that this never worked as well as I hoped: Even if I set the resolution to exactly the tablet's screen's resolution I got blurry output, and it also drained the battery a lot, so I gave up on this. I subsequently noticed that the tablet is rather useful to take notes, and it has been in sporadic use for that.

Going off on this tangent: I later learned that the HDMI input of this device appears to the system like a camera input, and I don't have to use Boox's "monitor" app but could other apps like FreeDCam as well. This somehow managed to fix the resolution issues, but the setup still wasn't as convenient to be used regularly.

I also played around with pure terminal approaches, e.g. SSH'ing into a system, but since my usual workflow was never purely text-based (I was at least used to using a window manager instead of a terminal multiplexer like screen or tmux) that never led anywhere either.

VSCode, working remotely

Since these attempts I have started a new job working on the Lean theorem prover, and working on or with Lean basically means using VSCode. (There is a very good neovim plugin as well, but I'm using VSCode nevertheless, if only to make sure I am dogfooding our default user experience).

My colleagues have said good things about using VSCode with the remote SSH extension to work on a beefy machine, so I gave this a try now as well, and while it's not a complete game changer for me, it does make certain tasks (rebuilding everything after a switching branches, running the test suite) very convenient. And it's a bit spooky to run these work loads without the laptop's fan spinning up.

In this setup, the workspace is remote, but VSCode still runs locally. But it made me wonder about my old goal of being able to work reasonably efficient on my eInk tablet. Can I replicate this setup there?

VSCode itself doesn't run on Android directly. There are project that run a Linux chroot or in termux on the Android system, and then you can VNC to connect to it (e.g. on Andronix)… but that did not seem promising. It seemed fiddly, and I probably should take it easy on the tablet's system.

code-server, running remotely

A more promising option is code-server. This is a fork of VSCode (actually of VSCodium) that runs completely on the remote machine, and the client machine just needs a browser. I set that up this weekend and found that I was able to do a little bit of work reasonably.

Access

With code-server one has to decide how to expose it safely enough. I decided against the tunnel-over-SSH option, as I expected that to be somewhat tedious to set up (both initially and for each session) on the android system, and I liked the idea of being able to use any device to work in my environment.

I also decided against the more involved "reverse proxy behind proper hostname with SSL" setups, because they involve a few extra steps, and some of them I cannot do as I do not have root access on the shared beefy machine I wanted to use.

That left me with the option of using a code-server's built-in support for self-signed certificates and a password:

$ cat .config/code-server/config.yaml
bind-addr: 1.2.3.4:8080
auth: password
password: xxxxxxxxxxxxxxxxxxxxxxxx
cert: true

With trust-on-first-use this seems reasonably secure.

Service

To keep code-server running I created a systemd service that's managed by my user's systemd instance:

~ $ cat ~/.config/systemd/user/code-server.service
[Unit]
Description=code-server
After=network-online.target

[Service]
Environment=PATH=/home/joachim/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
ExecStart=/nix/var/nix/profiles/default/bin/nix run nixpkgs#code-server

[Install]
WantedBy=default.target

(I am using nix as a package manager on a Debian system there, hence the additional PATH and complex ExecStart. If you have a more conventional setup then you do not have to worry about Environment and can likely use ExecStart=code-server.

For this to survive me logging out I had to ask the system administrator to run loginctl enable-linger joachim, so that systemd allows my jobs to linger.

Git credentials

The next issue to be solved was how to access the git repositories. The work is all on public repositories, but I still need a way to push my work. With the classic VSCode-SSH-remote setup from my laptop, this is no problem: My local SSH key is forwarded using the SSH agent, so I can seamlessly use that on the other side. But with code-server there is no SSH key involved.

I could create a new SSH key and store it on the server. That did not seem appealing, though, because SSH keys on Github always have full access. It wouldn't be horrible, but I still wondered if I can do better.

I thought of creating fine-grained personal access tokens that only me to push code to specific repositories, and nothing else, and just store them permanently on the remote server. Still a neat and convenient option, but creating PATs for our org requires approval and I didn't want to bother anyone on the weekend.

So I am experimenting with Github's git-credential-manager now. I have configured it to use git's credential cache with an elevated timeout, so that once I log in, I don't have to again for one workday.

$ nix-env -iA nixpkgs.git-credential-manager
$ git-credential-manager configure
$ git config --global credential.credentialStore cache
$ git config --global credential.cacheOptions "--timeout 36000"

To login, I have to https://github.com/login/device on an authenticated device (e.g. my phone) and enter a 8-character code. Not too shabby in terms of security. I only wish that webpage would not require me to press Tab after each character…

This still grants rather broad permissions to the code-server, but at least only temporarily

Android setup

On the client side I could now open https://host.example.com:8080 in Firefox on my eInk Android tablet, click through the warning about self-signed certificates, log in with the fixed password mentioned above, and start working!

I switched to a theme that supposedly is eInk-optimized (eInk by Mufanza). It's not perfect (e.g. git diffs are unhelpful because it is not possible to distinguish deleted from added lines), but it's a start. There are more eInk themes on the official Visual Studio Marketplace, but because code-server is a fork it cannot use that marketplace, and for example this theme isn't on Open-VSX.

For some reason the F11 key doesn't work, but going fullscreen is crucial, because screen estate is scarce in this setup. I can go fullscreen using VSCode's command palette (Ctrl-P) and invoking the command there, but Firefox often jumps out of the fullscreen mode, which is annoying. I still have to pay attention to when that's happening; maybe its the Esc key, which I am of course using a lot due to me using vim bindings.

A more annoying problem was that on my Boox tablet, sometimes the on-screen keyboard would pop up, which is seriously annoying! It took me a while to track this down: The Boox has two virtual keyboards installed: The usual Google ASOP keyboard, and the Onyx Keyboard. The former is clever enough to stay hidden when there is a physical keyboard attached, but the latter isn't. Moreover, pressing Shift-Ctrl on the physical keyboard rotates through the virtual keyboards. Now, VSCode has many keyboard shortcuts that require Shift-Ctrl (especially on an eInk device, where you really want to avoid using the mouse). And the limited settings exposed by the Boox Android system do not allow you configure that or disable the Onyx keyboard! To solve this, I had to install the KISS Launcher, which would allow me to see more Android settings, and in particular allow me to disable the Onyx keyboard. So this is fixed.

I was hoping to improve the experience even more by opening the web page as a Progressive Web App (PWA), as described in the code-server FAQ. Unfortunately, that did not work. Firefox on Android did not recognize the site as a PWA (even though it recognizes a PWA test page). And I couldn't use Chrome either because (unlike Firefox) it would not consider a site with a self-signed certificate as a secure context, and then code-server does not work fully. Maybe this is just some bug that gets fixed in later versions.

I did not work enough with this yet to assess how much the smaller screen estate, the lack of colors and the slower refresh rate will bother me. I probably need to hide Lean's InfoView more often, and maybe use the Error Lens extension, to avoid having to split my screen vertically.

I also cannot easily work on a park bench this way, with a tablet and a separate external keyboard. I'd need at least a table, or some additional piece of hardware that turns tablet + keyboard into some laptop-like structure that I can put on my, well, lap. There are cases for Onyx products that include a keyboard, and maybe they work on the lap, but they don't have the Trackpoint that I have on my ThinkPad TrackPoint Keyboard II, and how can you live without that?

Conclusion

After this initial setup chances are good that entering and using this environment is convenient enough for me to actually use it; we will see when it gets warmer.

A few bits could be better. In particular logging in and authenticating GitHub access could be both more convenient and more safe - I could imagine that when I open the page I confirm that on my phone (maybe with a fingerprint), and that temporarily grants access to the code-server and to specific GitHub repositories only. Is that easily possible?

02 Feb 2025 3:07pm GMT

Anuradha Weeraman: DeepSeek-R1, at the cusp of an open revolution

DeepSeek-R1, at the cusp of an open revolution

DeepSeek R1, the new entrant to the Large Language Model wars has created quite a splash over the last few weeks. Its entrance into a space dominated by the Big Corps, while pursuing asymmetric and novel strategies has been a refreshing eye-opener.

GPT AI improvement was starting to show signs of slowing down, and has been observed to be reaching a point of diminishing returns as it runs out of data and compute required to train, fine-tune increasingly large models. This has turned the focus towards building "reasoning" models that are post-trained through reinforcement learning, techniques such as inference-time and test-time scaling and search algorithms to make the models appear to think and reason better. OpenAI&aposs o1-series models were the first to achieve this successfully with its inference-time scaling and Chain-of-Thought reasoning.

Intelligence as an emergent property of Reinforcement Learning (RL)

Reinforcement Learning (RL) has been successfully used in the past by Google&aposs DeepMind team to build highly intelligent and specialized systems where intelligence is observed as an emergent property through rewards-based training approach that yielded achievements like AlphaGo (see my post on it here - AlphaGo: a journey to machine intuition).

DeepMind went on to build a series of Alpha* projects that achieved many notable feats using RL:

All of these systems achieved mastery in its own area through self-training/self-play and by optimizing and maximizing the cumulative reward over time by interacting with its environment where intelligence was observed as an emergent property of the system.

DeepSeek-R1, at the cusp of an open revolutionThe RL feedback loop

RL mimics the process through which a baby would learn to walk, through trial, error and first principles.

R1 model training pipeline

At a technical level, DeepSeek-R1 leverages a combination of Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT) for its training pipeline:

DeepSeek-R1, at the cusp of an open revolutionDeepSeek-R1 Model Training Pipeline

Using RL and DeepSeek-v3, an interim reasoning model was built, called DeepSeek-R1-Zero, purely based on RL without relying on SFT, which demonstrated superior reasoning capabilities that matched the performance of OpenAI&aposs o1 in certain benchmarks such as AIME 2024.

The model was however affected by poor readability and language-mixing and is only an interim-reasoning model built on RL principles and self-evolution.

DeepSeek-R1-Zero was then used to generate SFT data, which was combined with supervised data from DeepSeek-v3 to re-train the DeepSeek-v3-Base model.

The new DeepSeek-v3-Base model then underwent additional RL with prompts and scenarios to come up with the DeepSeek-R1 model.

The R1-model was then used to distill a number of smaller open source models such as Llama-8b, Qwen-7b, 14b which outperformed bigger models by a large margin, effectively making the smaller models more accessible and usable.

Key contributions of DeepSeek-R1

  1. RL without the need for SFT for emergent reasoning capabilities

R1 was the first open research project to validate the efficacy of RL directly on the base model without relying on SFT as a first step, which resulted in the model developing advanced reasoning capabilities purely through self-reflection and self-verification.

Although, it did degrade in its language capabilities during the process, its Chain-of-Thought (CoT) capabilities for solving complex problems was later used for further RL on the DeepSeek-v3-Base model which became R1. This is a significant contribution back to the research community.

The below analysis of DeepSeek-R1-Zero and OpenAI o1-0912 shows that it is viable to attain robust reasoning capabilities purely through RL alone, which can be further augmented with other techniques to deliver even better reasoning performance.

DeepSeek-R1, at the cusp of an open revolutionSource: https://github.com/deepseek-ai/DeepSeek-R1

Its quite interesting, that the application of RL gives rise to seemingly human capabilities of "reflection", and arriving at "aha" moments, causing it to pause, ponder and focus on a specific aspect of the problem, resulting in emergent capabilities to problem-solve as humans do.

  1. Model distillation

DeepSeek-R1 also demonstrated that larger models can be distilled into smaller models which makes advanced capabilities accessible to resource-constrained environments, such as your laptop. While its not possible to run a 671b model on a stock laptop, you can still run a distilled 14b model that is distilled from the larger model which still performs better than most publicly available models out there. This enables intelligence to be brought closer to the edge, to allow faster inference at the point of experience (such as on a smartphone, or on a Raspberry Pi), which paves way for more use cases and possibilities for innovation.

DeepSeek-R1, at the cusp of an open revolutionSource: https://github.com/deepseek-ai/DeepSeek-R1

Distilled models are very different to R1, which is a massive model with a completely different model architecture than the distilled variants, and so are not directly comparable in terms of capability, but are instead built to be more smaller and efficient for more constrained environments. This technique of being able to distill a larger model&aposs capabilities down to a smaller model for portability, accessibility, speed, and cost will bring about a lot of possibilities for applying artificial intelligence in places where it would have otherwise not been possible. This is another key contribution of this technology from DeepSeek, which I believe has even further potential for democratization and accessibility of AI.

DeepSeek-R1, at the cusp of an open revolution

Why is this moment so significant?

DeepSeek-R1 was a pivotal contribution in many ways.

  1. The contributions to the state-of-the-art and the open research helps move the field forward where everybody benefits, not just a few highly funded AI labs building the next billion dollar model.
  2. Open-sourcing and making the model freely available follows an asymmetric strategy to the prevailing closed nature of much of the model-sphere of the larger players. DeepSeek should be commended for making their contributions free and open.
  3. It reminds us that its not just a one-horse race, and it incentivizes competition, which has already resulted in OpenAI o3-mini a cost-effective reasoning model which now shows the Chain-of-Thought reasoning. Competition is a good thing.
  4. We stand at the cusp of an explosion of small-models that are hyper-specialized, and optimized for a specific use case that can be trained and deployed cheaply for solving problems at the edge. It raises a lot of exciting possibilities and is why DeepSeek-R1 is one of the most pivotal moments of tech history.

Truly exciting times. What will you build?

02 Feb 2025 2:37pm GMT

Junichi Uekawa: February.

February. This is entrance exam season for Tokyo Junior High Schools. Good luck to those who are going through it now.

02 Feb 2025 5:56am GMT

Dirk Eddelbuettel: RcppSpdlog 0.0.20 on CRAN: New Upstream, New Features

Version 0.0.20 of RcppSpdlog arrived on CRAN early this morning and has been uploaded to Debian. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site.

This release updates the code to the version 1.15.1 of spdlog which was released this morning as well. It also contains a contributed PR which illustrates logging in a multithreaded context.

The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.20 (2025-02-01)

  • New multi-threaded logging example (Young Geun Kim and Dirk via #22)

  • Upgraded to upstream release spdlog 1.15.1

Courtesy of my CRANberries, there is also a diffstat report. More detailed information is on the RcppSpdlog page, or the package documention site.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

02 Feb 2025 1:48am GMT

01 Feb 2025

feedPlanet Debian

Guido Günther: Free Software Activities January 2025

Another short status update of what happened on my side last month. Mostly focused on quality of life improvements in phosh and cleaning up and improving phoc this time around (including catching up with wlroots git) but some improvements for other things like phosh-osk-stub happened on the side line too.

phosh

phoc

phosh-osk-stub

xdg-desktop-portal-phosh

phosh-recipes

libcmatrix

phrog

Debian

git-buildpackage

livi

feedbackd

Wayland protocols

Wlroots

Bug reports

Reviews

This is not code by me but reviews on other peoples code. The list is incomplete, but I hope to improve on this in the upcoming months. Thanks for the contributions!

Help Development

If you want to support my work see donations.

Comments?

Join the Fediverse thread

01 Feb 2025 11:24am GMT

31 Jan 2025

feedPlanet Debian

Gunnar Wolf: ChatGPT is bullshit

This post is an unpublished review for ChatGPT is bullshit

As people around the world understand how LLMs behave, more and more people wonder as to why these models hallucinate, and what can be done about to reduce it. This provocatively named article by Michael Townsen Hicks, James Humphries and Joe Slater bring is an excellent primer to better understanding how LLMs work and what to expect from them.

As humans carrying out our relations using our language as the main tool, we are easily at awe with the apparent ease with which ChatGPT (the first widely available, and to this day probably the best known, LLM-based automated chatbot) simulates human-like understanding and how it helps us to easily carry out even daunting data aggregation tasks. It is common that people ask ChatGPT for an answer and, if it gets part of the answer wrong, they justify it by stating that it's just a hallucination. Townsen et al. invite us to switch from that characterization to a more correct one: LLMs are bullshitting. This term is formally presented by Frankfurt [1]. To Bullshit is not the same as to lie, because lying requires to know (and want to cover) the truth. A bullshitter not necessarily knows the truth, they just have to provide a compelling description, regardless of what is really aligned with truth.

After introducing Frankfurt's ideas, the authors explain the fundamental ideas behind LLM-based chatbots such as ChatGPT; a Generative Pre-trained Transformer (GPT)'s have as their only goal to produce human-like text, and it is carried out mainly by presenting output that matches the input's high-dimensional abstract vector representation, and probabilistically outputs the next token (word) iteratively with the text produced so far. Clearly, a GPT's ask is not to seek truth or to convey useful information - they are built to provide a normal-seeming response to the prompts provided by their user. Core data are not queried to find optimal solutions for the user's requests, but are generated on the requested topic, attempting to mimic the style of document set it was trained with.

Erroneous data emitted by a LLM is, thus, not equiparable with what a person could hallucinate with, but appears because the model has no understanding of truth; in a way, this is very fitting with the current state of the world, a time often termed as the age of post-truth [2]. Requesting an LLM to provide truth in its answers is basically impossible, given the difference between intelligence and consciousness: Following Harari's definitions [3], LLM systems, or any AI-based system, can be seen as intelligent, as they have the ability to attain goals in various, flexible ways, but they cannot be seen as conscious, as they have no ability to experience subjectivity. This is, the LLM is, by definition, bullshitting its way towards an answer: their goal is to provide an answer, not to interpret the world in a trustworthy way.

The authors close their article with a plea for literature on the topic to adopt the more correct "bullshit" term instead of the vacuous, anthropomorphizing "hallucination". Of course, being the word already loaded with a negative meaning, it is an unlikely request.

This is a great article that mixes together Computer Science and Philosophy, and can shed some light on a topic that is hard to grasp for many users.

[1] Frankfurt, Harry (2005). On Bullshit. Princeton University Press.

[2] Zoglauer, Thomas (2023). Constructed truths: truth and knowledge in a post-truth world. Springer.

[3] Harari, Yuval Noah (2023. Nexus: A Brief History of Information Networks From the Stone Age to AI. Random House.

31 Jan 2025 6:52pm GMT

Dirk Eddelbuettel: zigg 0.0.1 on CRAN: New Package!

Thrilled to announce a new package: zigg. It arrived on CRAN today after a few days of review in the 'newbies' queue. zigg provides the Ziggurat pseudo-random number generator for Normal, Exponential and Uniform draws proposed by Marsaglia and Tsang (JSS, 2000), and extended by Leong et al. (JSS, 2005).

I had picked up their work in package RcppZiggurat and updated its code for the 64-buit world we now live in. That package alredy provided the Normal generator along with several competing implementations which it compared rigorously and timed them. As one of the generators was based on the GNU GSL via the implementation of Voss, we always ended up with a run-time dependency on the GSL too. No more: this new package is zero-depedency, zero-suggsts and hence very easy to deploy. Moreover, we also include a demonstration of four distinct ways of accessing the compiled code from another R package: pure and straight-up C, similarly pure C++, inclusion of the header in C++ as well as via Rcpp.

The other advance is the resurrection of the second generator for the Exponential distribution. And following Burkardt we expose the Uniform too. The main upside of these generators is their excellent speed as can be seen in the comparison the default R generators generated by the example script timings.R:

Needless to say, speed is not everything. This PRNG comes the time of 32-bit computing so the generator period is likely to be shorter than that of newer high-quality generators. If in doubt, forgo speed and stick with the high-quality default generators.

The short NEWS entry follows.

Changes in version 0.0.1 (2021-01-30)

  • Initial version and CRAN upload

For more, see the package page or the git repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

31 Jan 2025 5:49pm GMT

Daniel Lange: Seagate old hard disks sold as new, smartmontools v7.4 for Debian Bullseye and Bookworm

Apparently somebody managed to resell Seagate hard disks that have 2-5 years of operations on them as brand new.

They did this by using some new shrink wrap bags and resetting the used hard disk SMART attributes to factory-new values.

Image of Seagate Exos X24 hard disk

Luckily Seagate has a proprietary extension "Seagate FARM (Field Access Reliability Metrics)" implemented in their disks that ... the crooks did not reset.

Luckily ... because other manufacturers do not have that extension. And you think the crooks only re-sell used Seagate disks? Lol.

The get access to the Seagate FARM extension, you need smartctl from smartmontools v7.4 or later.

For Debian 12 (Bookworm) you can add the backports archive and then install with apt install smartmontools/bookworm-backports.

For Debian 11 (Bullseye) you can use a backport we created at my company:

File sha256
smartmontools_7.4-2~bpo11+1_amd64.deb e09da1045549d9b85f2cd7014d1f3ca5d5f0b9376ef76f68d8d303ad68fdd108

You can also download static builds from https://builds.smartmontools.org/ which keeps the latest CI builds of the current development branch (v7.5 at the time of writing).

To check the state of your drives, compare the output from smartctl -x and smartctl -l farm. Double checking Power_On_Hours vs. "Power on Hours" is the obvious. But the other values around "Head Flight Hours" and "Power Cycle Count" should also roughly match what you expect from a hard disk of a certain age. All near zero, of course, for a factory-new hard disk.

This is what it looks like for a hard disk that has gracefully serviced 4 years and 8 months so far. The smartctl -x and smartctl -l farm data match within some small margins:

$ smartctl -x /dev/sda

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.1.0-30-amd64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Exos X14
Device Model: ST10000NM0568-2H5110
[..]
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
[..]
4 Start_Stop_Count -O--CK 100 100 020 - 26
[..]
9 Power_On_Hours -O--CK 054 054 000 - 40860
10 Spin_Retry_Count PO--C- 100 100 097 - 0
12 Power_Cycle_Count -O--CK 100 100 020 - 27
[..]
192 Power-Off_Retract_Count -O--CK 100 100 000 - 708
193 Load_Cycle_Count -O--CK 064 064 000 - 72077
[..]
240 Head_Flying_Hours ------ 100 253 000 - 21125h+51m+45.748s
$ smartctl -l farm /dev/sda

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.1.0-30-amd64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

Seagate Field Access Reliability Metrics log (FARM) (GP Log 0xa6)
FARM Log Page 0: Log Header
FARM Log Version: 2.9
Pages Supported: 6
Log Size: 98304
Page Size: 16384
Heads Supported: 24
Number of Copies: 0
Reason for Frame Capture: 0
FARM Log Page 1: Drive Information
[..]
Power on Hours: 40860
Spindle Power on Hours: 34063
Head Flight Hours: 24513
Head Load Events: 72077
Power Cycle Count: 28
Hardware Reset Count: 193

You may like to run the command below on your systems to capture the state. Remember FARM is only supported on Seagate drives.

for i in /dev/sd{a,b,c,d,e,f,g,h} ; do { smartctl -x $i ; smartctl -l farm $i ; } >> $(date +'%y%m%d')_smartctl_$(basename $i).txt ; done

31 Jan 2025 5:42pm GMT

Russell Coker: Links January 2025

Aaron Quigley's Everything Open lecture about Intelligent Interfaces is one of the most interesting research reports I've seen in a long time [1]. This one can be understood and appreciated by people who don't have a strong background in computer science.

Statites (satellites that don't orbit the sun but use solar sails to hover in place) could be used to catch up to interstellar objects [2].

Slashgear has an interesting article about an AI piloted F16 beating a human piloted F16 [3]. Given the serious handicaps of flying a plane designed for humans and flying to minimise risk to itself and other crewed aircraft this is a serious victory. Hopefully crewed military aircraft will be obsolete soon.

Amusing video about the performance of cats with MMORPG style descriptions [4].

John Goerzen wrote an interesting blog post about censorship and the changes to Facebook [5].

Ron Garret wrote an interesting blog post 15 years ago when going through what he now describes as an existential crisis [6].

A comment on Ron's post is references Alan Crowe's blog post about whether the "self" exists which is an interesting philosophical post [7]. But I'm still going to think of myself as a person.

Another comment on Ron's post references Aaron Swartz' blog post about Noam Chomsky etc [8]. I have to watch Manufacturing Consent: Noam Chomsky and the Media.

Ron Garret wrote an interesting blog post about his failed attempts to start a company and how it all worked out well for him any way [9].

Amusing video about a failed crowdfunded e-bike [10].

Cory Doctorow wrote an insightful article about how Enshittification is not caused by VCs but by lack of controls [11].

31 Jan 2025 1:36pm GMT