15 Dec 2025
Fedora People
Vojtěch Trefný: Partitioning with Ansible Storage Role: Partitions
The storage role always allowed creating and managing different storage technologies like LVM, LUKS encryption or MD RAID, but one technology seemed to be missing for a long time, and surprisingly, it was the most basic one, the actual partitioning. Support for partition management was always something that was planned for the storage role, but it was never a high priority. From the start, the role could create partitions. When creating a more complex storage setup on an empty disk, for example creating a new LVM volume group or adding a new physical volume to an existing LVM setup, the role would always automatically create a single partition on the disk. But that was all the role could do, just one single partition spanning the entire disk.
The reason for this limitation was simple: creating multiple partitions is something usually reserved for the OS installation process, where users need to have separate partitions required by the bootloader, like /boot and /boot/efi. The more advanced "partitioning" is then delegated to a more complex storage technologies like LVM, which is where most of the changes are done in an existing system and where users will usually employ Ansible to make changes later.
But the requirement for more advanced partition management was always there, and since the 1.19 release, the role can now create and manage partitions in the Ansible way.
Partition Management with Storage Role
The usage of the role for partition management is simple and follows the same logic as the other storage technologies, with the management divided into two parts: managing the storage_pools, which in the case of partitions is the underlying disk (or to be more precise, the partition table), and the volumes, which are the partitions themselves. A simple playbook to create two partitions on a disk can look like this:
roles:
- name: linux-system-roles.storage
storage_pools:
- name: sdb
type: partition
disks: sdb
volumes:
- name: sdb1
type: partition
size: 1 GiB
fs_type: ext4
- name: sdb2
type: partition
size: 10 GiB
fs_type: ext4
and the partitions it creates will look like this
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS FSTYPE
sdb 8:16 0 20G 0 disk
├─sdb1 8:17 0 1G 0 part ext4
└─sdb2 8:18 0 10G 0 part ext4
Other filesystem-related properties (like mount_point or fs_label) can be specified, and these work in the same way as for any other volume type.
The only property that is specific to partitions is part_type, which allows you to choose a partition type when using the MBR/MSDOS partition table. Supported types are primary, logical and extended. If you don't specify the partition type, the role will create the first three partitions as primary and for the fourth one, add an extended partition and create it as a logical partition inside it. On GPT, which is used as the default partition table, the partition type is ignored.
Encrypted partitions can be created by adding the encryption: true option for the partition and setting the passphrase:
roles:
- name: linux-system-roles.storage
storage_pools:
- name: sdb
type: partition
disks: sdb
volumes:
- name: sdb1
type: partition
size: 1 GiB
fs_type: ext4
encryption: true
encryption_password: "aaaaaaaaa"
- name: sdb2
type: partition
size: 10 GiB
fs_type: ext4
encryption: true
encryption_password: "aaaaaaaaa"
Don't forget that adding the encryption layer is a destructive operation - if you run the two playbooks above one after another, the filesystems created by the first one will be removed, and all data on them will be lost. Adding the LUKS encryption layer (so-called re-encryption) is currently not supported by the role.
Idempotency and Partition Numbers
One of the core principles of Ansible is idempotency, or the ability to re-run the same playbook, and if the system is in the state specified by the playbook, no changes will be made.
This is true for partitioning with the storage role as well. When running the playbook from our example above for the second time, the role will check the sdb disk and look for the two specified partitions. And if there are two partitions 1 and 10 GiB large, it won't do anything. This is how the role works in general, but with partitions, there is a new challenge: partitions don't have unique names and using partition numbers for idempotency can be tricky.
Did you know that partition numbers for logical partitions are not stable? If you have two logical partitions
sdb5andsdb6, removing thesdb5partition will automatically re-number thesdb6partition tosdb5.
Predicting the partition name is not always straightforward. For example, disks that end in a number (common with NVMe drives) require adding a p separator before the partition number (nvme0n1 becomes nvme0n1p1).
For these reasons, the role requires explicitly using the state: absent option to remove a partition, and partitions can be referred to by their numbers in the playbooks as well as their full names. So, for example, the following playbook will resize the sdb2 partition from our first example
roles:
- name: linux-system-roles.storage
storage_pools:
- name: sdb
type: partition
disks: sdb
volumes:
- name: 2
type: partition
size: 15 GiB
fs_type: ext4
and the first partition won't be removed, because it is not explicitly mentioned as absent, only omitted in the playbook:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS FSTYPE
sdb 8:16 0 20G 0 disk
├─sdb1 8:17 0 1G 0 part ext4
└─sdb2 8:18 0 15G 0 part ext4
Feedback and Future Features
With this change, the storage role can now manage all basic storage technologies. We are of course not yet covering all the potential features, but we are always looking for more ideas from our users. If you have any features you'd like to see in the role, please don't hesitate and let us know.
15 Dec 2025 9:13am GMT
Fedora Magazine: Creating Data Analysis Pipelines using DuckDB and RStudio

Motivation and Vision
The core motivation behind data analysis pipelines, and the focus of this article, is the need to establish a clear path from unprocessed data to actionable insights for contributor engagement and impact. The key question is "what are we trying to measure to ensure the continuity of community work?"
As a side note, my preparation for the ADSP (Advanced Data Analysis Semi-Professional) certification in Korea utilized RStudio Desktop, running on a Fedora Linux environment. I got hands-on with R's core statistical toolkit, leveraging base functions. Among these were summary()1 and lm()2 as the basis for fundamental hypothesis testing and regression analysis3. I became more intrigued by R's power after testing its data manipulation packages (especially the key package dplyr).
With this background in mind, the article focuses on the design of an analysis pipeline that fulfills three objectives:
- it leverages the power of DuckDB4 and S3 storage,
- it redefines the workflow,
- it ensures scalable data transformation and analysis capabilities
Establishing such a robust foundation is essential for producing reliable and validated metrics for the contributor community, which itself is subject to ongoing definition and validation.
Acknowledgement: I extend my sincere gratitude to Justin Wheeler for connecting me with the Fedora Data Working Group (FDWG), and to Michael Winters and K Naraian for their guidance, discussion, and support throughout the design and validation of this data analysis pipeline.
Scope and Tool Selection: Please note that this analysis pipeline represents a combination of tools and methods chosen from my perspective as a data analyst, particularly one with a background in the CRM (Customer Relationship Management) domain and consumer electronics industry. Therefore, other analysts or data engineers may utilize different combinations of technologies based on their own expertise or project requirements.
Why data analysts must engage with ELT5/ETL6
The role of the analyst is undergoing a fundamental transformation in 2025. We are moving beyond the traditional responsibility of performing statistical analysis and presenting visualization on pre-cleaned data. Today, the modern analyst must evolve into a "Data Ops(Operations)". This requires a holistic understanding of the data lifecycle and bridging the gap between business context and data engineering. This expansion mandates a familiarity with ELT/ETL processes to examine the quality and structure of the data source.
Moreover, data analysts must be adept at processing diverse data types such as semi-structured data (for example, schema-less JSON strings or variant) and understanding various data access methods such as leveraging the efficiency of in-situ processing over the constraints of in-memory loading of datasets.
RStudio: The Unified IDE for Hybrid R and Python workflows
My ADSP examination requirements motivated my initial deep dive into RStudio. However, it is worth highlighting its utility as a tool for any data professional. The most significant benefit of using RStudio is its seamless ability to leverage the best tools from both the R and Python language ecosystems. This eliminates the need for the analyst to switch environments which leads to dramatically higher operational efficiency. This unified approach streamlines the analysis lifecycle from code execution to final reporting.
Python for data engineering
Use Python's libraries like Pandas for efficient ETL/ELT operations, data manipulation on large datasets, and integrating with production environments or machine learning workflows (TensorFlow/PyTorch).
R for analysis and visualization
Utilize R's statistical packages and its superior data visualization capabilities (ggplot2, R Shiny) for data analysis modeling, beautiful reporting, and creating customized, publication-ready graphics.
RStudio Desktop: Installation Instructions7 for Fedora Linux
Install R base packages using the terminal and verify:
$ sudo dnf install R
$ R --version
Now, install RStudio from the Fedora COPR repository. Note that these COPR packages conflict with the binary package provided by Posit. Remove the existing Posit RStudio binary package if you installed it.
$ sudo dnf copr enable iucar/rstudio
$ sudo dnf install rstudio-desktop
Dependencies and dev tools are required so install them:
$ sudo dnf install @development-tools
$ sudo dnf install libxml2-devel openssl-devel curl-devel
Launch the RStudio. When the < prompt appears on the RStudio Console enter the following commands. Note that this prompt should appear in the bottom-left pane of the default layout.
Install the reticulate package and execute the function reticulate::py_install() to manage Python dependencies:
install.packages("reticulate")
reticulate::py_install(packages = c("duckdb", "pandas"))
Set the global options for all code chunks within the R Markdown Canvas:
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
ragg is an indirect but critical dependency of core Tidyverse packages (such as ggplot2):
install.packages("ragg")
Install base packages for data manipulation:
install.packages("tidyverse")
DBI, tools for database interface, is an essential R package that provides a standardized, vendor-agnostic interface for connecting to and interacting with various database systems (both local and remote)
install.packages("DBI")
Install tools for Parquet files and S3 data lake access:
install.packages("arrow")
Install R Markdown for combining R code, and install Quarto for combining R/Python/SQL with its execution results, and explanatory text into reproducible data pipelines directly within the environment. The Quarto (.qmd) file runs code chunks in R, Python, and SQL in a single document.
install.packages(c("rmarkdown","quarto"))
Load packages for ELT and EDA:
library(tidyverse)
library(arrow)
Data architecture pipeline
The specific dataset chosen, Datanommer (Fedora Messaging Streams), aligns with the strategic objectives of the Fedora Data Working Group, where I contribute. The data is stored in the Bronze Data Layer where raw data from source systems is ingested and stored, as-is, for scalable data lake storage. The Bronze Layer allows for schema evolution without breaking downstream processes.
To provide the Working Group with transparent access and initial insight into this data, I have prepared a shared Initial Exploratory Data Analysis (EDA) Notebook. This notebook serves as the initial public view of the data quality and patterns, and it informed the subsequent architectural decisions for the scalable pipeline I am about to outline.
Given the complexity of the architecture, I will proceed with an outline of the core components, organized by their role in the ELT pipeline:
This restructured pipeline, leveraging the new Lakehouse architecture, unlocks several core benefits crucial for scaling contributor analysis and enabling future insights:
Elimination of Memory Constraints via In-Situ Processing
DuckDB acts as a high-performance analytical engine that enables In-Situ Processing. It queries data directly from storage (specifically the Parquet files) without requiring the entire dataset to be loaded into RAM. This not only solves the memory problem but also delivers rapid query execution and significantly lowers operational costs associated with large computational clusters hosted on the OpenShift/Fedora AWS infrastructure.
Future-Proofing
The shift to a Lakehouse model ensures the pipeline is ready for growth and evolving data complexity. Future integration of Apache Iceberg and Apache Polaris will provide schema evolution capabilities. This ensures the pipeline is fully future-proofed against changes in underlying data structures.
Streamlined ELT Workflow and Multi-Lingual Access
I have redefined the processing workflow from a bottlenecked ETL model to a resilient Extract-Load-Transform (ELT) pattern. Parquet files with the variant type store semi-structured data (like JSON/nested structures), loaded raw into S3, simplifies the ingestion stage. When using R, it is recommended to read Parquet files using the Apache Arrow library.
The parsed data is then accessible by multiple analytical platforms (R Shiny, Python, BI tools) without duplication or manual preparation. This multi-lingual access maximizes the utility of the clean data layer, supporting a growing number of analytical users and more complex queries necessary for defining long-term contributor metrics.
Initial EDA Notebook
The preliminary Exploratory Data Analysis (EDA) was conducted within the Jupyter Notebook format. This allowed broad compatibility with the existing execution and review environment of the Fedora Data Working Group.
The Initial EDA Notebook is documented to ensure complete reproducibility. This included all necessary steps for the Python library installation and environment setup. Any standard Python script containing ELT logic can be seamlessly run within RStudio's Python mode or "knitting8" an R Markdown document or rendering a Quarto file.
Conclusion
The establishment of this analysis pipeline represents a crucial step in transforming unprocessed Fedora data into actionable insights. By addressing the core challenges of scaling and in-memory processing through DuckDB, and enabling transparent analysis via the hybrid RStudio/Jupyter workflow, I have demonstrated viable methods for performing Exploratory Data Analysis (EDA) and Extract, Load, Transform (ELT) processes on vast community datasets. In conclusion, the purpose of this work is to foster deeper engagement across a broader community by analyzing data with a view that relates to the Fedora Project community.
I hope this pipeline will serve as the technical foundation that activates and focuses the community discussion around the specific variables and metrics needed to define and ensure the continuity of community contributions.
AI Assistance
The ideation, structural planning, and terminology refinement of the pipelines were assisted by Gemini and Figma.
Software version
RStudio Desktop 2025.05.1 Build 513 (Fedora COPR repository)
R version 4.5.2 (2025-10-31) / Python 3.14.0
Notes
- summary(): When used on a data object (for example, DataFrame), it provides basic statistics (min, max, mean, median). When used on a fitted linear model object (lm), it delivers key diagnostic information like coefficient estimates and p-values.
︎ - lm(): Stands for Linear Model. This is the core function for fitting linear regression models in R, allowing the user to examine and model the linear relationship between variables.
︎ - Regression analysis examines which factors affect the other and which ones are irrelevant for statistical and business context.
︎ - DuckDB is a column-oriented database architecture.
- Direct Querying: It directly queries data from file formats such as Parquet, CSV, and JSON.
- Local compute engine: It is widely used as a high-performance local compute engine for analytical workloads. It runs in-process, meaning it operates within your application (like a Python script or R session) without needing a separate server or cluster management.
- Cloud Integration: It supports querying data stored in cloud storage services like AWS S3, GCS (Google Cloud Storage), and Azure Blob Storage.
︎ - ELT (Extract, Load, Transform): In a modern data environment like a Lakehouse, ELT is preferred: data is first extracted from the source and loaded raw into the cloud data lake (S3), and then transformed in place by the processing engine like DuckDB.
︎ - ETL (Extract, Transform, Load): transformations occur before loading the data into the final destination.
︎ - Key Advantages of RStudio over Jupyter Notebook for Production Workflows;
Even with its slightly more complex initial setup compared to Jupyter Notebooks, the advantages become significant when moving from exploration (Jupyter's strength) to reproducible, production-ready workflows (RStudio's strength).
- Integrated Console, Source, Environment, and Files: RStudio offers a cohesive, four-pane layout that allows for seamless navigation between writing code, running commands, inspecting variables, and managing files/plots. Jupyter requires constant shifting between code cells and external tabs.
- Superior Debugging Tools: RStudio includes a powerful, visual debugger that allows you to set breakpoints, step through code line-by-line, and inspect variable states directly in the environment pane. Jupyter's debugging is typically cell-based and less intuitive.
- Native Project Management: RStudio Projects (.Rproj files) automatically manage the working directory and history. This makes it easy to switch between different analytical tasks without conflicts.
- Integrated Environment Management (renv): RStudio integrates seamlessly with tools like renv (R Environment) to create isolated, reproducible R environments. This addresses dependency hell by ensuring the exact package versions used in development are used in production, which is crucial for data pipeline version control.
- Quarto/R Markdown Integration: RStudio provides dedicated tools and buttons for easily compiling and rendering complex analytical documents (like your Quarto file) into HTML, PDF, or presentation slides.
- Shiny Integration: RStudio is the native environment for developing Shiny web applications-interactive dashboards and tools that turn analysis into deployable products. Jupyter requires separate frameworks (like Dash or Streamlit) for similar deployment.
- Focus on Scripting: RStudio's source editor is optimized for writing clean, structured R/Python scripts, which are preferred for building robust, scheduled pipeline components (like those managed by Airflow).
- Code Chunk Execution (Quarto): Even when using Quarto, RStudio allows for superior navigation and execution of code chunks compared to the often sequential and state-dependent nature of Jupyter Notebook cells.
︎ - knitr executes code in R Markdown (.Rmd) file by chunks or as a whole (typically by clicking the "Knit" button in RStudio or using rmarkdown::render() in R)
︎
15 Dec 2025 8:00am GMT
14 Dec 2025
Fedora People
Evgeni Golov: Home Assistant, Govee Lights Local, VLANs, Oh my!
14 Dec 2025 3:48pm GMT
Kushal Das: Johnnycanencrypt 0.17.0 released
14 Dec 2025 8:16am GMT
13 Dec 2025
Fedora People
Kevin Fenzi: infra weekly recap: mid december 2025
Another busy week for me and Fedora infrastructure in general, and also the last working week of the year for me. I am out on vacation for the holidays and back 2026-01-12.
Of course I will be around and checking in for outages/urgent issues and working on things in the community that I find enjoyable.
( see: https://www.scrye.com/blogs/nirik/posts/2023/12/13/time-off-when-youre-paid-to-work-in-a-community/ )
RUD2-CC to RDU3 datacenter move
This last monday was the physical datacenter move. It had been pushed back for various reasons, but I am glad we could get it done and over with this year.
Things didn't go as smoothly as planned unfortunately.
-
There was bad weather in the area of the datacenters (snow and ice). The truck transporting things took a bit longer to arrive, the folks doing the move had to head home before things became impassible and also took longer to get back in to finish cabling. :(
-
There was a miscommunication between planning folks and datacenter folks on the ground: we thought that everything was moving to dual 10G network (so networking can upgrade/reboot switches and we are fine). The folks doing the actual cabling thought that we were just reconnecting things the way the old datacenter was setup (1 1G connection). So, it took a while to get 10G all connected and configured.
-
Of course there were some casualties too: One machine (our retrace server) had a broken rail. DC folks got it setup anyhow, but new rails are going to need to be installed soon. And another of our servers for some reason refuses to accept any mgmt passwords. That will need to be reset locally.
-
There's one cursed machine that has a 10G network card in it, and lspci on the server shows it, but it has no interfaces for it, and the mgmt interface doesn't show it at all. Probibly the card is dead or otherwise needs vendor intervention.
Otherwise important things are back up with some reinstalling and cleanup to be done. Here's hoping for a datacenter moveless 2026!
Scraper news
I did a bunch of tweaking since last week to try and get things in a state where we could not need manual intervention for scraper handling. This included some httpd changes, some proxy changes and a bit of hardware tuning. So far, we are doing good. I haven't had to manually look at it much this week. We have still been under scraper load, but blocking the blame endpoint really helped along with the other tuning. I hope it will be a quiet holidays.
Decemeber of docs
So far we are just under half of december gone by, and so far I have kept up working on at least one docs pr/ticket every day.
We have moved infra docs over to forge now also!
You can see activity here:
https://forge.fedoraproject.org/infra/docs/activity/monthly
Hopefully I can keep it up. We are down about 21 tickets now. Perhaps I can even do a bit more now that I am on holidays.
Happy holidays everyone
Happy holidays everyone!
comments? additions? reactions?
As always, comment on mastodon: https://fosstodon.org/@nirik/115713891385384001
13 Dec 2025 7:03pm GMT
Phil Wyett: Fedora 41 support
Fedora 41 went End Of Life (EOL) on December 10, 2025. I have removed my Virtual Machines (VMs) for Fedora 41 and will be doing no further maintenance or builds for this release.
13 Dec 2025 2:46pm GMT
12 Dec 2025
Fedora People
Rénich Bon Ćirić: The illusion of sovereignty: Why curl | bash destroy your investment in Cybersecurity
Your company just finished Series B. You have cash to spend. You have a great product with a Go stack that compiles to a single static binary.
Your engineering team spends 25% of their time securing that stack. You invest millions in infrastructure defense: Stateful firewalls, AI-augmented scanning, Red Teams, Blue Teams, and rigorous DevSecOps pipelines. You even partner with major cloud providers to ensure your supply chain is audited.
You are serious about security. You have built a fortress.
And then, to install your product, you tell enterprise customers to run this:
curl -fsSL https://my-really-cool-company.com/install.sh | sh
This single line undermines your entire security architecture.
Why is this so dangerous?
First, consider the distribution mechanism. Teams automate releases. They push to Git, CI runs tests, and the binary is pushed to an object store or CDN. The install.sh script is just a pointer to that location.
The problem is Mutability
When a sysadmin runs that command, they are piping an unverified, unsigned script directly into a shell; often as root.
If any point in that supply chain is compromised; if your CDN is hijacked, or your build server is breached (like the Codecov or SolarWinds attacks), or a rogue maintainer inserts a backdoor (like the recent XZ Utils / liblzma incident); your customer downloads the malware instantly.
There is no audit trail. There is no cryptographic signature verifying the author. There is just a script that can change content between the time you audit it and the time you run it.
The Trust Fallacy
We operate in good faith. You assume the vendor is secure. But supply chain security isn't about trusting the vendor; it's about verifying the artifact.
If you are selling to Government, Defense, or Finance, "trust" is not a strategy. Sovereignty is the strategy. These clients need:
- Immutability: A guarantee that the binary hasn't changed.
- Provenance: Cryptographic proof of origin (GPG).
- Sovereignty: The ability to mirror the software in an air-gapped environment without reaching out to the public internet during installation.
The Solution: Native Packaging
In the Enterprise Linux ecosystem (RHEL, CentOS, Fedora, etc), we solved this decades ago.
- RPMs allow for offline installation.
- GPG Signatures ensure the binary was built by you.
- Repositories allow clients to mirror and scan the software before it touches their production servers.
If you are asking enterprise clients to pipe shell scripts, you are asking them to surrender their sovereignty. It is time to treat your delivery mechanism with the same rigor as your source code.
12 Dec 2025 4:00pm GMT
Rénich Bon Ćirić: Certificados SSH: Una Guía Completa
Tengo rato pensando: "¿Qué más ahbrá en cunato a SSH y sus certificados, llaves y demás cosas?". El openssh tiene más que ofrecer, seguramente, que lo que usamos al día a día. Basta con echarte un clavado en los man pages del mismo y ver que así es.
La neta, son perros para manejar autenticación a escala, con expiración automática y políticas bien cabronas. En Fedora 43, con SELinux cuidándonos la espalda, es aún más seguro.
Esta guía completa es tu mapa (y el mío) para dominar los certificados SSH. Incluye comparaciones profundas, mejores prácticas de seguridad, tips de automatización y ejemplos extensos. Ya sea que estés asegurando un centro de datos o un laboratorio casero, esto te va a subir el nivel en cuanto a SSH se refiere.
Nota
Esta guía asume OpenSSH 7.0+ para soporte completo de certificados. Checa tu versión con ssh -V. Para producción, usa módulos de seguridad de hardware (HSM) para las llaves de CA.
Paralelos con SSH Estándar
SSH estándar con llaves de usuario:
- Flujo de trabajo: Los usuarios generan pares de llaves; los admins agregan manualmente las públicas a ~/.ssh/authorized_keys en cada servidor.
- Pros: Elimina contraseñas; criptografía fuerte; funciona offline.
- Contras: Pesadilla de escalabilidad; agregar/quitar usuarios requiere tocar cada servidor; no hay expiración automática ni restricciones; llaves comprometidas quedan hasta que se limpien manualmente; trazabilidad de auditoría limitada a logs.
SSH estándar con contraseñas:
- Flujo de trabajo: Los usuarios se autentican con contraseñas almacenadas en servidores (a menudo hasheadas).
- Pros: Configuración cero para usuarios; simple de implementar.
- Contras: Susceptible a ataques de fuerza bruta; las contraseñas débiles son comunes; no hay auditoría de logins exitosos; las contraseñas se pueden pezcar o reusar; los cambios de contraseña centralizados son propensos a errores.
Certificados SSH:
- Flujo de trabajo: CA firma llaves en credenciales portátiles y es rica en políticas.
- Mejor que llaves de usuario: Emisión/revocación centralizada; los certificados expiran automáticamente; opciones para restricciones de comando/IP; auditoría más fácil vía IDs de llaves y seriales.
- Mejor que contraseñas: Criptográficamente fuerte; sin secretos compartidos; soporta multi-factor (ej. con FIDO); amigable a la auditoría.
- Peor que ambos: Complejidad inicial de configuración de CA; llave privada de CA es un punto único de fallo (si se compromete, todos los certificados son inválidos; a rótarla inmediatamente); requiere soporte de OpenSSH; no es compatible con clientes/servidores SSH viejos.
En resumen, los certificados brillan para organizaciones que necesitan autenticación escalable y manejado con políticas. Para uso personal o despliegues pequeños, las llaves tradicionales suelen bastar.
Cómo Funcionan los Certificados SSH
Un certificado SSH es una extensión firmada por una llave pública, que contiene:
- Llave pública: La llave del usuario/host siendo certificada.
- ID de llave: Un identificador legible a humanos (ej. "juan@empresa").
- Principals: Usuarios permitidos (para certificados de usuario) o hostnames (para certificados de host).
- Opciones: Permisos/restricciones (por ejemplo. "no-port-forwarding").
- Período de validez: Fechas de inicio/fin para una expiración automática.
- Número serial: ID único para revocación.
- Firma de CA: Prueba autenticidad.
Los servidores checan la llave pública de la CA para verificar firmas, eliminando el almacenamiento por usuario de llaves. Los certificados son portátiles y auto-contenidos.
Prerrequisitos
- OpenSSH 7.0+ (los certificados fueron introducidos en la v5.4, pero lo acabaron hasta la v7.0). En Fedora 43, viene instalado por defecto (openssh-10.0p1-5.fc43.x86_64 al momento).
- Acceso a una máquina segura para operaciones de CA (idealmente offline).
- Conocimiento básico de generación de llaves SSH.
Consejo
En Fedora, SELinux puede prevenir el acceso de archivos de CA o certificados. Siempre checa contextos con ls -Z y ajústalo si es necesario.
Generando una llave de CA
La llave de CA es la base; su parte privada firma todos los certificados, así que protéjala fiéramente.
Elige un tipo de llave fuerte (ed25519 recomendado por velocidad/seguridad):
ssh-keygen -t ed25519 -f ca_key -C "SSH CA para ejemplo.tld"
Esto crea ca_key (privada; nunca la compartas) y ca_key.pub (pública; distribúyela entre los servidores y clients).
Para RSA (si ed25519 no está soportado):
ssh-keygen -t rsa -b 8192 -f ca_key -C "SSH CA para ejemplo.tld"
Importante
Respalda la llave privada de forma segura.
Creando Certificados de Usuario
Los certificados de usuario permiten a usuarios autenticarse en servidores sin agregar llaves individualmente. Vamos paso a paso, empezando con lo básico y agregando funcionalidad para hacerlo más seguro y flexible. Así es más fácil entender qué hace cada opción.
Paso 1: Certificado Básico
Primero, genera la llave del usuario si no tienes una:
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 -C "juan@ejemplo.tld"
Ahora, firma la llave pública con la CA para crear un certificado básico. La opción -s especifica la llave privada de la CA, -I es la identidad (un ID legible para identificar el cert), y -z es un número de serie único para evitar colisiones.
ssh-keygen -s ca_key -I "juan@ejemplo.tld" -z 1 ~/.ssh/id_ed25519.pub
Esto crea ~/.ssh/id_ed25519-cert.pub, un certificado válido para cualquier usuario en cualquier host que confíe en la CA. Básico pero funcional.
Opciones adicionales
Limitar Principals
Para más seguridad, limita el certificado a usuarios específicos con -n (principals). Esto evita que el certificado se use para logins no autorizados.
ssh-keygen -s ca_key -I "juan@ejemplo.tld" -n juan,respaldo -z 2 ~/.ssh/id_ed25519.pub
Ahora, solo "juan" o "respaldo" pueden usar este certificado para autenticarse. Útil para equipos compartidos.
Opción 2: Agregar Restricciones
Agrega opciones con -O para limitar qué puede hacer el usuario. Por ejemplo, no-port-forwarding bloquea túneles, no-agent-forwarding previene reenvío del agente SSH.
ssh-keygen -s ca_key -I "juan@ejemplo.tld" -n juan,respaldo -O no-port-forwarding -O no-agent-forwarding -z 3 ~/.ssh/id_ed25519.pub
Esto hace el certificado más seguro, previniendo abusos como port forwarding no autorizado.
Determinar Validez
Determina un período de validez con -V para que el certificado expire automáticamente. Usa formatos como +30d (30 días desde ahora) o fechas absolutas.
ssh-keygen -s ca_key -I "juan@ejemplo.tld" -n juan,respaldo -O no-port-forwarding -O no-agent-forwarding -V +30d -z 4 ~/.ssh/id_ed25519.pub
Ahora el certificado dura solo 30 días, forzando renovación periódica para mantener la seguridad.
Forzar un Comando
Para automatización (como backups), usa -O force-command para limitar el certificado a un comando específico. Ideal para scripts que no necesitan un shell completo.
ssh-keygen -s ca_key -I "respaldo@ejemplo.tld" -n respaldo -O force-command="/usr/bin/rsync --server --daemon" -V +1d -z 5 ~/.ssh/id_ed25519.pub
Este certificado solo permite rsync, nada más. Cambié el ID a "respaldo" para diferenciarlo.
Llaves basadas en hardware
Para llaves basadas en hardware, usa tokens PKCS#11 con -D (ruta a la lib):
ssh-keygen -s ca_key.pub -D /usr/lib/opensc-pkcs11.so -I "juan@ejemplo.tld" -z 6 ~/.ssh/id_ed25519.pub
O, para poner la CA en memoria con un ssh-agent, agrega la llave y usa -U:
ssh-add ca_key
ssh-keygen -Us ca_key.pub -I "juan@ejemplo.tld" -z 7 ~/.ssh/id_ed25519.pub
Estos son para despliegues más avanzados; no siempre necesarios.
Creando Certificados de Host
Los certificados de host prueban identidad de servidor a clients, previniendo ataques MitM.
Pasos (corre en cada host):
-
Genera llaves de host (si no están presentes; especifica tipo para evitar clutter):
sudo ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key -N ""
Nota
Seguramente, esto lo hace tu distribución. En Fedora 43, lo hace un servicio llamado sshd-keygen, el cual corre siempre al iniciar y genera la llave de host si esta no existe. Te genera 3, de hecho.
-
Firma básica:
ssh-keygen -s ca_key -I "web01.ejemplo.tld" -h -z 1001 /etc/ssh/ssh_host_ed25519_key.pub
Output: /etc/ssh/ssh_host_ed25519_key-cert.pub.
-
Limita a los principals (hostnames e IPs para que sea más robusto):
ssh-keygen -s ca_key -I "web01.ejemplo.tld" -h -n web01.ejemplo.tld,web01,10.0.0.5 -z 1002 ~/.ssh/ssh_host_ed25519_key.pub
-
Determina la validez:
ssh-keygen -s ca_key -I "web01.ejemplo.tld" -h -V 20240101:20241231 -z 1003 /etc/ssh/ssh_host_ed25519_key.pub
Revocando Certificados
Usa "Key Revocation Lists (KRLs)" para invalidar certificados comprometidos. Siempre usa seriales únicos (-z) para una revocación efectiva.
-
Crea KRL (por archivo de certificado):
ssh-keygen -k -f revoked.krl ~/.ssh/id_ed25519-cert.pub
-
Actualiza KRL (agrega otro con el certificado):
ssh-keygen -k -u -f revoked.krl another-cert.pub
-
Revoca por Serial (no necesitas el certificado):
ssh-keygen -k -u -f revoked.krl -s ca_key -z 10
-
Distribuye KRL a servidores (ej. via config):
En /etc/ssh/sshd_config:
RevokedKeys /path/to/revoked.krl
-
Checa la revocación:
# Exit != 0 si fue revocado ssh-keygen -Q -f revoked.krl ~/.ssh/id_ed25519-cert.pub
Configurando Confianza
La confianza se configura para que servidores y clientes reconozcan la CA y verifiquen certificados. Sin esto, los certificados son papel mojado; el servdor rechaza logins porque no confía en la firma de la CA. La neta, es como darle una carta de recomendación a alguien que no conoce al firmante. Vamos por partes, compa, para que quede clarito.
Certificados de usuario (de lado del servidor)
Aquí, lo que te conviene es usar el método centralizado. Ya tienes a tus usuarios considerados en el certificado y solo hay que repartir la llave pública de la CA en todos los nodos. Esto es ideal para empresas o clusters grandes, porque evitas tocar archivos de cada usuario.
Pongo los otros métodos para que estés enterado nomás, por si los necesitas en setups pequeños.
-
Método Centralizado: Usa TrustedUserCAKeys en /etc/ssh/sshd_config para manejo centralizado. Esta directiva le dice a SSH que confíe en la CA para firmar certificados de usuario, sin necesidad de authorized_keys individuales.
TrustedUserCAKeys /etc/ssh/ca.pub
Luego pon la pubkey de CA en /etc/ssh/ca.pub. Distribúyela a todos los servers (ej. con scp o Ansible). Esto evita tocar archivos de usuario y simplifica revocaciones.
-
Método por usuario: Agrega a ~/.ssh/authorized_keys (útil para servers personales, pero no escala). La línea "cert-authority" indica que cualquier cert firmado por esa CA es válido para ese usuario.
echo "cert-authority $(cat ca_key.pub)" >> ~/.ssh/authorized_keys
-
Identidades Desacopladas: Usa AuthorizedPrincipalsFile en sshd_config para mapear principals de certificado a usuarios locales (ej. mapea "juan@corp" a "centos"). Los principals son como IDs en el cert que dicen quién eres, sin depender del username del sistema.
AuthorizedPrincipalsFile /etc/ssh/principals/%u
Crea /etc/ssh/principals/centos con:
juan@empresa pancho@empresa
Nota
Asegúrate que el dueño del archivo sea root:root con permisos 600 para prevenir escalamiento de privilegios. En Fedora, SELinux puede requerir contextos correctos (ej. restorecon -Rv /etc/ssh/principals). Esto permite cuentas compartidas sin authorized_keys per-user, pero checa que los principals matchen exactamente.
Para Certificados de Host (de lado del servidor)
Los certificados del host prueban la identidad del server al cliente. Configúralos agregando el path del certificado a sshd_config. SSH lo presentará automáticamente durante conexiones para evitar ataques MitM.
-
Agrega el path del certificado a /etc/ssh/sshd_config:
HostCertificate /etc/ssh/ssh_host_ed25519_key-cert.pub
De lado del cliente:
El cliente también necesita confiar en la CA para verificar certificados.
-
Para los certificados del host: Agrega la CA a ~/.ssh/known_hosts o /etc/ssh/ssh_known_hosts. El patrón "@cert-authority" con wildcard (*.ejemplo.tld) confía en cualquier host en ese dominio si está firmado por la CA, previniendo spoofing.
echo "@cert-authority *.ejemplo.tld $(cat ca_key.pub)" >> ~/.ssh/known_hosts
-
Para certificados de usuario: SSH carga automáticamente los certificados si están nombrados de manera adecuada: id_ed25519-cert.pub al lado de la llave privada. Si usas ssh-agent, agrégalo con ssh-add para que esté disponible.
Reinicia servicios SSH después de hacer cambios: systemctl reload sshd. Prueba con ssh -v user@host para ver si la confianza funciona.
Ejemplos
Ejemplo 1: Certificado de Usuario Básico
-
Escenario: Acceso estándar de usuario.
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 ssh-keygen -s ca_key -I "pancho@empresa" -z 10 ~/.ssh/id_ed25519.pub # Distribuye ca_key.pub a servidores # Login: ssh pancho@servidor
Ejemplo 2: Usuario de Backup Restringido
-
Escenario: Rsync automatizado con límites de IP/comando. (Nota: source-address puede ser frágil en entornos cloud dinámicos.)
ssh-keygen -s ca_key -I "respaldo@servidor" -n respaldo -O force-command="/usr/bin/rsync --server" -O source-address="10.0.0.0/8" -V +1d -z 11 ~/.ssh/id_ed25519.pub
Ejemplo 3: Certificado de Host para Cluster
-
Escenario: Nodos de cluster seguros. (Flujo seguro: Trae llaves a máquina CA, firma localmente, despliega certificados.)
for host in node1 node2; do # Copia la llave de host a la máquina CA segura scp $host:/etc/ssh/ssh_host_ed25519_key.pub /tmp/$host.pub # Firma localmente con CA ssh-keygen -s ca_key -I "$host.cluster" -h -z 1004 /tmp/$host.pub # Despliega certificado de vuelta scp /tmp/$host-cert.pub $host:/etc/ssh/ssh_host_ed25519_key-cert.pub done # Los clientes agregan @cert-authority a known_hosts
Ejemplo 4: Certificado de Usuario FIDO
-
Escenario: Llaves por hardware.
ssh-keygen -t ecdsa-sk -f ~/.ssh/id_ecdsa_sk ssh-keygen -s ca_key -I "juan@fido" -z 12 ~/.ssh/id_ecdsa_sk.pub
Ejemplo 5: Certificado con Extensiones
-
Escenario: Opciones custom.
ssh-keygen -s ca_key -I "dev@team" -O critical:source-address="192.168.1.0/24" -O extension:permit-pty -z 13 ~/.ssh/id_ed25519.pub
Ejemplo 6: Inspeccionando Certificados
-
Ver detalles:
ssh-keygen -L -f ~/.ssh/id_ed25519-cert.pub
Ejemplo 7: Revocando un Certificado
-
Agrega a KRL y despliega:
ssh-keygen -k -u -f revoked.krl ~/.ssh/id_ed25519-cert.pub scp revoked.krl server:/etc/ssh/
Mejores Prácticas y Consideraciones de Seguridad
-
Seguridad de la llave privada del CA: Guarda la llave privada del CA en algún lugar seguro. Nunca las uses en servidores de producción. Rota las CAs anualmente o en caso de compromiso; usa una CA dual: Agrega ambas partes públicas de las llaves de la CA vieja y nueva a TrustedUserCAKeys durante transición para evitar lockouts.
-
Períodos de Validez: Usa ciclos de vida cortos (días/semanas) para certificados de usuario y más largos para los hosts (meses/años). Automatiza la renovación.
-
Auditoría: Log uso de certificados via sshd; monitorea anomalías.
-
Integración: Automatiza con Ansible o Chafánsible la distribución de certificados. Un rol de ejemplo con Ansible:
- name: Despliega llave pública del CA copy: content: "{{ ca_pub_key }}" dest: /etc/ssh/ca.pub notify: reload sshd
-
Evita errores comunes: No firmes certificados con opciones débiles; prueba la revocación; usa ed25519.
Automatización y Escalado
Para despliegues grandes, evita exponer llaves privadas del CA. Usa herramientas seguras:
-
Firmado manual con script (solo en máquina CA segura):
#!/usr/bin/bash # ssh-signer.bash - Corre en la máquina con el CA user=$1 key=$2 serial=$( date +%s%N ) # Serial único basado en timestamp ssh-keygen -s ca_key -I "$user" -V +7d -O no-port-forwarding -z "$serial" "$key"
-
Certificados de Host en bola: Para inicializar, usa OpenTofu para injectar la llave pública del CA. Para sconfigurarlo manualmente:
serial=$( date +%s%N ) # Serial único basado en timestamp for host in $(cat hosts.txt); do scp $host:/etc/ssh/ssh_host_ed25519_key.pub /tmp/$host.pub ssh-keygen -s ca_key -I "$host" -h -z $serial /tmp/$host.pub scp /tmp/$host-cert.pub $host:/etc/ssh/ssh_host_ed25519_key-cert.pub done
-
Monitoreo: Usa ssh-audit o scripts custom para verificar certificados.
Conclusión
Los certificados SSH lo transforman de un protocolo simple en un sistema de autenticación robusto. Centralizando confianza y habilitando políticas, se ofrece escalabilidad y seguridad sin par. Empieza en chiquito, prueba con un usuario/host; luego escala. Recuerda, la CA es tu joya de la corona; protéjala.
RTFM!
- man ssh-keygen
- man sshd_config
- docs oficiales de OpenSSH: https://www.openssh.org/manual.html
- RFC 4251: https://datatracker.ietf.org/doc/html/rfc4251
Contribuye mejoras en:
https://github.com/openssh/openssh-portable
¡Chido, ahora ve y securiza tus despliegues!
12 Dec 2025 4:00pm GMT
Tomasz Torcz: Fedora 42→43, more eventful than usual
12 Dec 2025 10:39am GMT
Fedora Magazine: 4 cool new projects to try in Copr for December 2025

This article series takes a closer look at interesting projects that recently landed in Copr.
Copr is a build-system for anyone in the Fedora community. It hosts thousands of projects with a wide variety of purposes, targeting diverse groups of users. Some of them should never be installed by anyone, some are already transitioning into the official Fedora repositories, and others fall somewhere in between. Copr allows you to install third-party software not found in the standard Fedora repositories, try nightly versions of your dependencies, use patched builds of your favourite tools to support some non-standard use-cases, and experiment freely.
If you don't know how to enable a repository or if you are concerned about whether is it safe to use Copr, please consult the project documentation.
Vicinae
Vicinae is a fast application launcher written in C++/QT. Inspired by tool Raycast, it provides instant app and file search and clipboard history. It also includes built-in utilities such as a calculator and web search, along with support for extensions written in TypeScript. It is designed to be highly responsive and native for Wayland environment. Therefore, if you like keeping your hands on the keyboard or want a customizable, extensible launcher for your desktop, Vicinae may be worth trying.

Installation instructions
The repo currently provides vicinae for Fedora 42, 43, and Fedora Rawhide. To install it, use these commands:
sudo dnf copr enable scottames/vicinae
sudo dnf install vicinae
UZDoom
UZDoom is a modern DOOM source port that builds upon classic GZDoom engine, offering hardware-accelerated rendering, an updated scripting system, improved mod support, and high-quality audio playback. At the same time, it maintains compatibility with classic WAD files while making the experience smooth on current systems.
Whether you are playing the original episodes or diving into extensive mod packs, UZDoom offers a convenient way to enjoy them.
Installation instructions
The repo currently provides uzdoom for Fedora 42, 43, and Fedora Rawhide. To install it, use these commands:
sudo dnf copr enable nalika/uzdoom
sudo dnf install uzdoom
Plasma Panel Colorizer
Plasma Panel Colorizer is a widget for KDE Plasma that allows you to customize the panel's appearance. In addition, it offers options for background tinting, blur, custom opacity levels, shadows, floating panels, or themes that differ from the stock Plasma look. It also includes full blur support and is updated for Plasma 6, making it easy to adjust your panel exactly the way you want.

Installation instructions
The repo currently provides plasma-panel-colorizer for Fedora 42, 43, and Fedora Rawhide. To install it, use these commands:
sudo dnf copr enable peridot-augustus/plasma-panel-colorizer
sudo dnf install plasma-panel-colorizer
sfizz-ui
Sfizz-ui is the graphical interface for the sfizz sampler engine, which is an open-source player for SFZ instrument libraries. The UI provides an accessible way to load SFZ instruments, adjust parameters, and integrate the sampler into your workflow. It also includes plugin support such as LV2 and VST3, making it suitable for music creation in a Linux DAW environment.
For musicians, sound designers, or anyone using SFZ sample libraries, sfizz-ui offers a polished interface.
Installation instructions
The repo currently provides sfizz-ui for Fedora 41, 42, and 43. To install it, use these commands:
sudo dnf copr enable lexridge/sfizz-ui
sudo dnf install sfizz-ui
12 Dec 2025 8:00am GMT
11 Dec 2025
Fedora People
Peter Czanik: The syslog-ng Insider 2025-12: logrotation; release RPM; nightly RPM
11 Dec 2025 12:15pm GMT
10 Dec 2025
Fedora People
Fedora Infrastructure Status: matrix server upgrades
10 Dec 2025 12:30pm GMT
Ben Cotton: The do’s and don’ts of measuring contributions “outside of working hours”
LFX Insights is a handy platform from the Linux Foundation that provides a variety of data on open source projects. Among the statistics it reports is contributions outside of working hours. Some users reported errors with how this information is reported, which got me thinking about the value of this measure. The short version: there's very little value.
Why measure outside-of-working-hours contributions?
LFX Insights includes this measure as a signal of a project's sustainability. Projects that rely heavily on people making after hours contributions, the thinking goes, will have a harder time attracting and retaining contributors.
As a software consumer, you don't want your upstreams to suddenly disappear because that will present supply chain risks. It could mean vulnerabilities go unpatched. It could also mean that new features aren't added. Either way, this puts the onus on the project's users to carry the load.
As a project leader, you may be less concerned about whether or not a company downstream has to devote extra engineering time, but you probably do want your contributors to stick around anyway. Onboarding, mentoring, and growing contributors takes a lot of time and effort. You want to make sure people can stick around.
Why this measure fails
Despite the good intentions of measuring contributions outside working hours, the reality fails to deliver. There are some straightfoward reasons for this. Not everyone's working hours are the same. Not everyone's working hours are consistent. Some people use a different time zone on their computer. Not everyone's working days are the same. Holidays vary widely across countries and religions. People (hopefully) take time off.
Then there's the cultural mismatch. Linux Foundation projects are, to a first approximation, by companies for companies. The Linux Foundation is a 501(c)(6), not a charity, so it makes sense that it would view the world through a business lens. I don't fault them for that. LF project contributors are more likely to make contributions during the working day than contributors to a "hobbyist" project.
But that workday tendency doesn't necessarily mean people will stick around projects longer if the project is tied to their job. As the last few years have shown, tech sector layoffs can come for anyone at any time. If someone is only working on an open source project because it's part of their job, then when the job changes, they'll probably stop. People who work on an open source project for non-job reasons will likely stick around through job changes.
Thus one could argue that a project with a high degree of outside-working-hours contributions is more sustainable.
What to measure instead
If measuring contributions outside of working hours isn't helpful, what is? Focus on what you're worried about. Worried that everyone will disappear? Measure the activity over time. Worried that when a new vulnerability is discovered the lone maintainer will be backpacking through the Alps? Measure the spread of the contributions. Worried that the project doesn't have enough people to follow secure coding practices? Measure the security posture.
Of course, the best answer is to stop trying to measure sustainability and contribute to making the project more sustainable instead.
This post's featured photo by Joshua Olsen on Unsplash.
The post The do's and don'ts of measuring contributions "outside of working hours" appeared first on Duck Alignment Academy.
10 Dec 2025 12:00pm GMT
Fedora Magazine: Talk to your Fedora system with the linux-mcp-server!

Generative AI systems are changing the way people interact with computers. MCP (model context protocol) is a way that enables LLMs to run commands and use tools to enable live, conversational interaction with systems. Using the new linux-mcp-server, let's walk through how you can talk with your Fedora system for understanding your system and getting help troubleshooting it!
Introduction
Large language models (LLMs) can be an invaluable tool when investigating an issue on a Linux system. However, this can involve a lot of copy/pasting of information from the Linux terminal into a web based interface to an LLM model.
The model context protocol (MCP) acts as a bridge, enabling LLMs to interact with external tools and data sources. The linux-mcp-server utilizes this protocol to give LLMs the ability to interact with a Fedora Linux system. Instead of you manually copying and pasting terminal output, the linux-mcp-server enables the LLM to directly query system information and log entries.
By enabling an LLM direct access to system information and logs, it is transformed into an active part of the investigation process when troubleshooting an issue. It empowers an LLM to directly query the system state, allowing it to help identify performance bottlenecks, and identify important log entries that might be missed by a manual review.
What is the model context protocol (MCP)?
Anthropic introduced MCP in November 2024 as an open standard for LLM tool use. This provides a way for LLMs to interact with outside systems and data sources.
Prior to MCP, there wasn't as strong a standard and ecosystem for LLM systems to call tools. LLMs were thus frequently limited to have only the information contained in their training. They were isolated from the outside world. For example, if you asked an LLM "what is the weather going to be next week", the LLM would respond with a message indicating that it doesn't know what the weather will be, as it doesn't have access to that information. MCP helps solve this problem by enabling a standardized way for an LLM to access an outside data source, such as the weather forecast.
At a high level, users can use an AI agent application, such as Goose (open source), or Claude Desktop, and specify which MCP servers they would like to use. The AI agent application informs the LLM that there are tools available via these MCP servers that can be used to help answer the requests from the user. The LLM model can then decide when to invoke these tools.
MCP is an open standard. You have the flexibility to use MCP servers, such as linux-mcp-server, with either open source-licensed LLM models, or hosted proprietary LLM models.
What is the linux-mcp-server?
The linux-mcp-server is a project started by Red Hat's RHEL Engineering team. It provides a number of tools that enable an LLM to query information from a Linux system, such as system info, service information and logs, process information, journald and other logs, network information, and storage and disk information. For a full list of the tools provided, refer to the project's Github page.
These tools, provided by linux-mcp-server, are focused on providing the LLM access to read-only information. In the future, we'll be exploring expanding past these read-only use cases.
The linux-mcp-server can be used to interact with the local Fedora Linux system that it is running on. It can also be used to interact with remote Fedora Linux systems over SSH. For example, if you have SSH key authentication setup with the remote systems, you could make a request to your AI agent application such as "Determine the current memory usage on the fedora1.example.com, fedora2.example.com, and fedora3.example.com servers".
Prerequisites
The main components needed are an AI agent application, access to LLM model inference, and the linux-mcp-server.
There are a number of options for the AI agent application, both open source and proprietary. An example of an open source AI agent is Goose, which provides an RPM that can be installed on Fedora.
There are several LLM model inference options, ranging from locally hosted open source models, to proprietary hosted LLM models. If hosting a model locally, you generally need to have GPU/accelerator hardware available. Open source tools that can be used to locally host LLM models include RamaLama, llama.cpp, and Ollama. There are a number of open source LLM models capable of tool calling. These include models such as gpt-oss-20b and Granite 4 small.
Once you've installed your preferred AI agent application, the next step is to install the linux-mcp-server and configure your AI agent application to use it. The linux-mcp-server Github page includes installation instructions, and configuration details for Goose and Claude Desktop.
Use cases for linux-mcp-server
NOTE
The example scenarios below include LLM generated output. The output is presented here as it was generated by the AI systems cited so you can assess the quality of the output they are capable of, particularly in a non-cloud, self-hosted scenario.
The sample LLM output provided here is not intended to be run by readers without the same level of review we recommend anyone perform on AI generated content prior to use on their systems.
Dear Fedora, why are you so slow today?
Has this happened to you? Time to hop on a call and give a demo. You're pumped. It's your go-time aaaaand… your system starts lagging. Folks on the call say your video is choppy or they can't understand you. OH NO.
You've got to assess what's up, quickly. Run `top`, check `dmesg`, check your network connection (read the man page for `netstat` because you can't remember the flag you want), check `journalctl`, cross-reference what you're seeing, then start killing processes, restarting services, reloading kernel modules willy-nilly based on hastily-crafted theories, see if it improves…
This is a case where you could actually ask the system to tell you why it's slow and give it the ability to do the analysis via linux-mcp-tools. The LLM you connect with won't just run the commands through the MCP server; it will understand what the numbers it gets means in context and be able to explain in human terms.
So let's start working with a relatively new laptop. It's a Thinkpad T14S with the AMD Ryzen
AI Pro 7 ( Ryzen
AI 360 chip.) Recently one of the wifi networks it must connect to regularly was upgraded. It now has issues with the network connection stability. So let's talk to Goose + gpt-oss:20b + linux-mcp-server about it. Here's the initial prompt I used:
My Wifi doesn't work very well. It cuts out after a few minutes. Help me fix it. Can you find the error messages in the system logs and diagnose the problem?
It responded with the following:
I'll help you diagnose your WiFi connectivity issues by examining the system logs for error messages related to your network interfaces and wireless connectivity. Let me start by gathering information about your system and checking the relevant logs.
LLM output 
I'll help you diagnose your WiFi connectivity issues by examining the system logs for error messages related to your network interfaces and wireless connectivity. Let me start by gathering information about your system and checking the relevant logs.
It then made a series of tool calls, including Get System Information, Get Network Interfaces, Get Journal Logs, Get Hardware Information, Get Service Status service_name:NetworkManager, Get Service Status service_name:wpa_supplicant, Read Log File. The result was this analysis, which is the actual problem I was hoping it could find!
LLM output 
Problem Summary
Your ATH12K PCI (Qualcomm 802.11ax) wireless card repeatedly reports
every few minutes, which triggers a drop of the wireless link. The driver is trying to enqueue an RX buffer for the hardware but fails with errno ‑28.
‑28 →
-ENOMEM(out‑of‑memory for the driver buffer); the kernel cannot allocate a DMA‑coherent RX descriptor.
Likely Root Causes
| Cause | Evidence / Typical Scenario | Fix / Work‑around |
|---|---|---|
| Driver bug or missing patch | The kernel's ath12k driver has had a long‑running issue with RX ring management in older 5.15/5.4 series. | Update the kernel or apply the latest patch from the
linux
tree (5.18+ or the
ath12k
backports). |
| Out‑of‑memory in the driver | The error itself is "‑28" - a memory‑allocation failure inside the driver. | Reduce the number of RX descriptors, or enable fast‑RX disabled mode (
modprobe ath12k_pci ath12k_pci_disable_fast_rx=1
). |
| Firmware mismatch / corruption | Old firmware (
ath12k_hwgen
files) don't match the driver version. |
Reinstall firmware from your distro (
sudo apt‑install --reinstall linux‑firmware
) or copy the latest from kernel source. |
| System memory pressure | If many big processes run, DMA‑coherent memory pool can be exhausted. | Free RAM, close memory‑heavy apps, or add swap. |
| ACPI/PCIe power‑management quirk | Some laptops throttle the Wi‑Fi device after idle, resetting the buffer queue. | Add
pci_bus_reset = off
or set
ath12k_pci_disable_ps=1
in a modprobe conf. |
| Hardware issue (e.g., faulty PCIe slot or NIC) | Persistent failure after multiple driver/kernel upgrades. | Test on another machine/PCIe slot or replace the card. |
Diagnostic Checklist
- Kernel & Driver Version
uname -r
modinfo ath12k_pci | grep ^version: - Driver Parameters
cat /sys/module/ath12k_pci/parameters/*
- See if
ath12k_pci_disable_fast_rxor similar is already set. - Firmware Location
ls /lib/firmware/ath12k/ath12k/ar9009/ | grep wlan0 | head
- If the files are missing or older than the driver, copy newer ones from
/lib/firmwareof a newer kernel. - Kernel Log Around the Error
sudo dmesg --ctime | grep -A3 'ath12k_pci 0000:c2:00.0'
- Observe if there are any other "-ENOMEM" or "resource exhaustion" messages.
- Memory Usage Stats
free -h
vmstat 1 5 - PCIe Latency / Link State
sudo lspci -vvv -s c2:00.0 | grep -iE "link|latency"
Quick‑Fix Path (Try in this order)
| Step | Command | What it does |
|---|---|---|
| 1 |
sudo apt update && sudo apt upgrade linux-firmware
|
Refresh firmware to the latest distro version. |
| 2 |
sudo apt upgrade linux-image-$(uname -r)
|
Apply latest kernel with patched ath12k driver (or switch to 5.18+ if still on 5.15). |
| 3 |
echo "ath12k_pci ath12k_pci_disable_fast_rx=1" | sudo tee /etc/modprobe.d/ath12k.conf
|
Force the driver to use a more conservative RX scheme. |
| 4 | Reboot and monitor:
dmesg -T | grep ath12k_pci
|
Verify the error disappears. |
| 5 | Optional: test on another machine or use a USB‑WiFi dongle (USB 802.11ax / 802.11ac) as a temporary workaround. | Ensure the issue is not hardware‑specific. |
**Tip:** After reboot, you can also verify driver parameters:
Advanced Debugging (if basic fixes don't help)
- Enable debug logs
sudo bash -c 'echo 1 > /sys/module/ath12k_pci/parameters/debug'
sudo dmesg -T | grep ath12k - Check PCIe device state
sudo lspci -vv -s c2:00.0
- Use
ethtoolon the Wi‑Fi interfacesudo ethtool -S wlp4s0
- Look for firmware logs
grep -i 'ath12k' /var/log/syslog
Bottom Line
The "failed to enqueue rx buf: -28" is an internal driver allocation failure that typically follows one of two paths:
- The driver or firmware is out of date → upgrade kernel/firmware.
- The driver configuration is too aggressive for your system's memory budget → tweak module parameters or free RAM.
If you're still stuck after following the quick‑fix path, it's worth capturing more detailed logs with the above diagnostic commands and sharing them. We can then cross‑reference the exact kernel version/driver commit to see if a known patch exists.
Feel free to let me know which step you've tried or if you encounter any new log messages, and we'll dig deeper!
Not too bad an analysis and set of debugging steps! I'll follow up in the comments and let you know if the solution works.
Where'd my disk space go, Fedora?
Trawling through `du` on the command line, converting the numbers in your head, sorting them - it can be tedious. We even have graphical tools like `baobab`, but it takes time to load in the results and explore the data. And sometimes you might have questions it can't answer in terms of the risk of deleting a specific directory, or the age of a specific file. It's easier to be sloppy and just let the disk fill up. But there comes a time where that is no longer an option.
You can ask the system using the linux-mcp-server to analyze your disk usage, give you a hitlist of the top space-eating places on the file system, and give you advice with a risk analysis of how to clean it up.
Here's my prompt:
I need help analyzing disk usage on my system. Can you: 1. Show me the current disk usage across all filesystems using get_disk_usage 2. Use list_directories_by_size to show me the 10 largest directories (this will help identify where disk space is being consumed) 3. Check if there are any large log files consuming space using get_journal_logs or read_log_file 4. Review running processes with list_processes to see if any are consuming excessive resources or creating temporary files After gathering this information, help me identify potential space-saving opportunities and assess the safety of cleaning up specific directories. Please don't only examine top-level directories. I want you to go deep if needed to find out where the largest files are!
I ran this prompt on my Fedora 42 system using linux-mcp-server with Goose connected to locally-served gpt-oss:20b, and here's the output I got:
LLM output 
Disk‑space overview (as of Nov 24 2025 15:21)
| Filesystem | Size | Used | Avail | Use % | Mounted on |
| /dev/mapper/luks‑61cedb0e… | 952.3 GB | 548.9 GB | 399.6 GB | 57.9 % | / |
| /dev/mapper/luks‑61cedb0e… | 952.3 GB | 548.9 GB | 399.6 GB | 57.9 % | /home |
| /dev/nvme0n1p2 | 973.4 MB | 462.8 MB | 443.4 MB | 51.1 % | /boot |
| /dev/nvme0n1p1 | 598.8 MB | 19.3 MB | 579.6 MB | 3.2 % | /boot/efi |
Overall usage is comfortably below 60 % of a 952 GB volume, so immediate capacity pressure isn't a concern. The biggest contributors to disk space today are in the ~/. hierarchy.
1. Biggest directories on the system
| Path (relative to the mount point) | Size (≈) | Notes |
| ~/ - User home (2 mounts: /home + /) | 1 TB total | 57 % used; the bulk of the space is in ~/ |
| ~/duffy/.local/share | 238 GB | Containers sub-dir, followed by Trash (7GB) - typical podman/docker container images |
| ~/duffy/.local/share/containers | 238 GB | Container layers, images, root filesystems - the prime candidate for cleanup |
| ~/duffy/.local | 266 GB | share (237 GB) dominates it |
| ~/duffy/.local/share/containers (inside share) | 238 GB | Same as above - just a different view |
| ~/duffy/.local/lib | 13 GB | Likely a virtual‑environment library set |
| ~/duffy/.local/bin | 4 GB | Helper binaries |
| ~/duffy/.local/share/Trash | 7 GB | User‑deleted items still resident on disk |
| ~/duffy/.local/share/containers (nested) | 238 GB | Same as above; redundant listings from recursive depth‑1 scans |
| ~/duffy/.local/share (top‑level) | 238 GB | containers is the absolute largest item |
Take‑away - 75 % of the 238 GB live inside three nested directories:
- ~/duffy/.local/share/containers (238 GB) → Rooted in Podman/Container‑Store
- ~/duffy/.local/share (238 GB) → Same data exposed at a higher level
- ~/duffy/.local/share/Trash (7 GB)
2. Possible space savers
| Category | Recommendation | Rationale |
| Containers | podman system prune -a (or docker system prune -a if Docker is in use) | Removes unreferenced images, containers, and caches. This can reclaim 200-250 GB if most containers are outdated or no longer referenced. |
| Trash | gio trash -empty | The 7 GB in Trash is still taking physical space; simply emptying it will free that amount. |
| Large libs/binaries | Check ~/duffy/.local/lib - 13 GB, and ~/duffy/.local/bin - 4 GB | If these are not actively used by a project, consider uninstalling or moving them to a secondary, less‑frequented store. |
| Ramalama & UV | 9 GB (ramalama), 8 GB (uv) | Possibly temporary files or caches from development tools. Inspecting the contents of these directories (du -h) can reveal if they are still needed. |
| Other caches | pnpm store prune, virtualenv or pip cache purge | Development tool caches often accumulate; they're usually safe to delete once you're done with the current development sessions. |
| Logs | The system journal shows normal activity, and syslog access is blocked by policy. | Systemd journal stores are rotated automatically; no urgent action required. |
| Boot partitions | Not relevant - they're low‑usage (boot/efi only 3 % used). | No action needed. |
3. Disk‑IO perspective
- Reads: 111.8 GB; Writes: 85.6 GB since boot (≈ 1.2 TB total).
- If writes are high relative to free space, consider enabling a tmpfs overlay for temporary files or moving large data to a separate, faster SSD. But in this case, writes are not approaching the 399 GB free.
4. Suggested cleanup workflow
# 1. Empty Trash
gio trash --empty
# 2. Prune unused containers/images
podman system prune -a # or docker system prune -a
# 3. Remove leftover caches
pnpm store prune # if pnpm is installed
pip cache purge # for Python
virtualenv --clean # optional
# 4. Inspect large "RAM" directories manually if you need deeper cleanup
du -h --max-depth=1 ~/duffy/.local/share/containers
Tip: After each step run df -h /home or du -sh ~/duffy/.local/* to see the impact.
5. Safety checklist
| Action | Why it's safe (or risk) |
| Emptying Trash | Deletes only user‑deleted files; confirm you don't need the content. |
| podman system prune -a | Deletes container images, containers, and volumes that are not referenced by any running or stopped container. Verify with podman ps -a before pruning. |
| Removing lib/bin from .local | Only safe if you're sure those libraries/bin files aren't part of a running application. |
| Caution with tmpfs or moving large directories | If you move a directory on‑the‑fly, ensure no running application depends on that path. |
Bottom line
- Current free space: ≈ 400 GB on both / and /home.
- Largest consumptions: User‑space under ~/duffy/.local/share/containers (≈ 238 GB).
- Actionable clean‑ups: Empty Trash, prune unused container images, and review large libraries/binaries. This should reclaim over 200 GB and bring the usage from 57.9 % down to roughly 40‑45 % - plenty of headroom for the future.
Feel free to ask what next steps you'd like to dig into (e.g., detailed contents of the containers dir, specific application logs, or targeted pruning commands).
Upgrade planning
So you may have noticed a little detail above - the system is a Fedora 42 system… and Fedora 43's been out a month now!
So, time to upgrade.
This example shows where we have some tools missing from the set provided in the linux-mcp-server. We're including it for two reasons:
- So you can see how this works manually
You can see that even when specific tools you might not need are available in the MCP server, you can have the response give you instructions on commands to run on your own, and copy/paste the command output back into your chat to get analysis alongside the data the system is able to pull via the MCP tool calls; - To encourage you to submit additional tools to our project!
We would love your additions to the project! Here's where to get started: https://github.com/rhel-lightspeed/linux-mcp-server/blob/main/docs/CONTRIBUTING.md
Here's the prompt I started with, with the same Goose + gpt-oss:20b + linux-mcp-server combination:
You are a Linux system administrator assistant analyzing a Fedora system for upgrade readiness. TASK: Examine this Fedora 42 system and provide a comprehensive upgrade readiness report for Fedora 43. ANALYSIS CHECKLIST: 1. Check current Fedora version: cat /etc/fedora-release 2. Review system updates status: dnf check-update 3. Identify third-party repositories: dnf repolist 4. List installed packages from non-Fedora repos: dnf list installed | grep -v @fedora 5. Check for broken dependencies: dnf check 6. Review disk space on root partition: df -h / 7. Check for unsupported or deprecated packages 8. Identify custom kernel modules: lsmod | grep -v "^Module" 9. Review SELinux status: sestatus 10. Check for pending system updates: dnf upgrade --refresh --assumeno REPORT FORMAT: # Fedora 43 Upgrade Readiness Report ## Current System Status - Fedora version: [version] - Kernel: [kernel version] - Updates status: [current/outdated] ## Potential Issues [List blocking issues with HIGH/MEDIUM/LOW severity] ## Third-Party Software [List non-Fedora packages that may need attention] ## Recommendations [Specific pre-upgrade steps needed] ## Overall Readiness: [READY/NEEDS ATTENTION/NOT READY] Run the necessary commands and provide this analysis. Be specific about any issues found and give actionable recommendations.
Now, right away the model came back to me to complain it doesn't have access to `dnf`, `cat`, etc. And that's expected here. What it did was give me a list of homework to run for it to complete the analysis… for example, `dnf check-update` and `cat /etc/fedora-release`. I had a little back and forth in Goose with the model where it would ask me to run a command and I'd copy the output into the context. This resulted, finally, in the following report:
LLM output 
Fedora 43 Upgrade Readiness Report
Analyzed System - Fedora 42 (Kernel 6.17.7‑200.fc42.x86_64)
Date - 2025‑11‑24
1. Current System Status
| Item | Value |
| Fedora release | 42 (Adams) |
| Kernel | 6.17.7‑200.fc42.x86_64 |
| Root filesystem | 953 GB LUKS, 58 % used (411 GB free) - > 15 % free |
| /var /usr | Same as root partition (mounted the same LUKS device) |
| SELinux | enabled (policy targeted), currently permissive |
| System‑wide packages | No held packages (dnf versionlock list empty) |
| System modules | No modules held. |
TL;DR - The base system is clean; the kernel is up‑to‑date for 42, there is ample disk space, and there are no version‑locked packages.
2. Third‑Party Repos & Packages
| Repository | Description | Current dnf repolist status |
| rpm-fusion-free, rpm-fusion-nonfree, rpm-fusion-updates | Widely used RPM‑Fusion components | Enabled |
| google-cloud-cli | Google Cloud SDK (RPM built for EL 10) | Enabled |
| visual-studio-code | Microsoft VS Code RPMs | Enabled |
| command‑line‑assistant, google‑chrome, PyCharm, unsupported‑fedora‑packages, goose, leaktk | Various Copr builds (mostly Fedora‑42 based) | Enabled |
What we need to verify
- Is a Fedora 43 build available for each of the non‑Fedora repos above?
- Does google-cloud-cli have a Fedora 43‑compatible package? (EL 10 build may not be usable).
- Are there any el10‑specific dependencies that could surface during the upgrade?
- Any custom kernel modules that will require re‑compilation against the 43 kernel.
3. Potential Issues & Severity
| Issue | Severity | Why it matters | Immediate Remedy |
| Non‑Fedora packages (e.g., google‑cloud‑cli) | MEDIUM | May contain packages built for EL 10; could conflict with Fedora 43 libs. | Verify if a Fedora 43 build exists or plan to remove/re‑build it. |
| Third‑party repos without 43 builds | MEDIUM | dnf system-upgrade pulls all enabled repos; lacking 43 packages causes transaction failure. | Disable such repos temporarily (dnf config-manager -set-disabled <repo>) or enable the 43 stream when available. |
| Custom kernel modules | LOW | Modules compiled against 42 kernel won't load on 43; services may drop. | Re‑compile or patch modules after upgrade. |
| Broken dependencies reported by dnf check | HIGH | Would block the upgrade transaction. | Run sudo dnf check now; fix any reported issues. |
| Insufficient disk space | HIGH | Upgrade scripts need > 15 % free. | Not an issue (58 % free). |
Current High‑Risk Items - None detected yet.
Next Action - Confirm each third‑party repo has 43‑compatible packages and set SELinux to enforcing.
4. Recommendations (Pre‑Upgrade Checklist)
| Step | Command | Purpose |
| Confirm 43‑stream packages | `dnf list -available * | grep <repo-id>` |
| Verify SELinux enforcement | sestatus | Should show enforcing. |
| Check for broken dependencies | sudo dnf check | No errors should appear. |
| Refresh all repos | sudo dnf upgrade -refresh | Pull latest metadata. |
| Dry‑run upgrade | sudo dnf system-upgrade download -releasever=43 -allowerasing | Detect any missing packages or conflicts. |
| Optional: Disable problematic repos | sudo dnf config-manager -set-disabled <repo> | Prevent upgrade crash if a repo has no 43 build. |
| Persist SELinux config | sudo sed -i 's/SELINUX=permissive/SELINUX=enforcing/' /etc/selinux/config | Make enforcement permanent. |
5. Post‑Upgrade Actions
- Reboot into the new kernel (dnf system-upgrade reboot).
- Re‑compile any custom kernel modules against the new kernel (yum install kernel-devel-$(uname -r) then rebuild modules).
- Re‑enable any disabled repos once the upgrade completes.
- Verify critical services (systemctl status <service>) and check logs.
- Run sudo dnf distro-sync to ensure all packages are the latest from Fedora 43.
6. Overall Readiness
| Criteria | Status | Comment |
| Disk space | Ready | 58 % free |
| Package health | Ready | No held or broken packages |
| Third‑party repos | Conditional | Verify 43‑compatibility |
| Custom modules | Pending | Need to identify modules and re‑compile |
| Final verdict | READY (after addressing above points) |
Next step - Run the three verification commands below and share the output with me:
sudo dnf check
sudo dnf repolist
dnf list installed | grep -v @fedora
With those outputs I can pinpoint any remaining blockers and give you a final, ready‑to‑proceed plan.
Happy upgrading!
Give it a try, let us know what you think!

You can see from these few examples that tool calling with LLMs is a valuable tool for troubleshooting Linux systems. We could use your help building this and making it awesome! How can you help?
- Give it a try, let us know what you think, file any bugs you find, and let us know what tools are missing that you could use! You can respond in the comments here, file a bug, or chat with us on Fedora matrix.
- We're primarily working with Goose as our client, but if you have another preferred client and want to help us support it better, we'd love to work with you!
- Get involved in the linux-mcp-project generally, we'd love to see your PRs!
- Let us know what you'd like to see in the future. What workflows would you like to see supported? How do you see this making your Fedora or overall Linux experience better? What larger workflows do you see this plugging into?
Join us upstream at https://github.com/rhel-lightspeed/linux-mcp-server! And come chat with us in the Fedora AI/ML SIG chat room on Matrix!
10 Dec 2025 8:00am GMT
Fedora Infrastructure Status: Fedora Copr outage - updating servers
10 Dec 2025 7:00am GMT
Chris Short: Short Family's Ultimate Holiday Movie Marathon: 29 Holiday Films
10 Dec 2025 5:00am GMT



