03 Jan 2020

feedPlanet Gentoo

Gentoo News: FOSDEM 2020

FOSDEM logo

It's FOSDEM time again! Join us at Université libre de Bruxelles, Campus du Solbosch, in Brussels, Belgium. This year's FOSDEM 2020 will be held on February 1st and 2nd.

Our developers will be happy to greet all open source enthusiasts at our Gentoo stand in building K where we will also celebrate 20 years compiling! Visit this year's wiki page to see who's coming.

03 Jan 2020 12:00am GMT

28 Dec 2019

feedPlanet Gentoo

Alexys Jacob: Scylla Summit 2019

I've had the pleasure to attend again and present at the Scylla Summit in San Francisco and the honor to be awarded the Most innovative use case of Scylla.

It was a great event, full of friendly people and passionate conversations. Peter did a great full write-up of it already so I wanted to share some of my notes instead…

This a curated set of topics that I happened to question or discuss in depth so this post is not meant to be taken as a full coverage of the conference.

Scylla Manager version 2

The upcoming version of scylla-manager is dropping its dependency on SSH setup which will be replaced by an agent, most likely shipped as a separate package.

On the features side, I was a bit puzzled by the fact that ScyllaDB is advertising that its manager will provide a repair scheduling window so that you can control when it's running or not.

Why did it struck me you ask?

Because MongoDB does the same thing within its balancer process and I always thought of this as a patch to a feature that the database should be able to cope with by itself.

And that database-do-it-better-than-you motto is exactly one of the promises of Scylla, the boring database, so smart at handling workload impacts on performance that you shouldn't have to start playing tricks to mitigate them… I don't want this time window feature on scylla-manager to be a trojan horse on the demise of that promise!

Kubernetes

They almost got late on this but are working hard to play well with the new toy of every tech around the world. Helm charts are also being worked on!

The community developed scylla operator by Yannis is now being worked on and backed by ScyllaDB. It can deploy, scale up and down a cluster.

Few things to note:

Change Data Capture

Oh boy this one was awaited… but it's now coming soon!

I inquired about it's performance impact since every operation will be written to a table. Clearly my questioning was a bit alpha since CDC is still being worked on.

I had the chance to discuss ideas with Kamil, Tzach and Dor: one of the thing that one of my colleague Julien asked for was the ability for the CDC to generate an event when a tombstone is written so we could actually know when a specific data expired!

I want to stress a few other things too:

LightWeight Transactions

Another so long awaited feature is also coming from the amazing work and knowledge of Konstantin. We had a great conversation about the differences between the currently worked on Paxos based LWT implementation and the maybe later Raft one.

So yes, the first LWT implementation will be using Paxos as a consensus algorithm. This will make the LWT feature very consistent while having it slower that what could be achieved using Raft. That's why ScyllaDB have plans on another implementation that could be faster with less data consistency guarantees.

User Defined Functions / Aggregations

This one is bringing the Lua language inside Scylla!

To be precise, it will be a Lua JIT as its footprint is low and Lua can be cooperative enough but the ScyllaDB people made sure to monitor its violations (when it should yield but does not) and act strongly upon them.

I got into implementation details with Avi, this is what I noted:

Text search

Dejan is the text search guy at ScyllaDB and the one who kindly implemented the LIKE feature we asked for and that will be released in the upcoming 3.2 version.

We discussed ideas and projected use cases to make sure that what's going to be worked on will be used!

Redis API

I've always been frustrated about Redis because while I love the technology I never trusted its clustering and scaling capabilities.

What if you could scale your Redis like Scylla without giving up on performance? That's what the implementation of the Redis API backed by Scylla will get us!

I'm desperately looking forward to see this happen!

28 Dec 2019 7:04pm GMT

24 Dec 2019

feedPlanet Gentoo

Michał Górny: Handling PEP 517 (pyproject.toml) packages in Gentoo

So far, the majority of Python packages have either used distutils, or a build system built upon it. Most frequently, this was setuptools. All those solutions provided a setup.py script with a semi-standard interface, and we were able to handle them reliably within distutils-r1.eclass. PEP 517 changed that.

Instead of a setup script, packages now only need to supply a declarative project information in pyproject.toml file (fun fact: TOML parser is not even part of Python stdlib yet). The build system used is specified as a combination of a package requirement and a backend object to use. The backends are expected to provide a very narrow API: it's limited to building wheel packages and source distribution tarballs.

The new build systems built around this concept are troublesome to Gentoo. They are more focused on being standalone package managers than build systems. They lack the APIs matching our needs. They have large dependency trees, including circular dependencies. Hence, we've decided to try an alternate route.

Instead of trying to tame the new build systems, or work around their deficiencies (i.e. by making them build wheel packages, then unpacking and repackaging them), we've explored the possibility of converting the pyproject.toml files into setup.py scripts. Since the new formats are declarative, this should not be that hard.

We've found poetry-setup project which seemed to have a similar goal. However, it was already discontinued at the time in favor of dephell. The latter project looked pretty powerful but the name was pretty ominous. We did not need most of the functions, and it was hell to package.

Finally, I've managed to dedicate some time into building an in-house solution instead. pyproject2setuppy is a small-ish (<100 SLOC) pyproject.toml-to-setuptools adapter which allows us to run flit- or poetry-based projects as if they used regular distutils. While it's quite limited, it's good enough to build and install the packages that we needed to deal with so far.

The design is quite simple - it reads pyproject.toml and calls setuptools' setup() function with the metadata read. As such, the package can even be used to provide a backwards-compatible setup.py script in other packages. In fact, this is how its own setup.py works - it carries flit-compatible pyproject.toml and uses itself to install itself via setuptools.

dev-python/pyproject2setuppy is already packaged in Gentoo. I've sent eclass patches to easily integrate it into distutils-r1. Once they are merged, installing pyproject.toml packages should be as simple as adding the following declaration into ebuilds:

DISTUTILS_USE_SETUPTOOLS=pyproject.toml

This should make things easier both for us (as it saves us from having to hurriedly add new build systems and their NIH dependencies) and for users who will not have to suffer from more circular dependencies in the Python world. It may also help some upstream projects to maintain backwards compatibility while migrating to new build systems.

24 Dec 2019 10:59pm GMT

19 Dec 2019

feedPlanet Gentoo

Michał Górny: A distribution kernel for Gentoo

The traditional Gentoo way of getting a kernel is to install the sources, and then configure and build one yourself. For those who didn't want to go through the tedious process of configuring it manually, an alternative route of using genkernel was provided. However, neither of those variants was able to really provide the equivalent of kernels provided by binary distributions.

I have manually configured the kernels for my private systems long time ago. Today, I wouldn't really have bothered. In fact, I realized that for some time I'm really hesitant to even upgrade them because of the effort needed to update configuration. The worst part is, whenever a new kernel does not boot, I have to ask myself: is it a real bug, or is it my fault for configuring it wrong?

I'm not alone in this. Recently Михаил Коляда (zlogene) has talked to me about providing binary kernels for Gentoo. While I have not strictly implemented what he had in mind, he inspired me to start working on a distribution kernel. The goal was to create a kernel package that users can install to get a working kernel with minimal effort, and that would be upgraded automatically as part of regular @world upgrades.

Pros and cons of your own kernel

If I am to justify switching from the old tradition of custom kernels to a universal kernel package, I should start by discussing the reasons why you may want to configure a custom kernel in the first place.

In my opinion, the most important feature of a custom kernel is that you can fine-tune it to your hardware. You just have to build the drivers you need (or may need), and the features you care about. The modules for my last custom kernel have occupied 44 MiB. The modules for the distribution kernel occupy 294 MiB. Such a difference in size also comes with a proportional increase of build time. This can be an important argument for people with low-end hardware. On the other hand, the distribution kernel permits building reusable binary packages that can save more computing power.

The traditional Gentoo argument is performance. However, these days I would be very careful arguing about that. I suppose you are able to reap benefits if you know how to configure your kernel towards a specific workload. But then - a misconfiguration can have the exact opposite effect. We must not forget that binary distributions are important players in the field - and the kernel must also be able to achieve good performance when not using a dedicated configuration.

At some point I have worked on achieving a very fast startup. For this reason I've switched to using LILO as the bootloader, and a kernel suitable for booting my system without an initramfs. A universal kernel naturally needs an initramfs, and is slower to boot.

The main counterargument is the effort. As mentioned earlier, I've personally grown tired of having to manually deal with my kernel. Do the potential gains mentioned outweigh the loss of human time on configuring and maintaining a custom kernel?

Creating a truly universal kernel

A distribution kernel makes sense only if it works on a wide range of systems. Furthermore, I didn't forget the original idea of binary kernel packages. I didn't want to write an ebuild that can install a working kernel anywhere. I wanted to create an ebuild that can be used to build a binary package that's going to work on a wide range of setups - including not only different hardware but also bootloaders and /boot layout. A package that would work fine both for my 'traditional' LILO setup and UEFI systemd-boot setup.

The first part of a distribution kernel is the right configuration. I wanted to use a well-tested configuration known to build kernels used on many systems, while at the same time minimizing the maintenance effort on our end. Reusing the configuration from a binary distro was the obvious solution. I went for using the config from Arch Linux's kernel package with minimal changes (e.g. changing the default hostname to Gentoo).

The second part is an initramfs. Since we need to support a wide variety of setups, we can't get away without it. To follow the configuration used, Dracut was the natural choice.

The third and hardest part is installing it all. Since I've already set a goal of reusing the same binary package on different filesystem layouts, the actual installation needed to be moved to postinst phase. Our distribution kernel package installs the kernel into an interim location which is entirely setup-independent, rendering the binary packages setup-agnostic as well. The initramfs is created and installed into the final location along with the kernel in pkg_postinst.

Support for different install layouts is provided by reusing the installkernel tool, originally installed by debianutils. As part of the effort, it was extended with initramfs support and moved into a separate sys-kernel/installkernel-gentoo package. Furthermore, an alternative sys-kernel/installkernel-systemd-boot package was created to provide an out-of-the-box support for systemd-boot layout. If neither of those two work for you, you can easily create your own /usr/local/bin/installkernel that follows your own layout.

Summary

The experimental versions of the distribution kernel are packaged as sys-kernel/vanilla-kernel (in distinction from sys-kernel/vanilla-sources that install the sources). Besides providing the default zero-effort setup, the package supports using your own configuration via savedconfig (but no easy way to update it at the moment). It also provides a forced flag that can be used by expert users to disable the initramfs.

The primary goal at the moment is to test the package and find bugs that could prevent our users from using it. In the future, we're planning to extend it to other architectures, kernel variants (Gentoo patch set in particular) and LTS versions. We're also considering providing prebuilt binary packages - however, this will probably be a part of a bigger effort into providing an official Gentoo binhost.

19 Dec 2019 12:32pm GMT

12 Dec 2019

feedPlanet Gentoo

Michał Górny: A better ebuild workflow with pure git and pkgcheck

Many developers today continue using repoman commit as their primary way of committing to Gentoo. While this tool was quite helpful, if not indispensable in times of CVS, today it's a burden. The workflow using a single serial tool to check your packages and commit to them is not very efficient. Not only it wastes your time and slows you down - it discourages you from splitting your changes into more atomic commits.

Upon hearing the pkgcheck advocacy, many developers ask whether it can commit for you. It won't do that, that's not its purpose. Not only it's waste of time to implement that - it would actually make it a worse tool. With its parallel engine pkgcheck really shines when dealing with multiple packages - forcing it to work on one package is a waste of its potential.

Rather than trying to proliferate your bad old habits, you should learn how to use git and pkgcheck efficiently. This post aims to give you a few advices.

pkgcheck after committing

Repoman was built under the assumption that checks should be done prior to committing. That is understandable when you're working on a 'live' repository as the ones used by CVS or Subversion. However, in case of VCS-es involving staging commits such as Git there is no real difference between checking prior to or post commit. The most efficient pkgcheck workflow is to check once all changes are committed and you are ready to push.

The most recent version of pkgcheck has a command just for that:

$ pkgcheck scan --commits

Yes, it's that simple. It checks what you've committed compared to origin (note: you'll need to have a correct origin remote), and runs scan on all those packages. Now, if you're committing changes to multiple packages (which should be pretty common), the scan is run in parallel to utilize your CPU power better.

You might say: but repoman ensures that my commit message is neat these days! Guess what. The --commits option does exactly that - it raises warnings if your commit message is bad. Admittedly, it only checks summary line at the moment but that's something that can (and will) be improved easily.

And I've forgotten the most cool thing of all: pkgcheck also reports if you accidentally remove the newest ebuild with stable keywords on given arch!

One more tip. You can use the following option to include full live verification of URLs:

$ pkgcheck scan --net --commits

Again, this is a feature missing entirely from repoman.

pkgcommit to ease committing to ebuilds

While the majority of repoman's VCS support is superficial or better implemented elsewhere, there's one killer feature worth keeping: automatically prepending the package name to the summary line. Since that is a really trivial thing, I've reimplemented it in a few lines of bash as pkgcommit.

When run in a package directory, it runs an editor with pre-filled commit message template to let you type it in, then passes it along with its own arguments to git. Usually, I use it as (I like to be explicit about signoffs and signing, you can make .git/config take care of it):

$ pkgcommit -sS .

Its extra feature is that it processes -m option and lets you skip the editor for simple messages:

$ pkgcommit -sS . -m 'Bump to 1.2.3'

Note that it does not go out of its way to figure out what to commit. You need to either stage changes yourself via git add, or pass appropriate paths to the command. What's important is that it does not limit you to committing to one directory - you can e.g. include some profile changes easily.

You'll also need pkg script from the same repository. Or you just install the whole bundle of app-portage/mgorny-dev-scripts.

Amending commits via fixups

Most of you know probably know that you can update commits via git commit --amend. However, that's useful only for editing the most recent commit. You can also use interactive rebase to choose specific commits for editing, and then amend them. Yet, usually there's a much more convenient way of doing that.

In order to commit a fixup to a particular past commit, use:

$ git commit --fixup OLD_COMMIT_ID

This will create a specially titled commit that will be automatically picked up and ordered by the interactive rebase:

$ git rebase -i -S origin

Again, I have a tool of greater convenience. Frequently, I just want to update the latest commit to a particular package (directory). git-fixup does exactly that - it finds the identifier of the latest commit to a particular file/directory (or the current directory when no parameter is given) and commits a fixup to that:

$ git-fixup .

Note that if you try to push fixups into the repository, nothing will stop you. This is one of the reasons that I don't enable signoffs and signing on all commits by default. This way, if I forget to rebase my fixups, the git hook will reject them as lacking signoff and/or signature.

Again, it is part of app-portage/mgorny-dev-scripts.

Interactive rebase to the rescue

When trivial tools are no longer sufficient, interactive rebase is probably one of the best tools for editing your commits. Start by initiating it for all commits since the last push:

$ git rebase -i -S origin

It will bring your editor with a list of all commits. Using this list, you can do a lot: reorder commits, drop them, reword their commit messages, use squash or fixup to merge them into other commits, and finally: edit them (open for amending).

The interactive rebase is probably the most powerful porcelain git command. I've personally found the immediate tips given by git good enough but I realize that many people find it hard nevertheless. Since it's not my goal here to provide detailed instructions on using git, I'm going to suggest looking online for tutorials and guides. The Rewriting History section of the Git Book also has a few examples.

Before pushing: git log

git log seems to be one of the most underappreciated pre-push tools. However, it can be of great service to you. When run prior to pushing, it can help you verify that what you're pushing is actually what you've meant to push.

$ git log --stat

will list all staged commits along with a pretty summary of affected files. This can help you notice that you've forgotten to git add a patch, or that you've accidentally committed some extraneous change, or that you've just mixed changes from two commits.

Of course, you can go even further and take a look at the changes in patch form:

$ git log -p

While I realize this is nothing new or surprising to you, sometimes it's worthwhile to reiterate the basics in a different context to make you realize something obvious.

12 Dec 2019 8:04pm GMT

11 Nov 2019

feedPlanet Gentoo

Craig Andrews: HTTP/3 Support Added to cURL in Gentoo

HTTP/3 may still be in the draft state but that isn't stopping software from adding support for it. As a Gentoo developer, I decided to maintain Gentoo's reputation for not being one to shy away from the bleeding edge by adding (optional) support for HTTP/3 to cURL. I believe that this makes Gentoo the first Linux distribution to ship support for this new protocol outside of the Firefox and Chrome/Chromium browsers.

cURL is a command line tool as well a library (libcurl) that is used by a wide variety of software. It's commonly used by applications written in php, it's used by the Kodi media center, and it's at least an optional dependency of everything from git to systemd to cmake and dovecot. By adding support for HTTP/3 to cURL, potentially everything that uses cURL will also instantly also start supporting HTTP/3.

cURL added HTTP/3 support in version 7.66.0. Rather than writing the entirety of large, complex, and evolving HTTP/3 protocol implementation again (and having to maintain that forever), cURL instead leverages libraries. The two options it currently supports for this purpose are quiche and the combination of ngtcp2 and nghttp3.

Quiche is an HTTP/3 implementation first released by Cloudflare in January 2019. Since Cloudflare is using it to add support for HTTP/3 to its entire CDN (Content Distribution Network), they're actively developing it keeping track of the latest changes being made in the HTTP/3 drafts. Quiche uses Google's boringssl for cryptography which allows it to evolve faster, not having to wait for OpenSSL to implement features. It's written in Rust which is great for security and maintainability. However, being written in Rust is also a problem as that means quiche is only available on platforms that Rust supports (amd64, arm64, ppc64, and x86) which is a much reduced subset of what cURL and the C language support (which is pretty much everything).

ngtcp2 (which implements IETF QUIC, the underlying HTTP/3 protocol) and nghttp3 (which implements the higher level HTTP/3 protocol) together form an HTTP/3 implementation. They are closely modeled on nghttp2 which is already used by cURL as well as the Apache web server (httpd). Therefore, they're easier for existing software to use. They are written in C using standard build tools making them highly portable and able to run on essentially any architecture. ngtcp2 uses OpenSSL but the changes necessary for HTTP/3 support are not yet available in OpenSSL. This situation is also preventing HTTP/3 support from being available in other software that uses OpenSSL, including nodejs (see nodejs issue). Therefore, for the moment, in order to use ngtcp2, a patched version of OpenSSL must also be used. That isn't an tenable solution for a Linux distribution such as Gentoo for a variety of reasons, including maintainability and security concerns involved with carrying a non-upstream version of such a critical package as OpenSSL. In the mean time, I've included the net-libs/ngtcp2 and net-libs/nghttp3 packages in Gentoo but masked them; that way, when OpenSSL is updated, the packages are ready and can simply be unmasked.

To enable HTTP/3 support in Gentoo, add the quiche use flag to the net-misc/curl package and re-emerge curl:

echo "net-misc/curl quiche" >> /etc/portage/package.use
emerge -1 net-misc/curl

After that, use the curl command's new --http3 argument when making https requests. See the cURL documentation for more information.

11 Nov 2019 5:03pm GMT

06 Nov 2019

feedPlanet Gentoo

Craig Andrews: Linters: Keys To Secure, Maintainable, Quality DevSecOps

Linters are static analysis tools that analyze source code and report problems. The term goes all the way back to Bell Labs in 1978 but the concept is still very important today. In my opinion, linters are a key ingredient of a successful DevSecOps implementation, and yet not enough people are aware of linters, how easy they are to use, and how important to quality and security they are.

Linters can be syntax checkers or more complex tools. "Lint" is a more or less a term used for lightweight, simple static analysis. For example, yamllint checks the syntax of YAML files. At first, this tool may seem to be nice but not really necessary; YAML is pretty easy to read and understand. However, it's simple to make a mistake in YAML and introduce a hard to discover problem. Take this .gitlab-ci.yml for instance:

---
variables:
  password: "swordfish"
include:
  - template: Code-Quality.gitlab-ci.yml
build:
  stage: build
  image:
    name: python:buster
  script:
    - ./build.sh
  artifacts:
    paths:
      - target/beanstalk-app.zip
variables:
  password: "taco"
include:
  - template: SAST.gitlab-ci.yml

This file is valid and GitLab will process it. However, it's not clear what it actually does - did you spot all the errors? In this case, an unexpected password is specified among other issues. This error may introducing a security vulnerability. And this example is short and relatively easy to manually check - what if the YAML file was much longer?

For more complex languages than YAML, linting is even more important. With more expressive language, errors are easier to introduce (through misunderstanding and typos) and harder to notice. Linters also make code more consistent, understandable, and maintainable. They not only improve security but also reduce cost and improve quality.

For a real world example, I've been doing a lot of CloudFormation work lately. It's easy to accidentally create more open security groups and network access control lists than necessary, to forget to enable encryption, or make other such mistakes. cfn_nag and cfn-lint have caught many errors for me, as well as encouraged me to improve the quality by setting resource descriptions and being explicit about intentions.

Another example is with Java. By using PMD to check for logic, syntax, and convention violation errors, the code can be more likely to work as expected. By using Checkstyle, the code is all consistently formatted, follows the same naming conventions, has required comments, and other such benefits that make the code easy to understand and maintain. And easy to understand and maintain inherently means more secure.

Therefore, always add as many linters as possible and have them run as often as possible. Running linters in the git pre-commit hook is ideal (as then detected errors are never even committed). Running them from the build process (maven, msbuild, make, grunt, gulp, etc) is really important. But ultimately, running them in continuous integration is an absolute requirement. Running them daily or weekly is simply not enough.

A common scenario I've seen is that static analysis is only done periodically (once per day or once per week) instead of for every commit (via a commit hook or continuous integration). For example, I've seen SonarQube set up to run daily for many projects. The problem with this approach is that errors are reported much later than they're introduced making them lower priority to fix and harder to fix. If a daily SonarQube scan discovers a security vulnerability, management will triage the issue and perhaps put fixing it on the backlog, then eventually an engineer is tasked with fixing it but before they can do so they have to study and understand the relevant code. A superior approach leading to better efficiency and better security is to perform this scanning for every commit and fail the build if the scan fails - that way, the person who introduced the problem has to immediately fix it. This reduces exposure (as detected issues can never progress in the pipeline) and improves efficiency (as the same person who introduced the issue fixes it immediately without having to re-learn anything).

Here's how a few linters are run on GitLab CI for the VersionPress on AWS project:

lint_ebextensions:
  stage: test
  image: sdesbure/yamllint
  script:
    - yamllint -s ./beanstalk/.ebextensions/*.config .

lint_dockerfile:
  stage: test
  image: hadolint/hadolint:latest-debian
  script:
    - hadolint beanstalk/Dockerfile

cfn_lint:
  stage: test
  image: aztek/cfn-lint
  script:
    - cfn-lint target/cloudformation.json

validate_cloudformation:
  only:
    variables:
      - $AWS_ACCESS_KEY_ID
      - $AWS_SECRET_ACCESS_KEY
  stage: test
  image: python:latest
  script:
    - pip install awscli
    - aws cloudformation validate-template --template-body file://target/cloudformation.json

cfn_nag:
  stage: test
  image:
    name: stelligent/cfn_nag
    entrypoint: [""]
  script:
    - cfn_nag target/cloudformation.json

Note that docker is used to run the linters. This approach allows them to be quickly and easily run, and it's much easier to maintain than having to manually install each one.

And finally, here are my favorite linters that I use frequently:

  • shellcheck is a static analysis tool for shell scripts.
  • cfn-lint validates templates against the CloudFormation spec and additional checks. Includes checking valid values for resource properties and best practices.
  • cfn-nag tool looks for patterns in CloudFormation templates that may indicate insecure infrastructure.
  • yamllint does not only check for syntax validity, but for weirdnesses like key repetition and cosmetic problems such as lines length, trailing spaces, indentation, etc.
  • PMD is an extensible cross-language static code analyzer. Easy to use from Java via Maven with the Maven PMD Plugin.
  • Checkstyle is a development tool to help programmers write Java code that adheres to a coding standard. Easy to use from Java via Maven with the Maven Checkstyle Plugin.
  • PHP_CodeSniffer tokenizes PHP, JavaScript and CSS files and detects violations of a defined set of coding standards.
  • Hadolint is a Dockerfile linter which validates inline bash.
  • CSSLint is a tool to help point out problems with your CSS code.
  • ESLint is the pluggable linting utility for JavaScript and JSX (prefer this tool over JSLint)
  • JSLint is The Javascript Code Quality Tool.
  • pkgcheck and repoman check Gentoo ebuilds (packages).
  • GitLab offers SAST (which is a bit more than the usual lightweight linter)

06 Nov 2019 7:21pm GMT

Michał Górny: Gentoo eclass design pitfalls

I have written my share of eclasses, and I have made my share of mistakes. Designing good eclasses is a non-trivial problem, and there are many pitfalls you should be watching for. In this post, I would like to highlight three of them.

Not all metadata variables are combined

PMS provides a convenient feature for eclass writers: cumulative handling of metadata variables. Quoting the relevant passage:

The IUSE, REQUIRED_USE, DEPEND, BDEPEND, RDEPEND and PDEPEND variables are handled specially when set by an eclass. They must be accumulated across eclasses, appending the value set by each eclass to the resulting value after the previous one is loaded. Then the eclass-defined value is appended to that defined by the ebuild. […]

Package Manager Specification (30th April 2018), 10.2 Eclass-defined Metadata Keys

That's really handy! However, the important thing that's not obvious from this description is that not all metadata variables work this way. The following multi-value variables don't: HOMEPAGE, SRC_URI, LICENSE, KEYWORDS, PROPERTIES and RESTRICT. Surely, some of them are not supposed to be set in eclasses but e.g. the last two are confusing.

This means that technically you need to append when defining them, e.g.:

# my.eclass
RESTRICT+=" !test? ( test )"

However, that's not the biggest problem. The real issue is that those variables are normally set in ebuilds after inherit, so you actually need to make sure that all ebuilds append to them. For example, the ebuild needs to do:

# my-1.ebuild
inherit my
RESTRICT+=" bindist"

Therefore, this design is prone to mistakes at ebuild level. I'm going to discuss an alternative solution below.

Declarative vs functional

It is common to use declarative style in eclasses - create a bunch of variables that ebuilds can use to control the eclass behavior. However, this style has two significant disadvantages.

Firstly, it is prone to typos. If someone recalls the variable name wrong, and its effects are not explicitly visible, it is very easy to commit an ebuild with a silly bug. If the effects are visible, it can still give you some quality debugging headache.

Secondly, in order to affect global scope, the variables need to be set before inherit. This is not trivially enforced, and it is easy to miss that the variable doesn't work (or partially misbehaves) when set too late.

The alternative is to use functional style, especially for affecting global scope variables. Instead of immediately editing variables in global scope and expecting ebuilds to control the behavior via variables, give them a function to do it:

# my.eclass
my_enable_pytest() {
  IUSE+=" test"
  RESTRICT+=" !test? ( test )"
  BDEPEND+=" test? ( dev-python/pytest[${PYTHON_USEDEP}] )"
  python_test() {
    pytest -vv || die
  }
}

Note that this function is evaluated in ebuild context, so all variables need appending. Its main advantage is that it works independently of where in ebuild it's called (but if you call it early, remember to append!), and in case of typo you get an explicit error. Example use in ebuild:

# my-1.ebuild
inherit my
IUSE="randomstuff"
RDEPEND="randomstuff? ( dev-libs/random )"
my_enable_pytest

Think what phases to export

Exporting phase functions is often a matter of convenience. However, doing it poorly can cause ebuild writers more pain than if they weren't exported in the first place. An example of this is vala.eclass as of today. It wrongly exports dysfunctional src_prepare(), and all ebuilds have to redefine it anyway.

It is often a good idea to consider how your eclass is going to be used. If there are both use cases for having the phases exported and for providing utility functions without any phases, it is probably a good idea to split the eclass in two: into -utils eclass that just provides the functions, and main eclass that combines them with phase functions. A good examples today are xdg and xdg-utils eclasses.

When you do need to export phases, it is wortwhile to consider how different eclasses are going to be combined. Generally, a few eclass types could be listed:

Generally, it's best to fit your eclass into as few of those as possible. If you do that, there's a good chance that the ebuild author would be able to combine multiple eclasses easily:

# my-1.ebuild
PYTHON_COMPAT=( python3_7 )
inherit cmake-utils git-r3 python-single-r1

Note that since each of those eclasses uses a different phase function set to do its work, they combine just fine! The inherit order is also irrelevant. If we e.g. need to add llvm to the list, we just have to redefine pkg_setup().

06 Nov 2019 7:57am GMT

13 Oct 2019

feedPlanet Gentoo

Michał Górny: Improving distfile mirror structure

The Gentoo distfile mirror network is essential in distributing sources to our users. It offloads upstream download locations, improves throughput and reliability, guarantees distfile persistency.

The current structure of distfile mirrors dates back to 2002. It might have worked well back when we mirrored around 2500 files but it proved not to scale well. Today, mirrors hold almost 70 000 files, and this number has been causing problems for mirror admins.

The most recent discussion on restructuring mirrors started in January 2015. I have started the preliminary research in January 2017, and it resulted in GLEP 75 being created in January 2018. With the actual implementation effort starting in October 2019, I'd like to summarize all the data and update it with fresh statistics.

Continue reading

13 Oct 2019 2:34pm GMT

04 Oct 2019

feedPlanet Gentoo

Thomas Raschbacher: [gentoo] Network Bridge device for Qemu kvm

So I needed to set up qemu+kvm on a new server (After the old one died)

Seems like i forgot to mention how I set up the bridge network on my previous blog post so here you go:

First let me mention that I am using the second physical Interface on the server for the bridge. Depending on your available hardware or use case you might need / want to change this:

So this is fairly simple (if one has a properly configured kernel of course - details on the Gentoo Wiki article on Network Bridges):

First add this in your /etc/conf.d/net (adjust the interface names as needed):

# QEMU / KVM bridge
bridge_br0="enp96s0f1"
config_br0="dhcp"

then add an init script and start it:

cd /etc/init.d; ln -s net.lo net.br0
/etc/init.d/net.br0 start # to test it

So then I get this error when trying to start my kvm/qemu instance:

 * creating qtap (auto allocating one) ...
/usr/sbin/qtap-manipulate: line 28: tunctl: command not found
tunctl failed
* failed to create qtap interface

seems like I was missing sys-apps/usermode-utilities .. so just emerge that, only to get this:

 * creating qtap (auto allocating one) ...
/usr/sbin/qtap-manipulate: line 29: brctl: command not found
brctl failed
* failed to create qtap interface

yepp I forgot to install that too ^^ .. so Install net-misc/bridge-utils and now it starts the VM

04 Oct 2019 8:53am GMT

30 Sep 2019

feedPlanet Gentoo

Thomas Raschbacher: "Network is unreachable" error - Gentoo, silly netifrc configuration mistake

So on a new install I was just sitting there and wondering .. what did I do wrong .. why do I keep getting those errors:

# ping lordvan.com
connect: Network is unreachable

then I realized something when checking the output of route -n:

 # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 enp96s0f0
192.168.0.254 0.0.0.0 255.255.255.255 UH 2 0 0 enp96s0f0

It should be:

 # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.0.254 0.0.0.0 UG 2 0 0 enp96s0f0
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 enp96s0f0

Turns out I had forgotten something quite simple, yet important: add "default via <router IP>" to /etc/conf.d/net.. So after changing it from

routes_enp96s0f0="192.168.0.254"

to

routes_enp96s0f0="default via 192.168.0.254"

and restarting the interface everything works just fine ;)

Silly mistake, easy fix .. can be a pain to realize what went wrong though .. maybe someone will make the same mistake and find this blog post hopefully to fix it faster than me ;)

30 Sep 2019 8:36am GMT

22 Sep 2019

feedPlanet Gentoo

Alexys Jacob: py3status v3.20 – EuroPython 2019 edition

Shame on me to post this so long after it happened… Still, that's a funny story to tell and a lot of thank you to give so let's go!

The py3status EuroPython 2019 sprint

I've attended all EuroPython conferences since 2013. It's a great event and I encourage everyone to get there!

The last two days of the conference week are meant for Open Source projects collaboration: this is called sprints.

I don't know why but this year I decided that I would propose a sprint to welcome anyone willing to work on py3status to come and help…

To be honest I was expecting that nobody would be interested so when I sat down at an empty table on saturday I thought that it would remain empty… but hey, I would have worked on py3status anyway so every option was okay!

Then two students came. They ran Windows and Mac OS and never heard of i3wm or py3status but were curious so I showed them. They could read C so I asked them if they could understand how i3status was reading its horrible configuration file… and they did!

Then Oliver Bestwalter (main maintainer of tox) came and told me he was a long time py3status user… followed by Hubert Bryłkowski and Ólafur Bjarni! Wow..

We joined forces to create a py3status module that allows the use of the great PewPew hardware device created by Radomir Dopieralski (which was given to all attendees) to control i3!

And we did it and had a lot of fun!

Oliver's major contribution

The module itself is awesome okay… but thanks to Oliver's experience with tox he proposed and contributed one of the most significant feature py3status has had: the ability to import modules from other pypi packages!

The idea is that you have your module or set of modules. Instead of having to contribute them to py3status you could just publish them to pypi and py3status will automatically be able to detect and load them!

The usage of entry points allow custom and more distributed modules creation for our project!

Read more about this amazing feature on the docs.

All of this happened during EuroPython 2019 and I want to extend once again my gratitude to everyone who participated!

Thank you contributors

Version 3.20 is also the work of cool contributors.
See the changelog.

22 Sep 2019 2:27pm GMT

13 Sep 2019

feedPlanet Gentoo

Nathan Zachary: Vim pulling in xorg dependencies in Gentoo

Today I went to update one of my Gentoo servers, and noticed that it wanted to pull in a bunch of xorg dependencies. This is a simple music server without any type of graphical environment, so I don't really want any xorg libraries or other GUI components installed. Looking through the full output, I couldn't see a direct reason that these components were now requirements.

To troubleshoot, I started adding packages to /etc/portage/package.mask, starting with cairo (which was the package directly requesting the 'X' USE flag be enabled). That didn't get me very far as it still just indicated that GTK+ needed to be installed. After following the dependency chain for a bit, I noticed that something was pulling in libcanberra and found that the default USE flags now include 'sound' and that vim now had it enabled. It looks like this USE flag was added between vim-8.1.1486 and vim-8.1.1846.

For my needs, the most straightforward solution was to just remove the 'sound' USE flag from vim by adding the following to /etc/portage/package.use:

# grep vim /etc/portage/package.use 
app-editors/vim -sound

13 Sep 2019 5:10pm GMT

06 Sep 2019

feedPlanet Gentoo

Nathan Zachary: Adobe Flash and Firefox 68+ in Gentoo Linux

Though many sites have abandoned Adobe Flash in favour of HTML5 these days, there are still some legacy applications (e.g. older versions of VMWare's vSphere web client) that depend on it. Recent versions of Firefox in Linux (68+) started failing to load Flash content for me, and it took some digging to find out why. First off, I noticed that the content wouldn't load even on Adobe's Flash test page. Second off, I found that the plugin wasn't listed in Firefox's about:plugins page.

So, I realised that the problem was due to the Adobe Flash plugin not integrating properly with Firefox. I use Gentoo Linux, so these instructions may not directly apply to other distributions, but I would imagine that the directory structures are at least similar. To start, I made sure that I had installed the www-plugins/adobe-flash ebuild with the 'npapi' USE flag enabled:

$ eix adobe-flash
[I] www-plugins/adobe-flash
     Available versions:  (22) 32.0.0.238^ms
       {+nsplugin +ppapi ABI_MIPS="n32 n64 o32" ABI_RISCV="lp64 lp64d" ABI_S390="32 64" ABI_X86="32 64 x32"}
     Installed versions:  32.0.0.238(22)^ms(03:13:05 22/08/19)(nsplugin -ppapi ABI_MIPS="-n32 -n64 -o32" ABI_RISCV="-lp64 -lp64d" ABI_S390="-32 -64" ABI_X86="64 -32 -x32")
     Homepage:            https://www.adobe.com/products/flashplayer.html https://get.adobe.com/flashplayer/ https://helpx.adobe.com/security/products/flash-player.html
     Description:         Adobe Flash Player

That ebuild installs the libflashplayer.so (shared object) in the /usr/lib64/nsbrowser/plugins/ directory by default.

However, through some digging, I found that Firefox 68+ was looking in another directory for the plugin (in my particular situation, that directory was /usr/lib64/mozilla/plugins/, which actually didn't exist on my system). Seeing as the target directory didn't exist, I had to firstly create it, and then I decided to symlink the shared object there so that future updates to the www-plugins/adobe-flash package would work without any further manual intervention:

mkdir -p /usr/lib64/mozilla/plugins/
cd $_
ln -s /usr/lib64/nsbrowser/plugins/libflashplayer.so .

After restarting Firefox, the Adobe Flash test page started working as did other sites that use Flash. So, though your particular Linux distribution, version of Firefox, and version of Adobe Flash may require the use of different directories than the ones I referenced above, I hope that these instructions can help you troubleshoot the problem with Adobe Flash not showing in the Firefox about:plugins page.

06 Sep 2019 7:17pm GMT

11 Aug 2019

feedPlanet Gentoo

Gentoo News: AArch64 (arm64) profiles are now stable!

Packet.com logo

The ARM64 project is pleased to announce that all ARM64 profiles are now stable.

While our developers and users have contributed significantly in this accomplishment, we must also thank our Packet sponsor for their contribution. Providing the Gentoo developer community with access to bare metal hardware has accelerated progress in achieving the stabilization of the ARM64 profiles.

About Packet.com

This access has been kindly provided to Gentoo by bare metal cloud Packet via their Works on Arm project. Learn more about their commitment to supporting open source here.

About Gentoo

Gentoo Linux is a free, source-based, rolling release meta distribution that features a high degree of flexibility and high performance. It empowers you to make your computer work for you, and offers a variety of choices at all levels of system configuration.

As a community, Gentoo consists of approximately two hundred developers and over fifty thousand users globally.

11 Aug 2019 12:00am GMT

09 Jul 2019

feedPlanet Gentoo

Michał Górny: Verifying Gentoo election results via Votrify

Gentoo elections are conducted using a custom software called votify. During the voting period, the developers place their votes in their respective home directories on one of the Gentoo servers. Afterwards, the election officials collect the votes, count them, compare their results and finally announce them.

The simplified description stated above suggests two weak points. Firstly, we rely on honesty of election officials. If they chose to conspire, they could fake the result. Secondly, we rely on honesty of all Infrastructure members, as they could use root access to manipulate the votes (or the collection process).

To protect against possible fraud, we make the elections transparent (but pseudonymous). This means that all votes cast are public, so everyone can count them and verify the result. Furthermore, developers can verify whether their personal vote has been included. Ideally, all developers would do that and therefore confirm that no votes were manipulated.

Currently, we are pretty much implicitly relying on developers doing that, and assuming that no protest implies successful verification. However, this is not really reliable, and given the unfriendly nature of our scripts I have reasons to doubt that the majority of developers actually verify the election results. In this post, I would like to shortly explain how Gentoo elections work, how they could be manipulated and introduce Votrify - a tool to explicitly verify election results.

Gentoo voting process in detail

Once the nomination period is over, an election official sets the voting process up by creating control files for the voting scripts. Those control files include election name, voting period, ballot (containing all vote choices) and list of eligible voters.

There are no explicit events corresponding to the beginning or the end of voting period. The votify script used by developers reads election data on each execution, and uses it to determine whether the voting period is open. During the voting period, it permits the developer to edit the vote, and finally to 'submit' it. Both draft and submitted vote are stored as appropriate files in the developer's home directory, 'submitted' votes are not collected automatically. This means that the developer can still manually manipulate the vote once voting period concludes, and before the votes are manually collected.

Votes are collected explicitly by an election official. When run, the countify script collects all vote files from developers' home directories. An unique 'confirmation ID' is generated for each voting developer. All votes along with their confirmation IDs are placed in so-called 'master ballot', while mapping from developer names to confirmation IDs is stored separately. The latter is used to send developers their respective confirmation IDs, and can be discarded afterwards.

Each of the election officials uses the master ballot to count the votes. Afterwards, they compare their results and if they match, they announce the election results. The master ballot is attached to the announcement mail, so that everyone can verify the results.

Possible manipulations

The three methods of manipulating the vote that I can think of are:

  1. Announcing fake results. An election result may be presented that does not match the votes cast. This is actively prevented by having multiple election officials, and by making the votes transparent so that everyone can count them.
  2. Manipulating votes cast by developers. The result could be manipulated by modifying the votes cast by individual developers. This is prevented by including pseudonymous vote attribution in the master ballot. Every developer can therefore check whether his/her vote has been reproduced correctly. However, this presumes that the developer is active.
  3. Adding fake votes to the master ballot. The result could be manipulated by adding votes that were not cast by any of the existing developers. This is a major problem, and such manipulation is entirely plausible if the turnout is low enough, and developers who did not vote fail to check whether they have not been added to the casting voter list.

Furthermore, the efficiency of the last method can be improved if the attacker is able to restrict communication between voters and/or reliably deliver different versions of the master ballot to different voters, i.e. convince the voters that their own vote was included correctly while manipulating the remaining votes to achieve the desired result. The former is rather unlikely but the latter is generally feasible.

Finally, the results could be manipulated via manipulating the voting software. This can be counteracted through verifying the implementation against the algorithm specification or, to some degree, via comparing the results a third party tool. Robin H. Johnson and myself were historically working on this (or more specifically, on verifying whether the Gentoo implementation of Schulze method is correct) but neither of us was able to finish the work. If you're interested in the topic, you can look at my election-compare repository. For the purpose of this post, I'm going to consider this possibility out of scope.

Verifying election results using Votrify

Votrify uses a two-stage verification model. It consists of individual verification which is performed by each voter separately and produces signed confirmations, and community verification that uses the aforementioned files to provide final verified election result.

The individual verification part involves:

  1. Verifying that the developer's vote has been recorded correctly. This takes part in detecting whether any votes have been manipulated. The positive result of this verification is implied by the fact that a confirmation is produced. Additionally, developers who did not cast a vote also need to produce confirmations, in order to detect any extraneous votes.
  2. Counting the votes and producing the election result. This produces the election results as seen from the developer's perspective, and therefore prevents manipulation via announcing fake results. Furthermore, comparing the results between different developers helps finding implementation bugs.
  3. Hashing the master ballot. The hash of master ballot file is included, and comparing it between different results confirms that all voters received the same master ballot.

If the verification is positive, a confirmation is produced and signed using developer's OpenPGP key. I would like to note that no private data is leaked in the process. It does not even indicate whether the dev in question has actually voted - only that he/she participates in the verification process.

Afterwards, confirmations from different voters are collected. They are used to perform community verification which involves:

  1. Verifying the OpenPGP signature. This is necessary to confirm the authenticity of the signed confirmation. The check also involves verifying that the key owner was an eligible voter and that each voter produced only one confirmation. Therefore, it prevents attempts to~fake the verification results.
  2. Comparing the results and master ballot hashes. This confirms that everyone participating received the same master ballot, and produced the same results.

If the verification for all confirmations is positive, the election results are repeated, along with explicit quantification of how trustworthy they are. The number indicates how many confirmations were used, and therefore how many of the votes (or non-votes) in master ballot were confirmed. The difference between the number of eligible voters and the number of confirmations indicates how many votes may have been altered, planted or deleted. Ideally, if all eligible voters produced signed confirmations, the election would be 100% confirmed.

09 Jul 2019 2:15pm GMT