17 Aug 2018

feedPlanet Gentoo

Luca Barbato: Gentoo on Integricloud

Integricloud gave me access to their infrastructure to track some issues on ppc64 and ppc64le.

Since some of the issues are related to the compilers, I obviously installed Gentoo on it and in the process I started to fix some issues with catalyst to get a working install media, but that's for another blogpost.

Today I'm just giving a walk-through on how to get a ppc64le (and ppc64 soon) VM up and running.

Preparation

Read this and get your install media available to your instance.

Install Media

I'm using the Gentoo installcd I'm currently refining.

Booting

You have to append console=hvc0 to your boot command, the boot process might figure it out for you on newer install medias (I still have to send patches to update livecd-tools)

Network configuration

You have to manually setup the network.
You can use ifconfig and route or ip as you like, refer to your instance setup for the parameters.

ifconfig enp0s0 ${ip}/16
route add -net default gw ${gw}
echo "nameserver 8.8.8.8" > /etc/resolv.conf

ip a add ${ip}/16 dev enp0s0
ip l set enp0s0 ip
ip r add default via ${gw}
echo "nameserver 8.8.8.8" > /etc/resolv.conf

Disk Setup

You may use fdisk to create:
- a PowerPC PrEP boot partition of 8M
- root partition with the remaining space

Device     Start      End  Sectors Size Type
/dev/sda1   2048    18431    16384   8M PowerPC PReP boot
/dev/sda2  18432 33554654 33536223  16G Linux filesystem

I'm using btrfs and zstd-compress /usr/portage and /usr/src/.

mkfs.btrfs /dev/sda2

Initial setup

It is pretty much the usual.

mount /dev/sda2 /mnt/gentoo
cd /mnt/gentoo
wget https://dev.gentoo.org/~mattst88/ppc-stages/stage3-ppc64le-20180810.tar.xz
tar -xpf stage3-ppc64le-20180810.tar.xz
mount -o bind /dev dev
mount -t devpts devpts dev/pts
mount -t proc proc proc
mount -t sysfs sys sys
cp /etc/resolv.conf etc
chroot .

You just have to emerge grub and gentoo-sources, I diverge from the defconfig by making btrfs builtin.

My /etc/portage/make.conf:

CFLAGS="-O3 -mcpu=power9 -pipe"
# WARNING: Changing your CHOST is not something that should be done lightly.
# Please consult https://wiki.gentoo.org/wiki/Changing_the_CHOST_variable beforee
 changing.
CHOST="powerpc64le-unknown-linux-gnu"

# NOTE: This stage was built with the bindist Use flag enabled
PORTDIR="/usr/portage"
DISTDIR="/usr/portage/distfiles"
PKGDIR="/usr/portage/packages"

USE="ibm altivec vsx"

# This sets the language of build output to English.
# Please keep this setting intact when reporting bugs.
LC_MESSAGES=C
ACCEPT_KEYWORDS=~ppc64

MAKEOPTS="-j4 -l6"
EMERGE_DEFAULT_OPTS="--jobs 10 --load-average 6 "

My minimal set of packages I need before booting:

emerge grub gentoo-sources vim btrfs-progs openssh

NOTE: You want to emerge again openssh and make sure bindist is not in your USE.

Kernel & Bootloader

cd /usr/src/linux
make defconfig
make menuconfig # I want btrfs builtin so I can avoid a initrd
make -j 10 all && make install && make modules_install
grub-install /dev/sda1
grub-mkconfig -o /boot/grub/grub.cfg

NOTE: make sure you pass /dev/sda1 otherwise grub will happily assume OpenFirmware knows about btrfs and just point it to your directory.
That's not the case unfortunately.

Networking

I'm using netifrc and I'm using the eth0-naming-convention.

touch /etc/udev/rules.d/80-net-name-slot.rules
ln -sf /etc/init.d/net.{lo,eth0}
echo -e "config_eth0=\"${ip}/16\"\nroutes_eth0="default via ${gw}\"\ndns_servers_eth0=\"8.8.8.8\"" > /etc/conf.d/net

Password and SSH

Even if the mticlient is quite nice, you would rather use ssh as much as you could.

passwd 
rc-update add sshd default

Finishing touches

Right now sysvinit does not add the hvc0 console as it should due to a profile quirk, for now check /etc/inittab and in case add:

echo 'hvc0:2345:respawn:/sbin/agetty -L 9600 hvc0' >> /etc/inittab

Add your user and add your ssh key and you are ready to use your new system!

17 Aug 2018 10:44pm GMT

15 Aug 2018

feedPlanet Gentoo

Michał Górny: new* helpers can read from stdin

Did you know that new* helpers can read from stdin? Well, now you know! So instead of writing to a temporary file you can install your inline text straight to the destination:

src_install() {
  # old code
  cat <<-EOF >"${T}"/mywrapper || die
    #!/bin/sh
    exec do-something --with-some-argument
  EOF
  dobin "${T}"/mywrapper

  # replacement
  newbin - mywrapper <<-EOF
    #!/bin/sh
    exec do-something --with-some-argument
  EOF
}

15 Aug 2018 9:21am GMT

13 Aug 2018

feedPlanet Gentoo

Michał Górny: OpenPGP key expiration is not a security measure

There seems to be some recurring confusion among Gentoo developers regarding the topic of OpenPGP key expiration dates. Some developers seem to believe them to be some kind of security measure - and start arguing about its weaknesses. Furthermore, some people seem to think of it as rotation mechanism, and believe that they are expected to generate new keys. The truth is, expiration date is neither of those.

The key expiration date can be updated at any time (both lengthened or shortened), including past the previous expiration date. This is a feature, not a bug. In fact, you are expected to update your expiration dates periodically. You certainly should not rotate your primary key unless really necessary, as switching to a new key usually involves a lot of hassle.

If an attacker manages to compromise your primary key, he can easily update the expiration date as well (even if it expires first). Therefore, expiration date does not really provide any added protection here. Revocation is the only way of dealing with compromised keys.

Expiration dates really serve two purposes: naturally eliminating unused keys, and enforcing periodical checks on the primary key. By requiring the developers to periodically update their expiration dates, we also implicitly force them to check whether their primary secret key (which we recommend storing offline, in a secure place) is still present and working. Now, if it turns out that the developer can't neither update the expiration date nor revoke the key (because the key, its backups and the revocation certificate are all lost, damaged or the developer goes MIA), the key will eventually expire and stop being a 'ghost'.

Even then, developers argue that we have LDAP and retirement procedures to deal with that. However, OpenPGP keys go beyond Gentoo and beyond Gentoo Infrastructure. We want to encourage good practices that will also affect our users and other people with whom developers are communicating, and who have no reason to know about internal Gentoo key management.

13 Aug 2018 3:35pm GMT

12 Aug 2018

feedPlanet Gentoo

Gentoo News: Gentoo booth at the FrOSCon, St. Augustin, Germany

FroSCon logo

As last year, there will be a Gentoo booth again at the upcoming FrOSCon "Free and Open Source Conference" in St. Augustin near Bonn! Visitors can meet Gentoo developers to ask any question, get Gentoo swag, and prepare, configure, and compile their own Gentoo buttons.

The conference is 25th and 26th of August 2018, and there is no entry fee. See you there!

12 Aug 2018 12:00am GMT

Gentoo News: Congratulations: Hanno Böck and co-authors win Pwnie!

Pwnies logo

Congratulations to security researcher and Gentoo developer Hanno Böck and his co-authors Juraj Somorovsky and Craig Young for winning one of this year's coveted Pwnie awards!

The award is for their work on the Return Of Bleichenbacher's Oracle Threat or ROBOT vulnerability, which at the time of discovery affected such illustrious sites as Facebook and Paypal. Technical details can be found in the full paper published at the Cryptology ePrint Archive.

12 Aug 2018 12:00am GMT

09 Aug 2018

feedPlanet Gentoo

Michał Górny: Inlining path_exists

The path_exists function in eutils was meant as a simple hack to check for existence of files matching a wildcard. However, it was kinda ugly and never became used widely. At this very moment, it is used correctly in three packages, semi-correctly in one package and totally misused in two packages. Therefore, I think it's time to replace it with something nicer.

The replacement snippet is rather trivial (from the original consumer, eselect-opengl):

local shopt_saved=$(shopt -p nullglob)
shopt -s nullglob
local opengl_dirs=( "${EROOT%/}"/usr/lib*/opengl )
${shopt_saved}

if [[ -n ${opengl_dirs[@]} ]]; then
        # ...
fi

Through using nullglob, you disable the old POSIX default of leaving the wildcard unexpanded when it does not match anything. Instead, you either simply get an empty array or a list of matched files/directories. If your code requires at least one match, you check for the array being empty; if it handles empty argument lists just fine (e.g. for loops), you can avoid any conditionals. As a side effect, you get the expanded match in an array, so you don't have to repeat the wildcard multiple times.

Also note using shopt directly instead of estack.eclass that is broken and does not restore options correctly. You can read more on option handling in Mangling shell options in ebuilds.

09 Aug 2018 3:01pm GMT

03 Aug 2018

feedPlanet Gentoo

Michał Górny: Verifying repo/gentoo.git with gverify

Git commit signatures are recursive by design - that is, each signature covers not only the commit in question but also indirectly all past commits, via tree and parent commit hashes. This makes user-side commit verification much simpler, as the user needs only to verify the signature on the most recent commit; with the assumption that the developer making it has verified the earlier commit and so on. Sadly, this is usually not the case at the moment.

Most of the Gentoo developers do not really verify the base upon which they are making their commits. While they might verify the commits when pulling before starting to work on their changes, it is rather unlikely that they verify the correctness when they repeatedly need to rebase before pushing. Usually this does not cause problems as Gentoo Infrastructure is verifying the commit signatures before accepting the push. Nevertheless, the recent attack on our GitHub mirrors made me realize that if a smart attacker was able to inject a single malicious commit without valid signature, then a Gentoo developer would most likely make a signed commit on top of it without even noticing the problem.

In this article, I would like to shortly present my quick solution to this problem - app-portage/gverify. gverify is a trivial reimplementation of gkeys in <200 lines of code. It uses the gkeys seed data (yes, this means it relies on manual updates) combined with autogenerated developer keyrings to provide strict verification of commits. Unlike gkeys, it works out-of-the-box without root privileges and automatically updates the keys on use.

The package installs a gv-install tool that installs two hooks on your repo/gentoo.git working copy. Those are post-merge and pre-rebase hooks that verify the tip of upstream master branch, respectively every time merge on master is finished, and every time a rebase is about to be started. This covers the two main cases - git pull and git pull --rebase. The former causes a verbose error after the update, the latter prevents a rebase from proceeding.

While this is far from perfect, it seems reasonably good solution given the limitations of available git hooks. Most importantly, it should prevent the git pull --rebase -S && git push --sign loop from silently accepting a malicious commit. Currently the hook verifies the top upstream commit only; however, in the future I want to implement incremental verification of all new commits.

03 Aug 2018 12:04pm GMT

19 Jul 2018

feedPlanet Gentoo

Alexys Jacob: Authenticating and connecting to a SSL enabled Scylla cluster using Spark 2

This quick article is a wrap up for reference on how to connect to ScyllaDB using Spark 2 when authentication and SSL are enforced for the clients on the Scylla cluster.

We encountered multiple problems, even more since we distribute our workload using a YARN cluster so that our worker nodes should have everything they need to connect properly to Scylla.

We found very little help online so I hope it will serve anyone facing similar issues (that's also why I copy/pasted them here).

The authentication part is easy going by itself and was not the source of our problems, SSL on the client side was.

Environment

SSL cipher setup

The Datastax spark cassandra driver uses default the TLS_RSA_WITH_AES_256_CBC_SHA cipher that the JVM does not support by default. This raises the following error when connecting to Scylla:

18/07/18 13:13:41 WARN channel.ChannelInitializer: Failed to initialize a channel. Closing: [id: 0x8d6f78a7]
java.lang.IllegalArgumentException: Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers

According to the ssl documentation we have two ciphers available:

  1. TLS_RSA_WITH_AES_256_CBC_SHA
  2. TLS_RSA_WITH_AES_128_CBC_SHA

We can get get rid of the error by lowering the cipher to TLS_RSA_WITH_AES_128_CBC_SHA using the following configuration:

.config("spark.cassandra.connection.ssl.enabledAlgorithms", "TLS_RSA_WITH_AES_128_CBC_SHA")\

However, this is not really a good solution and instead we'd be inclined to use the TLS_RSA_WITH_AES_256_CBC_SHA version. For this we need to follow this Datastax's procedure.

Then we need to deploy the JCE security jars on our all client nodes, if using YARN like us this means that you have to deploy these jars to all your NodeManager nodes.

For example by hand:

# unzip jce_policy-8.zip
# cp UnlimitedJCEPolicyJDK8/*.jar /opt/oracle-jdk-bin-1.8.0.144/jre/lib/security/

Java trust store

When connecting, the clients need to be able to validate the Scylla cluster's self-signed CA. This is done by setting up a trustStore JKS file and providing it to the spark connector configuration (note that you protect this file with a password).

keyStore vs trustStore

In SSL handshake purpose of trustStore is to verify credentials and purpose of keyStore is to provide credentials. keyStore in Java stores private key and certificates corresponding to the public keys and is required if you are a SSL Server or SSL requires client authentication. TrustStore stores certificates from third parties or your own self-signed certificates, your application identify and validates them using this trustStore.

The spark-cassandra-connector documentation has two options to handle keyStore and trustStore.

When we did not use the trustStore option, we would get some obscure error when connecting to Scylla:

com.datastax.driver.core.exceptions.TransportException: [node/1.1.1.1:9042] Channel has been closed

When enabling DEBUG logging, we get a clearer error which indicated a failure in validating the SSL certificate provided by the Scylla server node:

Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

setting up the trustStore JKS

You need to have the self-signed CA public certificate file, then issue the following command:

# keytool -importcert -file /usr/local/share/ca-certificates/MY_SELF_SIGNED_CA.crt -keystore COMPANY_TRUSTSTORE.jks -noprompt
Enter keystore password:  
Re-enter new password: 
Certificate was added to keystore

using the trustStore

Now you need to configure spark to use the trustStore like this:

.config("spark.cassandra.connection.ssl.trustStore.password", "PASSWORD")\
.config("spark.cassandra.connection.ssl.trustStore.path", "COMPANY_TRUSTSTORE.jks")\

Spark SSL configuration example

This wraps up the SSL connection configuration used for spark.

This example uses pyspark2 and reads a table in Scylla from a YARN cluster:

$ pyspark2 --packages datastax:spark-cassandra-connector:2.0.1-s_2.11 --files COMPANY_TRUSTSTORE.jks

>>> spark = SparkSession.builder.appName("scylla_app")\
.config("spark.cassandra.auth.password", "test")\
.config("spark.cassandra.auth.username", "test")\
.config("spark.cassandra.connection.host", "node1,node2,node3")\
.config("spark.cassandra.connection.ssl.clientAuth.enabled", True)\
.config("spark.cassandra.connection.ssl.enabled", True)\
.config("spark.cassandra.connection.ssl.trustStore.password", "PASSWORD")\
.config("spark.cassandra.connection.ssl.trustStore.path", "COMPANY_TRUSTSTORE.jks")\
.config("spark.cassandra.input.split.size_in_mb", 1)\
.config("spark.yarn.queue", "scylla_queue").getOrCreate()

>>> df = spark.read.format("org.apache.spark.sql.cassandra").options(table="my_table", keyspace="test").load()
>>> df.show()

19 Jul 2018 11:37am GMT

06 Jul 2018

feedPlanet Gentoo

Alexys Jacob: A botspot story

I felt like sharing a recent story that allowed us identify a bot in a haystack thanks to Scylla.

The scenario

While working on loading up 2B+ of rows into Scylla from Hive (using Spark), we noticed a strange behaviour in the performances of one of our nodes:

So we started wondering why that server in blue was having those peaks of load and was clearly diverging from the two others… As we obviously expect the three nodes to behave the same, there were two options on the table:

  1. hardware problem on the node
  2. bad data distribution (bad schema design? consistent hash problem?)

We shared this with our pals from ScyllaDB and started working on finding out what was going on.

The investigation

Hardware?

Hardware problem was pretty quickly evicted, nothing showed up on the monitoring and on the kernel logs. I/O queues and throughput were good:

Data distribution?

Avi Kivity (ScyllaDB's CTO) quickly got the feeling that something was wrong with the data distribution and that we could be facing a hotspot situation. He quickly nailed it down to shard 44 thanks to the scylla-grafana-monitoring platform.

Data is distributed between shards that are stored on nodes (consistent hash ring). This distribution is done by hashing the primary key of your data which dictates the shard it belongs to (and thus the node(s) where the shard is stored).

If one of your keys is over represented in your original data set, then the shard it belongs to can be overly populated and the related node overloaded. This is called a hotspot situation.

tracing queries

The first step was to trace queries in Scylla to try to get deeper into the hotspot analysis. So we enabled tracing using the following formula to get about 1 trace per second in the system_traces namespace.

tracing probability = 1 / expected requests per second throughput

In our case, we were doing between 90K req/s and 150K req/s so we settled for 100K req/s to be safe and enabled tracing on our nodes like this:

# nodetool settraceprobability 0.00001

Turns out tracing didn't help very much in our case because the traces do not include the query parameters in Scylla 2.1, it is becoming available in the soon to be released 2.2 version.

NOTE: traces expire on the tables, make sure your TRUNCATE the events and sessions tables while iterating. Else you will have to wait for the next gc_grace_period (10 days by default) before they are actually removed. If you do not do that and generate millions of traces like we did, querying the mentioned tables will likely time out because of the "tombstoned" rows even if there is no trace inside any more.

looking at cfhistograms

Glauber Costa was also helping on the case and got us looking at the cfhistograms of the tables we were pushing data to. That proved to be clearly highlighting a hotspot problem:

histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                             (micros)          (micros)           (bytes)                  
50%             0,00              6,00              0,00               258                 2
75%             0,00              6,00              0,00               535                 5
95%             0,00              8,00              0,00              1916                24
98%             0,00             11,72              0,00              3311                50
99%             0,00             28,46              0,00              5722                72
Min             0,00              2,00              0,00               104                 0
Max             0,00          45359,00              0,00          14530764            182785

What this basically means is that 99% percentile of our partitions are small (5KB) while the biggest is 14MB! That's a huge difference and clearly shows that we have a hotspot on a partition somewhere.

So now we know for sure that we have an over represented key in our data set, but what key is it and why?

The culprit

So we looked at the cardinality of our data set keys which are SHA256 hashes and found out that indeed we had one with more than 1M occurrences while the second highest one was around 100K!…

Now that we had the main culprit hash, we turned to our data streaming pipeline to figure out what kind of event was generating the data associated to the given SHA256 hash… and surprise! It was a client's quality assurance bot that was constantly browsing their own website with legitimate behaviour and identity credentials associated to it.

So we modified our pipeline to detect this bot and discard its events so that it stops polluting our databases with fake data. Then we cleaned up the million of events worth of mess and traces we stored about the bot.

The aftermath

Finally, we cleared out the data in Scylla and tried again from scratch. Needless to say that the curves got way better and are exactly what we should expect from a well balanced cluster:

Thanks a lot to the ScyllaDB team for their thorough help and high spirited support!

I'll quote them conclude this quick blog post:

06 Jul 2018 2:50pm GMT

03 Jul 2018

feedPlanet Gentoo

Thomas Raschbacher: php 5.6 -> php 7.1 upgrade issues .. (internal server error)

Turns out I should read things like the gentoo wiki upgrade guide *first* to avoid issues ..

After installing php 7.1 to replace the (quite old) php 5.6 I did check php.ini ,.. but forgot to check for compiled modules and setting PHP_TARGETS .. then wondered why I just got Internal Server Error messages..

Thanks to the php team for writing this nice guide to remind people like me of what to do:

https://wiki.gentoo.org/wiki/PHP/Upgrading_to_PHP_7.1

As always the Gentoo Wiki is a great source of information, and I like to use it as a reminder of the things needed when installing/upgrading,.. ;)

03 Jul 2018 9:41am GMT

29 Jun 2018

feedPlanet Gentoo

Kristian Fiskerstrand: My comments on the Gentoo Github hack

Several news outlets are reporting on the takeover of the Gentoo GitHub organization that was announced recently. Today 28 June at approximately 20:20 UTC unknown individuals have gained control of the Github Gentoo organization, and modified the content of repositories as well as pages there. We are still working to determine the exact extent and … Continue reading "My comments on the Gentoo Github hack"

29 Jun 2018 4:00pm GMT

28 Jun 2018

feedPlanet Gentoo

Gentoo News: Github Gentoo organization hacked - resolved

2018-07-04 14:00 UTC

We believe this incident is now resolved. Please see the incident report for details about the incident, its impact, and resolution.

2018-06-29 15:15 UTC

The community raised questions about the provenance of Gentoo packages. Gentoo development is performed on hardware run by the Gentoo Infrastructure team (not github). The Gentoo hardware was unaffected by this incident. Users using the default Gentoo mirroring infrastructure should not be affected.

If you are still concerned about provenance or are unsure what solution you are using, please consult https://wiki.gentoo.org/wiki/Project:Portage/Repository_Verification. This will instruct you on how to verify your repository.

2018-06-29 06:45 UTC

The gentoo GitHub organization remains temporarily locked down by GitHub support, pending fixes to pull-request content.

For ongoing status, please see the Gentoo infra-status incident page.

For later followup, please see the Gentoo Wiki page for GitHub 2018-06-28. An incident post-mortem will follow on the wiki.

28 Jun 2018 12:00am GMT

03 Jun 2018

feedPlanet Gentoo

Sebastian Pipping: Upstream release notification for package maintainers

Repology is monitoring package repositories across Linux distributions. By now, Atom feeds of per-maintainer outdated packages that I was waiting for have been implemented.

So I subscribed to my own Gentoo feed using net-mail/rss2email and now Repology notifies me via e-mail of new upstream releases that other Linux distros have packaged that I still need to bump in Gentoo. In my case, it brought an update of dev-vcs/svn2git to my attention that I would have missed (or heard about later), otherwise.

Based on this comment, Repology may soon do release detection upstream similar to what euscan does, as well.

03 Jun 2018 1:22pm GMT

01 Jun 2018

feedPlanet Gentoo

Domen Kožar: Announcing Cachix - Binary Cache as a Service

In the last 6 years working with Nix and mostly in last two years full-time, I've noticed a few patterns.

These are mostly direct or indirect result of not having a "good enough" infrastructure to support how much Nix has grown (1600+ contributors, 1500 pull requests per month).

Without further ado, I am announcing https://cachix.org - Binary Cache as a Service that is ready to be used after two months of work.

What problem(s) does cachix solve?

The main motivation is to save you time and compute resources waiting for your packages to build. By using a shared cache of already built packages, you'll only have to build your project once.

This should also speed up CI builds, as Nix can take use of granular caching of each package, rather than caching the whole build.

Another one (which I personally consider even more important) is decentralization of work produced by Nix developers. Up until today, most devs pushed their software updates into the nixpkgs repository, which has the global binary cache at https://cache.nixos.org.

But as the community grew, fitting different ideologies into one global namespace became impossible. I consider nixpkgs community to be mature but sometimes clash of ideologies with rational backing occurs. Some want packages to be featureful by default, some prefer them to be minimalist. Some might prefer lots of configuration knobs available (for example cross-compilation support or musl/glib swapping), some might prefer the build system to do just one thing, as it's easier to maintain.

These are not right or wrong opinions, but rather a specific view of use cases that software might or might not cover.

There are also many projects that don't fit into nixpkgs because their releases are too frequent, they are not available under permissive license, are simpler to manage over complete control or maintainers simply disagree with requirements that nixpkgs developers impose on contributors.

And that's fine. What we've learned in the past is not to fight these ideas, but allow them to co-exist in different domains.

If you're interested:

Domen (domen@enlambda.com)

01 Jun 2018 10:00am GMT

25 May 2018

feedPlanet Gentoo

Michał Górny: The story of Gentoo management

I have recently made a tabular summary of (probably) all Council members and Trustees in the history of Gentoo. I think that this table provides a very succinct way of expressing the changes within management of Gentoo. While it can't express the complete history of Gentoo, it can serve as a useful tool of reference.

What questions can it answer? For example, it provides an easy way to see how many terms individuals have served, or how long Trustee terms were. You can clearly see who served both on the Council and on the Board and when those two bodies had common members. Most notably, it collects a fair amount of hard-to-find data in a single table.

Can you trust it? I've put an effort to make the developer lists correct but given the bad quality of data (see below), I can't guarantee complete correctness. The Trustee term dates are approximate at best, and oriented around elections rather than actual term (which is hard to find). Finally, I've merged a few short-time changes such as empty seats between resignation and appointing a replacement, as expressing them one by one made little sense and would cause the tables to grow even longer.

This article aims to be the text counterpart to the table. I would like to tell the history of the presented management bodies, explain the sources that I've used to get the data and the problems that I've found while working on it.

As you could suspect, the further back I had to go, the less good data I was able to find. The problems included the limited scope of our archives and some apparent secrecy of decision-making processes at the early time (judging by some cross-posts, the traffic on -core mailing list was significant, and it was not archived before late 2004). Both due to lack of data, and due to specific interest in developer self-government, this article starts in mid-2003.

Continue reading

25 May 2018 3:43pm GMT

20 May 2018

feedPlanet Gentoo

Michał Górny: Empty directories, *into, dodir, keepdir and tmpfiles.d

There seems to be some serious confusion around the way directories are installed in Gentoo. In this post, I would like to shortly explain the differences between different methods of creating directories in ebuilds, and instruct how to handle the issues related to installing empty directories and volatile locations.

Empty directories are not guaranteed to be installed

First things first. The standards are pretty clear here:

Behaviour upon encountering an empty directory is undefined. Ebuilds must not attempt to install an empty directory.

PMS 13.2.2 Empty directories (EAPI 7 version)

What does that mean in practice? It means that if an empty directory is found in the installation image, it may or may not be installed. Or it may be installed, and incidentally removed later (that's the historical Portage behavior!). In any case, you can't rely on either behavior. If you really need a directory to exist once the package is installed, you need to make it non-empty (see: keepdir below). If you really need a directory not to exist, you need to rmdir it from the image.

That said, this behavior does makes sense. It guarantees that the Gentoo installation is secured against empty directory pruning tools.

*into

The *into family of functions is used to control install destination for other ebuild helpers. By design, either they or the respective helpers create the install directories as necessary. In other words, you do not need to call dodir when using *into.

dodir

dodir is not really special in any way. It is just a convenient wrapper for install -d that prepends ${ED} to the path. It creates an empty directory the same way the upstream build system would have created it, and if the directory is left empty, it is not guaranteed to be preserved.

So when do you use it? You use it when you need to create a directory that will not be created otherwise and that will become non-empty at the end of the build process. Example use cases are working around broken build systems (that fail due to non-existing directories but do not create them), and creating directories when you want to manually write to a file there.

src_install() {
    # build system is broken and fails
    # if ${D}/usr/bin does not exist
    dodir /usr/bin
    default

    dodir /etc/foo
    sed -e "s:@libdir@:$(get_libdir):" \
        "${FILESDIR}"/foo.conf.in \
        > "${ED}"/etc/foo/foo.conf || die
}

keepdir

keepdir is the function specifically meant for installing empty directories. It creates the directory, and a keep-file inside it. The directory becomes non-empty, and therefore guaranteed to be installed and preserved. When using keepdir, you do not call dodir as well.

Note that actually preserving the empty directories is not always necessary. Sometimes packages are perfectly capable of recreating the directories themselves. However, make sure to verify that the permissions are correct afterwards.

src_install() {
    default

    # install empty directory
    keepdir /var/lib/foo
}

Volatile locations

The keepdir method works fine for persistent locations. However, it will not work correctly in directories such as /run that are volatile or /var/cache that may be subject to wiping by user. On Gentoo, this also includes /var/run (which OpenRC maintainers unilaterally decided to turn into a /run symlink), and /var/lock.

Since the package manager does not handle recreating those directories e.g. after a reboot, something else needs to. There are three common approaches to it, most preferred first:

  1. Application creates all necessary directories at startup.
  2. tmpfiles.d file is installed to create the files at boot.
  3. Init script creates the directories before starting the service (checkpath).

The preferred approach is for applications to create those directories themselves. However, not all applications do that, and not all actually can. For example, applications that are running unprivileged generally can't create those directories.

The second approach is to install a tmpfiles.d file to create (and maintain) the directory. Those files are work both for systemd and OpenRC users (via opentmpfiles) out of the box. The directories are (re-)created at boot, and optionally cleaned up periodically. The ebuild should also use tmpfiles.eclass to trigger directory creation after installing the package.

The third approach is to make the init script create the directory. This was the traditional way but nowadays it is generally discouraged as it causes duplication between different init systems, and the directories are not created when the application is started directly by the user.

Summary

To summarize:

  1. when you install files via *into, installation directories are automatically created for you;
  2. when you need to create a directory into which files are installed in other way than ebuild helpers, use dodir;
  3. when you need to install an empty directory in a non-volatile location (and application can't just create it on start), use keepdir;
  4. when you need to install a directory into a volatile location (and application can't just create it on start), use tmpfiles.d.

20 May 2018 8:03am GMT