10 Apr 2024

feedPlanet Gentoo

Gentoo Linux becomes an SPI associated project

SPI Inc. logo

As of this March, Gentoo Linux has become an Associated Project of Software in the Public Interest, see also the formal invitation by the Board of Directors of SPI. Software in the Public Interest (SPI) is a non-profit corporation founded to act as a fiscal sponsor for organizations that develop open source software and hardware. It provides services such as accepting donations, holding funds and assets, … SPI qualifies for 501(c)(3) (U.S. non-profit organization) status. This means that all donations made to SPI and its supported projects are tax deductible for donors in the United States. Read on for more details…

Questions & Answers

Why become an SPI Associated Project?

Gentoo Linux, as a collective of software developers, is pretty good at being a Linux distribution. However, becoming a US federal non-profit organization would increase the non-technical workload.

The current Gentoo Foundation has bylaws restricting its behavior to that of a non-profit, is a recognized non-profit only in New Mexico, but a for-profit entity at the US federal level. A direct conversion to a federally recognized non-profit would be unlikely to succeed without significant effort and cost.

Finding Gentoo Foundation trustees to take care of the non-technical work is an ongoing challenge. Robin Johnson (robbat2), our current Gentoo Foundation treasurer, spent a huge amount of time and effort with getting bookkeeping and taxes in order after the prior treasurers lost interest and retired from Gentoo.

For these reasons, Gentoo is moving the non-technical organization overhead to Software in the Public Interest (SPI). As noted above, SPI is already now recognized at US federal level as a full-fleged non-profit 501(c)(3). It also handles several projects of similar type and size (e.g., Arch and Debian) and as such has exactly the experience and background that Gentoo needs.

What are the advantages of becoming an SPI Associated Project in detail?

Financial benefits to donors:

Financial benefits to Gentoo:

Non-financial benefits to Gentoo:

[1] Presently, almost no donations to the Gentoo Foundation provide a tax benefit for donors anywhere in the world. Becoming a SPI Associated Project enables tax benefits for donors located in the USA. Some other countries do recognize donations made to non-profits in other jurisdictions and provide similar tax credits.

[2] This also depends on jurisdictions and local tax laws of the donor, and is often tied to tax deductions.

[3] The Gentoo Foundation currently pays $1500/year in tax preparation costs.

[4] In recent fiscal years, through careful budgetary planning on the part of the Treasurer and advice of tax professionals, the Gentoo Foundation has used depreciation expenses to offset taxes owing; however, this is not a sustainable strategy.

[5] Non-profits are eligible for reduced fees, e.g., of Paypal (savings of 0.9-1.29% per donation) and other services.

[6] Some sponsorship programs are only available to verified 501(c)(3) organizations

Can I still donate to Gentoo, and how?

Yes, of course, and please do so! For the start, you can go to SPI's Gentoo page and scroll down to the Paypal and Click&Pledge donation links. More information and more ways will be set up soon. Keep in mind, donations to Gentoo via SPI are tax-deductible in the US!

In time, Gentoo will contact existing recurring donors, to aid transitions to SPI's donation systems.

What will happen to the Gentoo Foundation?

Our intention is to eventually transfer the existing assets to SPI and dissolve the Gentoo Foundation. The precise steps needed on the way to this objective are still under discussion.

Does this affect in any way the European Gentoo e.V.?

No. Förderverein Gentoo e.V. will continue to exist independently. It is also recognized to serve public-benefit purposes (§ 52 Fiscal Code of Germany), meaning that donations are tax-deductible in the E.U.

10 Apr 2024 5:00am GMT

01 Apr 2024

feedPlanet Gentoo

The interpersonal side of the xz-utils compromise

While everyone is busy analyzing the highly complex technical details of the recently discovered xz-utils compromise that is currently rocking the internet, it is worth looking at the underlying non-technical problems that make such a compromise possible. A very good write-up can be found on the blog of Rob Mensching...

"A Microcosm of the interactions in Open Source projects"

01 Apr 2024 2:54pm GMT

15 Mar 2024

feedPlanet Gentoo

Optimizing parallel extension builds in PEP517 builds

The distutils (and therefore setuptools) build system supports building C extensions in parallel, through the use of -j (--parallel) option, passed either to build_ext or build command. Gentoo distutils-r1.eclass has always passed these options to speed up builds of packages that feature multiple C files.

However, the switch to PEP517 build backend made this problematic. While the backend uses the respective commands internally, it doesn't provide a way to pass options to them. In this post, I'd like to explore the different ways we attempted to resolve this problem, trying to find an optimal solution that would let us benefit from parallel extension builds while preserving minimal overhead for packages that wouldn't benefit from it (e.g. pure Python packages). I will also include a fresh benchmark results to compare these methods.

The history

The legacy build mode utilized two ebuild phases: the compile phase during which the build command was invoked, and the install phase during which install command was invoked. An explicit command invocation made it possible to simply pass the -j option.

When we initially implemented the PEP517 mode, we simply continued calling esetup.py build, prior to calling the PEP517 backend. The former call built all the extensions in parallel, and the latter simply reused the existing build directory.

This was a bit ugly, but it worked most of the time. However, it suffered from a significant overhead from calling the build command. This meant significantly slower builds in the vast majority of packages that did not feature multiple C source files that could benefit from parallel builds.

The next optimization was to replace the build command invocation with more specific build_ext. While the former also involved copying all .py files to the build directory, the latter only built C extensions - and therefore could be pretty much a no-op if there were none. As a side effect, we've started hitting rare bugs when custom setup.py scripts assumed that build_ext is never called directly. For a relatively recent example, there is my pull request to fix build_ext -j… in pyzmq.

I've followed this immediately with another optimization: skipping the call if there were no source files. To be honest, the code started looking messy at this point, but it was an optimization nevertheless. For the no-extension case, the overhead of calling esetup.py build_ext was replaced by the overhead of calling find to scan the source tree. Of course, this still had some risk of false positives and false negatives.

The next optimization was to call build_ext only if there were at least two files to compile. This mostly addressed the added overhead for packages building only one C file - but of course it couldn't resolve all false positives.

One more optimization was to make the call conditional to DISTUTILS_EXT variable. While the variable was introduced for another reason (to control adding debug flag), it provided a nice solution to avoid both most of the false positives (even if they were extremely rare) and the overhead of calling find.

The last step wasn't mine. It was Eli Schwartz's patch to pass build options via DIST_EXTRA_CONFIG. This provided the ultimate optimization - instead of trying to hack a build_ext call around, we were finally able to pass the necessary options to the PEP517 backend. Needless to say, it meant not only no false positives and no false negatives, but it effectively almost eliminated the overhead in all cases (except for the cost of writing the configuration file).

The timings

Table 1. Benchmark results for various modes of package builds
Django 5.0.3 Cython 3.0.9
Serial PEP517 build 5.4 s 46.7 s
build total 3.1 s 8.4 s 20.8 s 23.5 s
PEP517 5.3 s 2.7 s
build_ext total 0.6 s 6 s 20.8 s 23.5 s
PEP517 5.4 s 2.7 s
find + build_ext total 0.06 s 5.5 s 20.9 s 23.6 s
PEP517 5.4 s 2.7 s
Parallel PEP517 build 5.4 s 22.8 s
Plot of the results from the table
Figure 1. Benchmark times plot

For a pure Python package (django here), the table clearly shows how successive iterations have reduced the overhead from parallel build supports, from roughly 3 seconds in the earliest approach, resulting in 8.4 s total build time, to the same 5.4 s as the regular PEP517 build.

For Cython, all but the ultimate solution result in roughly 23.5 s total, half of the time needed for a serial build (46.7 s). The ultimate solution saves another 0.8 s on the double invocation overhead, giving the final result of 22.8 s.

Test data and methodology

The methods were tested against two packages:

  1. Django 5.0.3, representing a moderate size pure Python package, and
  2. Cython 3.0.9, representing a package with a moderate number of C extensions.

Python 3.12.2_p1 was used for testing. The timings were done using time command from bash. The results were averaged from 5 warm cache test runs. Testing was done on AMD Ryzen 5 3600, with pstates boost disabled.

The PEP517 builds were performed using the following command:

python3.12 -m build -nwx

The remaining commands and conditions were copied from the eclass. The test scripts, along with the results, spreadsheet and plot source can be found in the distutils-build-bench repository.

15 Mar 2024 3:41pm GMT