20 Sep 2017

feedPlanet Grep

Philip Van Hoof: Post Iceland holiday

I'm filled up with new inspiration.

20 Sep 2017 8:59pm GMT

Dries Buytaert: Announcing Node.js on Acquia Cloud

Today, Acquia announced that it expanded Acquia Cloud to support Node.js, the popular open-source JavaScript runtime. This is a big milestone for Acquia as it is the first time we have extended our cloud beyond Drupal. I wanted to take some time to explain the evolution of Acquia's open-source stack and why this shift is important for our customers' success.

From client-side JavaScript to server-side JavaScript

JavaScript was created at Netscape in 1995, when Brendan Eich wrote the first version of JavaScript in just 10 days. It took around 10 years for JavaScript to reach enterprise maturity, however. Adoption accelerated in 2004 when Google used JavaScript to build the first release of Gmail. In comparison to e-mail competitors like Yahoo! Mail and Hotmail, Gmail showed what was possible with client-side JavaScript, which enables developers to update pages dynamically and reduces full-page refreshes and round trips to the server. The benefit is an improved user experience that is usually faster, more dynamic in its behavior, and generally more application-like.

In 2009, Google invented the V8 JavaScript engine, which was embedded into its Chrome browser to make both Gmail and Google Maps faster. Ryan Dahl used the V8 run-time as the foundation of Node.js, which enabled server-side JavaScript, breaking the language out of the boundaries of the browser. Node.js is event-driven and provides asynchronous, non-blocking I/O - things that help developers build modern web applications, especially those with real-time capabilities and streamed data. It ushered in the era of isomorphic applications, which means that JavaScript applications can now share code between the client side and server side. The introduction of Node.js has spurred a JavaScript renaissance and contributed to the popularity of JavaScript frameworks such as AngularJS, Ember and React.

Acquia's investment in Headless Drupal

In the web development world, few trends are spreading more rapidly than decoupled architectures using JavaScript frameworks and headless CMS. Decoupled architectures are gaining prominence because architects are looking to take advantage of other front-end technologies, most commonly JavaScript based front ends, in addition to those native to Drupal.

Acquia has been investing in the development of headless Drupal for nearly five years, when we began contributing to the addition of web service APIs to Drupal core. A year ago, we released Waterwheel, an ecosystem of software development kits (SDKs) that enables developers to build Drupal-backed applications in JavaScript and Swift, without needing extensive Drupal expertise. This summer, we released Reservoir, a Drupal distribution for decoupled Drupal. Over the past year, Acquia has helped to support a variety of headless architectures, with and without Node.js. While not always required, Node.js is often used alongside of a headless Drupal application to provide server-side rendering of JavaScript applications or real-time capabilities.

Managed Node.js on Acquia Cloud

Previously, if an organization wanted to build a decoupled architecture with Node.js, it was not able to host the Node.js application on Acquia Cloud. This means that the organization would have to run Node.js with a separate vendor. In many instances, this requires organizations to monitor, troubleshoot and patch the infrastructure supporting the Node.js application of their own accord. Separating the management of the Node.js application and Drupal back end not only introduces a variety of complexities, including security risk and governance challenges, but it also creates operational strain. Organizations must rely on two vendors, two support teams, and multiple contacts to build decoupled applications using Drupal and Node.js.

To eliminate this inefficiency, Acquia Cloud can now support both Drupal and Node.js. Our goal is to offer the best platform for developing and running Drupal and Node.js applications. This means that organizations only need to rely on one vendor and one cloud infrastructure when using Drupal and Node.js. Customers can access Drupal and Node.js environments from a single user interface, in addition to tools that enable continuous delivery, continuous integration, monitoring, alerting and support across both Drupal and Node.js.

Acquia Cloud now supports Node.js for Headless DrupalOn Acquia Cloud, customers can access Drupal and Node.js environments from a single user interface.

Delivering on Acquia's mission

When reflecting on Acquia's first decade this past summer, I shared that one of the original corporate values our small team dreamed up was to "empower everyone to rapidly assemble killer websites". After ten years, we've evolved our mission to "build the universal platform for the world's greatest digital experiences". While our focus has expanded as we've grown, Acquia's enduring aim is to provide our customers with the best tools available. Adding Node.js to Acquia Cloud is a natural evolution of our mission.

20 Sep 2017 2:48pm GMT

Mattias Geniar: DNS Research: using SPF to query internal DNS resolvers

The post DNS Research: using SPF to query internal DNS resolvers appeared first on ma.ttias.be.

Using the SPF records to trigger a response from an internal DNS server. Clever way to extract otherwise closed data!

In response to the spread of cache poisoning attacks, many DNS resolvers have gone from being open to closed resolvers, meaning that they will only perform queries on behalf of hosts within a single organization or Internet Service Provider.

As a result, measuring the security of the DNS infrastructure has been made more difficult. Closed resolvers will not respond to researcher queries to determine if they utilize security measures like port randomization or transaction id randomization.

However, we can effectively turn a closed resolver into an open one by sending an email to a mail server (MTA) in the organization. This causes the MTA to make a query on the external researchers' behalf, and we can log the security features of the DNS resolver using information gained by a nameserver and email server under our control.

Source: DNS Research

The post DNS Research: using SPF to query internal DNS resolvers appeared first on ma.ttias.be.

20 Sep 2017 1:28pm GMT

18 Sep 2017

feedPlanet Grep

Mattias Geniar: A proposal for cryptocurrency addresses in DNS

The post A proposal for cryptocurrency addresses in DNS appeared first on ma.ttias.be.

By now it's pretty clear that the idea of a cryptocurrency probably isn't going away. It might not be Bitcoin or Litecoin, it might not have the same value as it does today, but the concept of cryptocurrency is here to stay: digital money.

Just like the beginning of IP addresses, using them raw was fine at first. But with crypto, you get long hexadecimal strings that truly no one can remember by heart. It's far from user friendly.

It's like trying to remember that 2a03:a800:a1:1952::ff is the address for this site. Doesn't work very well, does it? It's far easier to say ma.ttias.be than the cryptic representation of IPv6.

I think we need something similar for cryptocurrencies. Something independent and -- relatively -- secure. So here's my proposal I came up with in the car on the way home.

Example: cryptocurrency in DNS

Here's the simplest example I can give.

$ dig ma.ttias.be TXT | sort
ma.ttias.be.    3600    IN    TXT   "ico:10 btc:1AeCyEczAFPVKkvausLSQWP1jcqkccga9m"
ma.ttias.be.    3600    IN    TXT   "ico:10 ltc:Lh1TUmh2WP4LkCeDTm3kMX1E7NQYSKyMhW"
ma.ttias.be.    3600    IN    TXT   "ico:20 eth:0xE526E2Aecb8B7C77293D0aDf3156262e361BfF0e"
ma.ttias.be.    3600    IN    TXT   "ico:30 xrp:rDsbeomae4FXwgQTJp9Rs64Qg9vDiTCdBv"

Cryptocurrency addresses get published as TXT records to a domain of your choosing. Want to receive a payment? Simple say "send it to ma.ttias.be", the client will resolve that TXT record and the accompanying addresses and use the priority field as a guideline for choosing which address to pick first.

Think MX records, but implemented as TXT. The lower the priority, the more preferred it is.

The TXT format explained

A TXT format can contain pretty much anything, so it needs some standardization in order for this to work. Here's my proposal.

ico:[priority] space [currency]:[address]

Let's pick the first result as an example and tear it down.

$ dig ma.ttias.be TXT | sort | head -n 1
ma.ttias.be.    3600    IN    TXT   "ico:10 btc:1AeCyEczAFPVKkvausLSQWP1jcqkccga9m"

Translates to;

Simple, versatile format.

The priority allows for round robin implementations, if you wish to diversify your portfolio. Adding multiple cryptocurrency allows the sender the freedom to choose which currency he/she prefers, while still honoring your priority.

Technically, I published 2 records with a priority of 10. It's up to the sender to determine which currency he/she prefers, if it's available to them. If it isn't, they can move down the chain & try other addresses published.

It means only addresses on which you want to receive currency should ever be posted as DNS records.

DNSSEC required

To avoid DNS cache poisoning or other man-in-the-middle attacks, DNSSEC would have to be a hard requirement in order to guarantee integrity.

This should not be optional.

If smart people every end up implementing something like this, an additional GPG/PKI like solution might be added for increased security, by signing the addresses once more.

Currency agnostic

This isn't a solution for Bitcoin, Litecoin or Ripple. It's a solution for all of them. And all the new currencies to come.

It's entirely currency agnostic and can be used for any virtual currency.

Multi tenancy

If you want multiple users on the same domain, you could solve this via subdomains. Ie "john.domain.tld", "mary.domain.tld", ...

It makes the format of the TXT record plain and simple and uses basic DNS records for delegation of accounts.

Why not a dedicated resource record?

For the same reason the SPF resource record went away and was replaced by a TXT alternative: availability.

Every DNS server and client already understands TXT records. If we have to wait for both servers, clients and providers to implement something like a ICO resource record, it'll take ages. Just look at the current state of CAA records, only a handful of providers offer it, even though it's a mandatory CA thing already.

There are already simpler naming schemes for cryptocurrency!

Technically, yes, but they all have a deep flaw: you have to trust someone else's system.

There's BitAlias, onename, ens for ethereum, okTurtles, ... and they all build on top of their own, custom system.

But it turns out, we already have a name-translation-system called DNS, we'd be far better of implementing a readable cryptocurrency variant in DNS than in someone else's closed system.

The validator regex

With the example given above, it can easily be validated with the following regex.

ico:([0-9]+) ([a-z]{3}):([a-zA-Z0-9]+)

And it translates to;

A complete validator with the dig DNS client would translate to;

$ dig ma.ttias.be TXT | sort | grep -P 'ico:([0-9]+) ([a-z]{3}):([a-zA-Z0-9]+)'
ma.ttias.be.    3600    IN    TXT   "ico:10 btc:1AeCyEczAFPVKkvausLSQWP1jcqkccga9m"
ma.ttias.be.    3600    IN    TXT   "ico:10 ltc:Lh1TUmh2WP4LkCeDTm3kMX1E7NQYSKyMhW"
ma.ttias.be.    3600    IN    TXT   "ico:20 eth:0xE526E2Aecb8B7C77293D0aDf3156262e361BfF0e"
ma.ttias.be.    3600    IN    TXT   "ico:30 xrp:rDsbeomae4FXwgQTJp9Rs64Qg9vDiTCdBv"

Now, who's going to make this an RFC? I certainly won't, I've got too many things to do already.

The post A proposal for cryptocurrency addresses in DNS appeared first on ma.ttias.be.

18 Sep 2017 7:43pm GMT

Xavier Mertens: [SANS ISC] Getting some intelligence from malspam

I published the following diary on isc.sans.org: "Getting some intelligence from malspam".

Many of us are receiving a lot of malspam every day. By "malspam", I mean spam messages that contain a malicious document. This is one of the classic infection vectors today and aggressive campaigns are started every week. Usually, most of them are blocked by modern antivirus or anti-spam but these files could help us to get some intelligence about the topic used by attackers to fool their victims. By checking the names of malicious files (often .rar, .gip or .7r archives), we found classic words like 'invoice', 'reminder', 'urgent', etc… [Read more]

[The post [SANS ISC] Getting some intelligence from malspam has been first published on /dev/random]

18 Sep 2017 10:09am GMT

17 Sep 2017

feedPlanet Grep

Mattias Geniar: Chrome to force .dev domains to HTTPS via preloaded HSTS

The post Chrome to force .dev domains to HTTPS via preloaded HSTS appeared first on ma.ttias.be.

tl;dr: one of the next versions of Chrome is going to force all domains ending on .dev (and .foo) to be redirected to HTTPs via a preloaded HTTP Strict Transport Security (HSTS) header.


This very interesting commit just landed in Chromium:

Preload HSTS for the .dev gTLD.

This adds the following line to Chromium's preload lists;

{ "name": "dev", "include_subdomains": true, "mode": "force-https" },
{ "name": "foo", "include_subdomains": true, "mode": "force-https" },

It forces any domain on the .dev gTLD to be HTTPs.

Wait, there's a legit .dev gTLD?

Yes, unfortunately.

It's been bought by Google as one of their 100+ new gTLDs. What do they use it for? No clue. But it's going to cause a fair bit of confusion and pain to webdevelopers.

The .dev gTLD has nameservers and is basically like any other TLD out there, we as developers just happen to have chosen that name as a good placeholder for local development, too, overwriting the public DNS.

$ dig +trace dev. NS
dev.                    172800  IN      NS      ns-tld4.charlestonroadregistry.com.
dev.                    172800  IN      NS      ns-tld5.charlestonroadregistry.com.
dev.                    172800  IN      NS      ns-tld3.charlestonroadregistry.com.
dev.                    172800  IN      NS      ns-tld2.charlestonroadregistry.com.
dev.                    172800  IN      NS      ns-tld1.charlestonroadregistry.com.

Google publishes some of their domains on there, too;

$ dig +trace google.dev A
google.dev.             3600    IN      A       127.0.53.53

So yes, it's a legit TLD.

Consequences of redirecting .dev to HTTPS

A lot of (web) developers use a local .dev TLD for their own development. Either by adding records to their /etc/hosts file or by using a system like Laravel Valet, which runs a dnsmasq service on your system to translate *.dev to 127.0.0.1.

In those cases, if you browse to http://site.dev, you'll be redirect to https://site.dev, the HTTPS variant.

That means your local development machine needs to;

Such fun.

What should we do?

With .dev being an official gTLD, we're most likely better of changing our preferred local development suffix from .dev to something else.

There's an excellent proposal to add the .localhost domain as a new standard, which would be more appropriate here. It would mean we no longer have site.dev, but site.localhost. And everything at *.localhost would automatically translate to 127.0.0.1, without /etc/hosts or dnsmasq workarounds.

Alternatively, if you're looking for a quick "search and replace" alternative for existing setups, consider the .test gTLD, which is a reserved name by IETF for testing (or development) purposes.

I do hope the Chromium team reconsiders the preloaded HSTS as it's going to have rather big implications for local webdevelopment.

The post Chrome to force .dev domains to HTTPS via preloaded HSTS appeared first on ma.ttias.be.

17 Sep 2017 8:04am GMT

15 Sep 2017

feedPlanet Grep

Dries Buytaert: Don't blame open-source software for poor security practices

Last week, Equifax, one of the largest American credit agencies, was hit by a cyberattack that may have compromised the personal data of nearly 143 million people, including name, address, social security numbers, birth dates and more. The forfeited information reveals everything required to steal someone's identity or to take out a loan in someone else's name. Considering that the current US population is 321 million, this cyberattack is now considered to be one of the largest and most intrusive breaches in US history.

It's Equifax that is to blame, not open-source

As Equifax began to examine how the breach occurred, many unsubstantiated reports and theories surfaced in an attempt to pinpoint the vulnerability. One such theory targeted Apache Struts as the software responsible for the breach. Because Apache Struts is an open-source framework used for developing Java applications, this resulted in some unwarranted open-source shaming.

Yesterday, Equifax confirmed that the security breach was due to an Apache Struts vulnerability. However, here is what is important; it wasn't because Apache Struts is open-source or because open-source is less secure. Equifax was hacked because the firm failed to patch a well-known Apache Struts flaw that was disclosed months earlier in March. Running an old, insecure version of software - open-source or proprietary - can and will jeopardize the security of any site. It's Equifax that is to blame, not open-source.

The importance of keeping software up-to-date

The Equifax breach is a good reminder of why organizations need to remain vigilant about properly maintaining and updating their software, especially when security vulnerabilities have been disclosed. In an ideal world, software would update itself the moment a security patch is released. WordPress, for example, offers automatic updates in an effort to promote better security, and to streamline the update experience overall. It would be interesting to consider automatic security updates for Drupal (just for patch releases, not for minor or major releases).

In absence of automatic updates, I would encourage users to work with PaaS companies that keep not only your infrastructure secure, but also your Drupal application code. Too many organizations underestimate the effort and expertise it takes to do it themselves.

At Acquia, we provide customers with automatic security patching of both the infrastructure and Drupal code. We monitor our customers' sites for intrusion attempts, DDoS attacks, and other suspicious activity. If you prefer to do the security patching yourself, we offer continuous integration or continuous delivery tools that enable you to get security patches into production in minutes rather than weeks or months. We take pride in assisting our customers to keep their sites current with the latest patches and upgrades; it's good for our customers and helps dispel the myth that open-source software is more susceptible to security breaches.

15 Sep 2017 2:28pm GMT

Dries Buytaert: Don't blame open-source software for poor security practices

Last week, Equifax, one of the largest American credit agencies, was hit by a cyberattack that may have compromised the personal data of nearly 143 million people, including name, address, social security numbers, birth dates and more. The forfeited information reveals everything required to steal someone's identity or to take out a loan in someone else's name. Considering that the current US population is 321 million, this cyberattack is now considered to be one of the largest and most intrusive breaches in US history.

It's Equifax that is to blame, not open-source

As Equifax began to examine how the breach occurred, many unsubstantiated reports and theories surfaced in an attempt to pinpoint the vulnerability. One such theory targeted Apache Struts as the software responsible for the breach. Because Apache Struts is an open-source framework used for developing Java applications, this resulted in some unwarranted open-source shaming.

Yesterday, Equifax confirmed that the security breach was due to an Apache Struts vulnerability. However, here is what is important; it wasn't because Apache Struts is open-source or because open-source is less secure. Equifax was hacked because the firm failed to patch a well-known Apache Struts flaw that was disclosed months earlier in March. Running an old, insecure version of software - open-source or proprietary - can and will jeopardize the security of any site. It's Equifax that is to blame, not open-source.

The importance of keeping software up-to-date

The Equifax breach is a good reminder of why organizations need to remain vigilant about properly maintaining and updating their software, especially when security vulnerabilities have been disclosed. In an ideal world, software would update itself the moment a security patch is released. WordPress, for example, offers automatic updates in an effort to promote better security, and to streamline the update experience overall. It would be interesting to consider automatic security updates for Drupal (just for patch releases, not for minor or major releases).

In absence of automatic updates, I would encourage users to work with PaaS companies that keep not only your infrastructure secure, but also your Drupal application code. Too many organizations underestimate the effort and expertise it takes to do it themselves.

At Acquia, we provide customers with automatic security patching of both the infrastructure and Drupal code. We monitor our customers' sites for intrusion attempts, DDoS attacks, and other suspicious activity. If you prefer to do the security patching yourself, we offer continuous integration or continuous delivery tools that enable you to get security patches into production in minutes rather than weeks or months. We take pride in assisting our customers to keep their sites current with the latest patches and upgrades; it's good for our customers and helps dispel the myth that open-source software is more susceptible to security breaches.

15 Sep 2017 2:28pm GMT

14 Sep 2017

feedPlanet Grep

Mattias Geniar: Laravel Horizon: requires ext-posix, missing from CentOS

The post Laravel Horizon: requires ext-posix, missing from CentOS appeared first on ma.ttias.be.

Here's what I ran into when I tried to install a project that required laravel/horizon via Composer.

$ composer install
Loading composer repositories with package information
Installing dependencies (including require-dev) from lock file
Your requirements could not be resolved to an installable set of packages.

  Problem 1
    - Installation request for laravel/horizon v1.0.3 -> satisfiable by laravel/horizon[v1.0.3].
    - laravel/horizon v1.0.3 requires ext-posix * -> the requested PHP extension posix is missing from your system.
...

The error message requires ext-posix * -> the requested PHP extension posix is missing from your system is actually confusing. On CentOS, there's no PHP package called 'posix', even though the PHP module is called POSIX.

$ php -m | grep posix

(If that doesn't return any results, the posix extension is missing.)

On CentOS, the package you're looking for is called process, as it contains a set of functions/methods to help with creating child processes, sending signals, parsing ID/GIDs, ...

If you're using the IUS repositories on CentOS/Red Hat, you can install them via;

$ yum install php71u-process

Afterwards, if you run composer again, it'll work. To verify if the posix extension is installed properly, run php -m again.

 $ php -m | grep posix
posix

Now, the posix extension is installed.

The post Laravel Horizon: requires ext-posix, missing from CentOS appeared first on ma.ttias.be.

14 Sep 2017 7:55pm GMT

Xavier Mertens: [SANS ISC] Another webshell, another backdoor!

I published the following diary on isc.sans.org: "Another webshell, another backdoor!".

I'm still busy to follow how webshells are evolving… I recently found another backdoor in another webshell called "cor0.id". The best place to find webshells remind pastebin.com[1]. When I'm testing a webshell, I copy it in a VM located on a "wild Internet" VLAN in my home lab with, amongst other controls, full packet capture enabled. This way, I can spot immediately is the VM is trying to "phone home" to some external hosts. This was the case this time! [Read more]

[The post [SANS ISC] Another webshell, another backdoor! has been first published on /dev/random]

14 Sep 2017 10:57am GMT

Mattias Geniar: Presentation: Code Obfuscation, PHP shells & more

The post Presentation: Code Obfuscation, PHP shells & more appeared first on ma.ttias.be.

"What hackers do once they get past your code."

I gave this talk a while back and refreshed it a bit for DrupalCamp Antwerpen last Friday. It's a very fun topic where I get to show the results of a compromised website or server, from a hosting point of view.

The slides are pretty self-explanatory, but last time I checked there was also a video recording of the talk. If that makes it online, I'll make sure to add it here.

The post Presentation: Code Obfuscation, PHP shells & more appeared first on ma.ttias.be.

14 Sep 2017 7:28am GMT

13 Sep 2017

feedPlanet Grep

Dries Buytaert: Who sponsors Drupal development? (2016-2017 edition)

Last year, Matthew Tift and I examined Drupal.org's commit data to understand who develops Drupal, how much of that work is sponsored, and where that sponsorship comes from. We published our analysis in a blog post called "Who Sponsors Drupal Development?". A year later, I wanted to present an update. This year's report will also cover additional data, including gender and geographical diversity, and project sponsorship.

Understanding how an open-source project works is important because it establishes a benchmark for project health and scalability. Scaling an open-source project is a difficult task. As an open-source project's rate of adoption grows, the number of people that benefit from the project also increases. Often the open-source project also becomes more complex as it expands, which means that the economic reward of helping to improve the project decreases.

A recent article on the Bitcoin and Ethereum contributor communities illustrates this disparity perfectly. Ethereum and Bitcoin have market capitalizations valued at $30 billion and $70 billion, respectively. However, both projects have fewer than 40 meaningful contributors, and contribution isn't growing despite the rising popularity of cryptocurrency.

Number of Bitcoin contributors between 2010 and 2017According to Bitcoin's GitHub data, Bitcoin has less than 40 active contributors.Number of Ethereum contributors between 2014 and 2017According to Ethereum's GitHub data, Ethereum has less than 20 active contributors.

Drupal, by comparison, has a diverse community of contributors. In the 12-month period between July 1, 2016 to June 30, 2017 we saw code contributions on Drupal.org from 7,240 different individuals and 889 different companies. This does not mean that Drupal is exempt from the challenges of scaling an open-source project. We hope that this report provides transparency about Drupal project development and encourages more individuals and organizations incentive to contribute. We also will highlight areas where our community can and should do better.

What is the Drupal.org credit system?

In the spring of 2015, after proposing ideas for giving credit and discussing various approaches at length, Drupal.org added the ability for people to attribute their work to an organization or customer in the Drupal.org issue queues. Maintainers of Drupal modules, themes and distributions can award issues credits to people who help resolve issues with code, translations, documentation, design and more.

Example issue credit on drupal orgA screenshot of an issue comment on Drupal.org. You can see that jamadar worked on this patch as a volunteer, but also as part of his day job working for TATA Consultancy Services on behalf of their customer, Pfizer.

Credits are a powerful motivator for both individuals and organizations. Accumulating credits provides individuals with a way to showcase their expertise. Organizations can utilize credits to help recruit developers or to increase their visibility in the Drupal.org marketplace.

While the benefits are evident, it is important to note a few of the limitations in Drupal.org's current credit system:

Who is working on Drupal?

For our analysis we looked at all the issues that were marked "closed" or "fixed" in the 12-month period from July 1, 2016 to June 30, 2017. What we learned is that there were 23,238 issues marked "closed" or "fixed", a 22% increase from the 19,095 issues in the 2015-2016 period. Those 23,238 issues had 42,449 issue credits, a 30% increase from the 32,711 issue credits recorded in the previous year. Issue credits against Drupal core remained roughly the same year over year, meaning almost all of this growth came from increased activity in contributed projects. This is no surprise. Drupal development is cyclical, and during this period of the Drupal 8 development cycle, most of the Drupal community has been focused on porting modules from Drupal 7 to Drupal 8. Of the 42,449 issue credits reported this year, 20% (8,619 credits) were for Drupal core, while 80% (33,830 credits) went to contributed themes, modules and distributions.

Compared to the previous year, we also saw an increase in both the number of people contributing and the number of organizations contributing. Drupal.org received code contributions from 7,240 different individuals and 889 different organizations.

Contributions by individuals vs organizationsThe number of individual contributors is up 28% year over year and the number of organizations contributing is up 26% year over year.

While the number of individual contributors rose, a relatively small number of individuals still do the majority of the work. Approximately 47% of individual contributors received just one credit. Meanwhile, the top 30 contributors (the top 0.4%) account for over 17% of the total credits, indicating that these individuals put an incredible amount of time and effort in developing Drupal and its contributed projects:

Rank Username Issues
1 jrockowitz 537
2 dawehner 421
3 RenatoG 408
4 bojanz 351
5 Berdir 335
6 mglaman 334
7 Wim Leers 332
8 alexpott 329
9 DamienMcKenna 245
10 jhodgdon 242
11 drunken monkey 238
12 naveenvalecha 196
13 Munavijayalakshmi 192
14 borisson_ 191
15 yongt9412 189
16 klausi 185
17 Sam152 184
18 miro_dietiker 182
19 Pavan B S 180
20 ajay_reddy 176
21 phenaproxima 172
22 sanchiz 162
23 slashrsm 161
24 jhedstrom 155
25 xjm 151
26 catch 147
27 larowlan 145
28 rakesh.gectcr 141
29 benjy 139
30 dhruveshdtripathi 138


Out of the top 30 contributors featured, 19 were also recognized as top contributors in our 2015-2016 report. These Drupalists' dedication and continued contribution to the project has been crucial to Drupal's development. It's also exciting to see 11 new names on the list. This mobility is a testament to the community's evolution and growth.

Next, we looked at both the gender and geographic diversity of Drupal.org code contributors. While these are only two examples of diversity, this is the only available data that contributors can choose to share on their Drupal.org profiles. The reported data shows that only 6% of the recorded contributions were made by contributors that identify as female, which indicates a steep gender gap. Like in most open-source projects, the gender imbalance in Drupal is profound and underscores the need to continue fostering diversity and inclusion in our community.

Contributions by genderThe gender representation behind the issue credits. Only 6% of the recorded contributions are by women. When measuring geographic diversity, we saw individual contributors from 6 different continents and 116 different countries: Contributions by continentContributions by countryThe top 20 countries from which contributions originate. The data is compiled by aggregating the countries of all individual contributors behind each commit. Note that the geographical location of contributors doesn't always correspond with the origin of their sponsorship. Wim Leers, for example, works from Belgium, but his funding comes from Acquia, which has the majority of its customers in North America.

How much of the work is sponsored?

Drupal is used by more than one million websites. The vast majority of the individuals and organizations behind these Drupal websites never participate in the development of the project. They might use the software as it is or might not feel the need to help drive its development. We have to provide more incentive for these individuals and organizations to contribute back to the project.

Issue credits can be marked as "volunteer" and "sponsored" simultaneously (shown in jamadar's screenshot near the top of this post). This could be the case when a contributor does the minimum required work to satisfy the customer's need, in addition to using their spare time to add extra functionality.

While Drupal started out as a 100% volunteer-driven project, today the majority of the code on Drupal.org is sponsored by organizations. Only 11% of the commit credits that we examined in 2016-2017 were "purely volunteer" credits (4,498 credits), in stark contrast to the 46% that were "purely sponsored". In other words, there were four times as many "purely sponsored" credits as "purely volunteer" credits.

A few comparisons with the 2015-2016 data:

Contributions by volunteer vs sponsored

No data is perfect, but it feels safe to conclude that most of the work on Drupal is sponsored. At the same time, the data shows that volunteer contribution remains very important to Drupal. Maybe most importantly, while the number of volunteers and sponsors has grown year over year in absolute terms, sponsored contributions appear to be growing faster than volunteer contributions. This is consistent with how open source projects grow and scale.

Who is sponsoring the work?

Now that we have established that most of the work on Drupal is sponsored, we want to study which organizations contribute to Drupal. While 889 different organizations contributed to Drupal, approximately 50% of them received four credits or fewer. The top 30 organizations (roughly the top 3%) account for about 48% of the total credits, which implies that the top 30 companies play a crucial role in the health of the Drupal project. The graph below shows the top 30 organizations and the number of credits they received between July 1, 2016 and June 30, 2017:

Top 30 organizations contributing to DrupalThe top 30 contributing organizations based on the number of Drupal.org commit credits.

While not immediately obvious from the graph above, different types of companies are active in Drupal's ecosystem:

Category Description
Traditional Drupal businesses Small-to-medium-sized professional services companies that make money primarily using Drupal. They typically employ fewer than 100 employees, and because they specialize in Drupal, many of these professional services companies contribute frequently and are a huge part of our community. Examples are Chapter Three (shown on graph) and Lullabot (shown on graph).
Digital marketing agencies Larger full-service agencies that have marketing-led practices using a variety of tools, typically including Drupal, Adobe Experience Manager, Sitecore, WordPress, etc. They tend to be larger, with the larger agencies employing thousands of people. Examples are Wunderman and Mirum.
System integrators Larger companies that specialize in bringing together different technologies into one solution. Example system agencies are Accenture, TATA Consultancy Services, Capgemini and CI&T.
Technology and infrastructure companies Examples are Acquia (shown on graph), Lingotek, BlackMesh, Rackspace, Pantheon and Platform.sh.
End-users Examples are Pfizer (shown on graph) or NBCUniversal.


A few observations:

Contributions by technology companiesSponsored code contributions to Drupal.org from technology and infrastructure companies. The chart does not reflect sponsored code contributions on GitHub, Drupal event sponsorship, and the many forms of value that these companies add to Drupal and other open-source communities.

We can conclude that technology and infrastructure companies, digital marketing agencies, system integrators and end-users are not meaningfully contributing code to Drupal.org today. How can we explain this disparity in comparison to traditional Drupal businesses who contribute the most? We believe the biggest reasons are:

  1. Drupal's strategic importance. A variety of the traditional Drupal agencies have been involved with Drupal for 10 years and almost entirely depend on Drupal to support their business. Given both their expertise and dependence on Drupal, they are most likely to look after Drupal's development and well-being. These organizations are typically recognized as Drupal experts and are sought out by organizations that want to build a Drupal website. Contrast this with most of the digital marketing agencies and system integrators who are sized to work with a diversified portfolio of content management platforms and who are historically only getting started with Drupal and open source. They deliver digital marketing solutions and aren't necessarily sought out for their Drupal expertise. As their Drupal practices grow in size and importance, this could change. In fact, contributing to Drupal can help grow their Drupal business because it helps their name stand out as Drupal experts and gives them a competitive edge with their customers.
  2. The level of experience with Drupal and open source. Drupal aside, many organizations have little or no experience with open source, so it is important that we motivate and teach them to contribute.
  3. Legal reservations. We recognize that some organizations are not legally permitted to contribute, let alone attribute their customers. We hope that will change as open source continues to get adopted.
  4. Tools and process barriers. Drupal contribution still involves a patch-based workflow on Drupal.org's unique issue queue system. This presents a fairly steep learning curve to most developers, who primarily work with more modern and common tools such as GitHub. Getting the code change proposal uploaded is just the first step; getting code changes accepted into an upstream Drupal project - especially Drupal core - is hard work. Peer reviews, gates such as automated testing and documentation, required sign-offs from maintainers and committers, knowledge of best practices and other community norms are a few of the challenges a contributor must face to get code accepted into Drupal.

Consequently, this data shows that the Drupal community can do more to entice companies to contribute code to Drupal.org. The Drupal community has a long tradition of encouraging organizations to share code rather than keep it behind firewalls. While the spirit of the Drupal project cannot be reduced to any single ideology - not every organization can or will share their code - we would like to see organizations continue to prioritize collaboration over individual ownership. Our aim is not to criticize those who do not contribute, but rather to help foster an environment worthy of contribution. Given the vast amount of Drupal users, we believe continuing to encourage organizations and end-users to contribute could be a big opportunity.

There are substantial benefits and business drivers for organizations that contribute: (1) it improves their ability to sell and win deals and (2) it improves their ability to hire. Companies that contribute to Drupal tend to promote their contributions in RFPs and sales pitches. Contributing to Drupal also results in being recognized as a great place to work for Drupal experts.

The uneasy alliance with corporate contributions

As mentioned above, when community-driven open-source projects grow, there is a bigger need for organizations to help drive their development. It almost always creates an uneasy alliance between volunteers and corporations.

This theory played out in the Linux community well before it played out in the Drupal community. The Linux project is 25 years old and has seen a steady increase in the number of corporate contributors for roughly 20 years. While Linux companies like Red Hat and SUSE rank high on the contribution list, so do non-Linux-centric companies such as Samsung, Intel, Oracle and Google. All of these corporate contributors are (or were) using Linux as an integral part of their business.

The 889 organizations that contribute to Drupal (which includes corporations) is more than four times the number of organizations that sponsor development of the Linux kernel. This is significant because Linux is considered "one of the largest cooperative software projects ever attempted". In fairness, Linux has a different ecosystem than Drupal. The Linux business ecosystem has various large organizations (Red Hat, Google, Intel, IBM and SUSE) for whom Linux is very strategic. As a result, many of them employ dozens of full-time Linux contributors and invest millions of dollars in Linux each year.

What projects have sponsors?

In total, the Drupal community worked on 3,183 different projects (modules, themes and distributions) in the 12-month period between July 1, 2016 to June 30, 2017. To understand where the organizations sponsoring Drupal put their money, I've listed the top 20 most sponsored projects:

Rank Project name Issues
1 Drupal core 4745
2 Drupal Commerce (distribution) 526
3 Webform 361
4 Open Y (distribution) 324
5 Paragraphs 231
6 Inmail 223
7 User guide 218
8 JSON API 204
9 Paragraphs collection 200
10 Entity browser 196
11 Diff 190
12 Group 170
13 Metatag 157
14 Facets 155
15 Commerce Point of Sale (PoS) 147
16 Search API 143
17 Open Social (distribution) 133
18 Drupal voor Gemeenten (distribution) 131
19 Solr Search 122
20 Geolocation field 118


Who is sponsoring the top 30 contributors?

Rank Username Issues Volunteer Sponsored Not specified Sponsors
1 jrockowitz 537 88% 45% 9% The Big Blue House (239), Kennesaw State University (6), Memorial Sloan Kettering Cancer Center (4)
2 dawehner 421 67% 83% 5% Chapter Three (328), Tag1 Consulting (19), Drupal Association (12), Acquia (5), Comm-press (1)
3 RenatoG 408 0% 100% 0% CI&T (408)
4 bojanz 351 0% 95% 5% Commerce Guys (335), Adapt A/S (38), Bluespark (2)
5 Berdir 335 0% 93% 7% MD Systems (310), Acquia (7)
6 mglaman 334 3% 97% 1% Commerce Guys (319), Thinkbean, LLC (48), LivePerson, Inc (46), Bluespark (22), Universal Music Group (16), Gaggle.net, Inc. (3), Bluehorn Digital (1)
7 Wim Leers 332 14% 87% 2% Acquia (290)
8 alexpott 329 7% 99% 1% Chapter Three (326), TES Global (1)
9 DamienMcKenna 245 2% 95% 4% Mediacurrent (232)
10 jhodgdon 242 0% 1% 99% Drupal Association (2), Poplar ProductivityWare (2)
11 drunken monkey 238 95% 11% 1% Acquia (17), Vizala (8), Wunder Group (1), Sunlime IT Services GmbH (1)
12 naveenvalecha 196 74% 55% 1% Acquia (152), Google Summer of Code (7), QED42 (1)
13 Munavijayalakshmi 192 0% 100% 0% Valuebound (192)
14 borisson_ 191 66% 39% 22% Dazzle (70), Acquia (6)
15 yongt9412 189 0% 97% 3% MD Systems (183), Acquia (6)
16 klausi 185 9% 61% 32% epiqo (112)
17 Sam152 184 59% 92% 7% PreviousNext (168), amaysim Australia Ltd. (5), Code Drop (2)
18 miro_dietiker 182 0% 99% 1% MD Systems (181)
19 Pavan B S 180 0% 98% 2% Valuebound (177)
20 ajay_reddy 176 100% 99% 0% Valuebound (180), Drupal Bangalore Community (154)
21 phenaproxima 172 0% 99% 1% Acquia (170)
22 sanchiz 162 0% 99% 1% Drupal Ukraine Community (107), Vinzon (101), FFW (60), Open Y (52)
23 slashrsm 161 6% 95% 3% MD Systems (153), Acquia (47)
24 jhedstrom 155 4% 92% 4% Phase2 (143), Workday, Inc. (134), Memorial Sloan Kettering Cancer Center (1)
25 xjm 151 0% 91% 9% Acquia (137)
26 catch 147 3% 83% 16% Third and Grove (116), Tag1 Consulting (6)
27 larowlan 145 12% 92% 7% PreviousNext (133), University of Technology, Sydney (30), amaysim Australia Ltd. (6), Australian Competition and Consumer Commission (ACCC) (1), Department of Justice & Regulation, Victoria (1)
28 rakesh.gectcr 141 100% 91% 0% Valuebound (128)
29 benjy 139 0% 94% 6% PreviousNext (129), Brisbane City Council (8), Code Drop (1)
30 dhruveshdtripathi 138 15% 100% 0% DevsAdda (138), OpenSense Labs (44)


We observe that the top 30 contributors are sponsored by 46 organizations. This kind of diversity is aligned with our desire not to see Drupal controlled by a single organization. These top contributors and organizations are from many different parts of the world and work with customers large and small. Nonetheless, we will continue to benefit from more diversity.

Evolving the credit system

Like Drupal itself, the credit system on Drupal.org is an evolving tool. Ultimately, the credit system will only be useful when the community uses it, understands its shortcomings, and suggests constructive improvements. In highlighting the organizations that sponsor the development of code on Drupal.org, we hope to elicit responses that help evolve the credit system into something that incentivizes business to sponsor more work and enables more people to participate in our community, learn from others, teach newcomers and make positive contributions. Drupal is a positive force for change and we wish to use the credit system to highlight (at least some of) the work of our diverse community, which includes volunteers, companies, nonprofits, governments, schools, universities, individuals, and other groups.

One of the challenges with the existing credit system is it has no way of "weighting" contributions. A typo fix counts just as much as giving multiple detailed technical reviews on a critical core issue. This appears to have the effect of incentivizing organizations' employees to work on "lower-hanging fruit issues", because this bumps their companies' names in the rankings. One way to help address this might be to adjust the credit ranking algorithm to consider things such as issue priority, patch size, and so on. This could help incentivize companies to work on larger and more important problems and save coding standards improvements for new contributor sprints. Implementing a scoring system that ranks the complexity of an issue would also allow us to develop more accurate reports of contributed work.

Conclusion

Our data confirms Drupal is a vibrant community full of contributors who are constantly evolving and improving the software. While we have amazing geographic diversity, we need greater gender diversity. Our analysis of the Drupal.org credit data concludes that most contributions to Drupal are sponsored. At the same time, the data shows that volunteer contribution remains very important to Drupal.

As a community, we need to understand that a healthy open-source ecosystem includes more than traditional Drupal businesses that contribute the most. For example, we don't see a lot of contribution from the larger digital marketing agencies, system integrators, technology companies, or end-users of Drupal - we believe that might come as these organizations build out their Drupal practices and Drupal becomes more strategic for them.

To grow and sustain Drupal, we should support those that contribute to Drupal and find ways to get those that are not contributing involved in our community. We invite you to help us continue to strengthen our ecosystem.

Special thanks to Tim Lehnen and Neil Drumm from the Drupal Association for providing us with the Drupal.org credit system data and for supporting us during our research. I would also like to extend a special thanks to Matthew Tift for helping to lay the foundation for this research, collaborating on last year's blog post, and for reviewing this year's edition. Finally, thanks to Angie Byron, Gábor Hojtsy, Jess (xjm), Preston So, Ted Bowman, Wim Leers and Gigi Anderson for providing feedback during the writing process.

13 Sep 2017 11:46am GMT

Dries Buytaert: Who sponsors Drupal development? (2016-2017 edition)

Last year, Matthew Tift and I examined Drupal.org's commit data to understand who develops Drupal, how much of that work is sponsored, and where that sponsorship comes from. We published our analysis in a blog post called "Who Sponsors Drupal Development?". A year later, I wanted to present an update. This year's report will also cover additional data, including gender and geographical diversity, and project sponsorship.

Understanding how an open-source project works is important because it establishes a benchmark for project health and scalability. Scaling an open-source project is a difficult task. As an open-source project's rate of adoption grows, the number of people that benefit from the project also increases. Often the open-source project also becomes more complex as it expands, which means that the economic reward of helping to improve the project decreases.

A recent article on the Bitcoin and Ethereum contributor communities illustrates this disparity perfectly. Ethereum and Bitcoin have market capitalizations valued at $30 billion and $70 billion, respectively. However, both projects have fewer than 40 meaningful contributors, and contribution isn't growing despite the rising popularity of cryptocurrency.

Number of Bitcoin contributors between 2010 and 2017According to Bitcoin's GitHub data, Bitcoin has less than 40 active contributors.Number of Ethereum contributors between 2014 and 2017According to Ethereum's GitHub data, Ethereum has less than 20 active contributors.

Drupal, by comparison, has a diverse community of contributors. In the 12-month period between July 1, 2016 to June 30, 2017 we saw code contributions on Drupal.org from 7,240 different individuals and 889 different companies. This does not mean that Drupal is exempt from the challenges of scaling an open-source project. We hope that this report provides transparency about Drupal project development and encourages more individuals and organizations incentive to contribute. We also will highlight areas where our community can and should do better.

What is the Drupal.org credit system?

In the spring of 2015, after proposing ideas for giving credit and discussing various approaches at length, Drupal.org added the ability for people to attribute their work to an organization or customer in the Drupal.org issue queues. Maintainers of Drupal modules, themes and distributions can award issues credits to people who help resolve issues with code, translations, documentation, design and more.

Example issue credit on drupal orgA screenshot of an issue comment on Drupal.org. You can see that jamadar worked on this patch as a volunteer, but also as part of his day job working for TATA Consultancy Services on behalf of their customer, Pfizer.

Credits are a powerful motivator for both individuals and organizations. Accumulating credits provides individuals with a way to showcase their expertise. Organizations can utilize credits to help recruit developers or to increase their visibility in the Drupal.org marketplace.

While the benefits are evident, it is important to note a few of the limitations in Drupal.org's current credit system:

Who is working on Drupal?

For our analysis we looked at all the issues that were marked "closed" or "fixed" in the 12-month period from July 1, 2016 to June 30, 2017. What we learned is that there were 23,238 issues marked "closed" or "fixed", a 22% increase from the 19,095 issues in the 2015-2016 period. Those 23,238 issues had 42,449 issue credits, a 30% increase from the 32,711 issue credits recorded in the previous year. Issue credits against Drupal core remained roughly the same year over year, meaning almost all of this growth came from increased activity in contributed projects. This is no surprise. Drupal development is cyclical, and during this period of the Drupal 8 development cycle, most of the Drupal community has been focused on porting modules from Drupal 7 to Drupal 8. Of the 42,449 issue credits reported this year, 20% (8,619 credits) were for Drupal core, while 80% (33,830 credits) went to contributed themes, modules and distributions.

Compared to the previous year, we also saw an increase in both the number of people contributing and the number of organizations contributing. Drupal.org received code contributions from 7,240 different individuals and 889 different organizations.

Contributions by individuals vs organizationsThe number of individual contributors is up 28% year over year and the number of organizations contributing is up 26% year over year.

While the number of individual contributors rose, a relatively small number of individuals still do the majority of the work. Approximately 47% of individual contributors received just one credit. Meanwhile, the top 30 contributors (the top 0.4%) account for over 17% of the total credits, indicating that these individuals put an incredible amount of time and effort in developing Drupal and its contributed projects:

Rank Username Issues
1 jrockowitz 537
2 dawehner 421
3 RenatoG 408
4 bojanz 351
5 Berdir 335
6 mglaman 334
7 Wim Leers 332
8 alexpott 329
9 DamienMcKenna 245
10 jhodgdon 242
11 drunken monkey 238
12 naveenvalecha 196
13 Munavijayalakshmi 192
14 borisson_ 191
15 yongt9412 189
16 klausi 185
17 Sam152 184
18 miro_dietiker 182
19 Pavan B S 180
20 ajay_reddy 176
21 phenaproxima 172
22 sanchiz 162
23 slashrsm 161
24 jhedstrom 155
25 xjm 151
26 catch 147
27 larowlan 145
28 rakesh.gectcr 141
29 benjy 139
30 dhruveshdtripathi 138


Out of the top 30 contributors featured, 19 were also recognized as top contributors in our 2015-2016 report. These Drupalists' dedication and continued contribution to the project has been crucial to Drupal's development. It's also exciting to see 11 new names on the list. This mobility is a testament to the community's evolution and growth.

Next, we looked at both the gender and geographic diversity of Drupal.org code contributors. While these are only two examples of diversity, this is the only available data that contributors can choose to share on their Drupal.org profiles. The reported data shows that only 6% of the recorded contributions were made by contributors that identify as female, which indicates a steep gender gap. Like in most open-source projects, the gender imbalance in Drupal is profound and underscores the need to continue fostering diversity and inclusion in our community.

Contributions by genderThe gender representation behind the issue credits. Only 6% of the recorded contributions are by women. When measuring geographic diversity, we saw individual contributors from 6 different continents and 116 different countries: Contributions by continentContributions by countryThe top 20 countries from which contributions originate. The data is compiled by aggregating the countries of all individual contributors behind each commit. Note that the geographical location of contributors doesn't always correspond with the origin of their sponsorship. Wim Leers, for example, works from Belgium, but his funding comes from Acquia, which has the majority of its customers in North America.

How much of the work is sponsored?

Drupal is used by more than one million websites. The vast majority of the individuals and organizations behind these Drupal websites never participate in the development of the project. They might use the software as it is or might not feel the need to help drive its development. We have to provide more incentive for these individuals and organizations to contribute back to the project.

Issue credits can be marked as "volunteer" and "sponsored" simultaneously (shown in jamadar's screenshot near the top of this post). This could be the case when a contributor does the minimum required work to satisfy the customer's need, in addition to using their spare time to add extra functionality.

While Drupal started out as a 100% volunteer-driven project, today the majority of the code on Drupal.org is sponsored by organizations. Only 11% of the commit credits that we examined in 2016-2017 were "purely volunteer" credits (4,498 credits), in stark contrast to the 46% that were "purely sponsored". In other words, there were four times as many "purely sponsored" credits as "purely volunteer" credits.

A few comparisons with the 2015-2016 data:

Contributions by volunteer vs sponsored

No data is perfect, but it feels safe to conclude that most of the work on Drupal is sponsored. At the same time, the data shows that volunteer contribution remains very important to Drupal. Maybe most importantly, while the number of volunteers and sponsors has grown year over year in absolute terms, sponsored contributions appear to be growing faster than volunteer contributions. This is consistent with how open source projects grow and scale.

Who is sponsoring the work?

Now that we have established that most of the work on Drupal is sponsored, we want to study which organizations contribute to Drupal. While 889 different organizations contributed to Drupal, approximately 50% of them received four credits or fewer. The top 30 organizations (roughly the top 3%) account for about 48% of the total credits, which implies that the top 30 companies play a crucial role in the health of the Drupal project. The graph below shows the top 30 organizations and the number of credits they received between July 1, 2016 and June 30, 2017:

Top 30 organizations contributing to DrupalThe top 30 contributing organizations based on the number of Drupal.org commit credits.

While not immediately obvious from the graph above, different types of companies are active in Drupal's ecosystem:

Category Description
Traditional Drupal businesses Small-to-medium-sized professional services companies that make money primarily using Drupal. They typically employ fewer than 100 employees, and because they specialize in Drupal, many of these professional services companies contribute frequently and are a huge part of our community. Examples are Chapter Three (shown on graph) and Lullabot (shown on graph).
Digital marketing agencies Larger full-service agencies that have marketing-led practices using a variety of tools, typically including Drupal, Adobe Experience Manager, Sitecore, WordPress, etc. They tend to be larger, with the larger agencies employing thousands of people. Examples are Wunderman and Mirum.
System integrators Larger companies that specialize in bringing together different technologies into one solution. Example system agencies are Accenture, TATA Consultancy Services, Capgemini and CI&T.
Technology and infrastructure companies Examples are Acquia (shown on graph), Lingotek, BlackMesh, Rackspace, Pantheon and Platform.sh.
End-users Examples are Pfizer (shown on graph) or NBCUniversal.


A few observations:

Contributions by technology companiesSponsored code contributions to Drupal.org from technology and infrastructure companies. The chart does not reflect sponsored code contributions on GitHub, Drupal event sponsorship, and the many forms of value that these companies add to Drupal and other open-source communities.

We can conclude that technology and infrastructure companies, digital marketing agencies, system integrators and end-users are not meaningfully contributing code to Drupal.org today. How can we explain this disparity in comparison to traditional Drupal businesses who contribute the most? We believe the biggest reasons are:

  1. Drupal's strategic importance. A variety of the traditional Drupal agencies have been involved with Drupal for 10 years and almost entirely depend on Drupal to support their business. Given both their expertise and dependence on Drupal, they are most likely to look after Drupal's development and well-being. These organizations are typically recognized as Drupal experts and are sought out by organizations that want to build a Drupal website. Contrast this with most of the digital marketing agencies and system integrators who are sized to work with a diversified portfolio of content management platforms and who are historically only getting started with Drupal and open source. They deliver digital marketing solutions and aren't necessarily sought out for their Drupal expertise. As their Drupal practices grow in size and importance, this could change. In fact, contributing to Drupal can help grow their Drupal business because it helps their name stand out as Drupal experts and gives them a competitive edge with their customers.
  2. The level of experience with Drupal and open source. Drupal aside, many organizations have little or no experience with open source, so it is important that we motivate and teach them to contribute.
  3. Legal reservations. We recognize that some organizations are not legally permitted to contribute, let alone attribute their customers. We hope that will change as open source continues to get adopted.
  4. Tools and process barriers. Drupal contribution still involves a patch-based workflow on Drupal.org's unique issue queue system. This presents a fairly steep learning curve to most developers, who primarily work with more modern and common tools such as GitHub. Getting the code change proposal uploaded is just the first step; getting code changes accepted into an upstream Drupal project - especially Drupal core - is hard work. Peer reviews, gates such as automated testing and documentation, required sign-offs from maintainers and committers, knowledge of best practices and other community norms are a few of the challenges a contributor must face to get code accepted into Drupal.

Consequently, this data shows that the Drupal community can do more to entice companies to contribute code to Drupal.org. The Drupal community has a long tradition of encouraging organizations to share code rather than keep it behind firewalls. While the spirit of the Drupal project cannot be reduced to any single ideology - not every organization can or will share their code - we would like to see organizations continue to prioritize collaboration over individual ownership. Our aim is not to criticize those who do not contribute, but rather to help foster an environment worthy of contribution. Given the vast amount of Drupal users, we believe continuing to encourage organizations and end-users to contribute could be a big opportunity.

There are substantial benefits and business drivers for organizations that contribute: (1) it improves their ability to sell and win deals and (2) it improves their ability to hire. Companies that contribute to Drupal tend to promote their contributions in RFPs and sales pitches. Contributing to Drupal also results in being recognized as a great place to work for Drupal experts.

The uneasy alliance with corporate contributions

As mentioned above, when community-driven open-source projects grow, there is a bigger need for organizations to help drive their development. It almost always creates an uneasy alliance between volunteers and corporations.

This theory played out in the Linux community well before it played out in the Drupal community. The Linux project is 25 years old and has seen a steady increase in the number of corporate contributors for roughly 20 years. While Linux companies like Red Hat and SUSE rank high on the contribution list, so do non-Linux-centric companies such as Samsung, Intel, Oracle and Google. All of these corporate contributors are (or were) using Linux as an integral part of their business.

The 889 organizations that contribute to Drupal (which includes corporations) is more than four times the number of organizations that sponsor development of the Linux kernel. This is significant because Linux is considered "one of the largest cooperative software projects ever attempted". In fairness, Linux has a different ecosystem than Drupal. The Linux business ecosystem has various large organizations (Red Hat, Google, Intel, IBM and SUSE) for whom Linux is very strategic. As a result, many of them employ dozens of full-time Linux contributors and invest millions of dollars in Linux each year.

What projects have sponsors?

In total, the Drupal community worked on 3,183 different projects (modules, themes and distributions) in the 12-month period between July 1, 2016 to June 30, 2017. To understand where the organizations sponsoring Drupal put their money, I've listed the top 20 most sponsored projects:

Rank Project name Issues
1 Drupal core 4745
2 Drupal Commerce (distribution) 526
3 Webform 361
4 Open Y (distribution) 324
5 Paragraphs 231
6 Inmail 223
7 User guide 218
8 JSON API 204
9 Paragraphs collection 200
10 Entity browser 196
11 Diff 190
12 Group 170
13 Metatag 157
14 Facets 155
15 Commerce Point of Sale (PoS) 147
16 Search API 143
17 Open Social (distribution) 133
18 Drupal voor Gemeenten (distribution) 131
19 Solr Search 122
20 Geolocation field 118


Who is sponsoring the top 30 contributors?

Rank Username Issues Volunteer Sponsored Not specified Sponsors
1 jrockowitz 537 88% 45% 9% The Big Blue House (239), Kennesaw State University (6), Memorial Sloan Kettering Cancer Center (4)
2 dawehner 421 67% 83% 5% Chapter Three (328), Tag1 Consulting (19), Drupal Association (12), Acquia (5), Comm-press (1)
3 RenatoG 408 0% 100% 0% CI&T (408)
4 bojanz 351 0% 95% 5% Commerce Guys (335), Adapt A/S (38), Bluespark (2)
5 Berdir 335 0% 93% 7% MD Systems (310), Acquia (7)
6 mglaman 334 3% 97% 1% Commerce Guys (319), Thinkbean, LLC (48), LivePerson, Inc (46), Bluespark (22), Universal Music Group (16), Gaggle.net, Inc. (3), Bluehorn Digital (1)
7 Wim Leers 332 14% 87% 2% Acquia (290)
8 alexpott 329 7% 99% 1% Chapter Three (326), TES Global (1)
9 DamienMcKenna 245 2% 95% 4% Mediacurrent (232)
10 jhodgdon 242 0% 1% 99% Drupal Association (2), Poplar ProductivityWare (2)
11 drunken monkey 238 95% 11% 1% Acquia (17), Vizala (8), Wunder Group (1), Sunlime IT Services GmbH (1)
12 naveenvalecha 196 74% 55% 1% Acquia (152), Google Summer of Code (7), QED42 (1)
13 Munavijayalakshmi 192 0% 100% 0% Valuebound (192)
14 borisson_ 191 66% 39% 22% Dazzle (70), Acquia (6)
15 yongt9412 189 0% 97% 3% MD Systems (183), Acquia (6)
16 klausi 185 9% 61% 32% epiqo (112)
17 Sam152 184 59% 92% 7% PreviousNext (168), amaysim Australia Ltd. (5), Code Drop (2)
18 miro_dietiker 182 0% 99% 1% MD Systems (181)
19 Pavan B S 180 0% 98% 2% Valuebound (177)
20 ajay_reddy 176 100% 99% 0% Valuebound (180), Drupal Bangalore Community (154)
21 phenaproxima 172 0% 99% 1% Acquia (170)
22 sanchiz 162 0% 99% 1% Drupal Ukraine Community (107), Vinzon (101), FFW (60), Open Y (52)
23 slashrsm 161 6% 95% 3% MD Systems (153), Acquia (47)
24 jhedstrom 155 4% 92% 4% Phase2 (143), Workday, Inc. (134), Memorial Sloan Kettering Cancer Center (1)
25 xjm 151 0% 91% 9% Acquia (137)
26 catch 147 3% 83% 16% Third and Grove (116), Tag1 Consulting (6)
27 larowlan 145 12% 92% 7% PreviousNext (133), University of Technology, Sydney (30), amaysim Australia Ltd. (6), Australian Competition and Consumer Commission (ACCC) (1), Department of Justice & Regulation, Victoria (1)
28 rakesh.gectcr 141 100% 91% 0% Valuebound (128)
29 benjy 139 0% 94% 6% PreviousNext (129), Brisbane City Council (8), Code Drop (1)
30 dhruveshdtripathi 138 15% 100% 0% DevsAdda (138), OpenSense Labs (44)


We observe that the top 30 contributors are sponsored by 46 organizations. This kind of diversity is aligned with our desire not to see Drupal controlled by a single organization. These top contributors and organizations are from many different parts of the world and work with customers large and small. Nonetheless, we will continue to benefit from more diversity.

Evolving the credit system

Like Drupal itself, the credit system on Drupal.org is an evolving tool. Ultimately, the credit system will only be useful when the community uses it, understands its shortcomings, and suggests constructive improvements. In highlighting the organizations that sponsor the development of code on Drupal.org, we hope to elicit responses that help evolve the credit system into something that incentivizes business to sponsor more work and enables more people to participate in our community, learn from others, teach newcomers and make positive contributions. Drupal is a positive force for change and we wish to use the credit system to highlight (at least some of) the work of our diverse community, which includes volunteers, companies, nonprofits, governments, schools, universities, individuals, and other groups.

One of the challenges with the existing credit system is it has no way of "weighting" contributions. A typo fix counts just as much as giving multiple detailed technical reviews on a critical core issue. This appears to have the effect of incentivizing organizations' employees to work on "lower-hanging fruit issues", because this bumps their companies' names in the rankings. One way to help address this might be to adjust the credit ranking algorithm to consider things such as issue priority, patch size, and so on. This could help incentivize companies to work on larger and more important problems and save coding standards improvements for new contributor sprints. Implementing a scoring system that ranks the complexity of an issue would also allow us to develop more accurate reports of contributed work.

Conclusion

Our data confirms Drupal is a vibrant community full of contributors who are constantly evolving and improving the software. While we have amazing geographic diversity, we need greater gender diversity. Our analysis of the Drupal.org credit data concludes that most contributions to Drupal are sponsored. At the same time, the data shows that volunteer contribution remains very important to Drupal.

As a community, we need to understand that a healthy open-source ecosystem includes more than traditional Drupal businesses that contribute the most. For example, we don't see a lot of contribution from the larger digital marketing agencies, system integrators, technology companies, or end-users of Drupal - we believe that might come as these organizations build out their Drupal practices and Drupal becomes more strategic for them.

To grow and sustain Drupal, we should support those that contribute to Drupal and find ways to get those that are not contributing involved in our community. We invite you to help us continue to strengthen our ecosystem.

Special thanks to Tim Lehnen and Neil Drumm from the Drupal Association for providing us with the Drupal.org credit system data and for supporting us during our research. I would also like to extend a special thanks to Matthew Tift for helping to lay the foundation for this research, collaborating on last year's blog post, and for reviewing this year's edition. Finally, thanks to Angie Byron, Gábor Hojtsy, Jess (xjm), Preston So, Ted Bowman, Wim Leers and Gigi Anderson for providing feedback during the writing process.

13 Sep 2017 11:46am GMT

11 Sep 2017

feedPlanet Grep

Sven Vermeulen: Authenticating with U2F

In order to further secure access to my workstation, after the switch to Gentoo sources, I now enabled two-factor authentication through my Yubico U2F USB device. Well, at least for local access - remote access through SSH requires both userid/password as well as the correct SSH key, by chaining authentication methods in OpenSSH.

Enabling U2F on (Gentoo) Linux is fairly easy. The various guides online which talk about the pam_u2f setup are indeed correct that it is fairly simple. For completeness sake, I've documented what I know on the Gentoo Wiki, as the pam_u2f article.

The setup, basically

The setup of U2F is done in a number of steps: 1. Validate that the kernel is ready for the USB device 2. Install the PAM module and supporting tools 3. Generate the necessary data elements for each user (keys and such) 4. Configure PAM to require authentication through the U2F key

For the kernel, the configuration item needed is the raw HID device support. Now, in current kernels, two settings are available that both talk about raw HID device support: CONFIG_HIDRAW is the general raw HID device support, while CONFIG_USB_HIDDEV is the USB-specific raw HID device support.

It is very well possible that only a single one is needed, but both where active on my kernel configuration already, and Internet sources are not clear which one is needed, so let's assume for now both are.

Next, the PAM module needs to be installed. On Gentoo, this is a matter of installing the pam\_u2f package, as the necessary dependencies will be pulled in automatically:

~# emerge pam_u2f

Next, for each user, a registration has to be made. This registration is needed for the U2F components to be able to correctly authenticate the use of a U2F key for a particular user. This is done with pamu2fcfg:

~$ pamu2fcfg -u<username> > ~/.config/Yubico/u2f_keys

The U2F USB key must be plugged in when the command is executed, as a succesful keypress (on the U2F device) is needed to complete the operation.

Finally, enable the use of the pam\_u2f module in PAM. On my system, this is done through the /etc/pam.d/system-local-login PAM configuration file used by all local logon services.

auth     required     pam_u2f.so

Consider the problems you might face

When fiddling with PAM, it is important to keep in mind what could fail. During the setup, it is recommended to have an open administrative session on the system so that you can validate if the PAM configuration works, without locking yourself out of the system.

But other issues need to be considered as well.

My Yubico U2F USB key might have a high MTBF (Mean Time Between Failures) value, but once it fails, it would lock me out of my workstation (and even remote services and servers that use it). For that reason, I own a second one, safely stored, but is a valid key nonetheless for my workstation and remote systems/services. Given the low cost of a simple U2F key, it is a simple solution for this threat.

Another issue that could come up is a malfunction in the PAM module itself. For me, this is handled by having remote SSH access done without this PAM module (although other PAM modules are still involved, so a generic PAM failure itself wouldn't resolve this). Of course, worst case, the system needs to be rebooted in single user mode.

One issue that I faced was the SELinux policy. Some applications that provide logon services don't have the proper rights to handle U2F, and because PAM just works in the address space (and thus SELinux domain) of the application, the necessary privileges need to be added to these services. My initial investigation revealed the following necessary policy rules (refpolicy-style);

udev_search_pids(...)
udev_read_db(...)
dev_rw_generic_usb_dev(...)

The first two rules are needed because the operation to trigger the USB key uses the udev tables to find out where the key is located/attached, before it interacts with it. This interaction is then controlled through the first rule.

Simple yet effective

Enabling the U2F authentication on the system is very simple, and gives a higher confidence that malicious activities through regular accounts will have it somewhat more challenging to switch to a more privileged session (one control is the SELinux policy of course, but for those domains that are allowed to switch then the PAM-based authentication is another control), as even evesdropping on my password (or extracting it from memory) won't suffice to perform a successful authentication.

If you want to use a different two-factor authentication, check out the use of the Google authenticator, another nice article on the Gentoo wiki. It is also possible to use Yubico keys for remote authentication, but that uses the OTP (One Time Password) functionality which isn't active on the Yubico keys that I own.

11 Sep 2017 4:25pm GMT

10 Sep 2017

feedPlanet Grep

Mattias Geniar: Coming soon: Oh Dear! – Monitoring for the encrypted web

The post Coming soon: Oh Dear! - Monitoring for the encrypted web appeared first on ma.ttias.be.

I'm excited to announce a new project I'm working on: Oh Dear!

The goal of Oh Dear! is to provide modern monitoring & feedback for sites that run on HTTPS. With Chrome's soon-to-be-released version that marks any input on non-HTTPS pages as "Not Secure", that target audience is huge.

The baseline below I think sums it up very accurately.

Many users only look at their certificate expiration dates when running HTTPS sites and -- hopefully -- renew in time. But that's only a small part of the journey to HTTPS. I've ranted about messing up HTTPS often enough that I don't want to repeat myself anymore.

What does Oh Dear! offer?

From my old rant, the summary from way-back-then still stands today:

Browsers don't care if your HTTPS config is 95% perfect. They'll destroy the visitor's experience if you don't nail it for the full 100%.

There's many things that can go wrong with deploying HTTPS, including;

Oh Dear! monitors for each and every one of those things, and more.

Included in Oh Dear! is Certificate Transparency reporting, so you can get notified whenever a new certificate is issued for one of your domains, intentional or otherwise.

Meet the team

Unlike my usual projects, this time I'm working together with smart folks to help make Oh Dear! a success.

The team consists of Dries Vints, Freek Van der Herten & me. We're all active in the Laravel community. Dries & Freek go way back, I only got to know these smart men a little over a year ago.

Join the beta

We're not open to the public yet, but there's a beta program you can subscribe to in order to get access to Oh Dear!.

If you run a website on HTTPS -- and chances are, you do -- don't let a bad certificate or configuration ruin your day. Trust us to monitor it for you and report any errors, before your visitors do.

Go check out our app at ohdearapp.com or follow us on Twitter via @OhDearApp.

The post Coming soon: Oh Dear! - Monitoring for the encrypted web appeared first on ma.ttias.be.

10 Sep 2017 7:50pm GMT

Kristof Provost: PR 219251

As I threatened to do in my previous post, I'm going to talk about PR 219251 for a bit. The bug report dates from only a few months ago, but the first report (that I can remeber) actually came from Shawn Webb on Twitter, of all places: backtrace

Despite there being a stacktrace it took quite a while (nearly 6 months in fact) before I figured this one out.

It took Reshad Patuck managing to distill the problem down to a small-ish test script to make real progress on this. His testcase meant that I could get core dumps and experiment. It also provided valuable clues because it could be tweaked to see what elements were required to trigger the panic.

This test script starts a (vnet) jail, adds an epair interface to it, sets up pf in the jail, and then reloads the pf rules on the host. Interestingly the panic does not seem to occur if that last step is not included.

Time to take a closer look at the code that breaks:

u_int32_t
pf_state_expires(const struct pf_state *state)
{
        u_int32_t       timeout;
        u_int32_t       start;
        u_int32_t       end;
        u_int32_t       states;

        /* handle all PFTM_* > PFTM_MAX here */
        if (state->timeout == PFTM_PURGE)
                return (time_uptime);
        KASSERT(state->timeout != PFTM_UNLINKED,
            ("pf_state_expires: timeout == PFTM_UNLINKED"));
        KASSERT((state->timeout < PFTM_MAX),
            ("pf_state_expires: timeout > PFTM_MAX"));
        timeout = state->rule.ptr->timeout[state->timeout];
        if (!timeout)
                timeout = V_pf_default_rule.timeout[state->timeout];
        start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START];
        if (start) {
                end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END];
                states = counter_u64_fetch(state->rule.ptr->states_cur);
        } else {
                start = V_pf_default_rule.timeout[PFTM_ADAPTIVE_START];
                end = V_pf_default_rule.timeout[PFTM_ADAPTIVE_END];
                states = V_pf_status.states;
        }
        if (end && states > start && start < end) {
                if (states < end)
                        return (state->expire + timeout * (end - states) /
                            (end - start));
                else
                        return (time_uptime);
        }
        return (state->expire + timeout);
}

Specifically, the following line:

       states = counter_u64_fetch(state->rule.ptr->states_cur);

We try to fetch a counter value here, but instead we dereference a bad pointer. There's two here, so already we need more information. Inspection of the core dump reveals that the state pointer is valid, and contains sane information. The rule pointer (rule.ptr) points to a sensible location, but the data is mostly 0xdeadc0de. This is the memory allocator being helpful (in debug mode) and writing garbage over freed memory, to make use-after-free bugs like this one easier to find.

In other words: the rule has been free()d while there was still a state pointing to it. Somehow we have a state (describing a connection pf knows about) which points to a rule which no longer exists.

The core dump also shows that the problem always occurs with states and rules in the default vnet (i.e. the host pf instance), not one of the pf instances in one of the vnet jails. That matches with the observation that the test script does not trigger the panic unless we also reload the rules on the host.

Great, we know what's wrong, but now we need to work out how we can get into this state. At this point we're going to have to learn something about how rules and states get cleaned up in pf. Don't worry if you had no idea, because before this bug I didn't either.

The states keep a pointer to the rule they match, so when rules are changed (or removed) we can't just delete them. States get cleaned up when connections are closed or they time out. This means we have to keep old rules around until the states that use them expire.

When rules are removed pf_unlink_rule() adds then to the V_pf_unlinked_rules list (more on that funny V_ prefix later). From time to time the pf purge thread will run over all states and mark the rules that are used by a state. Once that's done for all states we know that all rules that are not marked as in-use can be removed (because none of the states use it).

That can be a lot of work if we've got a lot of states, so pf_purge_thread() breaks that up into smaller chuncks, iterating only part of the state table on every run. Let's look at that code:

void
pf_purge_thread(void *unused __unused)
{
        VNET_ITERATOR_DECL(vnet_iter);
        u_int idx = 0;

        sx_xlock(&pf_end_lock);
        while (pf_end_threads == 0) {
                sx_sleep(pf_purge_thread, &pf_end_lock, 0, "pftm", hz / 10);

                VNET_LIST_RLOCK();
                VNET_FOREACH(vnet_iter) {
                        CURVNET_SET(vnet_iter);


                        /* Wait until V_pf_default_rule is initialized. */
                        if (V_pf_vnet_active == 0) {
                                CURVNET_RESTORE();
                                continue;
                        }

                        /*
                         *  Process 1/interval fraction of the state
                         * table every run.
                         */
                        idx = pf_purge_expired_states(idx, pf_hashmask /
                            (V_pf_default_rule.timeout[PFTM_INTERVAL] * 10));

                        /*
                         * Purge other expired types every
                         * PFTM_INTERVAL seconds.
                         */
                        if (idx == 0) {
                                /*
                                 * Order is important:
                                 * - states and src nodes reference rules
                                 * - states and rules reference kifs
                                 */
                                pf_purge_expired_fragments();
                                pf_purge_expired_src_nodes();
                                pf_purge_unlinked_rules();
                                pfi_kif_purge();
                        }
                        CURVNET_RESTORE();
                }
                VNET_LIST_RUNLOCK();
        }

        pf_end_threads++;
        sx_xunlock(&pf_end_lock);
        kproc_exit(0);
}

We iterate over all of our virtual pf instances (VNET_FOREACH()), check if it's active (for FreeBSD-EN-17.08, where we've seen this code before) and then check the expired states with pf_purge_expired_states(). We start at state 'idx' and only process a certain number (determined by the PFTM_INTERVAL setting) states. The pf_purge_expired_states() function returns a new idx value to tell us how far we got.

So, remember when I mentioned the odd V_ prefix? Those are per-vnet variables. They work a bit like thread-local variables. Each vnet (virtual network stack) keeps its state separate from the others, and the V_ variables use a pointer that's changed whenever we change the currently active vnet (say with CURVNET_SET() or CURVNET_RESTORE()). That's tracked in the 'curvnet' variable. In other words: there are as many V_pf_vnet_active variables as there are vnets: number of vnet jails plus one (for the host system).

Why is that relevant here? Note that idx is not a per-vnet variable, but we handle multiple pf instances here. We run through all of them in fact. That means that we end up checking the first X states in the first vnet, then check the second X states in the second vnet, the third X states in the third and so on and so on.

That of course means that we think we've run through all of the states in a vnet while we really only checked some of them. So when pf_purge_unlinked_rules() runs it can end up free()ing rules that actually are still in use because pf_purge_thread() skipped over the state(s) that actually used the rule.

The problem only happened if we reloaded rules in the host jail, because the active ruleset is never free()d, even if there are no states pointing to the rule.

That explains the panic, and the fix is actually quite straightforward: idx needs to be a per-vnet variable, V_pf_purge_idx, and then the problem is gone.

As is often the case, the solution to a fairly hard problem turns out to be really simple.

10 Sep 2017 5:00pm GMT