09 Jan 2019

feedPlanet Gentoo

Gentoo News: FOSDEM 2019

FOSDEM logo

It's FOSDEM time again! Join us at Université libre de Bruxelles, Campus du Solbosch, in Brussels, Belgium. This year's FOSDEM 2019 will be held on February 2nd and 3rd.

Our developers will be happy to greet all open source enthusiasts at our Gentoo stand in building K. Visit this year's wiki page to see who's coming. So far eight developers have specified their attendance, with most likely many more on the way!

09 Jan 2019 12:00am GMT

06 Dec 2018

feedPlanet Gentoo

Alexys Jacob: Scylla Summit 2018 write-up

It's been almost one month since I had the chance to attend and speak at Scylla Summit 2018 so I'm relieved to finally publish a short write-up on the key things I wanted to share about this wonderful event!

Make Scylla boring

This statement of Glauber Costa sums up what looked to me to be the main driver of the engineering efforts put into Scylla lately: making it work so consistently well on any kind of workload that it's boring to operate 🙂

I will follow up on this statement to highlight the things I heard and (hopefully) understood during the summit. I hope you'll find it insightful.

Reduced operational efforts

The thread-per-core and queues design still has a lot of possibilities to be leveraged.

The recent addition of RPC streaming capabilities to seastar allows a drastic reduction in the time it takes the cluster to grow or shrink (data rebalancing / resynchronization).

Incremental compaction is also very promising as this background process is one of the most expensive there is in the database's design.

I was happy to hear that scylla-manager will soon be made available and free to use with basic features while retaining more advanced ones for enterprise version (like backup/restore).
I also noticed that the current version was not supporting SSL enabled clusters to store its configuration. So I directly asked Michał for it and I'm glad that it will be released on version 1.3.1.

Performant multi-tenancy

Why choose between real-time OLTP & analytics OLAP workloads?

The goal here is to be able to run both on the same cluster by giving users the ability to assign "SLA" shares to ROLES. That's basically like pools on Hadoop at a much finer grain since it will create dedicated queues that will be weighted by their share.

Having one queue per usage and full accounting will allow to limit resources efficiently and users to have their say on their latency SLAs.

But Scylla also has a lot to do in the background to run smoothly. So while this design pattern was already applied to tamper compactions, a lot of work has also been done on automatic flow control and back pressure.

For instance, Materialized Views are updated asynchronously which means that while we can interact and put a lot of pressure on the table its based on (called the Main Table), we could overwhelm the background work that's needed to keep MVs View Tables in sync. To mitigate this, a smart back pressure approach was developed and will throttle the clients to make sure that Scylla can manage to do everything at the best performance the hardware allows!

I was happy to hear that work on tiered storage is also planned to better optimize disk space costs for certain workloads.

Last but not least, columnar storage optimized for time series and analytics workloads are also something the developers are looking at.

Latency is expensive

If you care for latency, you might be happy to hear that a new polling API (named IOCB_CMD_POLL) has been contributed by Christoph Hellwig and Avi Kivity to the 4.19 Linux kernel which avoids context switching I/O by using a shared ring between kernel and userspace. Scylla will be using it by default if the kernel supports it.

The iotune utility has been upgraded since 2.3 to generate an enhanced I/O configuration.

Also, persistent (disk backed) in-memory tables are getting ready and are very promising for latency sensitive workloads!

A word on drivers

ScyllaDB has been relying on the Datastax drivers since the start. While it's a good thing for the whole community, it's important to note that the shard-per-CPU approach on data that Scylla is using is not known and leveraged by the current drivers.

Discussions took place and it seems that Datastax will not allow the protocol to evolve so that drivers could discover if the connected cluster is shard aware or not and then use this information to be more clever in which write/read path to use.

So for now ScyllaDB has been forking and developing their shard aware drivers for Java and Go (no Python yet… I was disappointed).

Kubernetes & containers

The ScyllaDB guys of course couldn't avoid the Kubernetes frenzy so Moreno Garcia gave a lot of feedback and tips on how to operate Scylla on docker with minimal performance degradation.

Kubernetes has been designed for stateless applications, not stateful ones and Docker does some automatic magic that have rather big performance hits on Scylla. You will basically have to play with affinities to dedicate one Scylla instance to run on one server with a "retain" reclaim policy.

Remember that the official Scylla docker image runs with dev-mode enabled by default which turns off all performance checks on start. So start by disabling that and look at all the tips and literature that Moreno has put online!

Scylla 3.0

A lot has been written on it already so I will just be short on things that important to understand in my point of view.

Random notes

Support for LWT (lightweight transactions) will be relying on a future implementation of the Raft consensus algorithm inside Scylla. This work will also benefits Materialized Views consistency. Duarte Nunes will be the one working on this and I envy him very much!

Support for search workloads is high in the ScyllaDB devs priorities so we should definitely hear about it in the coming months.

Support for "mc" sstables (new generation format) is done and will reduce storage requirements thanks to metadata / data compression. Migration will be transparent because Scylla can read previous formats as well so it will upgrade your sstables as it compacts them.

ScyllaDB developers have not settled on how to best implement CDC. I hope they do rather soon because it is crucial in their ability to integrate well with Kafka!

Materialized Views, Secondary Indexes and filtering will benefit from the work on partition key and indexes intersections to avoid server side filtering on the coordinator. That's an important optimization to come!

Last but not least, I've had the pleasure to discuss with Takuya Asada who is the packager of Scylla for RedHat/CentOS & Debian/Ubuntu. We discussed Gentoo Linux packaging requirements as well as the recent and promising work on a relocatable package. We will collaborate more closely in the future!

06 Dec 2018 10:53pm GMT

25 Nov 2018

feedPlanet Gentoo

Michał Górny: Portability of tar features

The tar format is one of the oldest archive formats in use. It comes as no surprise that it is ugly - built as layers of hacks on the older format versions to overcome their limitations. However, given the POSIX standarization in late 80s and the popularity of GNU tar, you would expect the interoperability problems to be mostly resolved nowadays.

This article is directly inspired by my proof-of-concept work on new binary package format for Gentoo. My original proposal used volume label to provide user- and file(1)-friendly way of distinguish our binary packages. While it is a GNU tar extension, it falls within POSIX ustar implementation-defined file format and you would expect that non-compliant implementations would extract it as regular files. What I did not anticipate is that some implementation reject the whole archive instead.

This naturally raised more questions on how portable various tar formats actually are. To verify that, I have decided to analyze the standards for possible incompatibility dangers and build a suite of test inputs that could be used to check how various implementations cope with that. This article describes those points and provides test results for a number of implementations.

Please note that this article is focused merely on read-wise format compatibility. In other words, it establishes how tar files should be written in order to achieve best probability that it will be read correctly afterwards. It does not investigate what formats the listed tools can write and whether they can correctly create archives using specific features.

Continue reading

25 Nov 2018 2:26pm GMT