14 Apr 2014

feedPlanet Twisted

Future Foundries: Crochet 1.2.0, now with a better API!

Crochet is a library for using Twisted more easily from blocking programs and libraries. The latest version, released here at PyCon 2014, includes a much improved API for calling into Twisted from threads. In particular, a timeout is passed in - if it is hit the underlying operation is cancelled, and an exception is raised. Not all APIs in Twisted support cancellation, but for those that do (or APIs you implement) this is a really nice feature. You get high level timeouts (instead of blocking sockets' timeout-per-socket-operation) and automatic cleanup of resources if something takes too long.

#!/usr/bin/python
"""
Do a DNS lookup using Twisted's APIs.
"""
from __future__ import print_function

# The Twisted code we'll be using:
from twisted.names import client

from crochet import setup, wait_for
setup()


# Crochet layer, wrapping Twisted's DNS library in a blocking call.
@wait_for(timeout=5.0)
def gethostbyname(name):
"""Lookup the IP of a given hostname.

Unlike socket.gethostbyname() which can take an arbitrary amount
of time to finish, this function will raise crochet.TimeoutError
if more than 5 seconds elapse without an answer being received.
"""
d = client.lookupAddress(name)
d.addCallback(lambda result: result[0][0].payload.dottedQuad())
return d


if __name__ == '__main__':
# Application code using the public API - notice it works in a normal
# blocking manner, with no event loop visible:
import sys
name = sys.argv[1]
ip = gethostbyname(name)
print(name, "->", ip)

14 Apr 2014 2:52pm GMT

02 Apr 2014

feedPlanet Twisted

Duncan McGreggor: Hash Maps in LFE: Request for Comment

As you may have heard, hash maps are coming to Erlang in R17. We're all pretty excited about this. The LFE community (yes, we have one... hey, being headquartered on Gutland keeps us lean!) has been abuzz with excitement: do we get some new syntax for Erlang maps? Or just record-like macros?

That's still an open question. There's a good chance that if we find an elegant solution, we'll get some new syntax.

In an effort to (re)start this conversation and get us thinking about the possibilities, I've drawn together some examples from various Lisps. At the end of the post, we'll review some related data structures in LFE... as a point of contrast and possible guidance.

Note that I've tried to keep the code grouped in larger gists, not split up with prose wedged between them. This should make it easier to compare and contrast whole examples at a glance.

Before we dive into the Lisps, let's take a look at maps in Erlang:

Erlang Maps

Common Lisp Hash Tables

Racket Hash Tables

Clojure Hash Maps

Shen Property Lists

OpenLisp Hash Tables

LFE Property Lists

LFE orddicts

I summarized some very basic usability and aesthetic thoughts on the LFE mail list, but I'll restate them here:

One of the things that makes Clojure such a joy to work with is the unified aspect of core functions and how one uses these to manipulate data structures of different types. Most other implementations have functions/macros that are dedicated to working with just maps. While that's clean and definitely has a strong appeal, Clojure reflects a great deal of elegance.

That being said, I don't think today is the day to propose unifying features for LFE/Erlang data types ;-) (To be honest, though, it's certainly in the back of my mind... this is probably also true for many folks on the mail list.)

Given my positive experience with maps (hash tables) in Racket, and Robert's initial proposed functions like map-new, map-set, I'd encourage us to look to Racket for some inspiration:

Additional thoughts:

With this done, I then did a thought experiment in potential syntax additions for LFE. Below are the series of gists that demonstrate this.

Looking at this Erlang syntax:

My fingers want to do something like this in LFE:

That feels pretty natural, from the LFE perspective. However, it looks like it might require hacking on the tuple-parsing logic (or splitting that into two code paths: one for regular tuple-parsing, and the other for maps...?).

The above syntax also lends itself nicely to these:

The question that arises for me is "how would we do this when calling functions?" Perhaps one of these:

Then, for Joe's other example:

We'd have this for LFE:

Before we pattern match on this, let's look at Erlang pattern matching for tuples:

Compare this with pattern matching elements of a tuple in LFE:

With that in our minds, we turn to Joe's matching example against a specific map element:

And we could do the same in LFE like this:

I'm really uncertain about add-pair and update-pair, both the need for them and the names. Interested to hear from others who know how map is implemented in Erlang and the best way to work with that in LFE...

02 Apr 2014 5:33pm GMT

Moshe Zadka: GPS Navigators, morality and the human utility function

GPS navigators used to be simple. You plug in the desired address, they calculate the quickest route. At some point, many started getting a feed of traffic information, so that they can route against big jams. Still pretty simple.

Enter Waze. Waze hyper-micro-optimizes for traffic conditions, using other users as roaming sensors and, probably, complicated AI on the back-end to predict what route would take how long. This means Waze learns from experience how fast people drive where, under what conditions - and, as far as I can tell, takes all this information into account. This means that on a residential street that is constantly being speeded to, Waze will learn that it is a fast path, and will start directing drivers through it. Drivers who are using Waze, which means they want to get to their destination as fast as possible.

There are a lot of valid reasons to want to get somewhere fast. Maybe you scheduled a meeting. Maybe someone is having an emergency. Maybe you just want to cuddle your kid before she goes to bed. Waze does not care. It tries to get users to their destination as fast as possible, no matter which residential streets they speed through. It has no concerns for safety, either the driver's or pedestrian, neatly divulging itself of all responsibility by "just suggesting". What if a nice street starts being higher-up on the Waze paths? And this lowers property values? Waze has no concerns for that.

I'm not picking on Waze especially. Their only fault was that they took the optimization criteria that all GPS navigators use (find fastest path) and used superior sensors and superior algorithms to implement them better. Whoever wins the "navigator wars" would have had to do at least as well, so in that sense, there is nothing special about Waze except hiring smart people who made good decisions.

But this does show how AI moves forward: smart engineers optimize the hell of whatever target function they are given. At some point, it starts having real costs that humans would understand, but an AI would not care about because the AI cares about optimizing the target function. The only way to solve it is to figure out what humans care about in terms computers can understand and make sure that any AI takes those into account - even if it is not smart enough to self-modify, it can do plenty of damage without it.

In other words, the Friendliness apocalypse, if we include not just existential risk but also a general raising of human risk factors, is not in some nebulous future until we figure out self-modification. We are in the middle of it, and we should make sure that we take it into account when building AI - even if that AI is limited to "just suggesting things", because humans are suggestible, and have learned to trust computers' suggestions.

02 Apr 2014 4:03pm GMT

27 Mar 2014

feedPlanet Twisted

Glyph Lefkowitz: Panopticon Rift

Greg Price expresses a common opinion here:

The bile for Facebook among Oculus fans is eye-opening. I think every one of the top 200 comments is negative. http://t.co/6j3x9zrEta

- Greg Price (@gnprice) March 26, 2014

I myself had a similar reaction; despite not being particularly invested in VR specifically (I was not an Oculus backer) I felt pretty negatively about the fact that Facebook was the acquirer. I wasn't quite sure why, at first, and after some reflection I'd like to share my thoughts with you on both why this is a bad thing and also why gamers, in particular, were disturbed by it.

The Oculus Rift really captured the imagination of the gaming community's intelligentsia. John Carmack's imprimatur alone was enough to get people interested, but the real power of the Oculus was to finally deliver on the promise of all those unbelievably clunky virtual reality headsets that we've played around with at one time or another.

Virtual Boy

The promise of Virtual Reality is, of course, to transport us so completely to another place and time that we cease to even be aware of the real world. It is that experience, that complete and overwhelming sense of being in an imagined place, that many gamers are looking for when they sit down in front of an existing game. It is the aspiration to that experience that makes "immersive" such a high compliment in game criticism.

Personally, immersion in a virtual world was the beginning of my real interest in computers. I've never been the kind of gamer who felt the need to be intensely competitive. I didn't really care about rules and mechanics that much, at least not for their own sake. In fact, the term "gamer" is a bit of a misnomer - I'm more a connoisseur of interactive experiences. The very first "game" I remember really enjoying was Zork. Although Zork is a goal-directed game you can "win", that didn't interest me. Instead, I enjoyed wandering around the environments it provided, and experimenting with different actions to see what the computer would do.

Computer games are not really just one thing. The same term, "games", encompasses wildly divergent experiences including computerized Solitaire, Silent Hill, Dyad, and Myst. Nevertheless, I think that pursuit of immersion - on really, fully, presently paying attention to an interactive experience - is a primary reason that "gamers" feel the need to distinguish themselves (ourselves?) from the casual computing public. Obviously, the fans of Oculus are among those most focused on immersive experiences.

Gamers feel the need to set themselves apart because computing today is practically defined by a torrential cascade of things that we're only barely paying attention to. What makes computing "today" different than the computing of yesteryear is that computing used to be about thinking, and now it's about communication.

The advent of social media has left us not only focused on communication, but focused on constant, ephemeral, shallow communication. This is not an accident. In our present economy there is, after all, no such thing as a "social media business" (or, indeed, a "search engine business"); there are only ad agencies.

The purveyors of social media need you to be engaged enough with your friends that you won't leave their sites, so there is some level of entertainment or interest they must bring to their transactions with you, of course; but they don't want you to be so engaged that a distracting advertisement would be obviously crass and inappropriate. Lighthearted banter, pictures of shiba inus, and shallow gossip are fantastic fodder for this. Less so are long, soul-searching long-form writing explaining and examining your feelings.

An ad for cat food might seem fine if you're chuckling at a picture of a dog in a hoodie saying "wow, very meme, such meta". It's less likely to drive you through to the terminus of that purchase conversion funnel if you're intently focused on supporting a friend who is explaining to you how a cancer scare drove home how they're doing nothing with their life, or how your friend's sibling has run away from home and your friend doesn't know if they're safe.

Even if you're a highly social extrovert, the dominant emotion that this torrent of ephemeral communication produces is, at best, sarcastic amusement. More likely, it produces constant anxiety. We do not experience our better selves when we're not really paying focused attention to anything. As Community Season 5 Episode 8 recently put it, somewhat more bluntly: "Mark Zuckerberg is Fidel Castro in flip-flops."

I think that despite all the other reasons we're all annoyed - the feelings of betrayal around the Kickstarter, protestations of Facebook's creepyness, and so on - the root of the anger around the Facebook acquisition is the sense that this technology with so much potential to reverse the Balkanization of our attention has now been put directly into the hands of those who created the problem in the first place.

So now, instead of looking forward to a technology that will allow us to visit a world of pure imagination, we now eagerly await something that will shove distracting notifications and annoying advertisements literally an inch away from our eyeballs. Probably after charging us several hundred dollars for the privilege.

It seems to me that's a pretty clear justification for a few hundred negative reddit comments.

27 Mar 2014 9:25am GMT

20 Mar 2014

feedPlanet Twisted

Duncan McGreggor: lfetool T-Shirts?!

So, yeah... waaaaay too early for a T-shirt; the code has just gone from 0.1.x to 0.2.x. But, how could we resist: the project logo is retro like woodblock prints!

Earlier today, lfetool converted to using plugins (easier for contributors to play!), and hopefully later tonight YAWS + Bootstrap support will land. Regardless, there's tons more work to do, and what's more motivating for new work than a T-shirt? As you can see, we had no choice.

So let's get some. T-shirts, that is.

The lfetool logo is on the front, and Robert Virding's example Fibonacci lfescript code is on the back (with the lfetool command that generates the script at the top). Here's the front of the T-shirt:




And here's the back:



We've got a CustomInk sign-up sheet that you can add your name to if you want a shirt. I'll coordinate with folks individually, once we meet our minimum number (we're going to need to use paypal with payment upfront). Due to the colors of the source code on the back, the minimum order is 15. This will put the shirts at $22 a piece + $5 to send it to you. I've just ordered 2; 13 more go!


20 Mar 2014 3:58am GMT

18 Mar 2014

feedPlanet Twisted

Duncan McGreggor: lfetool 0.2 Released

This release of lfetool has a bunch of new interesting features -- too much to cover in a tweet, so it's getting a little blog post ;-)

Here are the high-level bits that should make users' lives better:

Note that the version jumped from 0.1.2 to 0.2.0 due to the strong improvements in quality and feature set of the tool. Forth-coming features include:

If you've got more ideas for the lfetool wishlist, post a comment here or send an email to the LFE mailing list.


18 Mar 2014 11:29pm GMT

15 Mar 2014

feedPlanet Twisted

Future Foundries: Signal/GC-safe cross-thread queueing in Python

I've just released a new version of Crochet, and one of the bugs fixed involves an interesting problem - reentrancy. In this particular case I'm talking about garbage collection and signal reentrancy - any function your Python program is running may be interrupted at any time (on bytecode boundaries) to do garbage collection or handle a signal. A signal handler can run arbitrary Python code, as can GC via to __del__ or weakref callbacks. Once that code finishes running control is returned to the original location in the code.


Unfortunately, due to a bug in Python, Queue.put() can deadlock in the following situation:
  1. As part of calling Queue.put(), a thread acquires the Queue's lock. This lock does not support being acquired more than once by the same thread.
  2. GC or a signal handler interrupts the function call.
  3. If the GC or signal handler code then also does Queue.put(), it will try to acquire the lock again... and since it's already locked it blocks waiting for the lock to be released.
  4. Since the signal handler/GC code is now blocked, control is never returned to original code, so lock is never released there.
  5. The thread is now deadlocked and will never recover.
Unfortunately there was no way to prevent the Queue.put() in GC; the Queue accepts log messages, and this is a GC-caused logging message coming out of code that is not under Crochet control's.

The obvious short-term solution is to reimplement a simplified Queue using Python's RLock, which allows the same thread to acquire the lock multiples times. But... RLock isn't reentrancy safe either due to another bug in Python! I could wrap OS-specific reentrant lock implementations, but that's a bigger project than I want to start.

The solution I ended up with (suggested by Jean-Paul Calderone I believe) was giving up on using threading primitives to communicate across threads. Instead I used the self-pipe trick. That is, the thread uses select() (or poll() or epoll()) to wait on one end of the pipe; to wake the thread up and tell it to check for new messages to process we simply write a byte to the other end of the pipe. Since Crochet uses Twisted, I had a pre-written event loop that already implemented self-pipe waking, and now the logging thread runs another Twisted reactor in addition to the regular reactor thread.

As far as I can tell this works, but it feels a bit like overkill. I'd welcome suggestions for other solutions.

15 Mar 2014 12:45pm GMT

14 Mar 2014

feedPlanet Twisted

Duncan McGreggor: The Future of LFE

Erlang Factory

First of all, Erlang Factory this year was just phenomenal: great talks, great energy, and none of the feared/anticipated "acquisition feeding frenzy." Instead, everyone was genuinely happy for WhatsApp and Cloudant, and with celebrations aside, they were ready to get on with the conference and dive into the tech :-)

And gosh, there was a bunch of good tech. If you don't believe me, check out the schedule. Also on that page are the speaker pics. For talks that have video or slides pushed up, the speaker pic is visually annotated and linked.

There's so much good stuff there -- I've definitely got my watching queue set up for the next few weeks ...

LFE Presentation

I gave a presentation on LFE which covered everything from motivational basics for using a Lisp in the 21st century, gave a taste of LFE in small chunks, and then took folks on a quick tour of creating projects in LFE. There was also some dessert of fun side/research projects that are currently in-progress. The slides for the presentation are here; also the slide source code is available (related demo project). I'd also like to give a shout out to the Hoplon crew for their help in making sure I could create this presentation in a Lisp (Clojure), and not HTML ;-) (It uses a Hoplon-based Reveal.js library.)

The Good Stuff

After the presentation, several of us chatted about Lisp and Erlang for a while. Robert and I later continued along these lines after heading over to the quiet of the ever-cool Marines Memorial 11th-floor library (complete with fireplace). Here we sketched out some of the interesting areas for future development in LFE. I'm not sure if I'm remembering everything (I've also added stuff that we didn't discuss in the library, like Sean Chalmers' recent experiments with types; blog and discussion):

Additional efforts planned:

I'll dump all these into github issues so they'll be easier to track.

If this stuff is exciting to you, feel free to jump into the low-volume discussions we have on the mail list.

More soon!


14 Mar 2014 8:11pm GMT

09 Mar 2014

feedPlanet Twisted

Future Foundries: Twisted on Python 3 now pip installable

The subset of Twisted that has been ported to Python 3 can now be pip installed. By either pointing at a version control URL or requiring Twisted 14.0 (once it's released), you can now have Twisted as a dependency for your Python 3 packages.

Here's a slightly edited version of my Travis-CI config for Crochet, demonstrating how I run unit tests on both Python 2 and Python 3 versions of Twisted (slightly tricky because the trial test runner hasn't been ported yet):

language: python

env:
- TWISTED=Twisted==13.0 RUNTESTS=trial
- TWISTED=Twisted==13.1 RUNTESTS=trial
- TWISTED=Twisted==13.2 RUNTESTS=trial

python:
- 2.6
- 2.7
- pypy

matrix:
include:
- python: 3.3
env: TWISTED=git+https://github.com/twisted/twisted.git RUNTESTS="python -m unittest discover"

install:
- pip install -q --no-use-wheel $TWISTED --use-mirrors
- python setup.py -q install

script: $RUNTESTS crochet.tests

09 Mar 2014 6:27pm GMT

08 Mar 2014

feedPlanet Twisted

Ashwini Oruganti: Kneel and Disconnect: Getting the fastest connection out of a hostname

A Caterpillar crawled to me one day and said
"Oh what the hell goes on inside your swollen head?
I don't believe that you can see as much as I
Now close your eyes and tell me what do you spy?"

A detailed account of the development of HostnameEndpoint follows, also leading to my talk at PyCon US 2014 in Montreal. During the course of its creation, many wise words were exchanged, thanks to exarkun and glyph (and itamar and tomprince!). I repeat them here, in hopes that it will help make you wiser too, should you aim for that goal. So well, indulge me whilst I rant on.

Soon after I finished working on a name-resolving TCP IPv6 client endpoint, I heard of a certain crazy endpoint, described (not in as much detail as it is now, but to some extent) in a ticket. In those days, seeing as the description was not too elaborate, and the goals not too clear, any mention of working on it would lead to a discussion such as:

<exarkun> I suspect it is obvious to anyone with more than ten years network programming experience.
<glyph> exarkun: Completely. ashfall should hurry up and get ten years of that, then.

This continued until one day Glyph decided to write a nice spec for it, and that's where our story begins.

Bornlivedie

<kenaan> ashfall r35028 A(branches/gai-endpoint-4859/): Branching to 'gai-endpoint-4859' …
<ashfall> Okay, this one is a little bit scary to start.

The gist of this endpoint is that it takes a hostname, does a getaddrinfo lookup on it, and picks the first two addresses out of the returned list. Using some time-involving algorithm, it tries connecting to both of them.

Doing things based on time in Twisted meant using reactor.callLater. That in turn meant using Clock as the reactor in the unit tests, since real time is slow and unpredictable, and tests are ideally fast and predictable.

This was also the day when my appreciation of the reactor grew.

My initial understanding of the task at hand, I remember, was that the endpoint would connect to the first address, record how long it took, then connect to the next address, record how long it took, compare the times taken, pick the one that took less time, and call connect one final time. After discussing it further with exarkun, I realized that this doesn't actually need to record any durations, since we don't care how long it took, as long as it was first.

Now, the high-level objective was to establish the connection as quickly as possible, and 3 connects would take a lot of time. So we make the connection attempts in parallel, drop the one that doesn't connect first and use the one that does connect first. So that there are two connection attempts, one (hopefully) connection success, and one discarded connection attempt. And no third connection attempt.

And how do we do that?

"Well, fortunately Twisted is good at doing network operations in parallel", explained exarkun. For example:

1
2
reactor.connectTCP('1.2.3.4', 456, f)
reactor.connectTCP('1:2::3:4', 567, f)

For my part I wondered, how is that parallel, with one reactor?

I learnt the reactor can do many things at once, and that that's its job, essentially. "reactor.connectTCP returns immediately, after all. It doesn't wait for the connection to be established first," exarkun elaborated. "That's why we need a factory with a buildProtocol method, so that the reactor can call it once the connection has succeeded. Because by the time it does succeed, the application will have moved on to doing something else."

"A secondary objective of the endpoint is to be conservative of network resources," he added. "If the connection to 1.2.3.4 succeeds quickly enough, then it would be wasteful of network resources to try connecting to 1:2::3:4 as well. That's why the relevant RFC talks about a 300 ms delay. If the first one is done really fast and you can avoid the second connection attempt, you should. So you won't actually start both connection attempts at the same time. You'll start the 2nd one a short while after the first one, if and only if the first one has not succeeded. reactor.callLater is how you would do that, and delayedCall.cancel() is how you would stop a delayed call from happening when the first one succeeds.someDeferred.cancel() is how you would make a connection attempt give up early, if the other connection attempt succeeds."

But now I wondered, what if 1.2.3.4 takes 254 ms and 1:2::3:4 would have taken, say, 230 ms, had we tried it (but we will never know?). And exarkun said, "[Then] Tough luck for us. We're not going to try to solve that problem." Turns out, it's not as much of a problem as the problem this RFC is aimed at fixing, which is the problem where 1:2::3:4 takes 120 seconds and 1.2.3.4 takes 254 ms, and so a purely serial approach would take 120.254 seconds to succeed, whereas this one will only take 0.254 seconds to succeed. That's a much more noticeable difference than 0.254 seconds vs 0.230 seconds.

"I watched nine cats dance on the moon
A flamingo stalked into my room
It bowed its head to me and knelt
To reveal the cards it had dealt

Now, where does Clock come into the picture? The clock is actually a replacement for the reactor, at least in a very restricted scope. It has a callLater method, but whereas reactor.callLater(5, f) will only call f after roughly 5 seconds have passed, clock.callLater(5, f) will call f the instant you call clock.advance(5). (or any combination of clock.advance calls so that the sum of the arguments passed to all calls is 5 or more). So if you're writing a unit test for code that does callLater(5, f), the test can complete in fewer than 5 seconds if you pass a Clock instance in for the reactor.

Using Clock completes with more predictable results, since the reactor really does call f in roughly 5 seconds, and the variation implied by "roughly" can sometimes make a difference (in a bad way) to how the code behaves. The idea is that maybe 5 seconds is close to some other critical amount of time. Perhaps sometimes the reactor gets around to calling f very soon after 5 seconds, perhaps 5.001 seconds. That exercises one code path through the code under test. And perhaps sometimes, the reactor doesn't get around to calling f until 5.1 seconds have elapsed, meanwhile, something else happened in those extra .099 seconds that didn't have time to happen in the first case, and so now you're exercising a second, different code path through the code under test. But this is a single test method, and it should only ever exercise a single code path. Since it is exercising two, now, it can have two different outcomes, and that makes it much less useful. Perhaps one of the outcomes is a success and the other is a failure, so when you run the tests, the results flip flop between good and bad, even though you're not changing any code. Or perhaps both outcomes are success - for now. And someone breaks one of the code paths accidentally. But they run the test, and it accidentally goes through the other code path, the one they didn't break, and so the test passes. And they think their code works.

So, yes. Clock and writing two tests is the solution. Because with Clock, you can write a test where it appears that 5.001 seconds elapse before f is called, and a second test where it appears that 5.1 seconds elapse. (just by doing clock.advance(5.001) in one of them and clock.advance(5.1) in the other)

Nomenclature

An ace, three jacks, two queens, four kings
Then turned them into burning rings
The flames jumped out and chased four mice
Caught by their tails they turned to ice

The new super-smart TCP endpoint, what should it be called?

The first name glyph suggested was HostnameEndpoint, because it connects you to a hostname, over whatever protocol is appropriate. Another suggestion was to call it TCPEndpoint, since we haven't used that name yet. And then it was pointed out that hostnames aren't part of TCP. Nor even IP. They're part of DNS. 99% of this endpoint is DNS functionality. So… DNSEndpoint? But you can look up hostnames for use with SCTP or UDP via DNS as well! And this won't let you do that. And, in fact, this endpoint does not necessarily use DNS: it could be doing name resolution via /etc/hosts, or, it could be doing name resolution using WINS… you don't know.

The important lesson to be learnt here is that naming things for how they work is generally worse than naming things for what they do. DNS and TCP are both implementation details (albeit kind of important details in this case); the important thing about this endpoint is that it takes a hostname and gives you a stream-oriented connection.

So HostnameEndpoint seemed like the best and the least ambiguous choice.

Deform to form a Star

Reading the spec, the whole idea sounded (to me, at least) too convoluted and it wasn't very obvious where to begin. On seeking help, exarkun suggested:

Part of the idea with proceeding from simplest to most complicated is that once you've done all the simple things, the complicated things generally end up pretty simple. They're only complicated if you have to tackle all the complexity at once.

We moved on to talk about what test cases to write first, because… TDD! Starting with the easy cases, I planned to begin by using the EndpointTestCaseMixin - Try making the test methods from that mixin that exercise failures pass. Then the success cases. Perhaps then try a test that exercises the case where getaddrinfo returns one IPv4 address. And then the case where it returns one IPv6 address. And when all that works, then maybe it's time to start thinking about the interesting cases, where getaddrinfo returns an IPv4 and an IPv6 address, e.g. What if it returns IPv4 then IPv6? What if it returns IPv6 then IPv4? What if the first connection attempt succeeds in < 300ms? What if it succeeds in > 300ms, but before the second? What if the second succeeds before the first? What if they both fail? But start with the simpler cases, and work on them one at a time.

Deferreds and Raising Exceptions

A cloud appeared outside my door
And through the window saw four more
And on the back of each cloud sat
Two rainbow smiles in wizard's hats

There was a part of the endpoint where I was using "except" to work with a Deferred firing with an exception. Needless to say, the unit test that was to check this was failing, and I wasn't sure why. This led exarkun to elaborate on how Deferreds work, and where I was going wrong.

We have Deferreds so that functions can return before they have finished the abstract task they are responsible for. They return a Deferred and the Deferred becomes responsible for indicating the result of the abstract task, once it is complete. This lets multiple tasks proceed concurrently. You can just call one function to start one task, then it will return (before the task is complete), and you can call another function to start another task (which will also return before that task is complete). Now you have two Deferreds for the two tasks, and you can use them to learn the results of those tasks.

You learn the result by adding a callback to the Deferred. And when the result is available, the callback is called with it.

Not all results are "successes" though. But for the same reason that these functions can't return their result directly, neither can they raise an exception directly. By the time they return, they might not have gotten far enough in their abstract task to know that there is going to be an error. So they can still only return a Deferred. So you can also attach error callbacks, or errbacks, to Deferreds as well. An errback is just like a callback, except it is called when the Deferred gets its result and that result is not a success.

So, to handle results from asynchronous, Deferred-returning functions, you take the returned Deferred and call addCallback(callbackFunction) on it. And to handle error results from asynchronous, Deferred-returning functions, you take the returned Deferred and call addErrback(errbackFunction) on it.

The errback is ignored if the result is not a failure, just like an "except" statement is ignored if no exception is raised. Similarly, the callback is ignored if the result is a failure.

Hence, "except" will not work for a Deferred that fires with exception, because the Deferred was returned. If anything was returned at all, an exception could not possibly have been raised. Because of how Python functions work - they can raise or they can return, they cannot do both.

Returning Deferreds From Tests

At one point, I had written a test that was returning a Deferred, and then the whole test case was timing out, never reaching completion. I learnt that since it doesn't require any external resources, there was no reason this test needed to spin a reactor.

I asked, then, is returning Deferreds from tests a bad thing?

"It's not 'good' or 'bad' really," Glyph explained, "but if you're going to think of it as one of those, let's think of it as 'bad' by default".

Returning a Deferred spins the global reactor. You return Deferreds from tests that need to interface with real, external resources or connect to real sockets. In this case you absolutely do not want to do that. You want to very carefully control everything that your HostnameEndpoint is talking to. You want to fire the connection attempt success and failure in a very specific order - in several different orders, as a matter of fact. and to verify that attempts are in progress at certain times and not in progress at other times. That means you want to use Clock and you want to use MemoryReactor (possibly smashed together on one object so that you only need to have one reactor).

Then, at precise points in your test, you want to say "this deferred should be fired by now" not "whenever this Deferred happens to fire…". As you can see, if you do it by returning Deferreds, then your failure mode is "uh… the test didn't finish", and not "Oh, of course, I forgot to callback deferred X when event Y happened, and my test failed telling me that that Deferred wasn't fired".

If you return an unfired Deferred from your test, then the test never ends. It just sits around waiting for that Deferred forever. where Trial defines "forever" to be 120 seconds, after which the test times out. There are times when it's appropriate, but you won't likely encounter them when testing Twisted itself, I was told.

Basically if you're originating the Deferreds that you're testing, you should definitely not be yielding them, or returning them from your test. Tests which do return Deferreds depend on the things that return the Deferreds that they're returning being fully tested and having no chance of timing out.

So as a rule of thumb, only return Deferreds from a test if the test interacts with external stuff, e.g. writing tests that talk to, say, Postgres, and have to do multiple asynchronous operations. And, if you're doing that, see if there's a way you could reasonably interact with less external stuff first.

This is just a special case of a general rule though. The general rule is, your tests should not depend on any more stuff than they need to. Especially, they should not depend on unpredictable external factors that may make your tests behave differently under different conditions. A real reactor is by definition an unpredictable external factor; the whole point is that it's for talking to unpredictable external stuff.

Glyph also shared an excellent analogy: "Think of it like an actual nuclear reactor. Before you hook your thermal regulator up to the coolant rods, you really want to have a high level of confidence that it is going to react in the appropriate way when the "uranium is too hot" wire gets some voltage. So you give that wire some voltage using a hand crank, not a critical mass."

0 seconds later is still "some amount of time"

They threw five clocks down on my bed
The chimes danced out on golden threads
And turned to footprints on my wall
Sequined tears began to fall"

Now, I had a function iterateEndpoint, which iterates through a generator that yields different endpoints for different addresses returned by gai. Now, I needed to keep calling this function for each endpoint, and as per the algorithm we are implementing, we introduce a time lag of 0.3 seconds between two endpoint's connection attempts. So, each call to iterateEndpoint needs to occur 0.3 seconds after the previous call. I needed a technique to introduce this delay in the calls. There's reactor.callLater, which did this and I tried using it, and it didn't work.

By "didn't work", I mean that all my tests were failing, but not failing in a way that would help me identify what's wrong. But mostly, MemoryReactor.tcpClients was empty, when it should hold some host address, i.e., (in terms of real endpoint) a connection wasn't being established.

I blamed callLater for this, because there was at least one connection attempt when I hadn't used callLater and introduced the delay, whereas now it wasn't even trying the first endpoint's host address.

Then I made the delay in the looped calls 0 seconds to check, and I learnt that I needed to "advance" the clock for this to work. Even when the delay was 0 seconds. And this just wasn't make any sense.

That was when glyph started quoting Morpheus and said, "Free your mind!"

"Advancing the clock means calling Clock.advance()", he explained further. "Specifically when I say 'free your mind' I mean 'stop thinking about everything in terms of real stuff happening'. Everything here is just a method calling another method. Some of those methods represent things (like the passage of time on a clock) but in a test it's all a simulation. I'd also say that ultimately when it's not in the test it's just a higher fidelity simulation but that's a thorny philosophical issue we probably shouldn't get into. At any rate it's always just methods calling other methods. So your LoopingCall specifies a function to call to advance to the next connection attempt; it asks an IReactorTime to call that thing via callLater.

But then, I wondered, if I set the time delay between two consecutive calls to the looped code as 0 seconds, why would I need to advance the clock now?

"Probably the easiest way to understand the answer to that is to look at the implementation of Clock", Glyph said patiently. "Particularly the implementation of Clock.callLater. When you call callLater, it just remembers the function you asked it to run, puts it in a list, and sorts the list. Sorting a list does not call your function. Similarly, a "real" reactor's callLater will never run your function reentrantly. 0 seconds later is still 'some amount of time' later, even if it's not a very big amount of time. the behavior of Clock is that it only invokes application code when 'time passes' which is to say when you call .advance(). Real reactors do the same thing and only invoke timed calls when you get back to the main loop.

But… it's zero. It's supposed to mean "nothing"! I protested.

"You're thinking too hard about it", said Glyph. "It's a number. The behavior of Clock is such that it remembers a callable, sorts them by number, and then calls them. It stuffs it on the stack of callLaters and re-evaluates that when the reactor gets control again. The magnitude of the number isn't important except in that negative numbers aren't allowed because that's non-sensical. In actual fact it is actually an orderable object, 'number' is a convention that we apply to it. callLater means 'when time advances, no sooner than N seconds later, call this function'."

"So, when the reactor is told to perform a task 0 seconds later, does it not do that immediately?", I asked, still not quite convinced.

"Well of course it does it immediately. But not reentrantly. 'As soon as possible but no sooner'." was Glyph's answer. "'immediately' is defined by some time-keeping thing, so in order to figure out when "immediately" occurs, it has to consult some time-keeping thing. And the point in the code - not in time - where it does that consultation is the reactor loop for reactors and the advance() method for Clock. Practically speaking, callLater is not a hard realtime API, because Python isn't a hard realtime system, so callLater(n, f) cannot physically run f precisely n seconds after its invocation; it can't even get a zero-error concept of when its own invocation was. clocks keep moving while computers are computing. so if you say callLater(1, f) really you're likely for f to get run at 1.000000001 seconds later, or something close to that. so if you say callLater(0, f) you're going to get 0.000000001 seconds later, if you're comparing against real time. if you're using a task.Clock(), time is frozen, so you don't get to 0 seconds later until you allow time to advance."

"The most important thing though is that you understand the (very simple) implementation within Clock. there's nothing complicated going on in there, no magic. it's almost as simple as l = []; l.append(f); for g in l: g(). All the stuff about how time actually works is useful to understand but ultimately peripheral to the very simple rule that functions don't get called until you call them."

"One last time", I said, now almost convinced, "what exactly does 'reentrant' mean here?"

"It means that def forever(): reactor.callLater(0, forever) will not cause a 'maximum recursion depth exceeded' error, because callLater does not call forever below its stack frame. forever() doesn't re-enter itself upon the call to callLater, hence, callLater does not invoke its argument reentrantly. In other words, it doesn't mutually recurse."

The Calculus of Testing

Now, in my tests, I was just popping the result I want from MemoryReactor.tcpClients, and matching it against a value. So, the test ensured that the connection attempts were being made, but they weren't checking the algorithm.

Turns out, you can't check an algorithm with a unit test. "Some people think you can check an algorithm with formal logic, but they're probably wrong in practice. An algorithm is too complicated to check in any general way.", explained exarkun.

So I can only hope that the implementation of the algorithm is correct?

"No, you can test that the output of the algorithm is correct for some number of inputs", said he. "And you can hope that your selection of inputs exercises all meaningful branches (and branch combinations) in the algorithm."

"The idea with unit tests is that you constrain the domain of possible incorrect outputs", Glyph added. "You test what you believe to be a representative sample of each 'interesting' region of behavior for your algorithm, and by 'interesting' I mean each range of input that requires more code."

A general lesson I learnt was that I was trying really hard to understand how the whole thing fits together all at once, but it helps to compartmentalize. It helps to just figure out what functions call what other functions; later, once you know what's going on, learn why they're getting called in that way. You can do a whole lot without knowing why. And in some cases there isn't really a "why", it's just an arbitrary decision because the reactor has a bunch of work to do and it has to choose some order, so it does things in whatever order we thought to implement first.

And just like that, we had our endpoint written in a respectful, TDD-way, with well-tested parts, etc. But it still wasn't perfect. Later in the reviews, exarkun suggested that connect could be implemented more simply - probably as an independent object with instance state (and perhaps as an explicit state machine) rather than a collection of nested functions with shared, closed-over mutable objects. That led me to file another ticket, and work on getting a finite state machine into Twisted. But that's another story!

The caterpillar gasped at me and said
"My god if that's what's going on inside your head
You can see so much more than I
I think it's time to turn into a butterfly."

08 Mar 2014 2:24pm GMT

24 Feb 2014

feedPlanet Twisted

Glyph Lefkowitz: Unyielding

The Oak and the Reed by Achille Michallon

… that which is hard and stiff
is the follower of death
that which is soft and yielding
is the follower of life …

- the Tao Te Ching, chapter 76

Problem: Threads Are Bad

As we know, threads are a bad idea, (for most purposes). Threads make local reasoning difficult, and local reasoning is perhaps the most important thing in software development.

With the word "threads", I am referring to shared-state multithreading, despite the fact that there are languages, like Erlang and Haskell which refer to concurrent processes - those which do not implicitly share state, and require explicit coordination - as "threads".

My experience is mainly (although not exclusively) with Python but the ideas presented here should generalize to most languages which have global shared mutable state by default, which is to say, quite a lot of them: C (including Original Recipe, Sharp, Extra Crispy, Objective, and Plus Plus), JavaScript, Java, Scheme, Ruby, and PHP, just to name a few.

With the phrase "local reasoning", I'm referring to the ability to understand the behavior (and thereby, the correctness) of a routine by examining the routine itself rather than examining the entire system.

When you're looking at a routine that manipulates some state, in a single-tasking, nonconcurrent system, you only have to imagine the state at the beginning of the routine, and the state at the end of the routine. To imagine the different states, you need only to read the routine and imagine executing its instructions in order from top to bottom. This means that the number of instructions you must consider is n, where n is the number of instructions in the routine. By contrast, in a system with arbitrary concurrent execution - one where multiple threads might concurrently execute this routine with the same state - you have to read the method in every possible order, making the complexity nn.

Therefore it is - literally - exponentially more difficult to reason about a routine that may be executed from an arbitrary number of threads concurrently. Instead, you need to consider every possible caller across your program, understanding what threads they might be invoked from, or what state they might share. If you're writing a library desgined to be thread-safe, then you must place some of the burden of this understanding on your caller.

The importance of local reasoning really cannot be overstated. Computer programs are, at least for the time being, constructed by human beings who are thinking thoughts. Correct computer programs are constructed by human beings who can simultaneously think thoughts about all the interactions that the portion of the system they're developing will have with other portions.

A human being can only think about seven things at once, plus or minus two. Therefore, although we may develop software systems that contain thousands, millions, or billions of components over time, we must be able to make changes to that system while only holding in mind an average of seven things. Really bad systems will make us concentrate on nine things and we will only be able to correctly change them when we're at our absolute best. Really good systems will require us to concentrate on only five, and we might be able to write correct code for them even when we're tired.

Aside: "Oh Come On They're Not That Bad"

Those of you who actually use threads to write real software are probably objecting at this point. "Nobody would actually try to write free-threading code like this," I can hear you complain, "Of course we'd use a lock or a queue to introduce some critical sections if we're manipulating state."

Mutexes can help mitigate this combinatorial explosion, but they can't eliminate it, and they come with their own cost; you need to develop strategies to ensure consistent ordering of their acquisition. Mutexes should really be used to build queues, and to avoid deadlocks those queues should be non-blocking but eventually a system which communicates exclusively through non-blocking queues effectively becomes a set of communicating event loops, and its problems revert to those of an event-driven system; it doesn't look like regular programming with threads any more.

But even if you build such a system, if you're using a language like Python (or the ones detailed above) where modules, classes, and methods are all globally shared, mutable state, it's always possible to make an error that will affect the behavior of your whole program without even realizing that you're interacting with state at all. You have to have a level of vigilance bordering on paranoia just to make sure that your conventions around where state can be manipulated and by whom are honored, because when such an interaction causes a bug it's nearly impossible to tell where it came from.

Of course, threads are just one source of inscrutable, brain-bending bugs, and quite often you can make workable assumptions that preclude you from actually having to square the complexity of every single routine that you touch; for one thing, many computations don't require manipulating state at all, and you can (and must) ignore lots of things that can happen on every line of code anyway. (If you think not, when was the last time you audited your code base for correct behavior in the face of memory allocation failures?) So, in a sense, it's possible to write real systems with threads that perform more or less correctly for the same reasons it's possible to write any software approximating correctness at all; we all need a little strength of will and faith in our holy cause sometimes.

Nevertheless I still think it's a bad idea to make things harder for ourselves if we can avoid it.

Solution: Don't Use Threads

So now I've convinced you that if you're programming in Python (or one of its moral equivalents with respect to concurrency and state) you shouldn't use threads. Great. What are you going to do instead?

There's a lot of debate over the best way to do "asynchronous" programming - that is to say, "not threads", four options are often presented.

  1. Straight callbacks: Twisted's IProtocol, JavaScript's on<foo> idiom, where you give a callback to something which will call it later and then return control to something (usually a main loop) which will execute those callbacks,
  2. "Managed" callbacks, or Futures: Twisted's Deferred, JavaScript's Promises/A[+], E's Promises, where you create a dedicated result-that-will-be-available-in-the-future object and return it for the caller to add callbacks to,
  3. Explicit coroutines: Twisted's @inlineCallbacks, Tulip's yield from coroutines, C#'s async/await, where you have a syntactic feature that explicitly suspends the current routine,
  4. and finally, implicit coroutines: Java's "green threads", Twisted's Corotwine, eventlet, gevent, where any function may switch the entire stack of the current thread of control by calling a function which suspends it.

One of these things is not like the others; one of these things just doesn't belong.

Don't Use Those Threads Either

Options 1-3 are all ways of representing the cooperative transfer of control within a stateful system. They are a semantic improvement over threads. Callbacks, Futures, and Yield-based coroutines all allow for local reasoning about concurrent operations.

So why does option 4 even show up in this list?

Unfortunately, "asynchronous" systems have often been evangelized by emphasizing a somewhat dubious optimization which allows for a higher level of I/O-bound concurrency than with preemptive threads, rather than the problems with threading as a programming model that I've explained above. By characterizing "asynchronousness" in this way, it makes sense to lump all 4 choices together.

I've been guilty of this myself, especially in years past: saying that a system using Twisted is more efficient than one using an alternative approach using threads. In many cases that's been true, but:

  1. the situation is almost always more complicated than that, when it comes to performance,
  2. "context switching" is rarely a bottleneck in real-world programs, and
  3. it's a bit of a distraction from the much bigger advantage of event-driven programming, which is simply that it's easier to write programs at scale, in both senses (that is, programs containing lots of code as well as programs which have many concurrent users).

A system that presents "implicit coroutines" - those which may transfer control to another concurrent task at any layer of the stack without any syntactic indication that this may happen - are simply the dubious optimization by itself.

Despite the fact that implicit coroutines masquerade under many different names, many of which don't include the word "thread" - for example, "greenlets", "coroutines", "fibers", "tasks" - green or lightweight threads are indeed threads, in that they present these same problems. In the long run, when you build a system that relies upon them, you eventually have all the pitfalls and dangers of full-blown preemptive threads. Which, as shown above, are bad.

When you look at the implementation of a potentially concurrent routine written using callbacks or yielding coroutines, you can visually see exactly where it might yield control, either to other routines, or perhaps even re-enter the same routine concurrently. If you are using callbacks - managed or otherwise - you will see a return statement, or the termination of a routine, which allows execution of the main loop to potentially continue. If you're using explicit coroutines, you'll see a yield (or await) statement which suspends the coroutine. Because you can see these indications of potential concurrency, they're outside of your mind, in your text editor, and you don't need to actively remember them as you're working on them.

You can think of these explicit yield-points as places where your program may gracefully bend to the needs of concurrent inputs. Crumple zones, or relief valves, for your logic, if you will: a single point where you have to consider the implications of a transfer of control to other parts of your program, rather than a rigid routine which might transfer (break) at any point beyond your control.

Like crumple zones, you shouldn't have too many of them, or they lose their effectiveness. A long routine which has an explicit yield point before every single instruction requires just as much out-of-order reasoning, and is therefore just as error-prone as one which has none, but might context switch before any instruction anyway. The advantage of having to actually insert the yield point explicitly is that at least you can see when a routine has this problem, and start to clean up and consolidate the mangement of its concurrency.

But this is all pretty abstract; let me give you a specific practical example, and a small theoretical demonstration.

The Buggiest Bug

Brass Cockroach - Image Credit GlamourGirlBeads http://www.etsy.com/listing/62042780/large-antiqued-brass-cockroach1-ants3074

When we wrote the very first version of Twisted Reality in Python, the version we had previously written in Java was already using green threads; at the time, the JVM didn't have any other kind of threads. The advantage to the new networking layer that we developed was not some massive leap forward in performance (the software in question was a multiplayer text adventure, which at the absolute height of its popularity might have been played by 30 people simultaneously) but rather the dramatic reduction in the number and severity of horrible, un-traceable concurrency bugs. One, in particular, involved a brass, mechanical cockroach which would crawl around on a timer, leaping out of a player's hands if it was in their inventory, moving between rooms if not. In the multithreaded version, the cockroach would leap out of your hands but then also still stay in your hands. As the cockroach moved between rooms it would create shadow copies of itself, slowly but inexorably creating a cockroach apocalypse as tens of thousands of pointers to the cockroach, each somehow acquiring their own timer, scuttled their way into every player's inventory dozens of times.

Given that the feeling that this particular narrative feature was supposed to inspire was eccentric whimsy and not existential terror, the non-determinism introduced by threads was a serious problem. Our hope for the even-driven re-write was simply that we'd be able to diagnose the bug by single-stepping through a debugger; instead, the bug simply disappeared. (Echoes of this persist, in that you may rarely hear a particularly grizzled Twisted old-timer refer to a particularly intractable bug as a "brass cockroach".)

The original source of the bug was so completely intractable that the only workable solution was to re-write the entire system from scratch. Months of debugging and testing and experimenting could still reproduce it only intermittently, and several "fixes" (read: random, desperate changes to the code) never resulted in anything.

I'd rather not do that ever again.

Ca(sh|che Coherent) Money

Despite the (I hope) entertaining nature of that anecdote, it still might be somewhat hard to visualize how concurrency results in a bug like that, and the code for that example is far too sprawling to be useful as an explanation. So here's a smaller in vitro example. Take my word for it that the source of the above bug was the result of many, many intersecting examples of the problem described below.

As it happens, this is the same variety of example Guido van Rossum gives when he describes why chose to use explicit coroutines instead of green threads for the upcoming standard library asyncio module, born out of the "tulip" project, so it's happened to more than one person in real life.

Photo Credit: Ennor https://www.flickr.com/photos/ennor/441394582/sizes/l/

Let's say we have this program:

1
2
3
4
5
6
7
8
9
def transfer(amount, payer, payee, server):
    if not payer.sufficient_funds_for_withdrawl(amount):
        raise InsufficientFunds()
    log("{payer} has sufficient funds.", payer=payer)
    payee.deposit(amount)
    log("{payee} received payment", payee=payee)
    payer.withdraw(amount)
    log("{payer} made payment", payer=payer)
    server.update_balances([payer, payee])

(I realize that the ordering of operations is a bit odd in this example, but it makes the point easier to demonstrate, so please bear with me.)

In a world without concurrency, this is of course correct. If you run transfer twice in a row, the balance of both accounts is always correct. But if we were to run transfer with the same two accounts in an arbitrary number of threads simultaneously, it is (obviously, I hope) wrong. One thread could update a payer's balance below the funds-sufficient threshold after the check to see if they're sufficient, but before issuing the withdrawl.

So, let's make it concurrent, in the PEP 3156 style. That update_balances routine looks like it probably has to do some network communication and block, so let's consider that it is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@coroutine
def transfer(amount, payer, payee, server):
    if not payer.sufficient_funds_for_withdrawl(amount):
        raise InsufficientFunds()
    log("{payer} has sufficient funds.", payer=payer)
    payee.deposit(amount)
    log("{payee} received payment", payee=payee)
    payer.withdraw(amount)
    log("{payer} made payment", payer=payer)
    yield from server.update_balances([payer, payee])

So now we have a trivially concurrent, correct version of this routine, although we did have to update it a little. Regardless of what sufficient_funds_for_withdrawl, deposit and withdrawl do - even if they do network I/O - we know that we aren't waiting for any of them to complete, so they can't cause transfer to interfere with itself. For the sake of a brief example here, we'll have to assume update_balances is a bit magical; for this to work our reads of the payer and payee's balance must be consistent.

But if we were to use green threads as our "asynchronous" mechanism rather than coroutines and yields, we wouldn't need to modify the program at all! Isn't that better? And only update_balances blocks anyway, so isn't it just as correct?

Sure: for now.

But now let's make another, subtler code change: our hypothetical operations team has requested that we put all of our log messages into a networked log-gathering system for analysis. A reasonable request, so we alter the implementation of log to write to the network.

Now, what will we have to do to modify the green-threaded version of this code? Nothing! This is usually the point where fans of various green-threading systems will point and jeer, since once the logging system is modified to do its network I/O, you don't even have to touch the code for the payments system. Separation of concerns! Less pointless busy-work! Looks like the green-threaded system is winning.

Oh well. Since I'm still a fan of explicit concurrency management, let's do the clearly unnecessary busy-work of updating the ledger code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@coroutine
def transfer(amount, payer, payee, server):
    if not payer.sufficient_funds_for_withdrawl(amount):
        raise InsufficientFunds()
    yield from log("{payer} has sufficient funds.", payer=payer)
    payee.deposit(amount)
    yield from log("{payee} received payment", payee=payee)
    payer.withdraw(amount)
    yield from log("{payer} made payment", payer=payer)
    yield from server.update_balances([payer, payee])

Well okay, at least that wasn't too hard, if somewhat tedious. Sigh. I guess we can go update all of the ledger's callers now and update them too…

…wait a second.

In order to update this routine for a non-blocking version of log, we had to type a yield keyword between the sufficient_funds_for_withdrawl check and the withdraw call, between the deposit and the withdraw call, and between the withdraw and update_balances call. If we know a little about concurrency and a little about what this program is doing, we know that every one of those yield froms are a potential problem. If those log calls start to back up and block, a payer may have their account checked for sufficient funds, then funds could be deducted while a log message is going on, leaving them with a negative balance.

If we were in the middle of updating lots of code, we might have blindly added these yield keywords without noticing that mistake. I've certainly done that in the past, too. But just the mechanical act of typing these out is an opportunity to notice that something's wrong, both now and later. Even if we get all the way through making the changes without realizing the problem, when we notice that balances are off, we can look only (reasoning locally!) at the transfer routine and realize, when we look at it, based on the presence of the yield from keywords, that there is something wrong with the transfer routine itself, regardless of the behavior of any of the things it's calling.

In the process of making all these obviously broken modifications, another thought might occur to us: do we really need to wait before log messages are transmitted to the logging system before moving on with our application logic? The answer would almost always be "no". A smart implementation of log could simply queue some outbound messages to the logging system, then discard if too many are buffered, removing any need for its caller to honor backpressure or slow down if the logging system can't keep up. Consider the way syslog says "and N more" instead of logging certain messages repeatedly. That feature allows it to avoid filling up logs with repeated messages, and decreases the amount of stuff that needs to be buffered if writing the logs to disk is slow.

All the extra work you need to do when you update all the callers of log when you make it asynchronous is therefore a feature. Tedious as it may be, the asynchronousness of an individual function is, in fact, something that all of its callers must be aware of, just as they must be aware of its arguments and its return type.

In fact you are changing its return type: in Twisted, that return type would be Deferred, and in Tulip, that return type is a new flavor of generator. This new return type represents the new semantics that happen when you make a function start having concurrency implications.

Haskell does this as well, by embedding the IO monad in the return type of any function which needs to have side-effects. This is what certain people mean when they say Deferreds are a Monad.

The main difference between lightweight and heavyweight threads is that it is that, with rigorous application of strict principles like "never share any state unnecessarily", and "always write tests for every routine at every point where it might suspend", lightweight threads make it at least possible to write a program that will behave deterministically and correctly, assuming you understand it in its entirety. When you find a surprising bug in production, because a routine that is now suspending in a place it wasn't before, it's possible with a lightweight threading system to write a deterministic test that will exercise that code path. With heavyweight threads, any line could be the position of a context switch at any time, so it's just not tractable to write tests for every possible order of execution.

However, with lightweight threads, you still can't write a test to discover when a new yield point might be causing problems, so you're still always playing catch-up.

Although it's possible to do this, it remains very challenging. As I described above, in languages like Python, Ruby, JavaScript, and PHP, even the code itself is shared, mutable state. Classes, types, functions, and namespaces are all shared, and all mutable. Libraries like object relational mappers commonly store state on classes.

No Shortcuts

Despite the great deal of badmouthing of threads above, my main purpose in writing this was not to convince you that threads are, in fact, bad. (Hopefully, you were convinced before you started reading this.) What I hope I've demonstrated is that if you agree with me that threading has problematic semantics, and is difficult to reason about, then there's no particular advantage to using microthreads, beyond potentially optimizing your multithreaded code for a very specific I/O bound workload.

There are no shortcuts to making single-tasking code concurrent. It's just a hard problem, and some of that hard problem is reflected in the difficulty of typing a bunch of new concurrency-specific code.

So don't be fooled: a thread is a thread regardless of its color. If you want your program to be supple and resilient in the face of concurrency, when the storm of concurrency blows, allow it to change. Tell it to yield, just like the reed. Otherwise, just like the steadfast and unchanging oak tree in the storm, your steadfast and unchanging algorithms will break right in half.

24 Feb 2014 10:05am GMT

11 Feb 2014

feedPlanet Twisted

The Moon Research Blog: Twisted Names – now with a (private) EDNS(0) message parser

This is the first in a series of articles describing my progress towards introducing EDNS(0) and DNSSEC support to Twisted Names.

In this article, I'll describe two new private classes designed to parse EDNS(0) messages.

But first a little background.

The EDNS protocol extensions depend on the use of a special DNS record which crucially contains the advertised maximum datagram size, extended return codes, a DNSSECOK flag, and a payload which contains zero or more extensible headers.

This special record is called an OPT record. It isn't a real record though; it doesn't convey any domain names or domain records. It's a hack to work around the lack of space in a standard 512 byte DNS message. All the header flags and fields that couldn't be shoehorned into the 96bit header section of a DNS message are instead encoded into an OPT "pseudo" record.

EDNS clients and servers append one OPT record to the "additional" records in a sent DNS message. This record is then extracted and interpreted by an EDNS aware receiver.

Twisted already has a class for parsing DNS records and it already has a class for parsing DNS messages. I've now created some wrappers which add the EDNS special behaviour to those existing APIs.

The first wrapper is a class called twisted.names.dns._OPTHeader, which wraps an instance of twisted.names.dns.RRHeader. This allows us to re-use the existing record parsing code, and means that the code in _OPTHeader is purely concerned with the special behaviour of an OPT record. Which seems neat! It also made the unit tests much more focused.

The second wrapper is a class called twisted.names.dns._EDNSMessage which is in trunk but not yet in a released version. This wraps an instance of twisted.names.dns.Message. And again, this allows us to re-use the existing parsing code while clearly separating the EDNS specific behaviour.

So Twisted Names can now interpret OPT records and EDNS messages, but we've decided to keep these private for the time being; until we are satisfied with the APIs.

This first version of _EDNSMessage has been written as a drop in replacement for dns.Message. But as a result it has inherited some of the ugliness of that class, namely:

Tom has some ideas about introducing a new Interface for message constructorsand I've done some work on improving the abbreviated names, but I got stuck trying to figure out how to introduce the changes while maintaining backwards compatibility.

I'd originally thought those things were blockers, because I intended that users would pass curried message constructors / factories to the various protocols and client APIs, as a way of configuring / overriding EDNS flags and fields in their requests. But that's not a good API (as Tom points out in another comment).

So I've decided that those cleanups would be nice to have, but probably not necessary right now.

A much simpler strategy - and my current priority - is to just switch all Twisted names code (client, protocols, server) to use the new dns._EDNSMessage class in place of dns.Message. This shouldn't introduce any backwards incompatibilities and I'll explicitly disable the use EDNS during encoding (for now). That way Twisted Names clients and servers should continue to behave exactly as before.

But I'll then be free to start work on a DNS client which can be configured to send EDNS(0) requests and which is capable of negotiating with EDNS(0) servers. That's the key requirement of the first project milestone.

(twisted.names.server and the twistd dns plugin will remain non-EDNS for the timebeing.)

In fact I've already started that work, but first I had to fix a bug in the way twisted.names.server generates response messages. My proposed fix includes new functions and methods for generating response messages from a request message. (Previously, the request message was being re-used as the basis for the response, and inevitably, client specific flags and fields were leaking into the response.)

Happily, these new APIs will also allow me to explicitly disable EDNS(0) in twisted.names.server, until such time as we implement server side EDNS(0) support.

So that's all for now. In my next post I hope to be able to report that Twisted Names is using the new _EDNSMessage class internally and summarise some of the other work that I've done on Twisted Names in support of the EDNS(0) and DNSSEC project.

Thanks once again to The NLnet Foundation for funding this project.

11 Feb 2014 4:50pm GMT

03 Feb 2014

feedPlanet Twisted

The Moon Research Blog: The Bristol Twisted Sprint – a belated thankyou

I'd like to thank Luke, Rob and everyone at HybridCluster for organising the Bristol Twisted Sprint in December last year and for their hospitality on the day.

I got quite a bit of work done and it was a rare opportunity to meet Tom and JP, who had flown in from the Americas, and Hynek who had flown in from Berlin.

Earlier that week I'd been amazed to receive an email from Tom with code-reviews of three of my EDNS branches. He'd done them during his flight from Canada! Not sure I'd have had the stamina, given the lack of elbow room and the temptation of in-flight entertainment and complimentary booze. But Tom's a pro and it meant that we had lots to discuss on the day of the sprint. Tom had raised some concerns about my proposed mechanism for setting EDNS parameters from the high level twisted.names.client API - with which I agree. But I'm still struggling to implement the alternative mechanism that we discussed on the day. More on that in a later blog post.

It was also nice to see Hynek, who showed me an interesting custom DNS server which he'd written despite the lack of documentation and despite having to subclass and override the private _lookup method of twisted.names.authority. He laid it on pretty thick, and guilt-tripped me into writing some decent documentation for twisted.names.server. Conscience is what motivates me, so I'm pleased to report that we now have a little more narrative documentation for Twisted Names. And it looks pretty nice too, thanks to khorn's and glyph's work on porting the Twisted documentation to ReST and Sphinx.

JP adopted a "headmasterly" role; supervising things from a lectern at the front of the room (an improvised standing desk), and intimidating us (the pupils) with astonishing Typespeed scores between builds - the keyboard noise was deafening! But I'm exaggerating, JP was as helpful and as thoughtful as usual. Switching context effortlessly to assist us with all areas of Twisted - from development environment tips, to Twisted OpenSSL integration, to EDNS, to PyPy tests and benchmarking, to Tun/Tap APIs. Very impressive!

Let's hope there are more UK Twisted sprints in 2014.

03 Feb 2014 12:32pm GMT

01 Feb 2014

feedPlanet Twisted

The Moon Research Blog: Twisted Names EDNS — a progress update (or lack thereof)

Apologies to those of you who have been eagerly awaiting news of EDNS and DNSSEC support in Twisted Names. I haven't really got a good excuse; just that I'm a bad communicator and I've been distracted by some mundane (and some exciting) non-computer activities. Sorry.

So, a progress report is long overdue and progress has been a little slower than expected, but let me try and compensate by promising a series of short updates and demos which I'll publish over the next few weeks, demonstrating what hasbeen achieved so far.

Stay tuned!

01 Feb 2014 9:43pm GMT

24 Jan 2014

feedPlanet Twisted

Glyph Lefkowitz: Unyielding

The Oak and the Reed by Achille Michallon

… that which is hard and stiff
is the follower of death
that which is soft and yielding
is the follower of life …

- the Tao Te Ching, chapter 76

Problem: Threads Are Bad

As we know, threads are a bad idea, (for most purposes). Threads make local reasoning difficult, and local reasoning is perhaps the most important thing in software development.

With the word "threads", I am referring to shared-state multithreading, despite the fact that there are languages, like Erlang and Haskell which refer to concurrent processes - those which do not implicitly share state, and require explicit coordination - as "threads".

My experience is mainly (although not exclusively) with Python but the ideas presented here should generalize to most languages which have global shared mutable state by default, which is to say, quite a lot of them: C (including Original Recipe, Sharp, Extra Crispy, Objective, and Plus Plus), JavaScript, Java, Scheme, Ruby, and PHP, just to name a few.

With the phrase "local reasoning", I'm referring to the ability to understand the behavior (and thereby, the correctness) of a routine by examining the routine itself rather than examining the entire system.

When you're looking at a routine that manipulates some state, in a single-tasking, nonconcurrent system, you only have to imagine the state at the beginning of the routine, and the state at the end of the routine. To imagine the different states, you need only to read the routine and imagine executing its instructions in order from top to bottom. This means that the number of instructions you must consider is n, where n is the number of instructions in the routine. By contrast, in a system with arbitrary concurrent execution - one where multiple threads might concurrently execute this routine with the same state - you have to read the method in every possible order, making the complexity nn.

Therefore it is - literally - exponentially more difficult to reason about a routine that may be executed from an arbitrary number of threads concurrently. Instead, you need to consider every possible caller across your program, understanding what threads they might be invoked from, or what state they might share. If you're writing a library desgined to be thread-safe, then you must place some of the burden of this understanding on your caller.

The importance of local reasoning really cannot be overstated. Computer programs are, at least for the time being, constructed by human beings who are thinking thoughts. Correct computer programs are constructed by human beings who can simultaneously think thoughts about all the interactions that the portion of the system they're developing will have with other portions.

A human being can only think about seven things at once, plus or minus two. Therefore, although we may develop software systems that contain thousands, millions, or billions of components over time, we must be able to make changes to that system while only holding in mind an average of seven things. Really bad systems will make us concentrate on nine things and we will only be able to correctly change them when we're at our absolute best. Really good systems will require us to concentrate on only five, and we might be able to write correct code for them even when we're tired.

Aside: "Oh Come On They're Not That Bad"

Those of you who actually use threads to write real software are probably objecting at this point. "Nobody would actually try to write free-threading code like this," I can hear you complain, "Of course we'd use a lock or a queue to introduce some critical sections if we're manipulating state."

Mutexes can help mitigate this combinatorial explosion, but they can't eliminate it, and they come with their own cost; you need to develop strategies to ensure consistent ordering of their acquisition. Mutexes should really be used to build queues, and to avoid deadlocks those queues should be non-blocking but eventually a system which communicates exclusively through non-blocking queues effectively becomes a set of communicating event loops, and its problems revert to those of an event-driven system; it doesn't look like regular programming with threads any more.

But even if you build such a system, if you're using a language like Python (or the ones detailed above) where modules, classes, and methods are all globally shared, mutable state, it's always possible to make an error that will affect the behavior of your whole program without even realizing that you're interacting with state at all. You have to have a level of vigilance bordering on paranoia just to make sure that your conventions around where state can be manipulated and by whom are honored, because when such an interaction causes a bug it's nearly impossible to tell where it came from.

Of course, threads are just one source of inscrutable, brain-bending bugs, and quite often you can make workable assumptions that preclude you from actually having to square the complexity of every single routine that you touch; for one thing, many computations don't require manipulating state at all, and you can (and must) ignore lots of things that can happen on every line of code anyway. (If you think not, when was the last time you audited your code base for correct behavior in the face of memory allocation failures?) So, in a sense, it's possible to write real systems with threads that perform more or less correctly for the same reasons it's possible to write any software approximating correctness at all; we all need a little strength of will and faith in our holy cause sometimes.

Nevertheless I still think it's a bad idea to make things harder for ourselves if we can avoid it.

Solution: Don't Use Threads

So now I've convinced you that if you're programming in Python (or one of its moral equivalents with respect to concurrency and state) you shouldn't use threads. Great. What are you going to do instead?

There's a lot of debate over the best way to do "asynchronous" programming - that is to say, "not threads", four options are often presented.

  1. Straight callbacks: Twisted's IProtocol, JavaScript's on<foo> idiom, where you give a callback to something which will call it later and then return control to something (usually a main loop) which will execute those callbacks,
  2. "Managed" callbacks, or Futures: Twisted's Deferred, JavaScript's Promises/A[+], E's Promises, where you create a dedicated result-that-will-be-available-in-the-future object and return it for the caller to add callbacks to,
  3. Explicit coroutines: Twisted's @inlineCallbacks, Tulip's yield from coroutines, C#'s async/await, where you have a syntactic feature that explicitly suspends the current routine,
  4. and finally, implicit coroutines: Java's "green threads", Twisted's Corotwine, eventlet, gevent, where any function may switch the entire stack of the current thread of control by calling a function which suspends it.

One of these things is not like the others; one of these things just doesn't belong.

Don't Use Those Threads Either

Options 1-3 are all ways of representing the cooperative transfer of control within a stateful system. They are a semantic improvement over threads. Callbacks, Futures, and Yield-based coroutines all allow for local reasoning about concurrent operations.

So why does option 4 even show up in this list?

Unfortunately, "asynchronous" systems have often been evangelized by emphasizing a somewhat dubious optimization which allows for a higher level of I/O-bound concurrency than with preemptive threads, rather than the problems with threading as a programming model that I've explained above. By characterizing "asynchronousness" in this way, it makes sense to lump all 4 choices together.

I've been guilty of this myself, especially in years past: saying that a system using Twisted is more efficient than one using an alternative approach using threads. In many cases that's been true, but:

  1. the situation is almost always more complicated than that, when it comes to performance,
  2. "context switching" is rarely a bottleneck in real-world programs, and
  3. it's a bit of a distraction from the much bigger advantage of event-driven programming, which is simply that it's easier to write programs at scale, in both senses (that is, programs containing lots of code as well as programs which have many concurrent users).

A system that presents "implicit coroutines" - those which may transfer control to another concurrent task at any layer of the stack without any syntactic indication that this may happen - are simply the dubious optimization by itself.

Despite the fact that implicit coroutines masquerade under many different names, many of which don't include the word "thread" - for example, "greenlets", "coroutines", "fibers", "tasks" - green or lightweight threads are indeed threads, in that they present these same problems. In the long run, when you build a system that relies upon them, you eventually have all the pitfalls and dangers of full-blown preemptive threads. Which, as shown above, are bad.

When you look at the implementation of a potentially concurrent routine written using callbacks or yielding coroutines, you can visually see exactly where it might yield control, either to other routines, or perhaps even re-enter the same routine concurrently. If you are using callbacks - managed or otherwise - you will see a return statement, or the termination of a routine, which allows execution of the main loop to potentially continue. If you're using explicit coroutines, you'll see a yield (or await) statement which suspends the coroutine. Because you can see these indications of potential concurrency, they're outside of your mind, in your text editor, and you don't need to actively remember them as you're working on them.

You can think of these explicit yield-points as places where your program may gracefully bend to the needs of concurrent inputs. Crumple zones, or relief valves, for your logic, if you will: a single point where you have to consider the implications of a transfer of control to other parts of your program, rather than a rigid routine which might transfer (break) at any point beyond your control.

Like crumple zones, you shouldn't have too many of them, or they lose their effectiveness. A long routine which has an explicit yield point before every single instruction requires just as much out-of-order reasoning, and is therefore just as error-prone as one which has none, but might context switch before any instruction anyway. The advantage of having to actually insert the yield point explicitly is that at least you can see when a routine has this problem, and start to clean up and consolidate the mangement of its concurrency.

But this is all pretty abstract; let me give you a specific practical example, and a small theoretical demonstration.

The Buggiest Bug

Brass Cockroach - Image Credit GlamourGirlBeads http://www.etsy.com/listing/62042780/large-antiqued-brass-cockroach1-ants3074

When we wrote the very first version of Twisted Reality in Python, the version we had previously written in Java was already using green threads; at the time, the JVM didn't have any other kind of threads. The advantage to the new networking layer that we developed was not some massive leap forward in performance (the software in question was a multiplayer text adventure, which at the absolute height of its popularity might have been played by 30 people simultaneously) but rather the dramatic reduction in the number and severity of horrible, un-traceable concurrency bugs. One, in particular, involved a brass, mechanical cockroach which would crawl around on a timer, leaping out of a player's hands if it was in their inventory, moving between rooms if not. In the multithreaded version, the cockroach would leap out of your hands but then also still stay in your hands. As the cockroach moved between rooms it would create shadow copies of itself, slowly but inexorably creating a cockroach apocalypse as tens of thousands of pointers to the cockroach, each somehow acquiring their own timer, scuttled their way into every player's inventory dozens of times.

Given that the feeling that this particular narrative feature was supposed to inspire was eccentric whimsy and not existential terror, the non-determinism introduced by threads was a serious problem. Our hope for the even-driven re-write was simply that we'd be able to diagnose the bug by single-stepping through a debugger; instead, the bug simply disappeared. (Echoes of this persist, in that you may rarely hear a particularly grizzled Twisted old-timer refer to a particularly intractable bug as a "brass cockroach".)

The original source of the bug was so completely intractable that the only workable solution was to re-write the entire system from scratch. Months of debugging and testing and experimenting could still reproduce it only intermittently, and several "fixes" (read: random, desperate changes to the code) never resulted in anything.

I'd rather not do that ever again.

Ca(sh|che Coherent) Money

Despite the (I hope) entertaining nature of that anecdote, it still might be somewhat hard to visualize how concurrency results in a bug like that, and the code for that example is far too sprawling to be useful as an explanation. So here's a smaller in vitro example. Take my word for it that the source of the above bug was the result of many, many intersecting examples of the problem described below.

As it happens, this is the same variety of example Guido van Rossum gives when he describes why chose to use explicit coroutines instead of green threads for the upcoming standard library asyncio module, born out of the "tulip" project, so it's happened to more than one person in real life.

Photo Credit: Ennor https://www.flickr.com/photos/ennor/441394582/sizes/l/

Let's say we have this program:

1
2
3
4
5
6
7
8
9
def transfer(amount, payer, payee, server):
    if not payer.sufficient_funds_for_withdrawl(amount):
        raise InsufficientFunds()
    log("{payer} has sufficient funds.", payer=payer)
    payee.deposit(amount)
    log("{payee} received payment", payee=payee)
    payer.withdraw(amount)
    log("{payer} made payment", payer=payer)
    server.update_balances([payer, payee])

(I realize that the ordering of operations is a bit odd in this example, but it makes the point easier to demonstrate, so please bear with me.)

In a world without concurrency, this is of course correct. If you run transfer twice in a row, the balance of both accounts is always correct. But if we were to run transfer with the same two accounts in an arbitrary number of threads simultaneously, it is (obviously, I hope) wrong. One thread could update a payer's balance below the funds-sufficient threshold after the check to see if they're sufficient, but before issuing the withdrawl.

So, let's make it concurrent, in the PEP 3156 style. That update_balances routine looks like it probably has to do some network communication and block, so let's consider that it is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@coroutine
def transfer(amount, payer, payee, server):
    if not payer.sufficient_funds_for_withdrawl(amount):
        raise InsufficientFunds()
    log("{payer} has sufficient funds.", payer=payer)
    payee.deposit(amount)
    log("{payee} received payment", payee=payee)
    payer.withdraw(amount)
    log("{payer} made payment", payer=payer)
    yield from server.update_balances([payer, payee])

So now we have a trivially concurrent, correct version of this routine, although we did have to update it a little. Regardless of what sufficient_funds_for_withdrawl, deposit and withdrawl do - even if they do network I/O - we know that we aren't waiting for any of them to complete, so they can't cause transfer to interfere with itself. For the sake of a brief example here, we'll have to assume update_balances is a bit magical; for this to work our reads of the payer and payee's balance must be consistent.

But if we were to use green threads as our "asynchronous" mechanism rather than coroutines and yields, we wouldn't need to modify the program at all! Isn't that better? And only update_balances blocks anyway, so isn't it just as correct?

Sure: for now.

But now let's make another, subtler code change: our hypothetical operations team has requested that we put all of our log messages into a networked log-gathering system for analysis. A reasonable request, so we alter the implementation of log to write to the network.

Now, what will we have to do to modify the green-threaded version of this code? Nothing! This is usually the point where fans of various green-threading systems will point and jeer, since once the logging system is modified to do its network I/O, you don't even have to touch the code for the payments system. Separation of concerns! Less pointless busy-work! Looks like the green-threaded system is winning.

Oh well. Since I'm still a fan of explicit concurrency management, let's do the clearly unnecessary busy-work of updating the ledger code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@coroutine
def transfer(amount, payer, payee, server):
    if not payer.sufficient_funds_for_withdrawl(amount):
        raise InsufficientFunds()
    yield from log("{payer} has sufficient funds.", payer=payer)
    payee.deposit(amount)
    yield from log("{payee} received payment", payee=payee)
    payer.withdraw(amount)
    yield from log("{payer} made payment", payer=payer)
    yield from server.update_balances([payer, payee])

Well okay, at least that wasn't too hard, if somewhat tedious. Sigh. I guess we can go update all of the ledger's callers now and update them too…

…wait a second.

In order to update this routine for a non-blocking version of log, we had to type a yield keyword between the sufficient_funds_for_withdrawl check and the withdraw call, between the deposit and the withdraw call, and between the withdraw and update_balances call. If we know a little about concurrency and a little about what this program is doing, we know that every one of those yield froms are a potential problem. If those log calls start to back up and block, a payer may have their account checked for sufficient funds, then funds could be deducted while a log message is going on, leaving them with a negative balance.

If we were in the middle of updating lots of code, we might have blindly added these yield keywords without noticing that mistake. I've certainly done that in the past, too. But just the mechanical act of typing these out is an opportunity to notice that something's wrong, both now and later. Even if we get all the way through making the changes without realizing the problem, when we notice that balances are off, we can look only (reasoning locally!) at the transfer routine and realize, when we look at it, based on the presence of the yield from keywords, that there is something wrong with the transfer routine itself, regardless of the behavior of any of the things it's calling.

In the process of making all these obviously broken modifications, another thought might occur to us: do we really need to wait before log messages are transmitted to the logging system before moving on with our application logic? The answer would almost always be "no". A smart implementation of log could simply queue some outbound messages to the logging system, then discard if too many are buffered, removing any need for its caller to honor backpressure or slow down if the logging system can't keep up. Consider the way syslog says "and N more" instead of logging certain messages repeatedly. That feature allows it to avoid filling up logs with repeated messages, and decreases the amount of stuff that needs to be buffered if writing the logs to disk is slow.

All the extra work you need to do when you update all the callers of log when you make it asynchronous is therefore a feature. Tedious as it may be, the asynchronousness of an individual function is, in fact, something that all of its callers must be aware of, just as they must be aware of its arguments and its return type.

In fact you are changing its return type: in Twisted, that return type would be Deferred, and in Tulip, that return type is a new flavor of generator. This new return type represents the new semantics that happen when you make a function start having concurrency implications.

Haskell does this as well, by embedding the IO monad in the return type of any function which needs to have side-effects. This is what certain people mean when they say Deferreds are a Monad.

The main difference between lightweight and heavyweight threads is that it is that, with rigorous application of strict principles like "never share any state unnecessarily", and "always write tests for every routine at every point where it might suspend", lightweight threads make it at least possible to write a program that will behave deterministically and correctly, assuming you understand it in its entirety. When you find a surprising bug in production, because a routine that is now suspending in a place it wasn't before, it's possible with a lightweight threading system to write a deterministic test that will exercise that code path. With heavyweight threads, any line could be the position of a context switch at any time, so it's just not tractable to write tests for every possible order of execution.

However, with lightweight threads, you still can't write a test to discover when a new yield point might be causing problems, so you're still always playing catch-up.

Although it's possible to do this, it remains very challenging. As I described above, in languages like Python, Ruby, JavaScript, and PHP, even the code itself is shared, mutable state. Classes, types, functions, and namespaces are all shared, and all mutable. Libraries like object relational mappers commonly store state on classes.

No Shortcuts

Despite the great deal of badmouthing of threads above, my main purpose in writing this was not to convince you that threads are, in fact, bad. (Hopefully, you were convinced before you started reading this.) What I hope I've demonstrated is that if you agree with me that threading has problematic semantics, and is difficult to reason about, then there's no particular advantage to using microthreads, beyond potentially optimizing your multithreaded code for a very specific I/O bound workload.

There are no shortcuts to making single-tasking code concurrent. It's just a hard problem, and some of that hard problem is reflected in the difficulty of typing a bunch of new concurrency-specific code.

So don't be fooled: a thread is a thread regardless of its color. If you want your program to be supple and resilient in the face of concurrency, when the storm of concurrency blows, allow it to change. Tell it to yield, just like the reed. Otherwise, just like the steadfast and unchanging oak tree in the storm, your steadfast and unchanging algorithms will break right in half.

24 Jan 2014 8:28am GMT

21 Jan 2014

feedPlanet Twisted

Glyph Lefkowitz: And Now For Something Completely Different

It seems that the constant reminders of all the wonderful new features that my previous web publishing host had in store for me finally motivated me to set up a somewhat more do-it-yourself publishing arrangement, as behooves someone of my particular talents and skills.

Desk

I'm using Pelican now, and from what I've seen of it so far, it's a promising tool. I particularly like their approach to publishing multiple content types; for content that I'm writing new, I can use markdown, but for content that I want to process automatically, it accepts HTML just fine, which means that the conversion process from my previous host has been relatively painless. (Relatively painless, that is. Remember to calibrate pain on the Extract-Transform-Load scale, which starts at 8 or so.)

One interesting consequence of this change is that you may now access this blog securely if that's the sort of thing you are interested in.

Another change is that I've made the conscious decision to eliminate the comments section. While over the years I have been lucky to receive many interesting and engaging comments on my posts, I think that the best feedback I've gotten has always been from people writing in their own spaces. So I've mainly removed the temptation of the comment box to prevent the motivation to write something thougthful from being burned away in a quick riposte.

If you would like to comment on something I've said here, now or in the future, just send me an email and I'll be happy to either post your comments inline, in a new post, or ideally, link to your own published response elsewhere.

Or, of course, ignore it entirely, as is my right as the lord of this particular digital fiefdom.

Sadly, this change makes some of the older posts imported to this blog read slightly oddly, as they invite you to "comment below", but nobody's been doing that for some time, so I think that overall this change is likely to provoke a better quality of discussion and use of time. The full indictment of how You Kids Today ruined the Internet with Your Horrible Comments, and Back In My Day Things Were Better And So Forth will have to wait for a future article. (Or you could go read any New York Times article with the word "millennials" in the title, as it will say basically the same thing.)

I hope you enjoy the new format.

21 Jan 2014 10:01am GMT