26 Nov 2013
The current project plancontains the list of milestones and tickets.
If there are any DNS enthusiasts / experts reading this, I'd appreciate your feedback; on the plan and on the implementation (as it evolves). Please get in touch.
But the more eyes on the code the better, so please consider helping out with code reviews if you can. (I'm happy to trade reviews if you've got your own branch / patch waiting to be reviewed.)
I've been given a head start in this project by the patches contributed by Bob Novasand Phil Mayers, so thank you both. I hope you'll be able monitor what I'm doing and steer me in the right direction.
Thanks also to Itamar who encouraged me to apply for the funding and to Tom Prince and everyone who helped me draft the proposal.
I'll be working on this at the Twisted Sprint in Bristol, UK on December 7th; where I'll be delighted to discuss the project and demonstrate what I've been up to.
Hope to see you there!
26 Nov 2013 1:30pm GMT
08 Nov 2013
On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 13.2!
The highlights of this release are:
- Twisted now includes a HostnameEndpoint implementation which uses IPv4 and IPv6 in parallel, speeding up the connection by using whichever connects first (the 'Happy Eyeballs'/RFC 6555 algorithm). (#4859)
- Improved support for Cancellable Deferreds by kaizhang, our GSoC student. (#4320, #6532, #6572, #6639)
- Improved Twisted.Mail documentation by shira, our Outreach Program for Women intern. (#6649, #6652)
- twistd now waits for the application to start successfully before exiting after daemonization. (#823)
- SSL server endpoint string descriptions now support the specification of chain certificates. (#6499)
- Over 70 closed tickets since 13.1.0.
For more information, check the NEWS file.
You can find the downloads on PyPi (or alternatively the Twisted Matrix Downloads page).
Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!
08 Nov 2013 8:18pm GMT
19 Oct 2013
It's been a busy 6 months since I first released Crochet, and now it's up to v1.0. Along the way I've expanded the documentation quite a bit and moved it to Sphinx, fixed a whole bunch of bug reports from users, added some new APIs and probably introduced some new bugs. What is Crochet, you ask?
Crochet is an MIT-licensed library that makes it easier for blocking or threaded applications like Flask or Django to use the Twisted networking framework. Crochet provides the following features:
- Runs Twisted's reactor in a thread it manages.
- The reactor shuts down automatically when the process' main thread finishes.
- Hooks up Twisted's log system to the Python standard library loggingframework. Unlike Twisted's built-in logging bridge, this includes support for blocking Handler instances.
- A blocking API to eventual results (i.e. Deferred instances). This last feature can be used separately, so Crochet is also useful for normal Twisted applications that use threads.
Bugs and feature requests should be filed at the project Github page.
19 Oct 2013 3:57pm GMT
18 Oct 2013
I'm please to announce that there is a new git mirror hosted on twistedmatrix.com infrastructure.
git clone https://code.twistedmatrix.com/git/Twisted
For those that prefer bzr, the bzr mirror is also available as
bzr branch https://code.twistedmatrix.com/bzr/Twisted/trunk Twisted
The build bots are now using these mirrors for checking out code.
18 Oct 2013 5:11pm GMT
10 Oct 2013
Logitech is my brand of choice for input devices. Unfortunately, though, Logitech seems to focus on their unifying receiver for most of their stuff, to the detriment of their Bluetooth offering. Every now and then, they do come out with a nice Bluetooth device, usually targetting ultrabooks or tablets. Last month I stumbled upon the new Logitech Ultrathin Touch Mouse (t630). As usual, it is marketed for Windows compatibility, with Linux officially not supported. They do have a second model targetted to Mac users with the t631, but I suspect the only difference is its color.
Fortunately, this device mostly works fine on my Ubuntu 13.04 laptops. Plural, because this tiny mouse can be set up to pair with two devices, switchable with a switch on the bottom. The only problem is that, out-of-the-box,
gnome-bluetooth cannot reconnect with the device when it has been powered down or switched back from the other channel. It turns out that Logitech might not be following standards, and requires repairing every time. In my search for similar cases, I found a bug report for another device that has had similar issues, and the solution presented there also works for the Ultrathin Touch Mouse.
The trick is to tell
gnome-bluetooth to always send the pincode (
0000, as usual) upon connecting. For this, it needs an entry in /usr/share/gnome-bluetooth/pin-code-database.xml like this:
<!-- Logitech Ultrathin Touch Mouse --> <device oui="00:1F:20:" name="Ultrathin Touch Mouse" pin="0000"/>
I filed a bug report to have this included by default. After adding the entry, add the mouse as a new input device and it should work as expected.
On to the mouse' features. Besides detecting motion with its bottom laser, the surface is a touch pad that can be depressed as a whole. Pressing in the top left and top right corner will trigger the left and right mouse button events (button 1 and 3). To do a middle-click, you have to press in the center of the touch pad, not at the top middle, as you'd expect. Vertical and horizontal scrolling can be done with swipe gestures, respectively up/down and left/right. This will trigger buttons 4 through 7.
On top of that, there are some additional gestures, which Logitech has pictured in a handy overview. First, there is a two-finger left or right swipe for doing previous and next actions. In X11 this will trigger buttons 8 and 9, and Firefox, for example, will respond to move back and forth in a tab's history. The other three gestures generate keyboard events, instead of the usual mouse events. A double-finger double-tap yields a press and release of the
Super_L key. In Unity this brings up the dash home by default. Finally there are swipes from the left edge and from the right edge. The former triggers
Ctrl_L Super_L Tab, which switches between the two last used tabs in Firefox, the latter
Alt_L Super_L XF86TouchpadOff, which doesn't have a default action bound to it, as far as I can tell. Logitech also mentions the single-finger double tap, but that doesn't seem to register any event in the input handler.
The mouse can be charged with via its micro-USB connector, also on the bottom, with a convenient short USB cable in the box. The micro-USB connector on that cable is also angled so the mouse doesn't have to be upright when charging. The battery state is reported to the kernel, but there is another bug in
upower that will make batteries in bluetooth input devices show up as laptop batteries.
Having used the mouse for a few days now, I like it a lot. It is really tiny, but not in a bad way (for me). The two-finger swipe gestures are a bit tricky to get right, but I don't really use them anyway. I also tried hooking it up to my Nexus 7, and that works nicely. All-in-all a great little device, especially while travelling.
10 Oct 2013 12:00pm GMT
Kicking off the revival of this publication, I recently did a guest post on our use of Elasticsearch at Mailgun. Since I have joined the Mailgun team at Rackspace in May, my primary project was to reimplement the Mailgun customer logs so that we can serve billions of searchable events. Head over to the HackerNews page for some additional details.
10 Oct 2013 11:47am GMT
27 Sep 2013
Summer is over!
Well, at least the Google Summer of Code is over. I had a great time working on Twisted. Here is what I do:
My project, Deferred Cancellation, involves adding cancellation support to Twisted APIs that return a uncancellable Deferred as much as possible. In this summer, I added cancellation support to 8 APIs, including:
When working on the project I found two bugs and fixed them:
- The "_loopFinshed" in TimerService.volatile should be changed to "_loopFinished" #6657
- twisted.names.test.test_dns.TestTCPController should override the messageReceived method. #6655
Also we added timeout implementation to Deferred, based on cancellation support. #5786
In total, there are 11 tickets, 4 of them have already be merged into trunk. I will keep tracking the rest of the tickets and continue to work on Twisted.
It's my first time working on a open source project, and I've had a great time. Thanks to all the people who helped me during this summer. I've learned so much from this project!
27 Sep 2013 4:46pm GMT
08 Sep 2013
My latest Twisted adventure began with a comment I came across in
1 2 3 4 5
This seemed like a worthy problem to investigate so that, at the very least, I could write a ticket to track the issue.
The first challenge was to set up a smart host configuration with Twisted. A smart host is a mail server which accepts mail to any address and then determines the mail exchange for the address and connects to it to relay the mail. Unlike an open relay, a smart host imposes restrictions on the source of messages. While some may accept mail only from authenticated senders, Twisted's default is to relay any mail received over a Unix socket or from localhost.
It was easy enough to run a smart host on my development machine. I just had to invoke
twistd mail with the relay option and specify a directory to hold messages to be relayed:
The smart host uses DNS to look up mail exchanges and contacts them via SMTP on port 25. Because my ISP does not allow outgoing traffic on port 25 and because I did not want to relay test messages to real mail servers, I needed to make some changes to the Twisted source so that the email messages would be relayed to a Twisted mail server that I ran on a second computer. I modified
relaymanager.py to relay to port 8025 and to use a hosts file for DNS resolution.
1 2 3 4 5 6 7 8 9 10 11 12
The hosts file maps
example.net to the IP address of the computer running the target mail server.
I configured that server to run on the default port, 8025, and accept mail for a few users on the domains
When I used telnet on the development machine to send mail to the smart host running on the same machine and addressed it to one of the configured users on example.com or example.net, the smart host relayed it to the mail server on the second machine.
Now that I had a usable configuration, I wanted to explore the implications of the comment that
RelayerMixin opened a large number of files and never closed them.
RelayerMixin is used to introduce a set of functions for relaying mail to another class, a relayer, through inheritance. On initialization, the relayer calls one of the
loadMessages, with a list of the pathnames of messages which it is responsible for relaying.
loadMessages opens each message file and stores the file object in a list. I hypothesized that if I sent a lot of messages to the smart host at once, its relayers would open files for all the messages and hit the operating system limit for open files.
I wrote a short program to send the SMTP commands for a series of messages to the smart host running on port 8025 of the same machine. The messages are randomly destined to one of two addresses on each of the two domains served by the mail server on the other machine.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
As I increased the number of messages sent, I expected to eventually see an exception occur when too many files were opened but that did not occur no matter how many messages were sent. From the server log, I observed that instead of opening one connection to the mail server for each domain and sending all the queued messages for that domain, the smart host was repeatedly connecting to the mail server and sending no more than a few messages at a time. That explained why the limit on open files was not being reached. The relayers were being handed only a few messages at a time so there was no need to open a lot of files at once.
This strategy for allocating work to relayers did not seem very efficient so I started exploring further.
SmartHostSMTPRelayingManager, which implements the smart host functionality, has a function,
checkState, which is called periodically to see if there are messages waiting to be relayed and if there is capacity to create new relayers. If so, it calls
_checkStateMX to create relayers and allocate messages to them. It turns out that
_checkStateMX contains a subtle bug which is the cause of the allocation behavior.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
_checkStateMX asks the relay queue for a list of waiting messages. Then it loops through the messages, grouping them by target domain. Eventually, each group will be handed off to a relayer. The problem is that
_checkStateMX breaks out of the loop as soon as it has at least one message for the maximum number of domains it can concurrently contact. That value,
maxConnections, is an optional parameter to
SmartHostSMTPRelayingManager.__init__. Its default value is 2.
_checkStateMX loops through the waiting messages, it creates a list of messages for the first domain it sees and keeps adding messages for that domain to the list. When it sees a second domain, it creates another list for that domain but since it has hit the limit on connections, it breaks out of the loop. So, any other messages in the queue for either domain must wait to be sent even though they could be handled by the same relayers. Instead of breaking out of the loop when it reaches the connection limit,
_checkStateMX should continue to add messages to the lists for the domains it has already seen and ignore messages for other domains.
With the understanding of how messages are allocated to relayers, I was now easily able to trigger an exception for too many open files by sending a large number of messages to one domain instead of splitting them between two.
As a result of this exploration, I filed and submitted fixes for two issue tickets, a defect ticket for the handling of open files by
RelayerMixin, and an enhancement ticket to improve how messages are allocated to relayers.
08 Sep 2013 7:27pm GMT
30 Aug 2013
Programs must be written for people to read, and only incidentally for machines to execute. - Abelson & Sussman, Structure and Interpretation of Computer Programs
Code readability gets talked about a lot these days. I haven't yet heard from anyone who's opposed to it. Unfortunately, learning to read code is a skill rarely discussed and even more rarely taught. As the SICP quote above points out, readability is perhaps the most important quality to aim for when writing. But what is code readability, exactly?
I propose there are three essential levels of code readability:
- "What is this code about?"
- "What is this code supposed to do?"
- "What does this code do?"
The first level is important when you're skimming code, trying to develop a picture of its overall structure. Good organization into modules tends to help a lot with this; if you have modules named utilthen this is harder than it has to be. Supporting docs describing architecture and organization can assist with this as well, along with usage examples if the code you're reading is a library.
The second level is what you encounter once you've found some code and you want to use or modify it. Maybe it's a library with weak or missing documentation and you're trying to discover how a function wants to be called or which methods to override in a class you need to inherit from. Good style, good class/function names, docstrings, and comments can all be very helpful in making your code readable for this case. There's been some research which associates poor quality identifier names and obvious bugs, so even if your code works well, poor naming can make it look like code that doesn't.
However, neither of these are the most important sense in which readability matters. They're more about communicating intent, so that later readers of your code can figure out what you were thinking. Sometimes it's a way to share hard-won insight that leaves indelible scars. But the final level is what matters most and matters longest.
To understand a program you must become both the machine and the program. - Alan Perlis, Epigrams In Programming #23
The most important level of readability is being able to look at code and understand what it actually does when run. All the factors discussed above - naming, style, documentation - cannot help with this task at all. In fact, they can be actively harmful to this. Even if the comment accurately described the code when it was written, there's no reason it has to now. The most obvious time you'll need to engage in this level of code reading is debugging - when the code looks like it does the right thing, but actually doesn't.
Language design plays a big role in supporting or detracting from the creation of readable code. As the link above shows, C provides a myriad of features that either fight against or outright destroy readability. So when picking a language, include readability as a factor in your decision making - and not just how the syntax looks. Mark Miller provides some excellent thoughts on how to design a language for readability in his notes on The Power of Irrelevance. There are also several people studying how to build tools to help us read code more effectively; Clarity In Code touches on some of the issues being considered in that field.
But given the constraints of the language you're currently using, what can we do to improve readability in our code? The key is promoting local reasoning. The absolute worst case for readability occurs when you have to understand all the code to understand any of the code. So we want to preserve as many barriers between portions of the program as are needed to prevent this.
Since local reasoning is good, global state is bad. The more code that can affect the behavior of the function or class you're looking at right now, the more work it takes to actually discern what it'll do, and when. Similarly, threads (and coroutines) destroy readability. since they destroy the ability to understand control flow locally in a single piece of code. Other forms of "magic" like call stack inspection, adding behavior to a class from other modules, metaclass shenanigans, preprocessor hacks, or macros all detract from local reasoning as well.
Since this perspective on programming is so little discussed or taught, it leads to a communications gap between inexperienced and veteran programmers. Once an experienced programmer has spent enough time picking apart badly designed code, it can become the dominant factor in his assessment of all the code he sees. I've experienced this myself fairly often: code that could be easily made a little more readable makes me itch; code that can't easily be fixed that was written with no thought for readability can be quite upsetting. But writing readable code is extra work, so people who haven't spent dozens of hours staring at a debugger prompt are sometimes baffled by the strong emotions these situations inspire. Why is it such a big deal to make that variable global? It works, after all.
Every program has at least one bug and can be shortened by at least one instruction - from which, by induction, it is evident that every program can be reduced to one instruction that does not work. - Ken Arnold
When considering how to write readable code, choice of audience matters a lot. Who's going to read what you're writing? When? For writing prose, we do this all the time. We use quite different style in a chat message or email than in a blog post, and a different style again in a formal letter or article. The wider your audience and the longer the duration you expect the message to be relevant, the more work you put into style, clarity, and readability. The same applies to code. Are you writing a one-off script that's only a few dozen lines long? Using global variables and one-letter identifiers is probably not going to hurt you, because you will most likely delete the code rather than read it again. Writing a library for use in more than one program? Be very careful about using any global state at all. (Also pay special attention to the names you give your classes/modules/functions; they may be very difficult to change later.) If new programmers were taught this idea as well as they're taught how to, e.g., override methods in a class or invoke standard libraries, it would be a lot easier for both new programmers and those mentoring them to relax.
So when you do set out to write readable code, consider your audience. There are some obvious parties to consider. If you're writing a library, your users will certainly read some of your code, even if it's just examples of how to use your library. Anyone who wants to modify your code later will need to read it. If your code is packaged by an OS distribution or other software collection, the packager will often need to read parts of your code to see how it interacts with other elements of the system. Security auditors will want to read your code to know how it handles the authority it's granted or the secrets it protects. And not least, you're writing for your future self! So even if those other people don't exist or don't matter to you - make life easier on that last guy. He'll thank you for it.
30 Aug 2013 7:08pm GMT
27 Aug 2013
I have merged branches/documenta-4320 into trunk. Now the howto page of deferred has documentation about cancellation.
I have merged branches/timerservice-typo-6657 into trunk. Now the
twisted.application.internet.TimerService can be pickled correctly.
I have merged branches/deferredlist-cancellation-6639 into trunk. Now
I have finished the patch for ticket #6656(
twisted.internet.task.LoopingCall.start should return a cancellable Deferred) and submitted it for review.
I have revised the patch for ticket #5786(Add timeout implementation to Deferred, based on cancellation support) according to the comments and submitted it for review.
Plan for the next week
In the next week, I will write an email to solicit opinions on raising exception from cancel(). Also, I will revised my patches according to the comments.
27 Aug 2013 3:20pm GMT
21 Aug 2013
I hope you enjoy it!
21 Aug 2013 10:06am GMT
15 Aug 2013
One of the first Twisted mail issue tickets I tackled involved what seemed to be the simple matter of adding missing unit tests but led to some refactoring of Twisted code. The ticket highlights the missing unit tests for the
exists functions of both the
AbstractMaildirDomain is a base class which is meant to be subclassed to create mail domains where the emails are stored in the Maildir format. Most of the functions it provides are placeholders that need to be overridden in subclasses. However, it does provide implementations for two functions,
exists checks whether a user exists in the domain or an alias of it and, if so, returns a callable which returns a
MaildirMessage to store a message for the user. Otherwise, it raises an
startMessage returns a
MaildirMessage which stores a message for the user.
The existing unit tests for
AbstractMaildirDomain were minimal, checking just that it fully implemented the
IAliasableDomain interface. Additional test cases were needed to verify that:
- for a valid user,
existsreturns a callable which returns a
MaildirMessageand that the callable returns distinct messages when called multiple times
- for an invalid user,
AbstractMaildirDomain could not be used directly in testing
startMessage because both call a function which has only a placeholder in the base class. So for testing purposes, I created a subclass of
TestMaildirDomain, which overrides the placeholder functions.
Since each test would need a
TestMaildirDomain to exercise, I wrote a
setUp function which runs prior to the test and creates a
TestMaildirDomain as well as a temporary Maildir directory for it to use. I also added a
tearDown function which runs after each test to remove the temporary directory.
1 2 3 4 5 6 7 8 9 10
The tests for
exists turned out to be pretty straightforward:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
DomainQueuer acts as a domain, but instead of saving emails for users, it puts messages in a queue for relaying. Its
startMessage functions are meant to be used in the same way as those of
AbstractMaildirDomain. It seemed like it would be an easy matter to adapt the unit tests for
AbstractMaildirDomain to work for
It turned out, however, that the test for
DomainQueuer failed because a function was being called on a variable set to
None. Something clearly hadn't been properly configured for the test.
DomainQueuer code being exercised in the test:
1 2 3 4 5 6 7 8 9 10 11
willRelay with the
protocol which was passed into it as part of the
willRelay tries to call
getPeer on the
transport instance variable of the
protocol. But, the minimal
User object passed to
exists, which worked for the
AbstractMaildirDomain test, is not sufficient for the
willRelay needs the
protocol to determine whether it should relay the message.
I had two thoughts about how to get around this problem, but I wasn't sure either of them was satisfactory. One option would be to provide the
User object with a fake protocol and fake transport so that
willRelay could be configured to return the desired value. That seemed to be a very roundabout way to solve the problem of getting
willRelay to return a specific boolean value.
A more direct way to get
willRelay to return the desired value would be to monkeypatch it. That is, as part of the unit test, to replace the
DomainQueuer.willRelay function with one that returns the desired value. The problem with monkeypatching is that, even though
willRelay is part of the public interface of
DomainQueuer, it could change in the future. For example, it could received additional optional arguments. Then, the unit test which used the monkeypatched version might not reflect the behavior of the real version of
I got some great advice about how to approach this problem and unit testing in general on the Twisted developers IRC channel. The idea behind unit testing is to test each unit of code independently. Here we can consider
willRelay to be two different units. But the way
willRelay is implemented means that it is difficult to test
exists independent of it. I was advised that sometimes the best thing to do when adding unit tests is to refactor the code itself. So, I attempted to do that without changing the public interface of
I introduced a base class,
AbstractRelayRules, whose purpose is to encapsulate the rules for determining whether a message should be relayed. Then, I defined a subclass,
DomainQueuerRelayRules, which contains the default rules for the
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
DomainQueuer, I changed the
__init__ function to take a new optional parameter specifying the rules to be used to determine whether to relay. When
relayRules is not provided, the default
DomainQueuerRelayRules is used. I also changed the
willRelay function to ask the
relayRules whether to relay instead of determining that itself. Existing code which creates a
DomainQueuer without providing the extra argument works in the exact same way as the old code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Finally, I created another subclass of
AbstractRelayRules to be used for test purposes.
TestRelayRules can be configured with a boolean value which determines whether relaying is allowed or not.
1 2 3 4 5 6 7
Now, the unit tests for
TestRelayRules are quite simple.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Refactoring to decouple the rules for relaying messages from the
DomainQueuer certainly made the unit testing code much cleaner. As a general matter, difficulty in writing unit tests may highlight dependencies in the code which should be refactored.
15 Aug 2013 11:04am GMT
12 Aug 2013
In the last week, I revised my patch for ticket 6657(The "_loopFinshed" in TimerService.volatile should be changed to "_loopFinished"). Now instead of testing the implementation detail, we test that
TimerService is pickleable by testing the pickle result.
I also revised my patch for ticket 6656(twisted.internet.task.LoopingCall.start should return a cancellable Deferred). I added a utility method to
LoopingCall so that
reset() can share the code. Then I added a test for cancellation after
stop() is called.
After revised my patches, I looked at ticket 4632(ability to cascade canceling inlineCallbacks's deferred).
inlineCallbacks helps one write Deferred-using code that looks like a regular sequential function. When one call anything that results in a
Deferred, one can simply yield it, the generator will automatically be resumed when the
Deferred's result is available. Ticket 4632 is aimed to add the ability to cascade canceling
inlineCallbacks. The work is almost done by tracktor and glyph. Some docstring are needed, and the code need to be revised to meet the code standard.
I also began to work on ticket 4320(Deferred cancellation documentation). I have added some documentation about how to use Deferred cancellation from itamar's blog posts to the howto doucmentation. Because there are already some documentation in the branche, so I need to do some adjust before I submit it for review.
Plan for the next week
In the next week, I will continue revise my patches according to the comments and merge the branches that have passed review to the trunk. Also I will try to find a new ticket to work on.
12 Aug 2013 12:35pm GMT
05 Aug 2013
In the last week, I submitted my patch for ticket 6644. When working on it, I found a bug of twisted.names.test.test_dns.TestTCPController`.
twisted.names.test.test_dns.TestTCPController pretends to be a DNS query processor for a
twisted.names.dns.DNSProtocol. It inherits from
twisted.names.test.test_dns.TestController, a testing DNS query processor for a
twisted.names.dns.DNSDatagramProtocol. However the
messageReceived method for a
DNSProtocol is different from the method for a
DNSDatagramProtocol. There should be no
addr parameter in the
messageReceived method for a
TestTCPController should override the
I have written a patch for it and submitted the patch for review(ticket 6655).
I also finished my patch for adding cancellation support for the
Deferred returned by
twisted.internet.task.LoopingCall.start(ticket 6656). Calling
twisted.internet.task.LoopingCall.start will start running a target function every interval seconds. It will return a A Deferred whose callback will be invoked with
self.stop is called, or whose errback will be invoked when the function raises an exception or returned a
Deferred that has its errback invoked. If a call of the target function hasn't returned yet when cancelling the
Deferred, we should cancel the running call. If a call of the target function is scheduled when cancelling the
Deferred, we should cancel the scheduled call. The
running flag should be set to
The patch is finished. However, due to a bug of
twisted.application.internet.TimerService, the new patch won't pass the test. More specifically, the
twisted.test.test_application.TestInternet2.testPickledTimer will fail.
The bug is due to a typo. The "_loopFinshed" in
TimerService.volatile should be "_loopFinished". So when pickling a
TimerService it will actually try to pickle
_loopFinished(the Deferred returned by
LoopingCall.start), which it shouldn't. In the patch for ticket 6656, a reference to
LoopingCall is added to the
Deferred returned by
LoopingCall.start. So when trying to pickle
_loopFinished, it will try to pickle an instance of
LoopingCall, which is unpickleable. This makes the test fail.
I've submitted a patch to fix this bug(ticket 6657).
Plan for the next week
In the next week, I will revise all my patches according to the comments. I will also look at ticket 4632 and make it ready for review.
05 Aug 2013 11:15am GMT
01 Aug 2013
Much of the work of documenting the Twisted Mail API has involved searching through the Python code to determine the types for parameters and return values. It often involves comparing functions in different classes which inherit from the same base class or implement the same interface. In some cases, I've resorted to looking at unit tests or example code to see how objects are used. After a recent experience while tracking down types, I'm more convinced than ever of the value of the API documentation.
I was documenting the
alias module, which contains classes for redirecting mail from one user to another user, to a file, to a process, and to a group of aliases. Four different classes inherit from the base class
AliasBase and implement the interface
IAlias, which contains the function
createMessageReceiver. The class hierarchy looks like this:
twisted.mail.alias.AliasBase twisted.mail.alias.AddressAlias twisted.mail.alias.AliasGroup twisted.mail.alias.FileAlias twisted.mail.alias.ProcessAlias
I was trying to determine the return value of
IAlias.createMessageReceiver. The return value was clear for three of the four classes that implement
IAlias because the object to be returned was created in the return statement.
FileAlias -> FileWrapper ProcessAlias -> MessageWrapper AliasGroup -> MultiWrapper
The objects returned are all message receivers which implement the
smtp.IMessage interface. They deliver a message to the appropriate place: a file, a process or a group of message receivers. It seemed pretty clear that the return value of the
createMessageReceiver function in the
IAlias interface should be
smtp.IMessage. However, there was one more class that implemented the interface,
AddressAlias, and the return value from that wasn't so clear.
1 2 3 4 5 6 7 8 9
AddressAlias.createMessageReceiver returns the result of a call to
exists on the result of a call to
domain is a base class function which returns an object which implements the
IDomain interface. Fortunately, the
IDomain interface was documented. It returns a callable which takes no arguments and returns an object implementing
IMessage. Unfortunately, this return value didn't match the pattern of the other three classes implementing
IAlias.createMessageReceiver, all of which return an object implementing
Although messy, it was possible that the return value of
IAlias.createMessageReceiver was either an
smtp.IMessage provider or a callable which takes no arguments and returns an
smtp.IMessage provider. Or, it might have been a mistake.
At this point, I fortuitously happened to be looking at this code in an old branch and noticed a difference. There, the
AddressAlias.createMessageReceiver function appeared as follows:
After some investigation, I found a ticket that had been fixed earlier this year to remove calls to the deprecated
IDomain.startMessage function. In the old code,
startMessage also returns an
IMessage provider. So, it seemed that a bug had been introduced in the switch from
The result of the call to
exists must be invoked to get the proper message receiver to return. The code should read:
I filed a ticket in the issue tracking system and subsequently submitted a fix. While reworking the unit tests, I relied heavily on the API documentation I had written for the
alias module. I think it's safe to say that had the API been fully documented when the original change was made, this error would have been easy to spot during code review or to avoid in the first place.
01 Aug 2013 6:42pm GMT
31 Jul 2013
Let me start with the good stuff. First of all, I think it's great that we have yet another asynchronous contender in the Python world. Every time something like this comes out, it means that Twisted has to fight that much less hard to get over the huge hump of event-driven programming being too hard, or too weird, or whatever. It's good to have an endorsement of the general message "if you need a web server to handle COMET requests, it needs to be asynchronous to perform acceptably" from such a high-profile company as Facebook.
Unfortunately I think the larger picture here is a failure of communication in the open source community. In the course of developing Tornado, there are several things that FriendFeed could have done to move the Twisted community forward, at no cost to themselves. I don't want to rag on FriendFeed, or Bret Taylor, or Facebook here; they're not the first to re-write something without communicating. In fact I recently had almost this exact same discussion with another project that did the same thing. Since Tornado is such a high-profile example, though, I want to draw attention to the problem so that there's some hope that maybe the next project won't forget to communicate first.
My main point here is that if you're about to undergo a re-write of a major project because it didn't meet some requirements that you had, please tell the project that you are rewriting what you are doing. In the best case scenario, someone involved with that project will say, "Oh, you've misunderstood the documentation, actually it does do that". In the worst case, you go ahead with your rewrite anyway, but there is some hope that you might be able to cooperate in the future, as the project gradually evolves to meet your requirements. Somewhere in the middle, you might be able to contribute a few small fixes rather than re-implementing the whole thing and maintaining it yourself.
This is especially important if you are later going to make claims about that project not living up to your vaguely-described requirements, and thereby damage its reputation. Bret Taylor claims in his blog:
We ended up writing our own web server and framework after looking at existing servers and tools like Twisted because none matched both our performance requirements and our ease-of-use requirements.
First and foremost, it would have been great to hear from Bret when he started off using Twisted about any performance problems or ease-of-use problems. I'm guessing that Twisted itself had only ease-of-use problems, and other "tools like Twisted" were the ones with performance problems, since later, in a comment on the same post, he says:
I can't imagine there is much of a performance difference [between Twisted Web and Tornado]. The bottom is not that complex in my opinion.
It would also be great if he had explicitly said that Twisted didn't have performance problems rather than making me guess, because I'm sure that is what lots of developers will take away from this. When you have the bully pulpit, off-the-cuff comments like this can do serious damage to smaller projects.
Later, in yet another comment, Bret points out the root problem:
... the HTTP/web support in Twisted is very chaotic (see http://twistedmatrix.com/trac/wiki/WebDevelopme... - even they acknowledge this)...
This is true. However, as I frequently like to note, Twisted is starved for resources. Reconciling the chaos described on the page about web development with Twisted is an ongoing process. For a tiny fraction of the effort invested in Tornado, FriendFeed could have worked with us to resolve many of the issues creating that chaos.
This is the main thing I want to reinforce here. If half a dozen occasional contributors with a real focused interest in web development showed up to help us on Twisted, we'd have an awesome, polished web story within a few months. If even one person really took responsibility for twisted.web, things would pick up. But if everyone who wants an asynchronous webserver either uses twisted.web (because it's great!) without talking to us or decides not to use it (because it doesn't meet their unstated requirements) without talking to us, it's going to continue to improve at the same sluggish pace.
Even at the current rate, by the time we have an excellent HTTP story, I somehow doubt that Tornado will have a good SSHv2 protocol story ;-).
In his comment, Bret also takes a couple of pot-shots at Twisted that I think are unnecessary, and I'd like to address those too.
In general, it seems like Twisted is full of demo-quality stuff, but most of the protocols have tons of bugs.
We're not talking about "most" of the protocols here, Tornado is only concerned with HTTP. And the HTTP implementation(s) in Twisted do not have "tons of bugs". They are production quality, used on lots of different websites, and have lots of automated tests. While much of the code in twisted.web doesn't have complete test coverage, since it's old enough to predate our testing requirements, I note that Tornado appears to have zero test coverage.
There's a kernel of truth here - some of the older, less frequently used protocols have a few problems - but in most cases the "bugs" are really just a lack of functionality. Twisted overall has very few protocol-related bugs, and again, our test policy makes sure that new bugs are introduced very rarely.
Given all those factors, it didn't seem to provide a lot of value. Our core I/O loop is actually pretty small and simple, and I think resulted in fewer bugs than would have come up if we had used Twisted.
I must respectfully disagree. Again, I don't want to rag on FriendFeed here, but here are several features that Tornado would have, and bugs that it wouldn't have, if it used Twisted for the event loop and none of the HTTP stuff:
- EINTR wouldn't cause your application to exit if run in a non-US-english locale.
- You don't have the opportunity to forget to set a socket to be non-blocking and thereby make your entire application stop.
- It would be possible to run your application on Windows.
- Firewalled connections and running out of file descriptors wouldn't cause your server to spew errors forever (at least, it won't any more).
- You could write a TCP client that didn't block for an arbitrary amount of time in connect().
- Finally, of course, you could use all of Twisted's other protocols, client and server: IMAP, POP, SMTP, IRC, AIM, etc. You could also use external protocol implementations like Thift.
- You could spawn asynchronous subprocesses.
This list is a great example of why projects like Tornado really should use Twisted. Tornado implements some innovative web-framework stuff, but absolutely nothing interesting that I can see at the level of async I/O. Using Twisted would have allowed them to focus exclusively on cool web things and left the never-ending stream of incremental surprising platform-specific, only-happens-in-weird-situations bugfixes to a single, common source.
What To Do NowI hope that someone at FriendFeed will be a little heavier on detail and a little lighter on FUD in some future conversation about Twisted. However, I'm sure they're going to have their hands full maintaining their own code, so I don't have high expectations in this area. I'm sure Bret wasn't intentionally slamming Twisted, either; it wasn't like he wrote a big screed about it, he just dropped in a few unsubstantiated comments into a much larger post about Tornado. So I just want to be clear: I don't have sore feelings, I don't need anybody to apologize to me or to Twisted.
If any of you out there are fans of both Tornado and of Twisted, it would be great if you could contribute a patch to Tornado which would allow it to at least optionally use Twisted as an I/O back-end. It would be great, of course, if lots of people interested in web stuff would help us out with our web situation, but supporting the Twisted event loop would be good regardless. It would mean that when people wanted to speak multiple protocols, they wouldn't need to re-write or kludge in their existing Tornado application, so it would increase the chances that we could get some help with our SSH, FTP, IRC, or XMPP code instead. It would also open up a much wider multi-protocol landscape to users of Tornado, even if Tornado's default mode of operation still used ioloop.py.
Even better would be to hook up something that made a Tornado IResource implementation, so that Tornado applications and twisted.web and Nevow applications could all be seamlessly integrated into one server.
The whole point of Twisted is to have a common I/O layer that lots of different libraries can use, share, and build on, so that we can solidify the common and highly complex abstraction required of a comprehensive, cross-platform, event-driven I/O layer. In order to realize that vision, we need help not just with the code; we need more Twisted ambassadors to go out into the community and help us integrate these disparate applications, help us find out where real users are finding the documentation inadequate or the organization confusing.
Tornado could be an excellent opportunity for those ambassadors to go out and introduce others to the wonders of Twisted, because its endorsement from FriendFeed guarantees it an audience of a tens of thousands of developers, at least for its first few months of life. If you've shied away from contributing to Twisted itself because of our aggressive testing and documentation requirements, well, Tornado apparently doesn't have any, so it would be a great place for you to start :).
31 Jul 2013 11:28pm GMT