27 Jun 2015

feedPlanet Twisted

Moshe Zadka: ncolony 0.0.2 released

Fixed:

Mostly internal cleanup release: running scripts is now nicer (all through "python -m ncolony <command>"), added Code of Conduct, releasing based on versioneer, cleaned up tox.ini, added HTTP healthchecker.

Available via PyPI and GitHub!

27 Jun 2015 4:38am GMT

26 Jun 2015

feedPlanet Twisted

Moshe Zadka: mainland: the main of your Python

I don't like console-scripts. Among what I dislike is their magicality and the actual code produced. I mean, the autogenerated Python looks like:

#!/home/moshez/src/mainland/build/toxenv/bin/python2

# -*- coding: utf-8 -*-
import re
import sys

from tox import cmdline

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(cmdline())

Notice the encoding header for an ASCII only file, and the weird windows compatibility maybe generated along with a unix-style header that also contains the name of the path? Truly, a file that only its generator could love.

Then again, I don't much like what we have in Twisted, with the preamble:

#!/home/moshez/src/ncolony/build/env/bin/python2
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
import os, sys

try:
    import _preamble
except ImportError:
    sys.exc_clear()

sys.path.insert(0, os.path.abspath(os.getcwd()))

from twisted.scripts.twistd import run
run()

This time, it's not auto-generated, it is artisanal. What it lacks in codegen ugliness it makes up in importing an under-module at top-level, and who doesn't like a little bit of sys.path games?

I will note that furthermore, neither of these options would play nice with something like pex. Worst of all, all of these options, regardless of their internal beauty (or horrible lack thereof), must be named veeeeery caaaaarefully to avoid collision with existing UNIX, Mac or Windows command. Think this is easy? My Ubuntu has two commands called "chromium-browser" and "chromium-bsu" because Google didn't check what popular games there are on Ubuntu before naming their browser.

Enter the best thing about Python, "-m" which allows executing modules. What "-m" gains in awesomeness, it loses in the horribleness of the two-named-module, where a module is executed twice, once resulting in sys.modules['__main__'] and once in sys.modules[real_name], with hilarious consequences for class hierarchies, exceptions and other things relying on identity of defined things.

Luckily, packages will automatically execute "package.__main__" if "python -m package" is called. Ideally, nobody would try to import __main__, but this means packages can contain only one command. Enter 'mainland', which defines a main which will, carefully and accurately, dispatch to the right module's main without any code generation. It has not been released yet, but is available from GitHub. Pull requests and issues are happily accepted!

Edit: Released! 0.0.1 is up on PyPI and on GitHub

26 Jun 2015 3:13am GMT

23 Jun 2015

feedPlanet Twisted

Moshe Zadka: On Codes of Conducts and “Protection”

Related: In Favor of Niceness, Community, and Civilization

I've seen elsewhere (thankfully, not my project, and no, I won't link to it) that people want the code of conduct to "protect contributors from 3rd parties as well as 3rd parties from contributors. I would like to first suggest the idea that this is…I don't even know. The US government has literal nuclear bombs, as well as aircraft carriers, and it cannot enforce its law everywhere. The idea that an underfunded OSS project can do literally anything to 3rd parties is ludicrous. The only enforcement mechanism any project can have is "you can't contribute here" (which, by necessity, only applies to those who wanted to contribute in the first place.)

So why would a bunch of people take on a code of conduct that will only limit them?

Because, to quote the related article above, "the death cry of liberalism is not 'death to the unbelievers', it is 'if you're nice, you can join our cuddle pile." Perhaps describing open source projects as "cuddle piles" is somewhat of an exaggeration, but the point remains. A code of conduct defines what "nice enough is". The cuddle piles? Those conquer the world. WebKit is powering both Android and iPhone browsers, for example, making open source be, literally, the way people see the online world.

Adopting a code of conduct that says that we do not harass, we do not retaliate and we accept all people is a powerful statement. This is how we keep our own garden clear of the pests of prejudice, of hatred. Untended gardens end up wilderness. Well-tended gardens grow until we need to keep a fence around the wild and call it a "preserve".

When I first saw the Rust code of conduct I thought "cool, but why does an open source project need a code of conduct"? Now I wonder how any open source project will survive without one. It will have to survive without me - in three years, I commit to not contribute to any project without a code of conduct, and I hope others will follow. Any project that does not have a CoC, in the interim, better behave as though it has one.

Twisted is working hard on adopting a code of conduct, and I will check one into NColony soon (a simpler one, appropriate to what is, so far, a one-person show).

23 Jun 2015 4:22am GMT

18 Jun 2015

feedPlanet Twisted

Moshe Zadka: Everything but the code

(Or, "So you want to build a new open-source Python project")

This is not going to be a "here is a list of options, but you do what is right for you" pandering post. This is going to be a "this is 2015, there are right ways to do things, and here they are" post. This is going to be an opinionated, in-your-face, THIS IS BEST post. That said, if you think that anything here does not represent the best practices in 2015, please do leave a comment. Also, if you have specific opinions on most of these, you're already ahead of the curve - the main target audience are people new to creating open-source Python projects, who could use a clear guide.

This post will also emphasize the new project. It is not worth it, necessarily, to switch to these things in an existing project - certainly not as a "whole-sale, stop the world and let's change" thing. But when starting a new project with zero legacy, there is no reason to do things wrong.

tl:dr; Use GitHub, put MIT license in "LICENSE", README.rst with badges, use py.test and coverage, flake8 for static checking, tox to run tests, package with setuptools, document with sphinx on RTD, Travis CI/Coveralls for continuous integration, SemVer and versioneer for versioning, support Py2+Py3+PyPy, avoid C extensions, avoid requirements.txt.

Where

When publishing kids' photos, you do it on Facebook, because that's where everybody is. LinkedIn is where you connect with colleagues and lead your professional life. When publishing projects, put them on GitHub, for exactly the same reason. Have GitHub pull requests be the way contributors propose changes. (There is no need to use the "merge" button on GitHub - it is fine to merge changes via git and push. But the official "how do I submit a change" should be "pull request", because that's what people know).

License

Put the license in a file called "LICENSE" at the root. If you do not have a specific reason to choose otherwise, MIT is reasonably permissive and compatible. Otherwise, use something like the license chooser and remember the three most important rules:

At the end of the license file, you can have a list of the contributors. This is an easy place to credit them. It is a good idea to ask people who send in pull requests to add themselves to the contributor list in their first one (this allows them to spell their name and e-mail exactly the way they want to).

Note that if you use the GPL or LGPL, they will recommend putting it in a file called "COPYING". Put it in "LICENSE" (the licenses explicitly allow it as an option, and it makes it easier for people to find the license if it always has the same name).

README

The GitHub default is README.md, but README.rst (restructured text) is perfectly supported via Sphinx, and is a better place to put Python-related documentation, because ReST plays better with Pythonic toolchains. It is highly encouraged to put badges on top of the document to link to CI status (usually Travis), ReadTheDocs and PyPI.

Testing

There are several reasonably good test runners. If there is no clear reason to choose one, py.test is a good default. "Using Twisted" is a good reason to choose trial. Using the built-in unittest runner is not a good option - there is a reason the cottage industry of "test runner" evolved. Using coverage is a no-brainer. It is good to run some functional tests too. Test runners should be able to help with this too, but even writing a Python program that fails if things are not working can be useful.

Distribute your tests alongside your code, by putting them under a subpackage called "tests" of the main package. This allows people who "pip install …" to run the tests, which means sending you bug reports is a lot easier.

Static checking

There are a lot of tools for static checking of Python programs - pylint, flake8 and more. Use at least one. Using more is not completely free (more ways to have to say "ignore this, this is ok") but can be useful to catch more style static issue. At worst, if there are local conventions that are not easily plugged into these checkers, write a Python program that will check for them and fail if those are violated.

Meta testing

Use tox. Put tox.ini at the root of your project, and make sure that "tox" (with no arguments) works and runs your entire test-suite. All unit tests, functional tests and static checks should be run using tox. It is not a bad idea to write a tox clause that builds and tests an installed wheel. This will require including all test code in the deployed package, which is a good idea.

Set tox to put all build artifacts in a build/ top-level directory.

Packaging

Have a setup.py file that uses setuptools. Tox will need it anyway to work correctly.

Structure

It is unlikely that you have a good reason to take more than one top-level name in the package namespace. Barring completely unavoidable name conflicts, your PyPI package name should be the same as your Python package name should be the same as your GitHub project. Your Python package should live at the top-level, not under "src/" or "py/".

Documentation

Use sphinx for prose documentation. Put it in doc/ with a relevant conf.py. Use either pydoctor or sphinx autodoc for API docs. "Pydoctor" has the potential for nicer docs, sphinx is well integrated with ReadTheDocs. Configure ReadTheDocs to auto-build the documentation every check-in.

Continuous Integration

If you enjoy owning your own machines, or platform diversity in testing really matters, use buildbot. Otherwise, take advantage for free Travis CI and configure your project with a .travis.yml that breaks your tox tests into one test per Travis clause. Integrate with coveralls to have coverage monitored.

Version management

Use SemVer. Take advantage of versioneer to help you manage it.

Release

A full run of "tox" should leave in its wake tested .zip and .whl files. A successful, post-tag run of tox, combined with versioneer, should leave behind tested .zip and .whl. The release script could be as simple as "tox && git tag $1 && (tox || (git tag -d $1;exit 1) && cp …whl and zip locations… dist/"

GPG sign dist/ files, and then use "twine" to upload them to PyPI. Make sure to upload to TestPyPI first, and verify the upload, before uploading to PyPI. Twine is a great tool, but badly documented - among other things, it is hard to find information about .pypirc. ".pypirc" is an ini file, which needs to have the following sections:

.gitignore

Python versions

If all your dependencies support Python 2 and 3, support Python 2 and 3. That will almost certainly require using "six" (or one of its competitors, like "future"). Run your unit tests under both Python 2 and 3. Make sure to run your unit tests under PyPy, as well.

C extensions

Avoid, if possible. Certainly do not use C extensions for performance improvements before (1) making sure they're needed (2) making sure they're helpful (3) trying other performance improvements. Ideally structure your C extensions to be optional, and fall back to a slow(er) Python implementation if they are unavailable. If they speed up something more general than your specific needs, consider breaking them out into a C-only project which your Python will depend on.

If using C extensions, regardless of whether to improve performance or integrate with 3rd party libraries, use CFFI.

If C extensions have successfully been avoided, and Python 3 compatibility kept, build universal wheels.

requirements.txt

The only good "requirements.txt" file is a non-existing one. The "setup.py" file should have the dependencies (ideally as weak-versioned as possible, usually just a ">=" for a library that tends not to break backwards compatibility a lot). Tox will maintain the virtualenv needed based on the things in the tox file, and if needing to test against specific versions, this is where specific versions belong. The "requirements.txt" file belongs in Salt-style (Chef, Fab, …) configurations, Docker-style (Vagrant-, etc.) configurations or Pants-style (Buck-, etc.) build scripts when building a Pex. This is the "deployment configuration", and needs to be decided by a deployer.

If your package has dependencies, they belong in a setup.py. Use extended_dependencies for test-only dependencies. Lists of test dependencies, and reproducible tests, can be configured in tox.ini. Tox will take care of pip-installing relevant packages in a virtualenv when running the tests.

Thanks

Thanks to John A. Booth, Dan Callahan, Robert Collins, Jack Diedrich, Steve Holden for their reviews and insightful comments. Any mistakes that remain are mine!

18 Jun 2015 2:31am GMT

09 Jun 2015

feedPlanet Twisted

Glyph Lefkowitz: Sorry I Unfollowed You

Since Alex Gaynor wrote his seminal thinkpiece on the subject, "I Hope Twitter Goes Away", I've been wrestling to define my relationship to this often problematic product.

On the one hand, Twitter has provided me with delightful interactions with human beings who I would not otherwise have had the opportunity to meet or interact with. If you are the sort of person who likes following people, four suggestions I'd make on that front are Melissa 🔔, Gary Bernhardt, Eevee and Matt Blaze, all of whom have blogs but none of whom I would have discovered without Twitter.

Twitter has also allowed me to reach a larger audience with my writing than I otherwise would have been able to. Lots of people click on links to this blog from Twitter either from following me directly or from a retweet. (Thank you, retweeters, one and all.)

On the other hand, the effect of using Twitter on my productivity is like having a constant, low-grade headache. While Twitter has never been a particularly bad distraction as measured by hours spent on it (I keep metrics on that, and it's rarely even in the top 10), I feel like consulting Twitter is something I do when I am stuck, or having to think about something hard. "I'll just check Twitter" is an easy way to "take a break" right at the moment that I ought to be thinking harder, eliminating distractions, mustering my will to focus.

This has been particularly stark for me as I've been trying to get some real writing done over the last couple of weeks and have been consistently drawing a blank. Given that I have a deadline coming up on Wednesday and another next Monday, something had to give.

Or, as Joss Whedon put it, when he quit Twitter:

If I'm going to start writing again, I have to go to the quiet place, and this is the least quiet place I've ever been in my life.

I'm an introvert, and using Twitter is more like being at a gigantic, awkward party all the time than any other online space I've ever been in.

There's an irony here. Mostly what people like that I put on Twitter (and yes, I've checked) are announcements that link to other things, accomplishments in other areas, like a blog post, or a feature in Twisted, but using Twitter itself is inimical to completing those things.

I'm loath to abandon the positive aspects of Twitter. Some people also use Twitter as a replacement for RSS, and I don't want to break the way they choose to pay attention to the stuff that I do. And a few of my friends communicate exclusively through direct messages.

The really "good" thing about Twitter is discovery. It enables you to discover people, content, and, eugh, "brands" that appeal to you. I have discovered things that I enjoy many times. The fundamental problem I am facing, which is a little bit hard to admit to oneself, is that I have discovered enough. I have enough games to play, enough books and articles to read, enough podcasts to listen to, enough movies to watch, enough code to write, enough open source libraries to investigate, that I will be busy for years based on what I already know.

For me, using Twitter's timeline at this point to "discover" more things is like being at a delicious buffet, being so full I'm nauseous, and stuffing my pockets with shrimp "just in case" I'm hungry "when I get home" - and then, of course, not going home.

Even disregarding my desire to produce useful content, if I just want to enjoy consuming content more deeply, I have to take the time to engage with it properly.

So here's what I'm doing:

  1. I am turning on the "anyone can direct message me" feature. We'll see how that goes; I may have to turn it off again later. As always, I'd prefer you send email (or text me, if it's time-critical).
  2. I am unfollowing literally everyone, and will not follow people in the future. Checking my timeline was the main information junk-food I want to avoid.
  3. Since my timeline, rather than mentions and replies, was my main source of distraction, I'll continue paying attention to mentions and replies (at least for now; I'll have to see if that becomes a problem in the absence of a timeline).
  4. In order to avoid producing such information junk-food myself, I'm going to try to directly tweet less, and put more things into brief blog posts so I have enough room to express them. I won't say "not at all", but most of the things that I put on Twitter would really be better as longer, more thoughtful articles.

Please note that there's nothing prescriptive here. I'm outlining what I'm doing in the hopes that others might recognize similar problems with themselves - if everyone used Twitter this way, there would hardly be a point to the site.

Also, if I've unfollowed you, that doesn't mean I'm not interested in what you have to say. I already have a way of keeping in touch with people's more fully-formed ideas: I use Blogtrottr to deliver relevant blog articles to my email. If I previously followed you and you think I might not be reading your blog already (in most cases I believe I already am), please feel free to drop me a line with an RSS link.

09 Jun 2015 12:41am GMT

07 Jun 2015

feedPlanet Twisted

Twisted Matrix Laboratories: Twisted Fellowship 2015: Call for proposals

On behalf of the Software Freedom Conservancy and the Twisted project I'm happy to announce that we're looking for a paid maintainer for the Twisted project.

Funding a software developer to work as a maintainer will help Twisted grow as a project, and enable Twisted's development community to increase their output of innovative code for the public's benefit. Twisted has strict coding, testing, documentation, and review standards, which ensures excellent code quality, continually improving documentation and code test coverage, and minimal regressions. Code reviews are historically a bottleneck for getting new code merged. A funded maintainer will help alleviate this bottleneck, and speed Twisted's development.

You can read more about the 2015 fellowship at https://twistedmatrix.com/trac/wiki/Fellowship2015

07 Jun 2015 11:12pm GMT

06 Jun 2015

feedPlanet Twisted

Moshe Zadka: (Somewhat confused) Thoughts on Languages, Eco-systems and Development

There are a few interesting new languages that would be fun to play with: Rust, Julia, LFE, Go and D all have interesting takes on some domain. But ultimately, a language is only as interesting as the things it can do. It used to be that "things it can do" referred to the standard library. I am old enough to remember when "batteries included" was one of the most interesting benefits Python had - the language had dared doing such avant-garde things in '99 as having a *built-in* url fetching library. That behaved, pretty much, the same on all platform. Out of the box. (It's hard to convey how amazing this was in '99.)

This is no longer the case. Now what distinguishes Python as "can do lots of things" is its built-in package management. Python, pip and virtualenv together give the ability to work on multiple Python projects, that need different versions of libraries, without any interference. They are, to a first approximation, 100% reliable. With pip 7.0 supporting caching wheels, virtualenvs are even more disposable (I am starting to think pip uninstall is completely unnecessary). In fact, except for virtualenv itself, it is rare nowadays to install any Python module globally. There are some rusty corners, of course:

I'll note npm, for example, clearly does better on these three (while doing worse on some things that Python does better). Of the languages mentioned above, it is nice to see that most have out-of-the-box built-in tools for ecosystem creation. Julia's hilariously piggybacks on GitHub's. Go's…well…uses URLs as the equivalent PyPI-level-names. Rust has a pretty decent system in cargo and crates.io (the .toml file is kind of weird, but I've seen worse). While there is justifiable excitement about Rust's approach to memory and thread-safety, it might be that it will win in the end based on having a better internal package management system than the low-level alternatives.

Note: OS package mgmt (yum, apt-get, brew and whatever Windows uses nowadays) is not a good fit for what developers need from a language-level package manager. Locality, quick package refreshes - these matter more to developers than to OS end-users.

06 Jun 2015 7:55am GMT

05 Jun 2015

feedPlanet Twisted

Duncan McGreggor: Scientific Computing and the Joy of Language Interop

The scientific computing platform for Erlang/LFE has just been announced on the LFE blog. Though written in the Erlang Lisp syntax of LFE, it's fully usable from pure Erlang. It wraps the new py library for Erlang/LFE, as well as the ErlPort project. More importantly, though, it wraps Python 3 libs (e.g., math, cmath, statistics, and more to come) and the ever-eminent NumPy and SciPy projects (those are in-progress, with matplotlib and others to follow).

(That LFE blog post is actually a tutorial on how to use lsci for performing polynomial curve-fitting and linear regression, adapted from the previous post on Hy doing the same.)

With the release of lsci, one can now start to easily and efficiently perform computationally intensive calculations in Erlang/LFE (and any other Erlang Core-compatible language, e.g., Elixir, Joxa, etc.) That's super-cool, but it's not quite the point ...

While working on lsci, I found myself experiencing a great deal of joy. It wasn't just the fact that supervision trees in a programming language are insanely great. Nor just the fact that scientific computing in Python is one of the best in any language. It wasn't only being able to use two syntaxes that I love (LFE and Python) cohesively, in the same project. And it wasn't the sum of these either -- you probably see where I'm going with this ;-) The joy of these and many other fantastic aspects of inter-operation between multiple powerful computing systems is truly greater than the sum of its parts.

I've done a bunch of Julia lately and am a huge fan of this language as well. One of the things that Julia provides is explicit interop with Python. Julia is targeted at the world of scientific computing, aiming to be a compelling alternative to Fortran (hurray!), so their recognition of the enormous contribution the Python scientific computing community has made to the industry is quite wonderful to see.

A year or so ago I did some work with Clojure and LFE using Erlang's JInterface. Around the same time I was using LFE on top of Erjang, calling directly into Java without JInterface. This is the same sort of Joy that users of Jython have, and there are many more examples of languages and tools working to take advantage of the massive resources available in the computing community.

Obviously, language inter-op is not new. Various FFIs have existed for quite some time (I'm a big fan of the Common Lisp CFFI), but what is new (relatively, that is ... as I age, anything in the past 10 years is new) is that we are seeing this not just for programs reaching down into C/C++, but reaching across, to other higher-level languages, taking advantage of their great achievements -- without having to reinvent so many wheels.

When this level of cooperation, credit, etc., is done in the spirit of openness, peer-review, code-reuse, and standing on the shoulders of giants (or enough people to make giants!), we get joy. Beautiful, wonderful coding joy.

And it's so much greater than the sum of the parts :-)


05 Jun 2015 2:53pm GMT

24 May 2015

feedPlanet Twisted

Twisted Matrix Laboratories: Twisted 15.2.1 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 15.2.1.

This is a bugfix release for the 15.2 series that fixes a regression in the new logging framework.

You can find the downloads on PyPI (or alternatively on the Twisted Matrix Labs website).

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
HawkOwl

24 May 2015 12:22pm GMT

23 May 2015

feedPlanet Twisted

Moshe Zadka: Unicode, UTF-8 and you

Unicode is not a panacea. Some people's names can't even be written in unicode. However, as far as universal encodings go, it is the best we have got - warts and all. It is the only reasonable way to represent text inside programs, except for very very specialized needs (no, you don't qualify).

Now, programs are made of libraries, and often there are several layers of abstraction between the library and the program. Sometimes, some weird abstraction layer in the middle will make it hard to convey user configuration into the library's guts. Code should figure things out itself, most of the time.

So, there are several ways to make dealing with unicode not-horrible.

Unicode internally

I've already mentioned it, but it bears repeating. Internal representation should use the language's built-in type (str in Python 3, String in Java, unicode in Python 2). All formatting, templating, etc. should be, internally, represented as taking unicode parameters and returning unicode results.

Standards

Obviously, when interacting with an external protocol that allows the other side to specify encoding, follow the encoding it specifies. Your program should support, at least, UTF-8, UTF-16, UTF-32 and Latin-1 through Latin-9. When choosing output encoding, choose UTF-8 by default. If there is some way for the user to specify an encoding, allow choosing between that and UTF-16. Anything else should be under "Advanced…" or, possibly, not at all.

Non-standards

When reading input that is not marked with an encoding, attempt to decode as UTF-8, then as UTF-16 (most UTF-16 decoders will auto-detect endianity, but it is pretty easy to hand-hack if people put in the BOM. UTF-8/16 are unlikely to have false positives, so if either succeeds, it's likely correct. Otherwise, as-ASCII-and-ignore-high-order is often the best that can be done. If it is reasonable, allow user-intervention in specifying the encoding.

When writing output, the default should be UTF-8. If it is non-trivial to allow user specification of the encoding, that is fine. If it is possible, UTF-16 should be offered (and BOM should be prepended to start-of-output). Other encodings are not recommended if there is no way to specify them: the reader will have to guess correctly. At the least, giving the user such options should be hidden behind an "Advanced…" option.

The most popular I/O that does not have explicit encoding, or any way to specify one, is file names on UNIX systems. UTF-8 should be assumed, and reasonably recovered from when it proves false. No other encoding is reasonable (UTF-16 is uniquely unsuitable since UNIX filenames cannot have NULs, and other encodings cannot encode some characters).

23 May 2015 4:04pm GMT

19 May 2015

feedPlanet Twisted

Twisted Matrix Laboratories: Twisted 15.2.0 Released

On behalf of Twisted Matrix Labs, I'm honoured to announce the release of Twisted 15.2.

Bringing not only headlining features but also a lot of incremental improvements, this release has got plenty to like:


You can find the downloads on PyPI (or alternatively the Twisted website).

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
HawkOwl

19 May 2015 6:51am GMT

09 May 2015

feedPlanet Twisted

Glyph Lefkowitz: Separate your Fakes and your Inspectors

When you are writing unit tests, you will commonly need to write duplicate implementations of your dependencies to test against systems which do external communication or otherwise manipulate state that you can't inspect. In other words, test fakes. However, a "test fake" is just one half of the component that you're building: you're also generally building a test inspector.

As an example, let's consider the case of this record-writing interface that we may need to interact with.

1
2
3
4
5
6
class RecordWriter(object):
    def write_record(self, record):
        "..."

    def close(self):
        "..."

This is a pretty simple interface; it can write out a record, and it can be closed.

Faking it out is similarly easy:

1
2
3
4
5
class FakeRecordWriter(object):
    def write_record(self, record):
        pass
    def close(self):
        pass

But this fake record writer isn't very useful. It's a simple stub; if our application writes any interesting records out, we won't know about it. If it closes the record writer, we won't know.

The conventional way to correct this problem, of course, is to start tracking some state, so we can assert about it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class FakeRecordWriter(object):
    def __init__(self):
        self.records = []
        self.closed = False

    def write_record(self, record):
        if self.closed:
            raise IOError("cannot write; writer is closed")
        self.records.append(record)

    def close(self):
        if self.closed:
            raise IOError("cannot close; writer is closed")
        self.closed = True

This is a very common pattern in test code. However, it's an antipattern.

We have exposed 2 additional, apparently public attributes to application code: .records and .closed. Our original RecordWriter interface didn't have either of those. Since these attributes are public, someone working on the application code could easily, inadvertently access them. Although it's unlikely that an application author would think that they could read records from a record writer by accessing .records, it's plausible that they might add a check of .closed before calling .close(), to make sure they won't get an exception. Such a mistake might happen because their IDE auto-suggested the completion, for example.

The resolution for this antipattern is to have a separate "fake" object, exposing only the public attributes that are also on the object being faked, and an "inspector" object, which exposes only the functionality useful to the test.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class WriterState(object):
    def __init__(self):
        self.records = []
        self.closed = False

    def raise_if_closed(self):
        if self.closed:
            raise ValueError("already closed")


class _FakeRecordWriter(object):
    def __init__(self, writer_state):
        self._state = writer_state

    def write_record(self, record):
        self._state.raise_if_closed()
        self._state.records.append(record)

    def close(self):
        self._state.raise_if_closed()
        self._state.closed = True


def create_fake_writer():
    state = WriterState()
    return state, _FakeRecordWriter(state)

In this refactored example, we now have a top-level entry point of create_fake_writer, which always creates a pair of WriterState and thing-which-is-like-a-RecordWriter. The type of _FakeRecordWriter can now be private, because it's no longer interesting on its own; it exposes nothing beyond the methods it's trying to fake.

Whenever you're writing test fakes, consider writing them like this, to ensure that you can hand application code the application-facing half of your fake, and test code the test-facing half of the fake, and not get them mixed up.

09 May 2015 6:52am GMT

30 Apr 2015

feedPlanet Twisted

Moshe Zadka: Lessons learned in porting to PyPy

I'm running the ncolony tests with pypy so I can add PyPy support. I expected it to be a no-op - but turned out I have ingrained expectations that are no longer true: moreover, PyPy is right, and I'm wrong.

That was it! It was actually a pleasant experience :)

30 Apr 2015 2:13am GMT

13 Apr 2015

feedPlanet Twisted

Twisted Matrix Laboratories: Twisted 15.1.0 Released

On behalf of Twisted Matrix Laboratories, I am honoured to announce the release of Twisted 15.1.0 -- just in time for the PyCon sprints!

This is not a big release, but does have some nice-to-haves:

- You can now install Twisted's optional dependencies easier -- for example, pip install twisted[tls] installs Twisted with TLS support.
- twisted.web.static.File allows defining a custom resource for rendering forbidden pages.
- Twisted's MSN support is now deprecated.
- More documentation has been added on how Trial finds tests.
- ...and 26 other closed tickets containing bug fixes, feature enhancements, and documentation.

For more information, check the NEWS file (link provided below).

You can find the downloads on PyPI (or alternatively the Twisted website) . The NEWS file is also available for reading.

Many thanks to everyone who had a part in this release - the supporters of the Twisted Software Foundation, the developers who contributed code as well as documentation, and all the people building great things with Twisted!

Twisted Regards,
Hawkie Owl

13 Apr 2015 8:23am GMT

10 Apr 2015

feedPlanet Twisted

Moshe Zadka: On Authentication in Web Applications

So you are writing a web application. Maybe it's "2.0", maybe it's "1.0". Maybe it's 3.0? I am not sure if that's a thing. Anyway, the important thing is that you want to know that people are who they say they are (I guess not if you're reimplementing 4Chan or other anonymous-only applications). So you have some buttons that say "Log in" and "Sign up". In the "Sign Up" page, you have the user enter in their e-mail address (twice, of course, because otherwise they'll misspell it) and the password (twice, same reason), and possibly a username. You check the password is "strong enough", and you have a little widget that rates it "weak", "fair", "strong" or whatever descriptive words you feel like today. Of course, being a civilized person, you store the passwords on the backend salted and hashed. You make sure that the password comparison is resistant to side-channel attacks. You add a CAPTCHA for the "forgot password" page to prevent mass-attacking it. Since you know your users are almost certainly using the same password on other sites, that do not do all of that, you also offer a 2-factor authentication scheme using Google Authenticator and falling back to SMS codes. Of course, before you store an SMS number as the fallback, you send it a trial code to make sure it is correct (right, WordPress.com?)

And, of course, no users use your 2-factor scheme, one day they get fished, and all their accounts are compromised.

Please note that the above paragraph is the absolute minimum you should do to be a responsible thing that says "sign up with your e-mail and password". Also highly recommended is participating in a white-hat bug bounty, hiring dedicated pen-testers and having a lot of server-side heuristics to detect a brute-force attack and shut it down immediately.

Do you know who has the resources to do all of that correctly? I can think of two companies that get it all correct. Their names start with consecutive letters of the alphabet… :)

Yep, Facebook and Google actually have the security teams and expertise to check every single one of those boxes (with the exception of the silly little widget that rates your password strength which never in the history of mankind has ever caused a user to choose a different password, because they only remember one password for all the sites they use and it doesn't change.) Please, for the love of kittens, puppies and hedgehogs, put a little "Sign-up with Facebook" and "Sign-up with Google+" widgets on your web-site. If you are worried about Facebook and Google "capturing" your users, just make sure to grab their (verified!) e-mail addresses when they sign up through OAuth, so that if you ever want to authenticate yourself, all you need to do is just have a "Forgot/recover password" widget, and you are on your way. You can even e-mail your users to tell them "Hey, FB/GOOG screwed us over, so start logging in with your password, and here is how you can recover the password".

There really is no excuse not to offer this, in 2015. Unless you think your security team is roughly as good as Facebook's.

10 Apr 2015 5:02am GMT

02 Apr 2015

feedPlanet Twisted

Glyph Lefkowitz: Not Funny

What?

Today's "joke" from the PSF about PyCon Havana was not funny, and, speaking as a PSF Fellow, I do not endorse it.

What's Not Funny?

Honestly I'm not sure where I could find a punch-line in this. I just don't see much there.

But if I look for something that's supposed to be "funny", here's what I see:

  1. Cuba is a backward country without sufficient technology to host a technical conference, and it is absurd and therefore "funny" that we could hold PyCon there.
  2. We are talking about PyCon US; despite the recent thaw in relations, decades of hostility that have torn families apart make it "funny" that US citizens would go to Cuba for a conference.

These things aren't funny.

Some Non-Reasons I'm Writing This

A common objection when someone speaks up about a subject like this is that it's "just a joke". That anyone speaking up and saying that offensive things aren't funny somehow dislikes the very concept of humor. I don't know why people think that, but I guess I need to make it clear: I am not an enemy of joy. That is not why I'm saying something.

I'm also not Cuban, I have no Cuban relatives, and until this incident I didn't even know I had friends of Cuban extraction, so I am not personally insulted by this. That means another common objection will crop up: some will ask if I'm just looking for an excuse to get offended, to write about taking offense and get attention for it.

So let me assure you, that personally, this is not the kind of attention that I want. I really didn't want to write this post. It's awkward. I really don't want to be having these types of conversations. I want to get attention for the software I write, not for my opinions about tacky blog posts.

Why, Then?

I might not know many Cuban python programmers personally, but I'd love to meet some. I'd love to meet anyone who cares about programming. Meeting diverse people from all over the world and working with them on code has been one of the great joys of my life. I love the fact that the Python community facilitates that and tries hard to reach out to people and to make them feel welcome.

I am writing this because I know that, somewhere out there, there's a Cuban programmer, or a kid who will grow up to be one, who might see that blog post, and think that the Python community, or the software industry, thinks that they're a throw-away punch line. I want them to know that I don't think they're a punch line. I want them to know that the python community doesn't think they're a punch line. I want them to know that they are not a punch line, and I want them to pursue their interest in programming exactly as far as it takes them and not push them away.

These people are real, they are listening, and if you tell me to just "lighten up" you are saying that your enjoyment of a joke is more important than their membership in our community.

It's Not Just Me

The PSF is paying attention. The chairman of the PSF has acknowledged the problematic nature of the "joke". Several of my friends in the Python community spoke up before I did (here, here, here, here, here, here, here), and I am very grateful for their taking the community to task and keeping us true to ideals of inclusiveness and empathy.

That doesn't excuse the public statement, made using official channels, which was in very poor taste. I am also very disappointed in certain people within the PSF1 who seem intent on doubling down on this mistake rather than trying to do something to correct it.


  1. names withheld to avoid a pile-on, but you know who you are and you should be ashamed.

02 Apr 2015 5:47am GMT