20 Dec 2014

feedPlanet Python

Calvin Spealman: Handmade Hero



Handmade hero looks like an amazing project.

If you're a long time game developer, new to it, or develop in some other discipline I think you have to respect the goals Casey has laid out for himself here. Developing an old style game from scratch live every weeknight is both a wonderful personal project and a beautiful piece of art.

Check it out!

20 Dec 2014 11:07pm GMT

Calvin Spealman: Handmade Hero



Handmade hero looks like an amazing project.

If you're a long time game developer, new to it, or develop in some other discipline I think you have to respect the goals Casey has laid out for himself here. Developing an old style game from scratch live every weeknight is both a wonderful personal project and a beautiful piece of art.

Check it out!

20 Dec 2014 11:07pm GMT

Ionel Cristian: Compiling Python extensions on Windows

For Python 2.7*

For Python 2.7 you need to get Microsoft Visual C++ Compiler for Python 2.7. It's a special package made by Microsoft that has all the stuff. It is supported since setuptools 6.0 [1].

Unfortunately the latest virtualenv, 1.11.6 as of now, still bundles setuptools 3.6. This means that if you try to run python setup.py build_ext in an virtualenv it will fail, because setuptools can't detect the compiler. The solution is to force upgrade setuptools, example: pip install "setuptools>=6.0".

If you're using tox then just add it to your deps. Example:

[testenv]
deps =
    setuptools>=6.0

This seems to work fine for 64bit extensions.

Note

Probably works for Python 3.3 too.

For Python 3.4*

This one gave me a headache. I've tried to follow this guide but had some problems with getting 64bit extensions to compile. In order to get it to work you need to jump through these hoops:

  1. Install Visual C++ 2010 Express.

  2. Install Windows SDK for Visual Studio 2010 (also know as the Windows SDK v7.1). This is required for 64bit extensions.

    Before installing the Windows SDK v7.1 (these gave me a bad time):

    • Do not install Microsoft Visual Studio 2010 Service Pack 1 yet. If you did then you have to reinstall everything. Uninstalling the Service Pack removes components so you have to reinstall Visual C++ 2010 Express again.
    • Remove all the Microsoft Visual C++ 2010 Redistributable packages from Control Panel\Programs and Features.

    If you don't do those then the install is going to fail with an obscure "Fatal error during installation" error.

  3. Create a vcvar64.bat file in C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\amd64 that contains [2]:

    CALL "C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\SetEnv.cmd" /x64
    

    If you don't do this you're going to get a mind-boggling error like this:

    Installing collected packages: whatever
      Running setup.py develop for whatever
        building 'whatever.cext' extension
    
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:/whatever\setup.py", line 112, in <module>
        for root, _, _ in os.walk("src")
      File "c:\python34\Lib\distutils\core.py", line 148, in setup
        dist.run_commands()
      File "c:\python34\Lib\distutils\dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "c:\python34\Lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\develop.py", line 32, in run
        self.install_for_development()
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\develop.py", line 117, in install_for_development
        self.run_command('build_ext')
      File "c:\python34\Lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "c:\python34\Lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "C:/whatever\setup.py", line 32, in run
        build_ext.run(self)
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\build_ext.py", line 54, in run
        _build_ext.run(self)
      File "c:\python34\Lib\distutils\command\build_ext.py", line 339, in run
        self.build_extensions()
      File "c:\python34\Lib\distutils\command\build_ext.py", line 448, in build_extensions
        self.build_extension(ext)
      File "C:/whatever\setup.py", line 39, in build_extension
        build_ext.build_extension(self, ext)
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\build_ext.py", line 187, in build_extension
        _build_ext.build_extension(self, ext)
      File "c:\python34\Lib\distutils\command\build_ext.py", line 503, in build_extension
        depends=ext.depends)
      File "c:\python34\Lib\distutils\msvc9compiler.py", line 460, in compile
        self.initialize()
      File "c:\python34\Lib\distutils\msvc9compiler.py", line 371, in initialize
        vc_env = query_vcvarsall(VERSION, plat_spec)
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\msvc9_support.py", line 52, in query_vcvarsall
        return unpatched['query_vcvarsall'](version, *args, **kwargs)
      File "c:\python34\Lib\distutils\msvc9compiler.py", line 287, in query_vcvarsall
        raise ValueError(str(list(result.keys())))
    ValueError: ['path']
    Complete output from command C:\whatever\.tox\3.4\Scripts\python.exe -c "import setuptools, tokenize; __file__='C:/whatever\\setup.py'; exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps:
    

    msvc9_support.py will run vcvarsall.bat amd64:

    c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC>vcvarsall.bat amd64
    The specified configuration type is missing.  The tools for the
    configuration might not be installed.
    

    Basically that is caused by vcvarsall.bat not being able to run vcvar64.bat because, surprise, the Windows SDK is missing that file.

  4. Now everything should work, go and try py -3 setup.py clean --all build_ext --force.

  5. Install Microsoft Visual Studio 2010 Service Pack 1. This is optional, however, if you do this you also have to do the following too:

  6. Install Microsoft Visual C++ 2010 Service Pack 1 Compiler Update for the Windows SDK 7.1.

[1] Support added in https://bitbucket.org/pypa/setuptools/issue/258
[2] See: http://stackoverflow.com/a/26513378

20 Dec 2014 10:00pm GMT

Ionel Cristian: Compiling Python extensions on Windows

For Python 2.7*

For Python 2.7 you need to get Microsoft Visual C++ Compiler for Python 2.7. It's a special package made by Microsoft that has all the stuff. It is supported since setuptools 6.0 [1].

Unfortunately the latest virtualenv, 1.11.6 as of now, still bundles setuptools 3.6. This means that if you try to run python setup.py build_ext in an virtualenv it will fail, because setuptools can't detect the compiler. The solution is to force upgrade setuptools, example: pip install "setuptools>=6.0".

If you're using tox then just add it to your deps. Example:

[testenv]
deps =
    setuptools>=6.0

This seems to work fine for 64bit extensions.

Note

Probably works for Python 3.3 too.

For Python 3.4*

This one gave me a headache. I've tried to follow this guide but had some problems with getting 64bit extensions to compile. In order to get it to work you need to jump through these hoops:

  1. Install Visual C++ 2010 Express.

  2. Install Windows SDK for Visual Studio 2010 (also know as the Windows SDK v7.1). This is required for 64bit extensions.

    Before installing the Windows SDK v7.1 (these gave me a bad time):

    • Do not install Microsoft Visual Studio 2010 Service Pack 1 yet. If you did then you have to reinstall everything. Uninstalling the Service Pack removes components so you have to reinstall Visual C++ 2010 Express again.
    • Remove all the Microsoft Visual C++ 2010 Redistributable packages from Control Panel\Programs and Features.

    If you don't do those then the install is going to fail with an obscure "Fatal error during installation" error.

  3. Create a vcvar64.bat file in C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\amd64 that contains [2]:

    CALL "C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\SetEnv.cmd" /x64
    

    If you don't do this you're going to get a mind-boggling error like this:

    Installing collected packages: whatever
      Running setup.py develop for whatever
        building 'whatever.cext' extension
    
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:/whatever\setup.py", line 112, in <module>
        for root, _, _ in os.walk("src")
      File "c:\python34\Lib\distutils\core.py", line 148, in setup
        dist.run_commands()
      File "c:\python34\Lib\distutils\dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "c:\python34\Lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\develop.py", line 32, in run
        self.install_for_development()
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\develop.py", line 117, in install_for_development
        self.run_command('build_ext')
      File "c:\python34\Lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "c:\python34\Lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "C:/whatever\setup.py", line 32, in run
        build_ext.run(self)
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\build_ext.py", line 54, in run
        _build_ext.run(self)
      File "c:\python34\Lib\distutils\command\build_ext.py", line 339, in run
        self.build_extensions()
      File "c:\python34\Lib\distutils\command\build_ext.py", line 448, in build_extensions
        self.build_extension(ext)
      File "C:/whatever\setup.py", line 39, in build_extension
        build_ext.build_extension(self, ext)
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\command\build_ext.py", line 187, in build_extension
        _build_ext.build_extension(self, ext)
      File "c:\python34\Lib\distutils\command\build_ext.py", line 503, in build_extension
        depends=ext.depends)
      File "c:\python34\Lib\distutils\msvc9compiler.py", line 460, in compile
        self.initialize()
      File "c:\python34\Lib\distutils\msvc9compiler.py", line 371, in initialize
        vc_env = query_vcvarsall(VERSION, plat_spec)
      File "C:\whatever\.tox\3.4\lib\site-packages\setuptools\msvc9_support.py", line 52, in query_vcvarsall
        return unpatched['query_vcvarsall'](version, *args, **kwargs)
      File "c:\python34\Lib\distutils\msvc9compiler.py", line 287, in query_vcvarsall
        raise ValueError(str(list(result.keys())))
    ValueError: ['path']
    Complete output from command C:\whatever\.tox\3.4\Scripts\python.exe -c "import setuptools, tokenize; __file__='C:/whatever\\setup.py'; exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps:
    

    msvc9_support.py will run vcvarsall.bat amd64:

    c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC>vcvarsall.bat amd64
    The specified configuration type is missing.  The tools for the
    configuration might not be installed.
    

    Basically that is caused by vcvarsall.bat not being able to run vcvar64.bat because, surprise, the Windows SDK is missing that file.

  4. Now everything should work, go and try py -3 setup.py clean --all build_ext --force.

  5. Install Microsoft Visual Studio 2010 Service Pack 1. This is optional, however, if you do this you also have to do the following too:

  6. Install Microsoft Visual C++ 2010 Service Pack 1 Compiler Update for the Windows SDK 7.1.

[1] Support added in https://bitbucket.org/pypa/setuptools/issue/258
[2] See: http://stackoverflow.com/a/26513378

20 Dec 2014 10:00pm GMT

Invent with Python: Translate Your Python 3 Program with the gettext Module

You've written a Python 3 program and want to make it available in other languages. You could duplicate the entire code-base, then go painstakingly through each .py file and replace any text strings you find. But this would mean you have two separate copies of your code, which doubles your workload every time you need to make a change or fix a bug. And if you want your program in other languages, it gets even worse.

Fortunately, Python provides a solution with the gettext module.

A Hack Solution

You could hack together your own solution. For example, you could replace every string in your program with a function call (with the function name being something simple, like _())) which will return the string translated into the correct language. For example, if your program was:

print('Hello world!')

...you could change this to:

print(_('Hello world!'))

...and the _() function could return the translation for 'Hello world!' based on what language setting the program had. For example, if the language setting was stored in a global variable named LANGUAGE, the _() function could look like this:

def _(s):
    spanishStrings = {'Hello world!': 'Hola Mundo!'}
    frenchStrings = {'Hello world!': 'Bonjour le monde!'}
    germanStrings = {'Hello world!': 'Hallo Welt!'}

    if LANGUAGE == 'English':
        return s
    if LANGUAGE == 'Spanish':
        return spanishStrings[s]
    if LANGUAGE == 'French':
        return frenchStrings[s]
    if LANGUAGE == 'German':
        return germanStrings[s]

This would work, but you'd be reinventing the wheel. This is pretty much what Python's gettext module. gettext is a set of tools and file formats created in the early 1990s to standardize software internationalization (also called I18N). gettext was designed as a system for all programming languages, but we'll focus on Python in this article.

The Example Program

Say you have a simple "Guess the Number" game written in Python 3 that you want to translate. The source code to this program is here. There are four steps to internationalizing this program:

  1. Modify the .py file's source code so that the strings are passed to a function named _().
  2. Use the pygettext.py script that comes installed with Python to create a "pot" file from the source code.
  3. Use the free cross-platform Poedit software to create the .po and .mo files from the pot file.
  4. Modify your .py file's source code again to import the gettext module and set up the language setting.

Step 1: Add the _() Function

First, go through all of the strings in your program that will need to be translated and replace them with _() calls. The gettext system for Python uses _() as the generic name for getting the translated string since it is a short name.

Note that using string formatting instead of string concatenation will make your program easier to translate. For example, using string concatenation your program would have to look like this:

print('Good job, ' + myName + '! You guessed my number in ' + guessesTaken + ' guesses!')

print(_('Good job, ') + myName + _('! You guessed my number in ') + guessesTaken + _(' guesses!'))

This results in three separate strings that need to be translated, as opposed to the single string needed in the string formatting approach:

print('Good job, %s! You guessed my number in %s guesses!' % (myName, guessesTaken))

print(_('Good job, %s! You guessed my number in %s guesses!') % (myName, guessesTaken))

When you've gone through the "Guess the Number" source code, it will look like this. You won't be able to run this program since the _() function is undefined. This change is just so that the pygettext.py script can find all the strings that need to be translated.

Step 2: Extract the Strings Using pygettext.py

In the Tools/i18n of your Python installation (C:\Python34\Tools\i18n on Windows) is the pygettext.py script. While the normal gettext unix command parse C/C++ source code for translatable strings and the xgettext unix command can parse other languages, pygettext.py knows how to parse Python source code. It will find all of these strings and produce a "pot" file.

On Windows, I've run this script like so:

C:\>py -3.4 C:\Python34\Tools\i18n\pygettext.py -d guess guess.py

This creates a pot file named guess.pot. This is just a normal plaintext file that lists all the translated strings it found in the source code by search for _() calls. You can view the guess.pot file here.

Step 3: Translate the Strings using Poedit

You could fill in the translation using a text editor, but the free Poedit software makes it easier. Download it from http://poedit.net. Select File > New from POT/PO file... and select your guess.po file.

Poedit will ask what language you want to translate the strings to. For this example, we'll use Spanish:

Then fill in the translations. (I'm using http://translate.google.com, so it probably sounds a bit odd to actual Spanish-speakers.)

And now save the file in it's gettext-formatted folder. Saving will create the .po file (a human-readable text file identical to the original .pot file, except with the Spanish translations) and a .mo file (a machine-readable version which the gettext module will read. These files have to be saved in a certain folder structure for gettext to be able to find them. They look like this (say I have "es" Spanish files and "de" German files):

./guess.py
./guess.pot
./locale/es/LC_MESSAGES/guess.mo
./locale/es/LC_MESSAGES/guess.po
./locale/de/LC_MESSAGES/guess.mo
./locale/de/LC_MESSAGES/guess.po

These two-character language names like "es" for Spanish and "de" for German are called ISO 639-1 codes and are standard abbreviations for languages. You don't have to use them, but it makes sense to follow that naming standard.

Step 4: Add gettext Code to Your Program

Now that you have the .mo file that contains the translations, modify your Python script to use it. Add the following to your program:

import gettext
es = gettext.translation('guess', localedir='locale', languages=['es'])
es.install()

The first argument 'guess' is the "domain", which basically means the "guess" part of the guess.mo filename. The localedir is the directory location of the locale folder you created. This can be either a relative or absolute path. The 'es' string describes the folder under the locale folder. The LC_MESSAGES folder is a standard name

The install() method will cause all the _() calls to return the Spanish translated string. If you want to go back to the original English, just assign a lambda function value to _ that returns the string it was passed:

import gettext
es = gettext.translation('guess', localedir='locale', languages=['es'])
print(_('Hello! What is your name?'))  # prints Spanish

_ = lambda s: s

print(_('Hello! What is your name?')) # prints English

You can view the translation-ready source code for the "Guess the Number". If you want to run this program, download and unzip this zip file with it's locale folders and .mo file set up.

Further Reading

I am by no means an expert on I18N or gettext, and please leave comments if I'm breaking any best practices in this tutorial. Most of the time your software will not switch languages while it's running, and instead read one of the LANGUAGE, LC_ALL, LC_MESSAGES, and LANG environment variables to figure out the locale of the computer it's running on. I'll update this tutorial as I learn more.

20 Dec 2014 9:24pm GMT

Invent with Python: Translate Your Python 3 Program with the gettext Module

You've written a Python 3 program and want to make it available in other languages. You could duplicate the entire code-base, then go painstakingly through each .py file and replace any text strings you find. But this would mean you have two separate copies of your code, which doubles your workload every time you need to make a change or fix a bug. And if you want your program in other languages, it gets even worse.

Fortunately, Python provides a solution with the gettext module.

A Hack Solution

You could hack together your own solution. For example, you could replace every string in your program with a function call (with the function name being something simple, like _())) which will return the string translated into the correct language. For example, if your program was:

print('Hello world!')

...you could change this to:

print(_('Hello world!'))

...and the _() function could return the translation for 'Hello world!' based on what language setting the program had. For example, if the language setting was stored in a global variable named LANGUAGE, the _() function could look like this:

def _(s):
    spanishStrings = {'Hello world!': 'Hola Mundo!'}
    frenchStrings = {'Hello world!': 'Bonjour le monde!'}
    germanStrings = {'Hello world!': 'Hallo Welt!'}

    if LANGUAGE == 'English':
        return s
    if LANGUAGE == 'Spanish':
        return spanishStrings[s]
    if LANGUAGE == 'French':
        return frenchStrings[s]
    if LANGUAGE == 'German':
        return germanStrings[s]

This would work, but you'd be reinventing the wheel. This is pretty much what Python's gettext module. gettext is a set of tools and file formats created in the early 1990s to standardize software internationalization (also called I18N). gettext was designed as a system for all programming languages, but we'll focus on Python in this article.

The Example Program

Say you have a simple "Guess the Number" game written in Python 3 that you want to translate. The source code to this program is here. There are four steps to internationalizing this program:

  1. Modify the .py file's source code so that the strings are passed to a function named _().
  2. Use the pygettext.py script that comes installed with Python to create a "pot" file from the source code.
  3. Use the free cross-platform Poedit software to create the .po and .mo files from the pot file.
  4. Modify your .py file's source code again to import the gettext module and set up the language setting.

Step 1: Add the _() Function

First, go through all of the strings in your program that will need to be translated and replace them with _() calls. The gettext system for Python uses _() as the generic name for getting the translated string since it is a short name.

Note that using string formatting instead of string concatenation will make your program easier to translate. For example, using string concatenation your program would have to look like this:

print('Good job, ' + myName + '! You guessed my number in ' + guessesTaken + ' guesses!')

print(_('Good job, ') + myName + _('! You guessed my number in ') + guessesTaken + _(' guesses!'))

This results in three separate strings that need to be translated, as opposed to the single string needed in the string formatting approach:

print('Good job, %s! You guessed my number in %s guesses!' % (myName, guessesTaken))

print(_('Good job, %s! You guessed my number in %s guesses!') % (myName, guessesTaken))

When you've gone through the "Guess the Number" source code, it will look like this. You won't be able to run this program since the _() function is undefined. This change is just so that the pygettext.py script can find all the strings that need to be translated.

Step 2: Extract the Strings Using pygettext.py

In the Tools/i18n of your Python installation (C:\Python34\Tools\i18n on Windows) is the pygettext.py script. While the normal gettext unix command parse C/C++ source code for translatable strings and the xgettext unix command can parse other languages, pygettext.py knows how to parse Python source code. It will find all of these strings and produce a "pot" file.

On Windows, I've run this script like so:

C:\>py -3.4 C:\Python34\Tools\i18n\pygettext.py -d guess guess.py

This creates a pot file named guess.pot. This is just a normal plaintext file that lists all the translated strings it found in the source code by search for _() calls. You can view the guess.pot file here.

Step 3: Translate the Strings using Poedit

You could fill in the translation using a text editor, but the free Poedit software makes it easier. Download it from http://poedit.net. Select File > New from POT/PO file... and select your guess.po file.

Poedit will ask what language you want to translate the strings to. For this example, we'll use Spanish:

Then fill in the translations. (I'm using http://translate.google.com, so it probably sounds a bit odd to actual Spanish-speakers.)

And now save the file in it's gettext-formatted folder. Saving will create the .po file (a human-readable text file identical to the original .pot file, except with the Spanish translations) and a .mo file (a machine-readable version which the gettext module will read. These files have to be saved in a certain folder structure for gettext to be able to find them. They look like this (say I have "es" Spanish files and "de" German files):

./guess.py
./guess.pot
./locale/es/LC_MESSAGES/guess.mo
./locale/es/LC_MESSAGES/guess.po
./locale/de/LC_MESSAGES/guess.mo
./locale/de/LC_MESSAGES/guess.po

These two-character language names like "es" for Spanish and "de" for German are called ISO 639-1 codes and are standard abbreviations for languages. You don't have to use them, but it makes sense to follow that naming standard.

Step 4: Add gettext Code to Your Program

Now that you have the .mo file that contains the translations, modify your Python script to use it. Add the following to your program:

import gettext
es = gettext.translation('guess', localedir='locale', languages=['es'])
es.install()

The first argument 'guess' is the "domain", which basically means the "guess" part of the guess.mo filename. The localedir is the directory location of the locale folder you created. This can be either a relative or absolute path. The 'es' string describes the folder under the locale folder. The LC_MESSAGES folder is a standard name

The install() method will cause all the _() calls to return the Spanish translated string. If you want to go back to the original English, just assign a lambda function value to _ that returns the string it was passed:

import gettext
es = gettext.translation('guess', localedir='locale', languages=['es'])
print(_('Hello! What is your name?'))  # prints Spanish

_ = lambda s: s

print(_('Hello! What is your name?')) # prints English

You can view the translation-ready source code for the "Guess the Number". If you want to run this program, download and unzip this zip file with it's locale folders and .mo file set up.

Further Reading

I am by no means an expert on I18N or gettext, and please leave comments if I'm breaking any best practices in this tutorial. Most of the time your software will not switch languages while it's running, and instead read one of the LANGUAGE, LC_ALL, LC_MESSAGES, and LANG environment variables to figure out the locale of the computer it's running on. I'll update this tutorial as I learn more.

20 Dec 2014 9:24pm GMT

Jean-Paul Calderone: Asynchronous Object Initialization - Patterns and Antipatterns

I caught Toshio Kuratomi's post about asyncio initialization patterns (or anti-patterns) on Planet Python. This is something I've dealt with a lot over the years using Twisted (one of the sources of inspiration for the asyncio developers).

To recap, Toshio wondered about a pattern involving asynchronous initialization of an instance. He wondered whether it was a good idea to start this work in __init__ and then explicitly wait for it in other methods of the class before performing the distinctive operations required by those other methods. Using asyncio (and using Toshio's example with some omissions for simplicity) this looks something like:


class Microblog:
def __init__(self, ...):
loop = asyncio.get_event_loop()
self.init_future = loop.run_in_executor(None, self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@asyncio.coroutine
def sync_latest(self):
# Don't do anything until initialization is done
yield from self.init_future
# ... do some work that depends on that initialization ...

It's quite possible to do something similar to this when using Twisted. It only looks a little bit difference:


class Microblog:
def __init__(self, ...):
self.init_deferred = deferToThread(self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@inlineCallbacks
def sync_latest(self):
# Don't do anything until initialization is done
yield self.init_deferred
# ... do some work that depends on that initialization ...

Despite the differing names, these two pieces of code basical do the same thing:

Maintenance costs

One thing this pattern gives you is an incompletely initialized object. If you write m = Microblog() then m refers to an object that's not actually ready to perform all of the operations it supposedly can perform. It's either up to the implementation or the caller to make sure to wait until it is ready. Toshio suggests that each method should do this implicitly (by starting with yield self.init_deferred or the equivalent). This is definitely better than forcing each call-site of a Microblog method to explicitly wait for this event before actually calling the method.

Still, this is a maintenance burden that's going to get old quickly. If you want full test coverage, it means you now need twice as many unit tests (one for the case where method is called before initialization is complete and another for the case where the method is called after this has happened). At least. Toshio's _reading_init method actually modifies attributes of self which means there are potentially many more than just two possible cases. Even if you're not particularly interested in having full automated test coverage (... for some reason ...), you still have to remember to add this yield statement to the beginning of all of Microblog's methods. It's not exactly a ton of work but it's one more thing to remember any time you maintain this code. And this is the kind of mistake where making a mistake creates a race condition that you might not immediately notice - which means you may ship the broken code to clients and you get to discover the problem when they start complaining about it.

Diminished flexibility

Another thing this pattern gives you is an object that does things as soon as you create it. Have you ever had a class with a __init__ method that raised an exception as a result of a failing interaction with some other part of the system? Perhaps it did file I/O and got a permission denied error or perhaps it was a socket doing blocking I/O on a network that was clogged and unresponsive. Among other problems, these cases are often difficult to report well because you don't have an object to blame the problem on yet. The asynchronous version is perhaps even worse since a failure in this asynchronous initialization doesn't actually prevent you from getting the instance - it's just another way you can end up with an incompletely initialized object (this time, one that is never going to be completely initialized and use of which is unsafe in difficult to reason-about ways).

Another related problem is that it removes one of your options for controlling the behavior of instances of that class. It's great to be able to control everything a class does just by the values passed in to __init__ but most programmers have probably come across a case where behavior is controlled via an attribute instead. If __init__ starts an operation then instantiating code doesn't have a chance to change the values of any attributes first (except, perhaps, by resorting to setting them on the class - which has global consequences and is generally icky).

Loss of control

A third consequence of this pattern is that instances of classes which employ it are inevitably doing something. It may be that you don't always want the instance to do something. It's certainly fine for a Microblog instance to create a SQLite3 database and initialize a cache directory if the program I'm writing which uses it is actually intent on hosting a blog. It's most likely the case that other useful things can be done with a Microblog instance, though. Toshio's own example includes a post method which doesn't use the SQLite3 database or the cache directory. His code correctly doesn't wait for init_future at the beginning of his post method - but this should leave the reader wondering why we need to create a SQLite3 database if all we want to do is post new entries.

Using this pattern, the SQLite3 database is always created - whether we want to use it or not. There are other reasons you might want a Microblog instance that hasn't initialized a bunch of on-disk state too - one of the most common is unit testing (yes, I said "unit testing" twice in one post!). A very convenient thing for a lot of unit tests, both of Microblog itself and of code that uses Microblog, is to compare instances of the class. How do you know you got a Microblog instance that is configured to use the right cache directory or database type? You most likely want to make some comparisons against it. The ideal way to do this is to be able to instantiate a Microblog instance in your test suite and uses its == implementation to compare it against an object given back by some API you've implemented. If creating a Microblog instance always goes off and creates a SQLite3 database then at the very least your test suite is going to be doing a lot of unnecessary work (making it slow) and at worst perhaps the two instances will fight with each other over the same SQLite3 database file (which they must share since they're meant to be instances representing the same state). Another way to look at this is that inextricably embedding the database connection logic into your __init__ method has taken control away from the user. Perhaps they have their own database connection setup logic. Perhaps they want to re-use connections or pass in a fake for testing. Saving a reference to that object on the instance for later use is a separate operation from creating the connection itself. They shouldn't be bound together in __init__ where you have to take them both or give up on using Microblog.

Alternatives

You might notice that these three observations I've made all sound a bit negative. You might conclude that I think this is an antipattern to be avoided. If so, feel free to give yourself a pat on the back at this point.

But if this is an antipattern, is there a pattern to use instead? I think so. I'll try to explain it.

The general idea behind the pattern I'm going to suggest comes in two parts. The first part is that your object should primarily be about representing state and your __init__ method should be about accepting that state from the outside world and storing it away on the instance being initialized for later use. It should always represent complete, internally consistent state - not partial state as asynchronous initialization implies. This means your __init__ methods should mostly look like this:


class Microblog(object):
def __init__(self, cache_dir, database_connection):
self.cache_dir = cache_dir
self.database_connection = database_connection

If you think that looks boring - yes, it does. Boring is a good thing here. Anything exciting your __init__ method does is probably going to be the cause of someone's bad day sooner or later. If you think it looks tedious - yes, it does. Consider using Hynek Schlawack's excellent characteristic package (full disclosure - I contributed some ideas to characteristic's design and Hynek ocassionally says nice things about me (I don't know if he means them, I just know he says them)).

The second part of the idea an acknowledgement that asynchronous initialization is a reality of programming with asynchronous tools. Fortunately __init__ isn't the only place to put code. Asynchronous factory functions are a great way to wrap up the asynchronous work sometimes necessary before an object can be fully and consistently initialized. Put another way:


class Microblog(object):
# ... __init__ as above ...

@classmethod
@asyncio.coroutine
def from_database(self, cache_dir, database_path):
# ... or make it a free function, not a classmethod, if you prefer
loop = asyncio.get_event_loop()
database_connection = yield from loop.run_in_executor(None, cls._reading_init)
return cls(cache_dir, database_connection)

Notice that the setup work for a Microblog instance is still asynchronous but initialization of the Microblog instance is not. There is never a time when a Microblog instance is hanging around partially ready for action. There is setup work and then there is a complete, usable Microblog.

This addresses the three observations I made above:

I hope these points have made a strong case for one of these approaches being an anti-pattern to avoid (in Twisted, in asyncio, or in any other asynchronous programming context) and for the other as being a useful pattern to provide both convenient, expressive constructors while at the same time making object initializers unsurprising and maximizing their usefulness.

20 Dec 2014 9:20pm GMT

Jean-Paul Calderone: Asynchronous Object Initialization - Patterns and Antipatterns

I caught Toshio Kuratomi's post about asyncio initialization patterns (or anti-patterns) on Planet Python. This is something I've dealt with a lot over the years using Twisted (one of the sources of inspiration for the asyncio developers).

To recap, Toshio wondered about a pattern involving asynchronous initialization of an instance. He wondered whether it was a good idea to start this work in __init__ and then explicitly wait for it in other methods of the class before performing the distinctive operations required by those other methods. Using asyncio (and using Toshio's example with some omissions for simplicity) this looks something like:


class Microblog:
def __init__(self, ...):
loop = asyncio.get_event_loop()
self.init_future = loop.run_in_executor(None, self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@asyncio.coroutine
def sync_latest(self):
# Don't do anything until initialization is done
yield from self.init_future
# ... do some work that depends on that initialization ...

It's quite possible to do something similar to this when using Twisted. It only looks a little bit difference:


class Microblog:
def __init__(self, ...):
self.init_deferred = deferToThread(self._reading_init)

def _reading_init(self):
# ... do some initialization work,
# presumably expensive or otherwise long-running ...

@inlineCallbacks
def sync_latest(self):
# Don't do anything until initialization is done
yield self.init_deferred
# ... do some work that depends on that initialization ...

Despite the differing names, these two pieces of code basical do the same thing:

Maintenance costs

One thing this pattern gives you is an incompletely initialized object. If you write m = Microblog() then m refers to an object that's not actually ready to perform all of the operations it supposedly can perform. It's either up to the implementation or the caller to make sure to wait until it is ready. Toshio suggests that each method should do this implicitly (by starting with yield self.init_deferred or the equivalent). This is definitely better than forcing each call-site of a Microblog method to explicitly wait for this event before actually calling the method.

Still, this is a maintenance burden that's going to get old quickly. If you want full test coverage, it means you now need twice as many unit tests (one for the case where method is called before initialization is complete and another for the case where the method is called after this has happened). At least. Toshio's _reading_init method actually modifies attributes of self which means there are potentially many more than just two possible cases. Even if you're not particularly interested in having full automated test coverage (... for some reason ...), you still have to remember to add this yield statement to the beginning of all of Microblog's methods. It's not exactly a ton of work but it's one more thing to remember any time you maintain this code. And this is the kind of mistake where making a mistake creates a race condition that you might not immediately notice - which means you may ship the broken code to clients and you get to discover the problem when they start complaining about it.

Diminished flexibility

Another thing this pattern gives you is an object that does things as soon as you create it. Have you ever had a class with a __init__ method that raised an exception as a result of a failing interaction with some other part of the system? Perhaps it did file I/O and got a permission denied error or perhaps it was a socket doing blocking I/O on a network that was clogged and unresponsive. Among other problems, these cases are often difficult to report well because you don't have an object to blame the problem on yet. The asynchronous version is perhaps even worse since a failure in this asynchronous initialization doesn't actually prevent you from getting the instance - it's just another way you can end up with an incompletely initialized object (this time, one that is never going to be completely initialized and use of which is unsafe in difficult to reason-about ways).

Another related problem is that it removes one of your options for controlling the behavior of instances of that class. It's great to be able to control everything a class does just by the values passed in to __init__ but most programmers have probably come across a case where behavior is controlled via an attribute instead. If __init__ starts an operation then instantiating code doesn't have a chance to change the values of any attributes first (except, perhaps, by resorting to setting them on the class - which has global consequences and is generally icky).

Loss of control

A third consequence of this pattern is that instances of classes which employ it are inevitably doing something. It may be that you don't always want the instance to do something. It's certainly fine for a Microblog instance to create a SQLite3 database and initialize a cache directory if the program I'm writing which uses it is actually intent on hosting a blog. It's most likely the case that other useful things can be done with a Microblog instance, though. Toshio's own example includes a post method which doesn't use the SQLite3 database or the cache directory. His code correctly doesn't wait for init_future at the beginning of his post method - but this should leave the reader wondering why we need to create a SQLite3 database if all we want to do is post new entries.

Using this pattern, the SQLite3 database is always created - whether we want to use it or not. There are other reasons you might want a Microblog instance that hasn't initialized a bunch of on-disk state too - one of the most common is unit testing (yes, I said "unit testing" twice in one post!). A very convenient thing for a lot of unit tests, both of Microblog itself and of code that uses Microblog, is to compare instances of the class. How do you know you got a Microblog instance that is configured to use the right cache directory or database type? You most likely want to make some comparisons against it. The ideal way to do this is to be able to instantiate a Microblog instance in your test suite and uses its == implementation to compare it against an object given back by some API you've implemented. If creating a Microblog instance always goes off and creates a SQLite3 database then at the very least your test suite is going to be doing a lot of unnecessary work (making it slow) and at worst perhaps the two instances will fight with each other over the same SQLite3 database file (which they must share since they're meant to be instances representing the same state). Another way to look at this is that inextricably embedding the database connection logic into your __init__ method has taken control away from the user. Perhaps they have their own database connection setup logic. Perhaps they want to re-use connections or pass in a fake for testing. Saving a reference to that object on the instance for later use is a separate operation from creating the connection itself. They shouldn't be bound together in __init__ where you have to take them both or give up on using Microblog.

Alternatives

You might notice that these three observations I've made all sound a bit negative. You might conclude that I think this is an antipattern to be avoided. If so, feel free to give yourself a pat on the back at this point.

But if this is an antipattern, is there a pattern to use instead? I think so. I'll try to explain it.

The general idea behind the pattern I'm going to suggest comes in two parts. The first part is that your object should primarily be about representing state and your __init__ method should be about accepting that state from the outside world and storing it away on the instance being initialized for later use. It should always represent complete, internally consistent state - not partial state as asynchronous initialization implies. This means your __init__ methods should mostly look like this:


class Microblog(object):
def __init__(self, cache_dir, database_connection):
self.cache_dir = cache_dir
self.database_connection = database_connection

If you think that looks boring - yes, it does. Boring is a good thing here. Anything exciting your __init__ method does is probably going to be the cause of someone's bad day sooner or later. If you think it looks tedious - yes, it does. Consider using Hynek Schlawack's excellent characteristic package (full disclosure - I contributed some ideas to characteristic's design and Hynek ocassionally says nice things about me (I don't know if he means them, I just know he says them)).

The second part of the idea an acknowledgement that asynchronous initialization is a reality of programming with asynchronous tools. Fortunately __init__ isn't the only place to put code. Asynchronous factory functions are a great way to wrap up the asynchronous work sometimes necessary before an object can be fully and consistently initialized. Put another way:


class Microblog(object):
# ... __init__ as above ...

@classmethod
@asyncio.coroutine
def from_database(self, cache_dir, database_path):
# ... or make it a free function, not a classmethod, if you prefer
loop = asyncio.get_event_loop()
database_connection = yield from loop.run_in_executor(None, cls._reading_init)
return cls(cache_dir, database_connection)

Notice that the setup work for a Microblog instance is still asynchronous but initialization of the Microblog instance is not. There is never a time when a Microblog instance is hanging around partially ready for action. There is setup work and then there is a complete, usable Microblog.

This addresses the three observations I made above:

I hope these points have made a strong case for one of these approaches being an anti-pattern to avoid (in Twisted, in asyncio, or in any other asynchronous programming context) and for the other as being a useful pattern to provide both convenient, expressive constructors while at the same time making object initializers unsurprising and maximizing their usefulness.

20 Dec 2014 9:20pm GMT

Daniel Greenfeld: I did an Aú Batido in 2014, now what?

This is my summary of my resolutions for 2014 and an early pass at my resolutions for 2015. I'm doing this right now instead of at the end of the year because as of the afternoon of December 21, I'm going off the grid.

Resolutions Accomplished in 2014

  • Release the second edition of Two Scoops of Django.
  • Visited South America.
  • Visited two new nations, Argentina and Brazil.
  • Went back to the East Coast to visit Philadelphia and places in New Jersey.
  • Went back to the Philippines.
  • Took some awesome road trips around the USA. My favorite was driving up the Pacific Coast Highway.
  • Took a fun class with Audrey. We did woodshop!
  • Learned how to do Aú Batido!
http://pydanny.com/static/aubatido.jpg

General Accomplishments

  • Published Two Scoops of Django 1.6 with my awesome wife, Audrey Roy Greenfeld.
  • Experienced my first major surgery.
  • Started working at Eventbrite.
  • Found a Inland Empire Capoeira group but still remained friends with Capoeira Batuque.
  • Gave talks at:
    • Wharton School of Business Web Conference
    • PyDay Mendoza
    • Python Brazil
    • PyCon Argentina
  • Experienced the unbelievably tasty asado (steak) of Argentina.
  • Participated in organizing the first Django Bar Camp.
  • Played in 2 Capoeira Rodas in Brazil.
  • Participated in the successful effort to reboot LA Django.
  • Ate dinner below the waterline inside fake island in the San Francisco Bay.

Resolutions for 2015

  • Bring a new child into the world.
  • Write and publish at least 1 fiction book. This is a childhood dream that I would like to make reality.
  • Learn Swift or some other interesting programming language.
  • Find sponsors for my open source projects.
  • Visit the two continents I've yet to see. That means Africa and Antarctica.
  • See the Grand Canyon.
  • Visit more family.
  • Do 1000 push-ups or similar exercises in a single day.
  • Pull off an Aú sem Mão (no-handed cartwheel).
  • Take another fun class with Audrey.
  • Learn how to surf or snowboard.
  • See all my friends. All of them.
  • Enjoy life with Audrey more.

20 Dec 2014 8:00pm GMT

Daniel Greenfeld: I did an Aú Batido in 2014, now what?

This is my summary of my resolutions for 2014 and an early pass at my resolutions for 2015. I'm doing this right now instead of at the end of the year because as of the afternoon of December 21, I'm going off the grid.

Resolutions Accomplished in 2014

  • Release the second edition of Two Scoops of Django.
  • Visited South America.
  • Visited two new nations, Argentina and Brazil.
  • Went back to the East Coast to visit Philadelphia and places in New Jersey.
  • Went back to the Philippines.
  • Took some awesome road trips around the USA. My favorite was driving up the Pacific Coast Highway.
  • Took a fun class with Audrey. We did woodshop!
  • Learned how to do Aú Batido!
http://pydanny.com/static/aubatido.jpg

General Accomplishments

  • Published Two Scoops of Django 1.6 with my awesome wife, Audrey Roy Greenfeld.
  • Experienced my first major surgery.
  • Started working at Eventbrite.
  • Found a Inland Empire Capoeira group but still remained friends with Capoeira Batuque.
  • Gave talks at:
    • Wharton School of Business Web Conference
    • PyDay Mendoza
    • Python Brazil
    • PyCon Argentina
  • Experienced the unbelievably tasty asado (steak) of Argentina.
  • Participated in organizing the first Django Bar Camp.
  • Played in 2 Capoeira Rodas in Brazil.
  • Participated in the successful effort to reboot LA Django.
  • Ate dinner below the waterline inside fake island in the San Francisco Bay.

Resolutions for 2015

  • Bring a new child into the world.
  • Write and publish at least 1 fiction book. This is a childhood dream that I would like to make reality.
  • Learn Swift or some other interesting programming language.
  • Find sponsors for my open source projects.
  • Visit the two continents I've yet to see. That means Africa and Antarctica.
  • See the Grand Canyon.
  • Visit more family.
  • Do 1000 push-ups or similar exercises in a single day.
  • Pull off an Aú sem Mão (no-handed cartwheel).
  • Take another fun class with Audrey.
  • Learn how to surf or snowboard.
  • See all my friends. All of them.
  • Enjoy life with Audrey more.

20 Dec 2014 8:00pm GMT

Toshio Kuratomi: Pattern or Antipattern? Splitting up initialization with asyncio

"O brave new world, That has such people in't!" - William Shakespeare, The Tempest

Instead of spending the Thanksgiving weekend fighting crowds of shoppers I indulged my inner geek by staying at home on my computer. And not to shop online either - I was taking a look at Python-3.4's asyncio library to see whether it would be useful in general, run of the mill code. After quite a bit of experimenting I do think every programmer will have a legitimate use for it from time to time. It's also quite sexy. I think I'll be a bit prone to overusing it for a little while ;-)

Something I discovered, though - there's a great deal of good documentation and blog posts about the underlying theory of asyncio and how to implement some broader concepts using asyncio's API. There's quite a few tutorials that skim the surface of what you can theoretically do with the library that don't go into much depth. And there's a definite lack of examples showing how people are taking asyncio's API and applying them to real-world problems.

That lack is both exciting and hazardous. Exciting because it means there's plenty of neat new ways to use the API that no one's made into a wide-spread and oft-repeated pattern yet. Hazardous because there's plenty of neat new ways to abuse the API that no one's thought to write a post explaining why not to do things that way before. My joke about overusing it earlier has a large kernel of truth in it… there's not a lot of information saying whether a particular means of using asyncio is good or bad.

So let me mention one way of using it that I thought about this weekend - maybe some more experienced tulip or twisted programmers will pop up and tell me whether this is a good use or bad use of the APIs.

Let's say you're writing some code that talks to a microblogging service. You have one class that handles both posting to the service and reading from it. As you write the code you realize that there's some time consuming tasks (for instance, setting up an on-disk cache for posts) that you have to do in order to read from the service that you do not have to wait for if your first actions are going to be making new posts. After a bit of thought, you realize you can split up your initialization into two steps. Initialization needed for posting will be done immediately in the class's constructor and initialization needed for reading will be setup in a future so that reading code will know when it can begin to process. Here's a rough sketch of what an implementation might look like:

import os
import sqlite
import asyncio

import aiohttp

class Microblog:
    def __init__(self, url, username, token, cachedir):
        self.auth = token
        self.username = username
        self.url = url
        loop = asyncio.get_event_loop()
        self.init_future = loop.run_in_executor(None, self._reading_init, cachedir)

    def _reading_init(self, cachedir):
        # Mainly setup our cache
        self.cachedir = cachedir
        os.makedirs(cachedir, mode=0o755, exist_ok=True)
        self.db = sqlite.connect('sqlite:////{0}/cache.sqlite'.format(cachedir))
        # Create tables, fill in some initial data, you get the picture [....]

    @asyncio.coroutine
    def post(self, msg):
        data = dict(payload=msg)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('post', self.url, data=data, headers=headers)
        # Manipulate reply a bit [...]
        return reply

    @asyncio.coroutine
    def sync_latest(self):
        # Synchronize with the initialization we need before we can read
        yield from self.init_future
        data = dict(per_page=100, page=1)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('get', self.url, data=data, headers=headers)
        # Stuff the reply in our cache

if __name__ == '__main__':
    chirpchirp = Microblog('http://chirpchirp.com', 'a.badger', TOKEN, '/home/badger/cache/')
    loop = asyncio.get_event_loop()
    # Contrived -- real code would probably have a coroutine to take user input
    # and then submit that while interleaving with displaying new posts
    asyncio.async(chirpchirp.post(' '.join(sys.argv[1:])))
    loop.run_until_complete(chirpchirp.sync_latest())
    

Some of this code is just there to give an idea of how this could be used. The real question's revolve around splitting up initialization into two steps:


20 Dec 2014 6:28pm GMT

Toshio Kuratomi: Pattern or Antipattern? Splitting up initialization with asyncio

"O brave new world, That has such people in't!" - William Shakespeare, The Tempest

Instead of spending the Thanksgiving weekend fighting crowds of shoppers I indulged my inner geek by staying at home on my computer. And not to shop online either - I was taking a look at Python-3.4's asyncio library to see whether it would be useful in general, run of the mill code. After quite a bit of experimenting I do think every programmer will have a legitimate use for it from time to time. It's also quite sexy. I think I'll be a bit prone to overusing it for a little while ;-)

Something I discovered, though - there's a great deal of good documentation and blog posts about the underlying theory of asyncio and how to implement some broader concepts using asyncio's API. There's quite a few tutorials that skim the surface of what you can theoretically do with the library that don't go into much depth. And there's a definite lack of examples showing how people are taking asyncio's API and applying them to real-world problems.

That lack is both exciting and hazardous. Exciting because it means there's plenty of neat new ways to use the API that no one's made into a wide-spread and oft-repeated pattern yet. Hazardous because there's plenty of neat new ways to abuse the API that no one's thought to write a post explaining why not to do things that way before. My joke about overusing it earlier has a large kernel of truth in it… there's not a lot of information saying whether a particular means of using asyncio is good or bad.

So let me mention one way of using it that I thought about this weekend - maybe some more experienced tulip or twisted programmers will pop up and tell me whether this is a good use or bad use of the APIs.

Let's say you're writing some code that talks to a microblogging service. You have one class that handles both posting to the service and reading from it. As you write the code you realize that there's some time consuming tasks (for instance, setting up an on-disk cache for posts) that you have to do in order to read from the service that you do not have to wait for if your first actions are going to be making new posts. After a bit of thought, you realize you can split up your initialization into two steps. Initialization needed for posting will be done immediately in the class's constructor and initialization needed for reading will be setup in a future so that reading code will know when it can begin to process. Here's a rough sketch of what an implementation might look like:

import os
import sqlite
import asyncio

import aiohttp

class Microblog:
    def __init__(self, url, username, token, cachedir):
        self.auth = token
        self.username = username
        self.url = url
        loop = asyncio.get_event_loop()
        self.init_future = loop.run_in_executor(None, self._reading_init, cachedir)

    def _reading_init(self, cachedir):
        # Mainly setup our cache
        self.cachedir = cachedir
        os.makedirs(cachedir, mode=0o755, exist_ok=True)
        self.db = sqlite.connect('sqlite:////{0}/cache.sqlite'.format(cachedir))
        # Create tables, fill in some initial data, you get the picture [....]

    @asyncio.coroutine
    def post(self, msg):
        data = dict(payload=msg)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('post', self.url, data=data, headers=headers)
        # Manipulate reply a bit [...]
        return reply

    @asyncio.coroutine
    def sync_latest(self):
        # Synchronize with the initialization we need before we can read
        yield from self.init_future
        data = dict(per_page=100, page=1)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('get', self.url, data=data, headers=headers)
        # Stuff the reply in our cache

if __name__ == '__main__':
    chirpchirp = Microblog('http://chirpchirp.com', 'a.badger', TOKEN, '/home/badger/cache/')
    loop = asyncio.get_event_loop()
    # Contrived -- real code would probably have a coroutine to take user input
    # and then submit that while interleaving with displaying new posts
    asyncio.async(chirpchirp.post(' '.join(sys.argv[1:])))
    loop.run_until_complete(chirpchirp.sync_latest())
    

Some of this code is just there to give an idea of how this could be used. The real question's revolve around splitting up initialization into two steps:


20 Dec 2014 6:28pm GMT

BangPypers: December 2014 Dev Sprint report

The December BangPypers meetup happened at the APIGee office in Kormanagala. This time we didn't have workshops or talks but dev sprint.

The agenda was to contribute to open source projects. There were 53 participants. The event started with introduction about projects participants would like to work on. The participants were mix of beginners and experienced contributors. We had 5 mentors, Baiju, Sayan, Siva, Elvis, Krace.

photo1

Participants worked on following open source projects CPSLib, dubdubdub,
uiautomator, fedora-infra, Network Visualization, genetic-drift, nidaba, Junction, pelican-pyembed, pssi, gridic, arrow, pynsq, rpmdev-assistant, cloudlynt. The mentors helped some participants to send their first pull request!

The participants were excited to see demo by Elvis, a note/time grid that helps create small fragments of music.
Here is a list of pull requests and commits by participants (not in any order)

Comments by participants can be found in meetup page.
photo2

Big thanks for APIgee for sponsoring food and venue for the meetup and pssi for sponsoring the t-shirt. Complete photos of the event can be found in G+ page.

We also have a mailing list where discussion happens about Python.

20 Dec 2014 10:14am GMT

BangPypers: December 2014 Dev Sprint report

The December BangPypers meetup happened at the APIGee office in Kormanagala. This time we didn't have workshops or talks but dev sprint.

The agenda was to contribute to open source projects. There were 53 participants. The event started with introduction about projects participants would like to work on. The participants were mix of beginners and experienced contributors. We had 5 mentors, Baiju, Sayan, Siva, Elvis, Krace.

photo1

Participants worked on following open source projects CPSLib, dubdubdub,
uiautomator, fedora-infra, Network Visualization, genetic-drift, nidaba, Junction, pelican-pyembed, pssi, gridic, arrow, pynsq, rpmdev-assistant, cloudlynt. The mentors helped some participants to send their first pull request!

The participants were excited to see demo by Elvis, a note/time grid that helps create small fragments of music.
Here is a list of pull requests and commits by participants (not in any order)

Comments by participants can be found in meetup page.
photo2

Big thanks for APIgee for sponsoring food and venue for the meetup and pssi for sponsoring the t-shirt. Complete photos of the event can be found in G+ page.

We also have a mailing list where discussion happens about Python.

20 Dec 2014 10:14am GMT

Andre Roberge: Review of IPython Notebook Essentials

Disclaimer: following a post on Google+ by a Packt representative, I volunteered to do a review of IPython Notebook Essentials and got a free copy as an ebook (I used the pdf format).

The verdict: Not recommended in its current state.

IPython Notebook Essentials starts very well. It suggests to use the Anaconda 3.4 distribution (which I have on my computer - more on that later) or connecting online using Wakari. Both methods worked fine for simple examples.

In chapter 1, we are given a quick tour of what the IPython notebook can do using the modeling of a cup of coffee as an example. The idea, on its own works well. [As a scientist, I found the use of degree Farenheit a bit odd and reminded me of books I read that were written 50 or more years ago.] However, while the authors used variables that follow the normal Python convention, separating names by underscore as in temp_cream, the code formatting is at times atrocious as variable names are sometimes split up after an underscore character as in the following two lines of code:

initial_temp_mix_at_shop = temp_mixture(temp_coffee, vol_coffee, temp_
cream, vol_cream)

which are, ironically enough, in bold in the text as the author wants us to focus our attention on their meaning (but no mention of the error in formatting).

While I usually prefer holding a paper book over reading an ebook on my screen, I must admit that being able to copy-paste code samples beats having to retype everything. So, it was possible to quickly reproduce the examples by copy-pasting and fixing the typos to run the examples.

However, a much better way exists which is often used by Packt books: making code samples available for download. The IPython notebooks have been designed for easy sharing. Here, the author has chosen not to make use of this. This, in my opinion, is something that is a major flaw in this case.


Chapter 2 covers in more details the notebook interface. This is undoubtably the most important chapter of the book for anyone who wishes to learn about the IPython notebook. It covers a lot of grounds.

However, I encountered many problems, some more serious than others. The first, which is perhaps more an annoyance than a real problem, is that one example intended to show code completion using the tab key is given as follows:

print am

Python programmers will immediately recognize this as being a Python 2 example. However, as far as I could tell (using ctrl-f "python 2") there isn't a single mention anywhere that Python 2 is used.
I happened to have the Anaconda 3.4 distribution installed (as recommended) but with Python 3.4 and not Python 2. Python 3 is now 6 years old and there is no excuse to focus on an old, and soon to be obsolete, version of Python without at least mentioning why that choice was made. Minor syntax difference, like adding parentheses for print statements, are easily fixed; some more subtle ones are not. Furthermore, while I had the Anaconda distribution installed, I was still using the online Wakari site to work through the examples, so that was not a major problem at that point.

While still in chapter 2, we are invited to replace "%pylab inline" by "%pylab" and run an example again to see a plot appear in a window (I first assumed a separate browser window) instead of being shown in the document. This does not work using the online Wakari site: the window is actually a Qt window, not something that works in a browser. So, to try this example, I had to start my local version and recopy the code, which worked correctly.

Shortly thereafter, we are introduced to more "magic" methods and invited to try running a cython example, loading from a local file. This did not work. The recommended "%%cython" magic method is no longer working in the latest IPython version included with the Python 3.4 Anaconda 3.4 distribution. After a while, I found the proper way to run cython code with the latest version BUT the example provided raised a (numpy-related?) syntax error. I copy-pasted the code from my browser to the Wakari online version and it worked correctly, confirming that I had not made an error in reproducing the code given by the author. However, I was not able to figure out the source of the error using the local version.

After finishing Chapter 2, I stopped trying to run every single examples and simply skimmed the book.

Chapter 3 focuses on producing plots with matplotlib, including animations. While not specific to the IPython notebook, this topic felt like an appropriate one to cover.

In Chapter 4, we learn about the pandas library which has very little to do with the IPython notebook. The situation is similar with Chapter 5 which covers SciPy, Numba and NumbaPro, again three libraries that have very little to do with the notebook as such. The choice of NumbaPro struck me as a bit odd since it is a commercial product. True enough, it can be tried for free - however, it is not something that I would consider to be an "essential" for the IPython notebook.

I know very little more about the IPython notebook than what I have learned from this book. However, I do know that it is possible to write extensions for the IPython notebook which is something that I would have thought should be included in any book titled "IPython Notebook Essentials", well before discussing specialized libraries such as Pandas, SciPy, Numba and NumbaPro.

There might very well be other topics more notebook specific that should be included, but I have no way to know from this book.

The book includes three appendices: a brief IPython notebook reference card, a brief review of Python, and an appendix on Numpy arrays. Both the Reference card and the Numpy arrays appendices seem like worthwhile additions. However, the brief review of Python seems a bit out of place. By including code like:

def make_lorenz(sigma, r, b):
def func(statevec, t):
x, y, z = statevec
return [ sigma * (y - x),
r * x - y - x * z,
x * y - b * z ]
return func


in Chapter 2, the author seems to assume, and rightly so in my opinion, that the reader will be familiar with Python. However, the appendix only covers the standard Python construct that one may find in a beginner's book intended for programmers that are familiar with other languages. As such, the Python review appendix seems just like a filler, increasing the page count while adding nothing of value to the book. Thankfully, it is relegated to an appendix instead of being inserted in an early chapter.

In summary, about half of the book contains information of value for someone who wants to learn about the IPython notebook itself; the choice of Python 2 over Python 3 is odd, and almost inexcusable given that it is not even mentioned anywhere; the lack of downloable code samples" (mostly IPython notebooks in this case) greatly reduces the value of this book and is something that could be remedied by the author. In fact, given the typos mentioned (where variable names are split over two lines), downloadable copies of notebooks should be made available.

As I write this review, Packt is having a sale during which ebooks are available for $5. At that price, I would say that IPython Notebook Essentials is worth it if one wants to learn about the IPython Notebook; however, based on a quick scan of other Packt books covering the IPython notebook, I believe that better choices exist from the same editor.

20 Dec 2014 6:54am GMT

Andre Roberge: Review of IPython Notebook Essentials

Disclaimer: following a post on Google+ by a Packt representative, I volunteered to do a review of IPython Notebook Essentials and got a free copy as an ebook (I used the pdf format).

The verdict: Not recommended in its current state.

IPython Notebook Essentials starts very well. It suggests to use the Anaconda 3.4 distribution (which I have on my computer - more on that later) or connecting online using Wakari. Both methods worked fine for simple examples.

In chapter 1, we are given a quick tour of what the IPython notebook can do using the modeling of a cup of coffee as an example. The idea, on its own works well. [As a scientist, I found the use of degree Farenheit a bit odd and reminded me of books I read that were written 50 or more years ago.] However, while the authors used variables that follow the normal Python convention, separating names by underscore as in temp_cream, the code formatting is at times atrocious as variable names are sometimes split up after an underscore character as in the following two lines of code:

initial_temp_mix_at_shop = temp_mixture(temp_coffee, vol_coffee, temp_
cream, vol_cream)

which are, ironically enough, in bold in the text as the author wants us to focus our attention on their meaning (but no mention of the error in formatting).

While I usually prefer holding a paper book over reading an ebook on my screen, I must admit that being able to copy-paste code samples beats having to retype everything. So, it was possible to quickly reproduce the examples by copy-pasting and fixing the typos to run the examples.

However, a much better way exists which is often used by Packt books: making code samples available for download. The IPython notebooks have been designed for easy sharing. Here, the author has chosen not to make use of this. This, in my opinion, is something that is a major flaw in this case.


Chapter 2 covers in more details the notebook interface. This is undoubtably the most important chapter of the book for anyone who wishes to learn about the IPython notebook. It covers a lot of grounds.

However, I encountered many problems, some more serious than others. The first, which is perhaps more an annoyance than a real problem, is that one example intended to show code completion using the tab key is given as follows:

print am

Python programmers will immediately recognize this as being a Python 2 example. However, as far as I could tell (using ctrl-f "python 2") there isn't a single mention anywhere that Python 2 is used.
I happened to have the Anaconda 3.4 distribution installed (as recommended) but with Python 3.4 and not Python 2. Python 3 is now 6 years old and there is no excuse to focus on an old, and soon to be obsolete, version of Python without at least mentioning why that choice was made. Minor syntax difference, like adding parentheses for print statements, are easily fixed; some more subtle ones are not. Furthermore, while I had the Anaconda distribution installed, I was still using the online Wakari site to work through the examples, so that was not a major problem at that point.

While still in chapter 2, we are invited to replace "%pylab inline" by "%pylab" and run an example again to see a plot appear in a window (I first assumed a separate browser window) instead of being shown in the document. This does not work using the online Wakari site: the window is actually a Qt window, not something that works in a browser. So, to try this example, I had to start my local version and recopy the code, which worked correctly.

Shortly thereafter, we are introduced to more "magic" methods and invited to try running a cython example, loading from a local file. This did not work. The recommended "%%cython" magic method is no longer working in the latest IPython version included with the Python 3.4 Anaconda 3.4 distribution. After a while, I found the proper way to run cython code with the latest version BUT the example provided raised a (numpy-related?) syntax error. I copy-pasted the code from my browser to the Wakari online version and it worked correctly, confirming that I had not made an error in reproducing the code given by the author. However, I was not able to figure out the source of the error using the local version.

After finishing Chapter 2, I stopped trying to run every single examples and simply skimmed the book.

Chapter 3 focuses on producing plots with matplotlib, including animations. While not specific to the IPython notebook, this topic felt like an appropriate one to cover.

In Chapter 4, we learn about the pandas library which has very little to do with the IPython notebook. The situation is similar with Chapter 5 which covers SciPy, Numba and NumbaPro, again three libraries that have very little to do with the notebook as such. The choice of NumbaPro struck me as a bit odd since it is a commercial product. True enough, it can be tried for free - however, it is not something that I would consider to be an "essential" for the IPython notebook.

I know very little more about the IPython notebook than what I have learned from this book. However, I do know that it is possible to write extensions for the IPython notebook which is something that I would have thought should be included in any book titled "IPython Notebook Essentials", well before discussing specialized libraries such as Pandas, SciPy, Numba and NumbaPro.

There might very well be other topics more notebook specific that should be included, but I have no way to know from this book.

The book includes three appendices: a brief IPython notebook reference card, a brief review of Python, and an appendix on Numpy arrays. Both the Reference card and the Numpy arrays appendices seem like worthwhile additions. However, the brief review of Python seems a bit out of place. By including code like:

def make_lorenz(sigma, r, b):
def func(statevec, t):
x, y, z = statevec
return [ sigma * (y - x),
r * x - y - x * z,
x * y - b * z ]
return func


in Chapter 2, the author seems to assume, and rightly so in my opinion, that the reader will be familiar with Python. However, the appendix only covers the standard Python construct that one may find in a beginner's book intended for programmers that are familiar with other languages. As such, the Python review appendix seems just like a filler, increasing the page count while adding nothing of value to the book. Thankfully, it is relegated to an appendix instead of being inserted in an early chapter.

In summary, about half of the book contains information of value for someone who wants to learn about the IPython notebook itself; the choice of Python 2 over Python 3 is odd, and almost inexcusable given that it is not even mentioned anywhere; the lack of downloable code samples" (mostly IPython notebooks in this case) greatly reduces the value of this book and is something that could be remedied by the author. In fact, given the typos mentioned (where variable names are split over two lines), downloadable copies of notebooks should be made available.

As I write this review, Packt is having a sale during which ebooks are available for $5. At that price, I would say that IPython Notebook Essentials is worth it if one wants to learn about the IPython Notebook; however, based on a quick scan of other Packt books covering the IPython notebook, I believe that better choices exist from the same editor.

20 Dec 2014 6:54am GMT

19 Dec 2014

feedPlanet Python

Floris Bruynooghe: Pylint and dynamically populated packages

Python links the module namespace directly to the layout of the source locations on the filesystem. And this is mostly fine, certainly for applications. For libraries sometimes one might want to control the toplevel namespace or API more tightly. This also is mostly fine as one can just use private modules inside a package and import the relevant objects into the __init__.py file, optionally even setting __all__. As I said, this is mostly fine, if sometimes a bit ugly.

However sometimes you have a library which may be loading a particular backend or platforms support at runtime. An example of this is the Python zmq package. The apipkg module is also a very nice way of controlling your toplevel namespace more flexibly. Problem is once you start using one of these things Pylint no longer knows which objects your package provides in it's namespace and will issue warnings about using non-existing things.

Turns out it is not too hard to write a plugin for Pylint which takes care of this. One just has to build the right AST nodes in place where they would be appearing at runtime. Luckily the tools to do this easily are provided:

def transform(mod):
    if mod.name == 'zmq':
        module = importlib.import_module(mod.name)
        for name, obj in vars(module).copy().items():
            if (name in mod.locals or
                    not hasattr(obj, '__module__') or
                    not hasattr(obj, '__name__')):
                continue
            if isinstance(obj, types.ModuleType):
                ast_node = [astroid.MANAGER.ast_from_module(obj)]
            else:
                if hasattr(astroid.MANAGER, 'extension_package_whitelist'):
                    astroid.MANAGER.extension_package_whitelist.add(
                        obj.__module__)
                real_mod = astroid.MANAGER.ast_from_module_name(obj.__module__)
                ast_node = real_mod.getattr(obj.__name__)
                for node in ast_node:
                    fix_linenos(node)
            mod.locals[name] = ast_node

As you can see the hard work of knowing what AST nodes to generate is all done in the astroid.MANAGER.ast_from_module() and astroid.MANAGER.ast_from_module_name() calls. All that is left to do is add these new AST nodes to the module's globals/locals (they are the same thing for a module).

You may also notice the fix_linenos() call. This is a small helper needed when running on Python 3 and importing C modules (like for zmq). The reason is that Pylint tries to sort by line numbers, but for C code they are None and in Python 2 None and an integer can be happily compared but in Python 3 that is no longer the case. So this small helper simply sets all unknown line numbers to 0:

def fix_linenos(node):
    if node.fromlineno is None:
        node.fromlineno = 0
    for child in node.get_children():
        fix_linenos(child)

Lastly when writing this into a plugin for Pylint you'll want to register the transformation you just wrote:

def register(linter):
    astroid.MANAGER.register_transform(astroid.Module, transform)

And that's all that's needed to make Pylint work fine with dynamically populated package namespaces. I've tried this on zmq as well as on a package using apipkg and its seems to work fine on both Python 2 and Python 3. Writing Pylint plugins seems not too hard!

19 Dec 2014 11:40pm GMT

Floris Bruynooghe: Pylint and dynamically populated packages

Python links the module namespace directly to the layout of the source locations on the filesystem. And this is mostly fine, certainly for applications. For libraries sometimes one might want to control the toplevel namespace or API more tightly. This also is mostly fine as one can just use private modules inside a package and import the relevant objects into the __init__.py file, optionally even setting __all__. As I said, this is mostly fine, if sometimes a bit ugly.

However sometimes you have a library which may be loading a particular backend or platforms support at runtime. An example of this is the Python zmq package. The apipkg module is also a very nice way of controlling your toplevel namespace more flexibly. Problem is once you start using one of these things Pylint no longer knows which objects your package provides in it's namespace and will issue warnings about using non-existing things.

Turns out it is not too hard to write a plugin for Pylint which takes care of this. One just has to build the right AST nodes in place where they would be appearing at runtime. Luckily the tools to do this easily are provided:

def transform(mod):
    if mod.name == 'zmq':
        module = importlib.import_module(mod.name)
        for name, obj in vars(module).copy().items():
            if (name in mod.locals or
                    not hasattr(obj, '__module__') or
                    not hasattr(obj, '__name__')):
                continue
            if isinstance(obj, types.ModuleType):
                ast_node = [astroid.MANAGER.ast_from_module(obj)]
            else:
                if hasattr(astroid.MANAGER, 'extension_package_whitelist'):
                    astroid.MANAGER.extension_package_whitelist.add(
                        obj.__module__)
                real_mod = astroid.MANAGER.ast_from_module_name(obj.__module__)
                ast_node = real_mod.getattr(obj.__name__)
                for node in ast_node:
                    fix_linenos(node)
            mod.locals[name] = ast_node

As you can see the hard work of knowing what AST nodes to generate is all done in the astroid.MANAGER.ast_from_module() and astroid.MANAGER.ast_from_module_name() calls. All that is left to do is add these new AST nodes to the module's globals/locals (they are the same thing for a module).

You may also notice the fix_linenos() call. This is a small helper needed when running on Python 3 and importing C modules (like for zmq). The reason is that Pylint tries to sort by line numbers, but for C code they are None and in Python 2 None and an integer can be happily compared but in Python 3 that is no longer the case. So this small helper simply sets all unknown line numbers to 0:

def fix_linenos(node):
    if node.fromlineno is None:
        node.fromlineno = 0
    for child in node.get_children():
        fix_linenos(child)

Lastly when writing this into a plugin for Pylint you'll want to register the transformation you just wrote:

def register(linter):
    astroid.MANAGER.register_transform(astroid.Module, transform)

And that's all that's needed to make Pylint work fine with dynamically populated package namespaces. I've tried this on zmq as well as on a package using apipkg and its seems to work fine on both Python 2 and Python 3. Writing Pylint plugins seems not too hard!

19 Dec 2014 11:40pm GMT

Daniel Greenfeld: setup.py tricks

Setup.py tricks

Seasons greetings!

Before I begin, I want to make very clear that most of what I'm about to explain are 'tricks'. They aren't "best practices", and in at least one case, is possibly inadvisable.

Speaking of inadvisable practices, at some point I'll write a 'setup.py traps' blog post, which are things I believe you should never, ever do in a setup.py module.

Tricks

These are tricks I have to make package management in python a tiny bit easier. Before you attempt to implement them, I recommend you have at least basic experience with creating new packages. Two ways to learn about python packaging are the New Library Sprint (beginner friendly) and the Python Packaging User Guide (more advanced).

'python setup.py publish'

This is where it all started. One day I was looking at some of Tom Christie's code and discovered the python setup.py publish command inside the setup.py module of Django Rest Framework. It goes something like this:

# setup.py
import os
import sys

# I'll discuss version tricks in a future blog post.
version = "42.0.0"

if sys.argv[-1] == 'publish':
    os.system("python setup.py sdist upload")
    os.system("python setup.py bdist_wheel upload")
    print("You probably want to also tag the version now:")
    print("  git tag -a %s -m 'version %s'" % (version, version))
    print("  git push --tags")
    sys.exit()

# Below this point is the rest of the setup() function

What's awesome about this is that using this technique I don't have to look up the somewhat cryptic python setup.py sdist upload command, or the actually cryptic python setup.py bdist_wheel upload. Instead, when it's time to push one of my packages to PyPI, I just type:

$ python setup.py publish

Much easier to remember!

'python setup.py tag'

The problem with Tom Christie's python setup.py publish command is that it forces me to type out the git tag command. Okay, let's be honest, it forces me to copy/paste the output of my screen. Therefore, all on my very own, I 'invented' the python setup.py tag command:

# setup.py

if sys.argv[-1] == 'tag':
    os.system("git tag -a %s -m 'version %s'" % (version, version))
    os.system("git push --tags")
    sys.exit()

Pretty nifty, eh? Now I don't have to remember so many cryptic git commands. And I get to shorten the python setup.py publish command:

if sys.argv[-1] == 'publish':
    os.system("python setup.py sdist upload")
    os.system("python setup.py bdist_wheel upload")
    sys.exit()

When I need to do a version release, I commit my code then type:

$ python setup.py publish
$ python setup.py tag

Why don't I combine the commands? Well, you aren't supposed to put things like 'RC1' or '-alpha' in your PyPI version names. By seperating the commands I have finer grained control over my package releases. I'm encouraged to place alpha, beta, and release candidates in git tags, rather than formal PyPI releases.

'python setup.py test'

I'm fairly certain some of my readers are going to have a seriously problem with this trick. In fact, depending on the the response of those who manage Python's packaging infrastructure, it might be moved to my forthcoming 'traps' blog post.

Alrighty then...

I like py.test. I've blogged about the use of py.test. I try to use it everywhere. Yet, I'm really not a fan of how we're supposed tie it into python setup.py test. The precise moment I get uncomfortable with py.test is when it makes me add special classes into setup.py.

Fortunately, there is another way:

if sys.argv[-1] == 'test':
    test_requirements = [
        'pytest',
        'flake8',
        'coverage'
    ]
    try:
        modules = map(__import__, test_requirements)
    except ImportError as e:
        err_msg = e.message.replace("No module named ", "")
        msg = "%s is not installed. Install your test requirments." % err_msg
        raise ImportError(msg)
    os.system('py.test')
    sys.exit()

Which means I get to use py.test and python setup.py test with a trivial addition of code:

$ python setup.py test

In theory, one could run pip install on the missing requirements, or call them from a requirements file. However, since these are 'tricks', I like to keep things short and sweet. If I get enough positive results for this one I'll update this example to include calling of pip for missing requirements.

What about subprocess?

There are those who will ask, "Why aren't you using the subprocess library for these shell commands?"

My answer to that question is, "Because if I need a nuclear weapon to kill a rabbit maybe I'm overdoing things." For these simple tricks, the os.system() function is good enough.

Traps!

Stay tuned for my 'traps' blog post to come out early in 2015.

19 Dec 2014 8:00pm GMT

Daniel Greenfeld: setup.py tricks

Setup.py tricks

Seasons greetings!

Before I begin, I want to make very clear that most of what I'm about to explain are 'tricks'. They aren't "best practices", and in at least one case, is possibly inadvisable.

Speaking of inadvisable practices, at some point I'll write a 'setup.py traps' blog post, which are things I believe you should never, ever do in a setup.py module.

Tricks

These are tricks I have to make package management in python a tiny bit easier. Before you attempt to implement them, I recommend you have at least basic experience with creating new packages. Two ways to learn about python packaging are the New Library Sprint (beginner friendly) and the Python Packaging User Guide (more advanced).

'python setup.py publish'

This is where it all started. One day I was looking at some of Tom Christie's code and discovered the python setup.py publish command inside the setup.py module of Django Rest Framework. It goes something like this:

# setup.py
import os
import sys

# I'll discuss version tricks in a future blog post.
version = "42.0.0"

if sys.argv[-1] == 'publish':
    os.system("python setup.py sdist upload")
    os.system("python setup.py bdist_wheel upload")
    print("You probably want to also tag the version now:")
    print("  git tag -a %s -m 'version %s'" % (version, version))
    print("  git push --tags")
    sys.exit()

# Below this point is the rest of the setup() function

What's awesome about this is that using this technique I don't have to look up the somewhat cryptic python setup.py sdist upload command, or the actually cryptic python setup.py bdist_wheel upload. Instead, when it's time to push one of my packages to PyPI, I just type:

$ python setup.py publish

Much easier to remember!

'python setup.py tag'

The problem with Tom Christie's python setup.py publish command is that it forces me to type out the git tag command. Okay, let's be honest, it forces me to copy/paste the output of my screen. Therefore, all on my very own, I 'invented' the python setup.py tag command:

# setup.py

if sys.argv[-1] == 'tag':
    os.system("git tag -a %s -m 'version %s'" % (version, version))
    os.system("git push --tags")
    sys.exit()

Pretty nifty, eh? Now I don't have to remember so many cryptic git commands. And I get to shorten the python setup.py publish command:

if sys.argv[-1] == 'publish':
    os.system("python setup.py sdist upload")
    os.system("python setup.py bdist_wheel upload")
    sys.exit()

When I need to do a version release, I commit my code then type:

$ python setup.py publish
$ python setup.py tag

Why don't I combine the commands? Well, you aren't supposed to put things like 'RC1' or '-alpha' in your PyPI version names. By seperating the commands I have finer grained control over my package releases. I'm encouraged to place alpha, beta, and release candidates in git tags, rather than formal PyPI releases.

'python setup.py test'

I'm fairly certain some of my readers are going to have a seriously problem with this trick. In fact, depending on the the response of those who manage Python's packaging infrastructure, it might be moved to my forthcoming 'traps' blog post.

Alrighty then...

I like py.test. I've blogged about the use of py.test. I try to use it everywhere. Yet, I'm really not a fan of how we're supposed tie it into python setup.py test. The precise moment I get uncomfortable with py.test is when it makes me add special classes into setup.py.

Fortunately, there is another way:

if sys.argv[-1] == 'test':
    test_requirements = [
        'pytest',
        'flake8',
        'coverage'
    ]
    try:
        modules = map(__import__, test_requirements)
    except ImportError as e:
        err_msg = e.message.replace("No module named ", "")
        msg = "%s is not installed. Install your test requirments." % err_msg
        raise ImportError(msg)
    os.system('py.test')
    sys.exit()

Which means I get to use py.test and python setup.py test with a trivial addition of code:

$ python setup.py test

In theory, one could run pip install on the missing requirements, or call them from a requirements file. However, since these are 'tricks', I like to keep things short and sweet. If I get enough positive results for this one I'll update this example to include calling of pip for missing requirements.

What about subprocess?

There are those who will ask, "Why aren't you using the subprocess library for these shell commands?"

My answer to that question is, "Because if I need a nuclear weapon to kill a rabbit maybe I'm overdoing things." For these simple tricks, the os.system() function is good enough.

Traps!

Stay tuned for my 'traps' blog post to come out early in 2015.

19 Dec 2014 8:00pm GMT

Mike Driscoll: eBook Review: Flask Framework Cookbook

Packt Publishing recently sent me a copy of the eBook version of Flask Framework Cookbook by Shalabh Aggarwal. I didn't read it in its entirety as Cookbooks don't usually make for a very interesting linear read. I just went through it and cherry picked various recipes. But before I get into too much detail, let's do the quick review!


Quick Review

  • Why I picked it up: I was asked by the publisher to read the book.

Book Formats

You can get this book in paperback, epub, mobi, or PDF.


Book Contents

The book is split up into 12 chapters covering 258 pages with over 80 recipes.


Full Review

Packt is always putting out niche Python books. Flask is one of the more popular Python micro-web frameworks, so it probably has a good sized audience. Let's spend a few moments looking at what the chapters cover. In chapter one, we find out the many ways you can configure Flask. It contains information about using class based settings, static files, blueprints and more. Chapter two changes things up a bit with a group of recipes about Jinja, a templating language. In chapter three, we move into data modeling using SQLAlchemy. There are also recipes for Redis, Alembic and MongoDB. Chapter four is about working with views. It contains information on XHR requests, class-based views, custom 404 handlers and several other recipes.

In chapter five, the author focues on webforms with WTForms. Here we learn about field validation, uploading files and cross-site forgery. For chapter 6, we move into authentication recipes. There are recipes for the Flask-Login extension, OpenID, Facebook, Google and Twitter. Chapter 7 goes into RESTful API building. There are only four recipes in this chapter with two on creating different types of REST interfaces. The last recipe is a complete REST API example though. Chapter eight is all about the admin interface in Flask. Here you will learn about the Flask-Admin extension, custom forms, user roles and more!

Chapter nine takes us into Internationalization and localization. It has the fewest recipes at just 3. You learn how to add a new language, language switching and gettext/ngettext. Moving on to chapter ten, we learn about debugging, error handling and testing. Here we cover everything from emailing errors, to using the pdb debugger to nose, mock and coverage tests. Chapter 11 is about deployment. It covers recipes about apach, Gunicornm Tornado, Fabric, Heroku, AWS Elastic Beanstalk, Application monitoring and a few other items. Chapter 12 rounds out the book with other tips and tricks such as full-text search, working with signals, caching, Celery, etc.

Overall, I found the book fairly well written. There are some spots that are a bit choppy as I don't believe the author is a native English speaker, but the prose doesn't suffer very much because of this. Most of the recipes work well as standalone snippets. Sometimes the snippets don't seem to be completely runnable, but you should be able to download the full code from Packt. I didn't always find the groupings of recipes to be completely cohesive, but for the most part, they made sense together. I would recommend this book for a beginner in Flask that wants to take his skills to the next level and to those who need a more complete understanding of some of the things you can do with Flask.

3407OS_Flask Frameworks Cookbook

Flask Framework Cookbook

by Shalabh Aggarwal

Amazon

Packt Publishing


Other Book Reviews

19 Dec 2014 6:15pm GMT

Mike Driscoll: eBook Review: Flask Framework Cookbook

Packt Publishing recently sent me a copy of the eBook version of Flask Framework Cookbook by Shalabh Aggarwal. I didn't read it in its entirety as Cookbooks don't usually make for a very interesting linear read. I just went through it and cherry picked various recipes. But before I get into too much detail, let's do the quick review!


Quick Review

  • Why I picked it up: I was asked by the publisher to read the book.

Book Formats

You can get this book in paperback, epub, mobi, or PDF.


Book Contents

The book is split up into 12 chapters covering 258 pages with over 80 recipes.


Full Review

Packt is always putting out niche Python books. Flask is one of the more popular Python micro-web frameworks, so it probably has a good sized audience. Let's spend a few moments looking at what the chapters cover. In chapter one, we find out the many ways you can configure Flask. It contains information about using class based settings, static files, blueprints and more. Chapter two changes things up a bit with a group of recipes about Jinja, a templating language. In chapter three, we move into data modeling using SQLAlchemy. There are also recipes for Redis, Alembic and MongoDB. Chapter four is about working with views. It contains information on XHR requests, class-based views, custom 404 handlers and several other recipes.

In chapter five, the author focues on webforms with WTForms. Here we learn about field validation, uploading files and cross-site forgery. For chapter 6, we move into authentication recipes. There are recipes for the Flask-Login extension, OpenID, Facebook, Google and Twitter. Chapter 7 goes into RESTful API building. There are only four recipes in this chapter with two on creating different types of REST interfaces. The last recipe is a complete REST API example though. Chapter eight is all about the admin interface in Flask. Here you will learn about the Flask-Admin extension, custom forms, user roles and more!

Chapter nine takes us into Internationalization and localization. It has the fewest recipes at just 3. You learn how to add a new language, language switching and gettext/ngettext. Moving on to chapter ten, we learn about debugging, error handling and testing. Here we cover everything from emailing errors, to using the pdb debugger to nose, mock and coverage tests. Chapter 11 is about deployment. It covers recipes about apach, Gunicornm Tornado, Fabric, Heroku, AWS Elastic Beanstalk, Application monitoring and a few other items. Chapter 12 rounds out the book with other tips and tricks such as full-text search, working with signals, caching, Celery, etc.

Overall, I found the book fairly well written. There are some spots that are a bit choppy as I don't believe the author is a native English speaker, but the prose doesn't suffer very much because of this. Most of the recipes work well as standalone snippets. Sometimes the snippets don't seem to be completely runnable, but you should be able to download the full code from Packt. I didn't always find the groupings of recipes to be completely cohesive, but for the most part, they made sense together. I would recommend this book for a beginner in Flask that wants to take his skills to the next level and to those who need a more complete understanding of some of the things you can do with Flask.

3407OS_Flask Frameworks Cookbook

Flask Framework Cookbook

by Shalabh Aggarwal

Amazon

Packt Publishing


Other Book Reviews

19 Dec 2014 6:15pm GMT

Mike Driscoll: $5 Python Books from Packt

Packt Publishing recently contacted me to let me know that they're having a $5 sale on their website for all their eBooks and Videos. Since they have a LOT of different Python and Python-related books, I thought my readers might want to know about that sale. Here's their press release:

5 Dollar - Social Media

Following the success of last year's festive offer, Packt Publishing will be celebrating the holiday season with an even bigger $5 offer. From Thursday 18th December, every eBook and video will be available on the publisher's website for just $5. Customers are invited to purchase as many as they like before the offer ends on Tuesday January 6th, making it the perfect opportunity to try something new or to take your skills to the next level as 2015 begins. With all $5 products available in a range of formats and DRM-free, customers will find great value content delivered exactly how they want it across Packt's website this Xmas and New Year.

Find out more at www.packtpub.com/packt5dollar

19 Dec 2014 2:33pm GMT

Mike Driscoll: $5 Python Books from Packt

Packt Publishing recently contacted me to let me know that they're having a $5 sale on their website for all their eBooks and Videos. Since they have a LOT of different Python and Python-related books, I thought my readers might want to know about that sale. Here's their press release:

5 Dollar - Social Media

Following the success of last year's festive offer, Packt Publishing will be celebrating the holiday season with an even bigger $5 offer. From Thursday 18th December, every eBook and video will be available on the publisher's website for just $5. Customers are invited to purchase as many as they like before the offer ends on Tuesday January 6th, making it the perfect opportunity to try something new or to take your skills to the next level as 2015 begins. With all $5 products available in a range of formats and DRM-free, customers will find great value content delivered exactly how they want it across Packt's website this Xmas and New Year.

Find out more at www.packtpub.com/packt5dollar

19 Dec 2014 2:33pm GMT

Python Anywhere: #VATMESS - or, how a taxation change took 4 developers a week to handle

A lot of people are talking about the problems that are being caused by a recent change to taxation in the EU; this TechCrunch article gives a nice overview of the issues. But we thought it would be fun just to tell about our experience - for general interest, and as an example of what one UK startup had to do to implement these changes. Short version: it hasn't been fun.

If you know all about the EU VAT changes and just want to know what we did at PythonAnywhere, click here to skip the intro. Otherwise, read on...

The background

"We can fight over what the taxation levels should be, but the tax system should be very, very simple and not distortionary." - Adam Davidson

The tax change is, in its most basic form, pretty simple. But some background will probably help. The following is simplified, but hopefully reasonably clear.

VAT is Value Added Tax, a tax that is charged on pretty much all purchases of anything inside the EU (basic needs like food are normally exempt). It's not dissimilar to sales or consumption tax, but the rate is quite high: in most EU countries it's something like 20%. When you buy (say) a computer, VAT is added on to the price, so a PC that the manufacturer wants EUR1,000 for might cost EUR1,200 including VAT. Prices for consumers are normally quoted with VAT included, when the seller is targetting local customers. (Companies with large numbers of international customers, like PythonAnywhere, tend to quote prices without VAT and show a note to the effect that EU customers have to pay VAT too.)

When you pay for the item, the seller takes the VAT, and they have to pay all the VAT they have collected to their local tax authority periodically. There are various controls in place to make sure this happens, that people don't pay more or less VAT than they've collected, and in general it all works out pretty simply. The net effect is that stuff is more expensive in Europe because of tax, but that's a political choice on the part of European voters (or at least their representatives).

When companies buy stuff from each other (rather than sales from companies to consumers) they pay VAT on those purchases if they're buying from a company in their own country, but they can claim that VAT back from their local tax authorities (or offset it against VAT they've collected), so it's essentially VAT-free. And when they buy from companies in other EU countries or internationally, it's VAT-free. (The actual accounting is a little more complicated, but let's not get into that.)

What changed

"Taxation without representation is tyranny." - James Otis

Historically, for "digital services" -- a category that includes hosting services like PythonAnywhere, but also downloaded music, ebooks, and that kind of thing -- the rule was that the rate of VAT that was charged was the rate that prevailed in the country where the company doing the selling was based. This made a lot of sense. Companies would just need to register as VAT-collecting businesses with their local authorities, charge a single VAT rate for EU customers (apart from sales to other VAT-registered businesses in other EU countries), and pay the VAT they collected to their local tax authority. It wasn't trivially simple, but it was doable.

But, at least from the tax authorities' side, there was a problem. Different EU countries have different VAT rates. Luxembourg, for example, charges 15%, while Hungary is 27%. This, of course, meant that Hungarian companies were at a competitive disadvantage to Luxembourgeoise companies.

There's a reasonable point to be made that governments who are unhappy that their local companies are being disadvantaged by their high tax rates might want to consider whether those high tax rates are such a good idea, but (a) we're talking about governments here, so that was obviously a non-starter, and (b) a number of large companies had a strong incentive to officially base themselves in Luxembourg, even if the bulk of their business -- both their customers and their operations -- was in higher-VAT jurisdictions.

So, a decision was made that instead of basing the VAT rate for intra-EU transactions for digital services on the VAT rate for the seller, it should be based on the VAT rate for the buyer. The VAT would then be sent to the customer's country's tax authority -- though to keep things simple, each country's tax authority would set up a process where the company's local tax authority would collect all of the money from the company along with a file saying how much was meant to go to each other country, and they'd handle the distribution. (This latter thing is called, in a wonderful piece of bureaucratese, a "Mini One-Stop Shop" or MOSS, which is why "VATMOSS" has been turned into a term for the whole change.)

As these kind of things go, it wasn't too crazy a decision. But the knock-on effects, both those inherent in the idea of companies having to charge different tax rates for different people, but also those caused by the particular way the details of laws have been worked out, have been huge. What follows is what we had to change for PythonAnywhere. Let's start with the basics.

Different country, different VAT rate

"The wisdom of man never yet contrived a system of taxation that would operate with perfect equality." - Andrew Jackson

Previously we had some logic that worked out if a customer was "vattable". A user in the UK, where we're based, is vattable, as is a non-business anywhere else in the EU. EU-based businesses and anyone from outside the EU were non-vattable. If a user was vattable, we charged them 20%, a number that we'd quite sensibly put into a constant called VAT_RATE in our accounting system.

What needed to change? Obviously, we have a country for each paying customer, from their credit card/PayPal details, so a first approximation of the system was simple. We created a new database table, keyed on the country, with VAT rates for each. Now, all the code that previously used the VAT_RATE constant could do a lookup into that table instead.

So now we're billing people the right amount. We also need to store the VAT rate on every invoice we generate so that we can produce the report to send to the MOSS, but there's nothing too tricky about that.

Simple, right? Not quite. Let's put aside that there's no solid source for VAT rates across the EU (the UK tax authorities recommend that people look at a PDF on an EU website that's updated irregularly and has a table of VAT rates using non-ISO country identifiers, with the countries' names in English but sorted by their name in their own language, so Austria is written "Austria" but sorted under "O" for "Österreich").

No, the first problem is in dealing with evidence.

Where are you from?

"Extraordinary claims require extraordinary evidence." - Carl Sagan

How do you know which country someone is from? You'd think that for a paying customer, it would be pretty simple. Like we said a moment ago, they've provided a credit card number and an address, or a PayPal billing address, so you have a country for them. But for dealing with the tax authorities, that's just not enough. Perhaps they feared that half the population of the EU would be flocking to Luxembourgeoise banks for credit cards based there to save a few euros on their downloads.

What the official UK tax authority guidelines say regarding determining someone's location is this (and it's worth quoting in all its bureaucratic glory):

1.5 Record keeping

If the presumptions referred above don't apply, you'll be expected to obtain and to keep in your records 2 pieces of non-contradictory evidence from the following list to support your taxing decisions. Examples include:

  • the billing address of the customer
  • the Internet Protocol (IP) address of the device used by the customer
  • location of the bank
  • the country code of SIM card used by the customer
  • the location of the customer's fixed land line through which the service is supplied to him
  • other commercially relevant information (for example, product coding information which electronically links the sale to a particular jurisdiction)

Once you have 2 pieces of non-contradictory evidence that is all you need and you don't need to collect any further supporting evidence. This is the case even if, for example, you obtain a third piece of evidence which happens to contradict the other 2 pieces of information. You must keep VAT MOSS records for a period of 10 years from 31 December of the year during which the transaction was carried out.

For an online service paid with credit cards, like PythonAnywhere, the only "pieces of evidence" we can reasonably collect are your billing address and your IP address.

So, our nice simple checkout process had to grow another wart. When someone signs up, we have to look at their IP address. We then compare this with a GeoIP database to get a country. When we have both that and their billing address, we have to check them:

We've set things up so that when someone is blocked due to a location mismatch like that, we show the user an apologetic page, and the PythonAnywhere staff get an email so that we can talk to the customer and try to work something out. But this is a crazy situation:

As far as we can tell (and we've spoken to our [very expensive] accountants about this) there's no other way to implement the rules in a manner consistent with the guidelines.

This sucks.

And it's not all. Now we need to deal with change...

Ch-ch-changes

"Time changes everything except something within us which is always surprised by change." - Thomas Hardy

VAT rates change (normally upwards). When you're only dealing with one country's VAT rate, this is pretty rare; maybe once every few years, so as long as you've put it in a constant somewhere and you're willing to update your system when it changes, you're fine. But if you're dealing with 28 countries, it becomes something you need to plan for happening pretty frequently, and it has to be stored in data somewhere.

So, we added an extra valid_from field to our table of VAT rates. Now, when we look up the VAT rate, we plug today's date into the query to make sure that we've got the right VAT rate for this country today. (If you're wondering why we didn't just set things up with a simple country-rate mapping that would be updated at the precise point when the VAT rate changed, read on.)

No big deal, right? Well, perhaps not, but it's all extra work. And of course we now need to check a PDF on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying "Beware of the Leopard" to see when it changes. We're hoping a solid API for this will crop up somewhere -- though there are obviously regulatory issues there, because we're responsible for making sure we use the right rate, and we can't push that responsibility off to someone else. The API would need to be provided by someone we were 100% sure would keep it updated at least as well as we would ourselves -- so, for example, a volunteer-led project would be unlikely to be enough.

So what went wrong next? Let's talk about subscription payments.

Reconciling ourselves to PayPal

"Nothing changes like changes, because nothing changes but the changes." - Gary Busey

Like many subscription services, at PythonAnywhere we use external companies to manage our billing. The awesome Stripe handle our credit card payments, and we also support PayPal. This means that we don't need to store credit card details in our own databases (a regulatory nightmare) or do any of the horrible integration work with legacy credit card processing systems.

So, when you sign up for a paid PythonAnywhere account, we set up a subscription on either Stripe or PayPal. The subscription is for a specific amount to be billed each month; on Stripe it's a "gross amount" (that is, including tax) while PayPal split it out into separate "net amount" and a "tax amount". But the common thing between them is that they don't do any tax calculations themselves. As international companies having to deal with billing all over the world, this makes sense, especially given that historically, say, a UK company might have been billing through their Luxembourg subsidiary to get the lower VAT rate, so there's no safe assumptions that the billing companies could make on their customers' behalf.

Now, the per-country date-based VAT rate lookup into the database table that we'd done earlier meant that when a customer signed up, we'd set up a subscription on PayPal/Stripe with the right VAT amount for the time when the subscription was created. But if and when the VAT rate changed in the future, it would be wrong. Our code would make sure that we were sending out billing reminders for the right amount, but it wouldn't fix the subscriptions on PayPal or Stripe.

What all that means is that when a VAT rate is going to change, we need to go through all of our customers in the affected country, identify if they were using PayPal or Stripe, then tell the relevant payment processor that the amount we're charging them needs to change.

This is tricky enough as it is. What makes it even worse is an oddity with PayPal billing. You cannot update the billing amount for a customer in the 72 hours before a payment is due to be made. So, for example, let's imagine that the VAT rate for Malta is going to change from 18% to 20% on 1 May. At least three days before, you need to go through and update the billing amounts on subscriptions for all Maltese customers whose next billing date is after 1 May. And then sometime after you have to go through all of the other Maltese customers and update them too.

Keeping all of this straight is a nightmare, especially when you factor in things like the fact that (again due to PayPal) we actually charge people one day after their billing date, and users can be "delinquent" -- that is, their billing date was several days ago but due to (for example) a credit card problem we've not been able to charge them yet (but we expect to be able to do so soon). And so on.

The solution we came up with was to write a reconciliation script. Instead of having to remember "OK, so Malta's going from 18% to 20% on 1 May so we need to run script A four days before, then update the VAT rate table at midnight on 1 May, then run script B four days after", and then repeat that across all 28 countries forever, we wanted something that would run regularly and would just make everything work, by checking when people are next going to be billed and what the VAT rate is on that date, then checking what PayPal and Stripe plan to bill them, and adjusting them if appropriate.

A code sample is worth a thousand words, so here's the algorithm we use for that. It's run once for every EU user (whose subscription details are in the subscription parameter). The get_next_charge_date_and_gross_amount and update_vat_amount are functions passed in as a dependency injection so that we can use the same algorithm for both PayPal and Stripe.

def reconcile_subscription(
    subscription,
    get_next_charge_date_and_gross_amount,
    update_vat_amount
):
    next_invoice_date = subscription.user.get_profile().next_invoice_date()
    next_charge_date, next_charge_gross_amount = get_next_charge_date_and_gross_amount(subscription)

    if next_invoice_date < datetime.now():
        # Cancelled subscription
        return

    if next_charge_date < next_invoice_date:
        # We're between invoicing and billing -- don't try to do anything
        return

    expected_next_invoice_vat_rate = subscription.user.get_profile().billing_vat_rate_as_of(next_invoice_date)
    expected_next_charge_vat_amount = subscription.user.get_profile().billing_net_amount * expected_next_invoice_vat_rate

    if next_charge_gross_amount == expected_next_charge_vat_amount + subscription.user.get_profile().billing_net_amount:
        # User has correct billing set up on payment processor
        return

    # Needs an update
    update_vat_amount(subscription, expected_next_charge_vat_amount)

We're pretty sure this works. It's passed every test we've thrown at it. And for the next few days we'll be running it in our live environment in "nerfed" mode, where it will just print out what it's going to do rather than actually updating anything on PayPal or Stripe. Then we'll run it manually for the first few weeks, before finally scheduling it as a cron job to run once a day. And then we'll hopefully be in a situation where when we hear about a VAT rate change we just need to update our database with a new row with the appropriate country, valid from date, and rate, and it will All Just Work.

(An aside: this probably all comes off as a bit of a whine against PayPal. And it is. But they do have positive aspects too. Lots of customers prefer to have their PythonAnywhere accounts billed via PayPal for the extra level of control it gives them -- about 50% of new subscriptions we get use it. And the chargeback model for fraudulent use under PayPal is much better -- even Stripe can't isolate you from the crazy-high chargeback fees that banks impose on companies when a cardholder claims that a charge was fraudulent.)

In conclusion

"You can't have a rigid view that all new taxes are evil." - Bill Gates

The changes in the EU VAT legislation came from the not-completely-unreasonable attempt by the various governments to stop companies from setting up businesses in low-VAT countries for the sole purpose of offering lower prices to their customers, despite having their operations on higher-VAT countries.

But the administrative load placed on small companies (including but not limited to tech startups) is large, it makes billing systems complex and fragile, and it imposes restrictions on sales that are likely to reduce trade. We've seen various government-sourced estimates of the cost of these regulations on businesses floating around, and they all seem incredibly low.

At PythonAnywhere we have a group of talented coders, and it still took over a week out of our development which we could have spent working on stuff our customers wanted. Other startups will be in the same position; it's an irritation but not the end of the world.

For small businesses without deep tech talent, we dread to think what will happen.

19 Dec 2014 12:36pm GMT

Python Anywhere: #VATMESS - or, how a taxation change took 4 developers a week to handle

A lot of people are talking about the problems that are being caused by a recent change to taxation in the EU; this TechCrunch article gives a nice overview of the issues. But we thought it would be fun just to tell about our experience - for general interest, and as an example of what one UK startup had to do to implement these changes. Short version: it hasn't been fun.

If you know all about the EU VAT changes and just want to know what we did at PythonAnywhere, click here to skip the intro. Otherwise, read on...

The background

"We can fight over what the taxation levels should be, but the tax system should be very, very simple and not distortionary." - Adam Davidson

The tax change is, in its most basic form, pretty simple. But some background will probably help. The following is simplified, but hopefully reasonably clear.

VAT is Value Added Tax, a tax that is charged on pretty much all purchases of anything inside the EU (basic needs like food are normally exempt). It's not dissimilar to sales or consumption tax, but the rate is quite high: in most EU countries it's something like 20%. When you buy (say) a computer, VAT is added on to the price, so a PC that the manufacturer wants EUR1,000 for might cost EUR1,200 including VAT. Prices for consumers are normally quoted with VAT included, when the seller is targetting local customers. (Companies with large numbers of international customers, like PythonAnywhere, tend to quote prices without VAT and show a note to the effect that EU customers have to pay VAT too.)

When you pay for the item, the seller takes the VAT, and they have to pay all the VAT they have collected to their local tax authority periodically. There are various controls in place to make sure this happens, that people don't pay more or less VAT than they've collected, and in general it all works out pretty simply. The net effect is that stuff is more expensive in Europe because of tax, but that's a political choice on the part of European voters (or at least their representatives).

When companies buy stuff from each other (rather than sales from companies to consumers) they pay VAT on those purchases if they're buying from a company in their own country, but they can claim that VAT back from their local tax authorities (or offset it against VAT they've collected), so it's essentially VAT-free. And when they buy from companies in other EU countries or internationally, it's VAT-free. (The actual accounting is a little more complicated, but let's not get into that.)

What changed

"Taxation without representation is tyranny." - James Otis

Historically, for "digital services" -- a category that includes hosting services like PythonAnywhere, but also downloaded music, ebooks, and that kind of thing -- the rule was that the rate of VAT that was charged was the rate that prevailed in the country where the company doing the selling was based. This made a lot of sense. Companies would just need to register as VAT-collecting businesses with their local authorities, charge a single VAT rate for EU customers (apart from sales to other VAT-registered businesses in other EU countries), and pay the VAT they collected to their local tax authority. It wasn't trivially simple, but it was doable.

But, at least from the tax authorities' side, there was a problem. Different EU countries have different VAT rates. Luxembourg, for example, charges 15%, while Hungary is 27%. This, of course, meant that Hungarian companies were at a competitive disadvantage to Luxembourgeoise companies.

There's a reasonable point to be made that governments who are unhappy that their local companies are being disadvantaged by their high tax rates might want to consider whether those high tax rates are such a good idea, but (a) we're talking about governments here, so that was obviously a non-starter, and (b) a number of large companies had a strong incentive to officially base themselves in Luxembourg, even if the bulk of their business -- both their customers and their operations -- was in higher-VAT jurisdictions.

So, a decision was made that instead of basing the VAT rate for intra-EU transactions for digital services on the VAT rate for the seller, it should be based on the VAT rate for the buyer. The VAT would then be sent to the customer's country's tax authority -- though to keep things simple, each country's tax authority would set up a process where the company's local tax authority would collect all of the money from the company along with a file saying how much was meant to go to each other country, and they'd handle the distribution. (This latter thing is called, in a wonderful piece of bureaucratese, a "Mini One-Stop Shop" or MOSS, which is why "VATMOSS" has been turned into a term for the whole change.)

As these kind of things go, it wasn't too crazy a decision. But the knock-on effects, both those inherent in the idea of companies having to charge different tax rates for different people, but also those caused by the particular way the details of laws have been worked out, have been huge. What follows is what we had to change for PythonAnywhere. Let's start with the basics.

Different country, different VAT rate

"The wisdom of man never yet contrived a system of taxation that would operate with perfect equality." - Andrew Jackson

Previously we had some logic that worked out if a customer was "vattable". A user in the UK, where we're based, is vattable, as is a non-business anywhere else in the EU. EU-based businesses and anyone from outside the EU were non-vattable. If a user was vattable, we charged them 20%, a number that we'd quite sensibly put into a constant called VAT_RATE in our accounting system.

What needed to change? Obviously, we have a country for each paying customer, from their credit card/PayPal details, so a first approximation of the system was simple. We created a new database table, keyed on the country, with VAT rates for each. Now, all the code that previously used the VAT_RATE constant could do a lookup into that table instead.

So now we're billing people the right amount. We also need to store the VAT rate on every invoice we generate so that we can produce the report to send to the MOSS, but there's nothing too tricky about that.

Simple, right? Not quite. Let's put aside that there's no solid source for VAT rates across the EU (the UK tax authorities recommend that people look at a PDF on an EU website that's updated irregularly and has a table of VAT rates using non-ISO country identifiers, with the countries' names in English but sorted by their name in their own language, so Austria is written "Austria" but sorted under "O" for "Österreich").

No, the first problem is in dealing with evidence.

Where are you from?

"Extraordinary claims require extraordinary evidence." - Carl Sagan

How do you know which country someone is from? You'd think that for a paying customer, it would be pretty simple. Like we said a moment ago, they've provided a credit card number and an address, or a PayPal billing address, so you have a country for them. But for dealing with the tax authorities, that's just not enough. Perhaps they feared that half the population of the EU would be flocking to Luxembourgeoise banks for credit cards based there to save a few euros on their downloads.

What the official UK tax authority guidelines say regarding determining someone's location is this (and it's worth quoting in all its bureaucratic glory):

1.5 Record keeping

If the presumptions referred above don't apply, you'll be expected to obtain and to keep in your records 2 pieces of non-contradictory evidence from the following list to support your taxing decisions. Examples include:

  • the billing address of the customer
  • the Internet Protocol (IP) address of the device used by the customer
  • location of the bank
  • the country code of SIM card used by the customer
  • the location of the customer's fixed land line through which the service is supplied to him
  • other commercially relevant information (for example, product coding information which electronically links the sale to a particular jurisdiction)

Once you have 2 pieces of non-contradictory evidence that is all you need and you don't need to collect any further supporting evidence. This is the case even if, for example, you obtain a third piece of evidence which happens to contradict the other 2 pieces of information. You must keep VAT MOSS records for a period of 10 years from 31 December of the year during which the transaction was carried out.

For an online service paid with credit cards, like PythonAnywhere, the only "pieces of evidence" we can reasonably collect are your billing address and your IP address.

So, our nice simple checkout process had to grow another wart. When someone signs up, we have to look at their IP address. We then compare this with a GeoIP database to get a country. When we have both that and their billing address, we have to check them:

We've set things up so that when someone is blocked due to a location mismatch like that, we show the user an apologetic page, and the PythonAnywhere staff get an email so that we can talk to the customer and try to work something out. But this is a crazy situation:

As far as we can tell (and we've spoken to our [very expensive] accountants about this) there's no other way to implement the rules in a manner consistent with the guidelines.

This sucks.

And it's not all. Now we need to deal with change...

Ch-ch-changes

"Time changes everything except something within us which is always surprised by change." - Thomas Hardy

VAT rates change (normally upwards). When you're only dealing with one country's VAT rate, this is pretty rare; maybe once every few years, so as long as you've put it in a constant somewhere and you're willing to update your system when it changes, you're fine. But if you're dealing with 28 countries, it becomes something you need to plan for happening pretty frequently, and it has to be stored in data somewhere.

So, we added an extra valid_from field to our table of VAT rates. Now, when we look up the VAT rate, we plug today's date into the query to make sure that we've got the right VAT rate for this country today. (If you're wondering why we didn't just set things up with a simple country-rate mapping that would be updated at the precise point when the VAT rate changed, read on.)

No big deal, right? Well, perhaps not, but it's all extra work. And of course we now need to check a PDF on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying "Beware of the Leopard" to see when it changes. We're hoping a solid API for this will crop up somewhere -- though there are obviously regulatory issues there, because we're responsible for making sure we use the right rate, and we can't push that responsibility off to someone else. The API would need to be provided by someone we were 100% sure would keep it updated at least as well as we would ourselves -- so, for example, a volunteer-led project would be unlikely to be enough.

So what went wrong next? Let's talk about subscription payments.

Reconciling ourselves to PayPal

"Nothing changes like changes, because nothing changes but the changes." - Gary Busey

Like many subscription services, at PythonAnywhere we use external companies to manage our billing. The awesome Stripe handle our credit card payments, and we also support PayPal. This means that we don't need to store credit card details in our own databases (a regulatory nightmare) or do any of the horrible integration work with legacy credit card processing systems.

So, when you sign up for a paid PythonAnywhere account, we set up a subscription on either Stripe or PayPal. The subscription is for a specific amount to be billed each month; on Stripe it's a "gross amount" (that is, including tax) while PayPal split it out into separate "net amount" and a "tax amount". But the common thing between them is that they don't do any tax calculations themselves. As international companies having to deal with billing all over the world, this makes sense, especially given that historically, say, a UK company might have been billing through their Luxembourg subsidiary to get the lower VAT rate, so there's no safe assumptions that the billing companies could make on their customers' behalf.

Now, the per-country date-based VAT rate lookup into the database table that we'd done earlier meant that when a customer signed up, we'd set up a subscription on PayPal/Stripe with the right VAT amount for the time when the subscription was created. But if and when the VAT rate changed in the future, it would be wrong. Our code would make sure that we were sending out billing reminders for the right amount, but it wouldn't fix the subscriptions on PayPal or Stripe.

What all that means is that when a VAT rate is going to change, we need to go through all of our customers in the affected country, identify if they were using PayPal or Stripe, then tell the relevant payment processor that the amount we're charging them needs to change.

This is tricky enough as it is. What makes it even worse is an oddity with PayPal billing. You cannot update the billing amount for a customer in the 72 hours before a payment is due to be made. So, for example, let's imagine that the VAT rate for Malta is going to change from 18% to 20% on 1 May. At least three days before, you need to go through and update the billing amounts on subscriptions for all Maltese customers whose next billing date is after 1 May. And then sometime after you have to go through all of the other Maltese customers and update them too.

Keeping all of this straight is a nightmare, especially when you factor in things like the fact that (again due to PayPal) we actually charge people one day after their billing date, and users can be "delinquent" -- that is, their billing date was several days ago but due to (for example) a credit card problem we've not been able to charge them yet (but we expect to be able to do so soon). And so on.

The solution we came up with was to write a reconciliation script. Instead of having to remember "OK, so Malta's going from 18% to 20% on 1 May so we need to run script A four days before, then update the VAT rate table at midnight on 1 May, then run script B four days after", and then repeat that across all 28 countries forever, we wanted something that would run regularly and would just make everything work, by checking when people are next going to be billed and what the VAT rate is on that date, then checking what PayPal and Stripe plan to bill them, and adjusting them if appropriate.

A code sample is worth a thousand words, so here's the algorithm we use for that. It's run once for every EU user (whose subscription details are in the subscription parameter). The get_next_charge_date_and_gross_amount and update_vat_amount are functions passed in as a dependency injection so that we can use the same algorithm for both PayPal and Stripe.

def reconcile_subscription(
    subscription,
    get_next_charge_date_and_gross_amount,
    update_vat_amount
):
    next_invoice_date = subscription.user.get_profile().next_invoice_date()
    next_charge_date, next_charge_gross_amount = get_next_charge_date_and_gross_amount(subscription)

    if next_invoice_date < datetime.now():
        # Cancelled subscription
        return

    if next_charge_date < next_invoice_date:
        # We're between invoicing and billing -- don't try to do anything
        return

    expected_next_invoice_vat_rate = subscription.user.get_profile().billing_vat_rate_as_of(next_invoice_date)
    expected_next_charge_vat_amount = subscription.user.get_profile().billing_net_amount * expected_next_invoice_vat_rate

    if next_charge_gross_amount == expected_next_charge_vat_amount + subscription.user.get_profile().billing_net_amount:
        # User has correct billing set up on payment processor
        return

    # Needs an update
    update_vat_amount(subscription, expected_next_charge_vat_amount)

We're pretty sure this works. It's passed every test we've thrown at it. And for the next few days we'll be running it in our live environment in "nerfed" mode, where it will just print out what it's going to do rather than actually updating anything on PayPal or Stripe. Then we'll run it manually for the first few weeks, before finally scheduling it as a cron job to run once a day. And then we'll hopefully be in a situation where when we hear about a VAT rate change we just need to update our database with a new row with the appropriate country, valid from date, and rate, and it will All Just Work.

(An aside: this probably all comes off as a bit of a whine against PayPal. And it is. But they do have positive aspects too. Lots of customers prefer to have their PythonAnywhere accounts billed via PayPal for the extra level of control it gives them -- about 50% of new subscriptions we get use it. And the chargeback model for fraudulent use under PayPal is much better -- even Stripe can't isolate you from the crazy-high chargeback fees that banks impose on companies when a cardholder claims that a charge was fraudulent.)

In conclusion

"You can't have a rigid view that all new taxes are evil." - Bill Gates

The changes in the EU VAT legislation came from the not-completely-unreasonable attempt by the various governments to stop companies from setting up businesses in low-VAT countries for the sole purpose of offering lower prices to their customers, despite having their operations on higher-VAT countries.

But the administrative load placed on small companies (including but not limited to tech startups) is large, it makes billing systems complex and fragile, and it imposes restrictions on sales that are likely to reduce trade. We've seen various government-sourced estimates of the cost of these regulations on businesses floating around, and they all seem incredibly low.

At PythonAnywhere we have a group of talented coders, and it still took over a week out of our development which we could have spent working on stuff our customers wanted. Other startups will be in the same position; it's an irritation but not the end of the world.

For small businesses without deep tech talent, we dread to think what will happen.

19 Dec 2014 12:36pm GMT

Django Weblog: DjangoCon Europe 2015

2015's DjangoCon Europe will take place in Cardiff, Wales, from the 2nd to the 7th June, for six days of talks, tutorials and code. Here's a snapshot of what's in store.

Six whole days

For the first time ever, we're holding a six-day DjangoCon.

The event will begin with an open day. All sessions on the open day - talks and tutorials of various different kinds - will be free, and open to the public. (This follows the example of Django Weekend Cardiff, where the open day proved hugely successful.)

The open day will:

  • help introduce Django to a wider audience
  • give newer members of the community a headstart, to help them get the most from the following five days

There'll be a DjangoGirls workshop on our open day, and lots more besides.

Two days of code

Following the three days of talks, we won't just have two days of sprints: they will be two days of code: code sprints, code clinics and workshops. We want everyone to have a reason to stay on after the talks, and participate, whatever their level of experience.

All these sessions will be free of charge to conference attendees. Some of them will be worth the ticket price of the entire conference on their own.

Values

We aim to put on a first-class technical conference. We also want the event to embody three very important values:

Diversity

We aim to make DjangoCon Europe 2015 a milestone in the Django community's effort to improve diversity.

Accessibility

We want this DjangoCon to set the highest standards for accessibility, and to ensure that we do not inadvertently exclude anyone from participating fully in the event.

Social responsibility

DjangoCon Europe 2015 expresses the Django community's values of fairness, respect and consideration as undertakings of social responsibility.

Cardiff

We're sure you'll enjoy the city and our venues, and we're looking forward to welcoming you in June. Here's some comprehensive information on how to get to Cardiff.

Social events

We have a number of social events planned - more details will be published soon.

Registration, call for papers and other key milestones

Here's a list of key dates. Ticket prices will be published when registration opens.

And finally

We're seeking sponsorship, of course, and would love to hear from any organisations that can contribute financial support to the event.

There's contact information on the website. If there's anything you want to know, you need just ask - the organising committee are at your disposal.

19 Dec 2014 9:34am GMT

Django Weblog: DjangoCon Europe 2015

2015's DjangoCon Europe will take place in Cardiff, Wales, from the 2nd to the 7th June, for six days of talks, tutorials and code. Here's a snapshot of what's in store.

Six whole days

For the first time ever, we're holding a six-day DjangoCon.

The event will begin with an open day. All sessions on the open day - talks and tutorials of various different kinds - will be free, and open to the public. (This follows the example of Django Weekend Cardiff, where the open day proved hugely successful.)

The open day will:

  • help introduce Django to a wider audience
  • give newer members of the community a headstart, to help them get the most from the following five days

There'll be a DjangoGirls workshop on our open day, and lots more besides.

Two days of code

Following the three days of talks, we won't just have two days of sprints: they will be two days of code: code sprints, code clinics and workshops. We want everyone to have a reason to stay on after the talks, and participate, whatever their level of experience.

All these sessions will be free of charge to conference attendees. Some of them will be worth the ticket price of the entire conference on their own.

Values

We aim to put on a first-class technical conference. We also want the event to embody three very important values:

Diversity

We aim to make DjangoCon Europe 2015 a milestone in the Django community's effort to improve diversity.

Accessibility

We want this DjangoCon to set the highest standards for accessibility, and to ensure that we do not inadvertently exclude anyone from participating fully in the event.

Social responsibility

DjangoCon Europe 2015 expresses the Django community's values of fairness, respect and consideration as undertakings of social responsibility.

Cardiff

We're sure you'll enjoy the city and our venues, and we're looking forward to welcoming you in June. Here's some comprehensive information on how to get to Cardiff.

Social events

We have a number of social events planned - more details will be published soon.

Registration, call for papers and other key milestones

Here's a list of key dates. Ticket prices will be published when registration opens.

And finally

We're seeking sponsorship, of course, and would love to hear from any organisations that can contribute financial support to the event.

There's contact information on the website. If there's anything you want to know, you need just ask - the organising committee are at your disposal.

19 Dec 2014 9:34am GMT

Python Anywhere: New PythonAnywhere update: Mobile, UI, packages, reliability, and the dreaded EU VAT change

We released a bunch of updates to PythonAnywhere today :-) Short version: we've made some improvements to the iPad and Android experience, applied fixes to our in-browser console, added a bunch of new pre-installed packages, done a big database upgrade that should make unplanned outages rarer and shorter, and made changes required by EU VAT legislation (EU customers will soon be charged their local VAT rate instead of UK VAT).

Here are the details:

iPad and Android

User interface

New packages

We've added loads of new packages to our "batteries included" list:

Additionally, for people who like alternative shells to the ubiquitous bash, we've added fish.

Reliability improvement

We've upgraded one of our underlying infrastructural databases to SSD storage. We've had a couple of outages recently caused by problems with this database, which were made much worse by the fact that it took a long time to start up after a failover. Moving it to SSD moved it to new hardware (which we think will make it less likely to fail) and will also mean that if it does fail, it should recover much faster.

EU VAT changes

For customers outside the EU, this won't change anything. But for non-business customers inside the EU, starting 1 January 2015, we'll be charging you VAT at the rate for your country, instead of using the UK VAT rate of 20%. This is the result of some (we think rather badly-thought-through) new EU legislation. We'll write an extended post about this sometime soon.

19 Dec 2014 9:23am GMT

Python Anywhere: New PythonAnywhere update: Mobile, UI, packages, reliability, and the dreaded EU VAT change

We released a bunch of updates to PythonAnywhere today :-) Short version: we've made some improvements to the iPad and Android experience, applied fixes to our in-browser console, added a bunch of new pre-installed packages, done a big database upgrade that should make unplanned outages rarer and shorter, and made changes required by EU VAT legislation (EU customers will soon be charged their local VAT rate instead of UK VAT).

Here are the details:

iPad and Android

User interface

New packages

We've added loads of new packages to our "batteries included" list:

Additionally, for people who like alternative shells to the ubiquitous bash, we've added fish.

Reliability improvement

We've upgraded one of our underlying infrastructural databases to SSD storage. We've had a couple of outages recently caused by problems with this database, which were made much worse by the fact that it took a long time to start up after a failover. Moving it to SSD moved it to new hardware (which we think will make it less likely to fail) and will also mean that if it does fail, it should recover much faster.

EU VAT changes

For customers outside the EU, this won't change anything. But for non-business customers inside the EU, starting 1 January 2015, we'll be charging you VAT at the rate for your country, instead of using the UK VAT rate of 20%. This is the result of some (we think rather badly-thought-through) new EU legislation. We'll write an extended post about this sometime soon.

19 Dec 2014 9:23am GMT

Kay Hayen: Nuitka Release 0.5.6

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release brings bug fixes, important new optimization, newly supported platforms, and important compatiblity improvements. Progress on all fronts.

Bug Fixes

  • Closure taking of global variables in member functions of classes that had a class variable of the same name was binding to the class variable as opposed to the module variable.

  • Overwriting compiled function's __doc__ attribute more than once could corrupt the old value, leading to crashes. Issue#156. Fixed in 0.5.5.2 already.

  • Compatibility Python2: The exec statement execfile were changing locals() was given as an argument.

    def function():
       a = 1
    
       exec code in locals() # Cannot change local "a".
       exec code in None     # Can change local "a"
       exec code
    

    Previously Nuitka treated all 3 variants the same.

  • Compatibility: Empty branches with a condition were reduced to only the condition, but they need in fact to also check the truth value:

    if condition:
        pass
    # must be treated as
    bool(condition)
    # and not (bug)
    condition
    
  • Detection of Windows virtualenv was not working properly. Fixed in 0.5.5.2 already.

  • Large enough constants structures are now unstreamed via marshal module, avoiding large codes being generated with no point. Fixed in 0.5.5.2 already.

  • Windows: Pressing CTRL-C gave two stack traces, one from the re-execution of Nuitka which was rather pointless. Fixed in 0.5.5.1 already.

  • Windows: Searching for virtualenv environments didn't terminate in all cases. Fixed in 0.5.5.1 already.

  • During installation from PyPI with Python3 versions, there were errors given for the Python2 only scons files. Issue#153. Fixed in 0.5.5.3 already.

  • Fix, the arguments of yield from expressions could be leaked.

  • Fix, closure taking of a class variable could have in a sub class where the module variable was meant.

    var = 1
    
    class C:
       var = 2
    
       class D:
          def f():
             # was C.var, now correctly addressed top level var
             return var
    
  • Fix, setting CXX environment variable because the installed gcc has too low version, wasn't affecting the version check at all.

  • Fix, on Debian/Ubuntu with hardening-wrapper installed the version check was always failing, because these report a shortened version number to Scons.

New Optimization

  • Local variables that must be assigned also have no side effects, making use of SSA. This allows for a host of optimization to be applied to them as well, often yielding simpler access/assign code, and discovering in more cases that frames are not necessary.
  • Micro optimization to dict built-in for simpler code generation.

Organizational

  • Added support for ARM "hard float" architecture.
  • Added package for Ubuntu 14.10 for download.
  • Added package for openSUSE 13.2 for download.
  • Donations were used to buy a Cubox-i4 Pro. It got Debian Jessie installed on it, and will be used to run an even larger amount of tests.
  • Made it more clear in the user documentation that the .exe suffix is used for all platforms, and why.
  • Generally updated information in user manual and developer manual about the optimization status.
  • Using Nikola 7.1 with external filters instead of our own, outdated branch for the web site.

Cleanups

  • PyLint clean for the first time ever. We now have a Buildbot driven test that this stays that way.
  • Massive indentation cleanup of keyword argument calls. We have a rule to align the keywords, but as this was done manually, it could easily get out of touch. Now with a "autoformat" tool based on RedBaron, it's correct. Also, spacing around arguments is now automatically corrected. More to come.
  • For exec statements, the coping back to local variables is now an explicit node in the tree, leader to cleaner code generation, as it now uses normal variable assignment code generation.
  • The MaybeLocalVariables became explicit about which variable they might be, and contribute to its SSA trace as well, which was incomplete before.
  • Removed some cases of code duplication that were marked as TODO items. This often resulted in cleanups.
  • Do not use replaceWith on child nodes, that potentially were re-used during their computation.

Summary

The release is mainly the result of consolidation work. While the previous release contained many important enhancements, this is another important step towards full SSA, closing one loop whole (class variables and exec functions), as well as applying it to local variables, largely extending its use.

The amount of cleanups is tremendous, in huge part due to infrastructure problems that prevented release repeatedly. This reduces the technological debt very much.

More importantly, it would appear that now eliminating local and temp variables that are not necessary is only a small step away. But as usual, while this may be easy to implement now, it will uncover more bugs in existing code, that we need to address before we continue.

19 Dec 2014 6:54am GMT

Kay Hayen: Nuitka Release 0.5.6

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release brings bug fixes, important new optimization, newly supported platforms, and important compatiblity improvements. Progress on all fronts.

Bug Fixes

  • Closure taking of global variables in member functions of classes that had a class variable of the same name was binding to the class variable as opposed to the module variable.

  • Overwriting compiled function's __doc__ attribute more than once could corrupt the old value, leading to crashes. Issue#156. Fixed in 0.5.5.2 already.

  • Compatibility Python2: The exec statement execfile were changing locals() was given as an argument.

    def function():
       a = 1
    
       exec code in locals() # Cannot change local "a".
       exec code in None     # Can change local "a"
       exec code
    

    Previously Nuitka treated all 3 variants the same.

  • Compatibility: Empty branches with a condition were reduced to only the condition, but they need in fact to also check the truth value:

    if condition:
        pass
    # must be treated as
    bool(condition)
    # and not (bug)
    condition
    
  • Detection of Windows virtualenv was not working properly. Fixed in 0.5.5.2 already.

  • Large enough constants structures are now unstreamed via marshal module, avoiding large codes being generated with no point. Fixed in 0.5.5.2 already.

  • Windows: Pressing CTRL-C gave two stack traces, one from the re-execution of Nuitka which was rather pointless. Fixed in 0.5.5.1 already.

  • Windows: Searching for virtualenv environments didn't terminate in all cases. Fixed in 0.5.5.1 already.

  • During installation from PyPI with Python3 versions, there were errors given for the Python2 only scons files. Issue#153. Fixed in 0.5.5.3 already.

  • Fix, the arguments of yield from expressions could be leaked.

  • Fix, closure taking of a class variable could have in a sub class where the module variable was meant.

    var = 1
    
    class C:
       var = 2
    
       class D:
          def f():
             # was C.var, now correctly addressed top level var
             return var
    
  • Fix, setting CXX environment variable because the installed gcc has too low version, wasn't affecting the version check at all.

  • Fix, on Debian/Ubuntu with hardening-wrapper installed the version check was always failing, because these report a shortened version number to Scons.

New Optimization

  • Local variables that must be assigned also have no side effects, making use of SSA. This allows for a host of optimization to be applied to them as well, often yielding simpler access/assign code, and discovering in more cases that frames are not necessary.
  • Micro optimization to dict built-in for simpler code generation.

Organizational

  • Added support for ARM "hard float" architecture.
  • Added package for Ubuntu 14.10 for download.
  • Added package for openSUSE 13.2 for download.
  • Donations were used to buy a Cubox-i4 Pro. It got Debian Jessie installed on it, and will be used to run an even larger amount of tests.
  • Made it more clear in the user documentation that the .exe suffix is used for all platforms, and why.
  • Generally updated information in user manual and developer manual about the optimization status.
  • Using Nikola 7.1 with external filters instead of our own, outdated branch for the web site.

Cleanups

  • PyLint clean for the first time ever. We now have a Buildbot driven test that this stays that way.
  • Massive indentation cleanup of keyword argument calls. We have a rule to align the keywords, but as this was done manually, it could easily get out of touch. Now with a "autoformat" tool based on RedBaron, it's correct. Also, spacing around arguments is now automatically corrected. More to come.
  • For exec statements, the coping back to local variables is now an explicit node in the tree, leader to cleaner code generation, as it now uses normal variable assignment code generation.
  • The MaybeLocalVariables became explicit about which variable they might be, and contribute to its SSA trace as well, which was incomplete before.
  • Removed some cases of code duplication that were marked as TODO items. This often resulted in cleanups.
  • Do not use replaceWith on child nodes, that potentially were re-used during their computation.

Summary

The release is mainly the result of consolidation work. While the previous release contained many important enhancements, this is another important step towards full SSA, closing one loop whole (class variables and exec functions), as well as applying it to local variables, largely extending its use.

The amount of cleanups is tremendous, in huge part due to infrastructure problems that prevented release repeatedly. This reduces the technological debt very much.

More importantly, it would appear that now eliminating local and temp variables that are not necessary is only a small step away. But as usual, while this may be easy to implement now, it will uncover more bugs in existing code, that we need to address before we continue.

19 Dec 2014 6:54am GMT

10 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: King Willams Town Bahnhof

Gestern musste ich morgens zur Station nach KWT um unsere Rerservierten Bustickets für die Weihnachtsferien in Capetown abzuholen. Der Bahnhof selber ist seit Dezember aus kostengründen ohne Zugverbindung - aber Translux und co - die langdistanzbusse haben dort ihre Büros.


Größere Kartenansicht




© benste CC NC SA

10 Nov 2011 10:57am GMT

09 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein

Niemand ist besorgt um so was - mit dem Auto fährt man einfach durch, und in der City - nahe Gnobie- "ne das ist erst gefährlich wenn die Feuerwehr da ist" - 30min später auf dem Rückweg war die Feuerwehr da.




© benste CC NC SA

09 Nov 2011 8:25pm GMT

08 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Brai Party

Brai = Grillabend o.ä.

Die möchte gern Techniker beim Flicken ihrer SpeakOn / Klinke Stecker Verzweigungen...

Die Damen "Mamas" der Siedlung bei der offiziellen Eröffnungsrede

Auch wenn weniger Leute da waren als erwartet, Laute Musik und viele Leute ...

Und natürlich ein Feuer mit echtem Holz zum Grillen.

© benste CC NC SA

08 Nov 2011 2:30pm GMT

07 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Lumanyano Primary

One of our missions was bringing Katja's Linux Server back to her room. While doing that we saw her new decoration.

Björn, Simphiwe carried the PC to Katja's school


© benste CC NC SA

07 Nov 2011 2:00pm GMT

06 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Nelisa Haircut

Today I went with Björn to Needs Camp to Visit Katja's guest family for a special Party. First of all we visited some friends of Nelisa - yeah the one I'm working with in Quigney - Katja's guest fathers sister - who did her a haircut.

African Women usually get their hair done by arranging extensions and not like Europeans just cutting some hair.

In between she looked like this...

And then she was done - looks amazing considering the amount of hair she had last week - doesn't it ?

© benste CC NC SA

06 Nov 2011 7:45pm GMT

05 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Mein Samstag

Irgendwie viel mir heute auf das ich meine Blogposts mal ein bischen umstrukturieren muss - wenn ich immer nur von neuen Plätzen berichte, dann müsste ich ja eine Rundreise machen. Hier also mal ein paar Sachen aus meinem heutigen Alltag.

Erst einmal vorweg, Samstag zählt zumindest für uns Voluntäre zu den freien Tagen.

Dieses Wochenende sind nur Rommel und ich auf der Farm - Katja und Björn sind ja mittlerweile in ihren Einsatzstellen, und meine Mitbewohner Kyle und Jonathan sind zu Hause in Grahamstown - sowie auch Sipho der in Dimbaza wohnt.
Robin, die Frau von Rommel ist in Woodie Cape - schon seit Donnerstag um da ein paar Sachen zur erledigen.
Naja wie dem auch sei heute morgen haben wir uns erstmal ein gemeinsames Weetbix/Müsli Frühstück gegönnt und haben uns dann auf den Weg nach East London gemacht. 2 Sachen waren auf der Checkliste Vodacom, Ethienne (Imobilienmakler) außerdem auf dem Rückweg die fehlenden Dinge nach NeedsCamp bringen.

Nachdem wir gerade auf der Dirtroad losgefahren sind mussten wir feststellen das wir die Sachen für Needscamp und Ethienne nicht eingepackt hatten aber die Pumpe für die Wasserversorgung im Auto hatten.

Also sind wir in EastLondon ersteinmal nach Farmerama - nein nicht das onlinespiel farmville - sondern einen Laden mit ganz vielen Sachen für eine Farm - in Berea einem nördlichen Stadteil gefahren.

In Farmerama haben wir uns dann beraten lassen für einen Schnellverschluss der uns das leben mit der Pumpe leichter machen soll und außerdem eine leichtere Pumpe zur Reperatur gebracht, damit es nicht immer so ein großer Aufwand ist, wenn mal wieder das Wasser ausgegangen ist.

Fego Caffé ist in der Hemmingways Mall, dort mussten wir und PIN und PUK einer unserer Datensimcards geben lassen, da bei der PIN Abfrage leider ein zahlendreher unterlaufen ist. Naja auf jeden Fall speichern die Shops in Südafrika so sensible Daten wie eine PUK - die im Prinzip zugang zu einem gesperrten Phone verschafft.

Im Cafe hat Rommel dann ein paar online Transaktionen mit dem 3G Modem durchgeführt, welches ja jetzt wieder funktionierte - und übrigens mittlerweile in Ubuntu meinem Linuxsystem perfekt klappt.

Nebenbei bin ich nach 8ta gegangen um dort etwas über deren neue Deals zu erfahren, da wir in einigen von Hilltops Centern Internet anbieten wollen. Das Bild zeigt die Abdeckung UMTS in NeedsCamp Katjas Ort. 8ta ist ein neuer Telefonanbieter von Telkom, nachdem Vodafone sich Telkoms anteile an Vodacom gekauft hat müssen die komplett neu aufbauen.
Wir haben uns dazu entschieden mal eine kostenlose Prepaidkarte zu testen zu organisieren, denn wer weis wie genau die Karte oben ist ... Bevor man einen noch so billigen Deal für 24 Monate signed sollte man wissen obs geht.

Danach gings nach Checkers in Vincent, gesucht wurden zwei Hotplates für WoodyCape - R 129.00 eine - also ca. 12€ für eine zweigeteilte Kochplatte.
Wie man sieht im Hintergrund gibts schon Weihnachtsdeko - Anfang November und das in Südafrika bei sonnig warmen min- 25°C

Mittagessen haben wir uns bei einem Pakistanischen Curry Imbiss gegönnt - sehr empfehlenswert !
Naja und nachdem wir dann vor ner Stunde oder so zurück gekommen sind habe ich noch den Kühlschrank geputzt den ich heute morgen zum defrosten einfach nach draußen gestellt hatte. Jetzt ist der auch mal wieder sauber und ohne 3m dicke Eisschicht...

Morgen ... ja darüber werde ich gesondert berichten ... aber vermutlich erst am Montag, denn dann bin ich nochmal wieder in Quigney(East London) und habe kostenloses Internet.

© benste CC NC SA

05 Nov 2011 4:33pm GMT

31 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Sterkspruit Computer Center

Sterkspruit is one of Hilltops Computer Centres in the far north of Eastern Cape. On the trip to J'burg we've used the opportunity to take a look at the centre.

Pupils in the big classroom


The Trainer


School in Countryside


Adult Class in the Afternoon


"Town"


© benste CC NC SA

31 Oct 2011 4:58pm GMT

Benedict Stein: Technical Issues

What are you doing in an internet cafe if your ADSL and Faxline has been discontinued before months end. Well my idea was sitting outside and eating some ice cream.
At least it's sunny and not as rainy as on the weekend.


© benste CC NC SA

31 Oct 2011 3:11pm GMT

30 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Nellis Restaurant

For those who are traveling through Zastron - there is a very nice Restaurant which is serving delicious food at reasanable prices.
In addition they're selling home made juices jams and honey.




interior


home made specialities - the shop in the shop


the Bar


© benste CC NC SA

30 Oct 2011 4:47pm GMT

29 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: The way back from J'burg

Having the 10 - 12h trip from J'burg back to ELS I was able to take a lot of pcitures including these different roadsides

Plain Street


Orange River in its beginngings (near Lesotho)


Zastron Anglican Church


The Bridge in Between "Free State" and Eastern Cape next to Zastron


my new Background ;)


If you listen to GoogleMaps you'll end up traveling 50km of gravel road - as it was just renewed we didn't have that many problems and saved 1h compared to going the official way with all it's constructions sites




Freeway


getting dark


© benste CC NC SA

29 Oct 2011 4:23pm GMT

28 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Wie funktioniert eigentlich eine Baustelle ?

Klar einiges mag anders sein, vieles aber gleich - aber ein in Deutschland täglich übliches Bild einer Straßenbaustelle - wie läuft das eigentlich in Südafrika ?

Ersteinmal vorweg - NEIN keine Ureinwohner die mit den Händen graben - auch wenn hier mehr Manpower genutzt wird - sind sie fleißig mit Technologie am arbeiten.

Eine ganz normale "Bundesstraße"


und wie sie erweitert wird


gaaaanz viele LKWs


denn hier wird eine Seite über einen langen Abschnitt komplett gesperrt, so das eine Ampelschaltung mit hier 45 Minuten Wartezeit entsteht


Aber wenigstens scheinen die ihren Spaß zu haben ;) - Wie auch wir denn gücklicher Weise mussten wir nie länger als 10 min. warten.

© benste CC NC SA

28 Oct 2011 4:20pm GMT