25 May 2015

feedPlanet Plone

David "Pigeonflight" Bain: Plone 5 Beta 2 in the Cloud

These are quick instructions for installing Plone 5 Beta 2 on Codio.com. I have found no faster (or more newbie friendly) way to get a Plone sandbox up and running. I'll assume you have already signed up with Codio. If not, go ahead and do that first. Step 1 - Create a new Project On the Codio dashboard select Create Project Use Default as the Starting Point and click Create. Step 2

25 May 2015 3:00pm GMT

Gil Forcada: WPOD is approaching, are you ready for it?

This Friday Mai 29th we are celebrating the 2nd edition of WPOD!

Look around your city, meet with friends, make new ones and contribute to Plone!

As usual IRC will be full of contributors willing to help smooth your path into Plone, so don't be shy and ask ;)

For Berliners: we will meet again at der Freitag offices, please RSVP at meetup.

See you there/in IRC!

25 May 2015 1:54pm GMT

Asko Soukka: Customize Plone 5 default theme on the fly

When I recently wrote about, how to reintroduce ploneCustom for Plone5 TTW (through the web) by yourself, I got some feedback that it was the wrong thing to do. And the correct way would always be to create your custom theme.

If you are ready to let the precious ploneCustom go, here's how to currently customize the default Barceloneta theme on the fly by creating a new custom theme.

Inherit a new theme from Barceloneta

So, let's customize a brand new Plone 5 site by creating a new theme, which inherits everything from Barceloneta theme, yet allows us to add additional rules and styles:

  1. Open Site Setup and Theming control panel.

  2. Create New theme, not yet activated, with title mytheme(or your own title, once you get the concept)

  3. In the opened theme editor, replace the contents of rules.xmlwith the following code:


    <?xml version="1.0" encoding="UTF-8"?>
    <rules
    xmlns="http://namespaces.plone.org/diazo"
    xmlns:css="http://namespaces.plone.org/diazo/css"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xi="http://www.w3.org/2001/XInclude">

    <!-- Import Barceloneta rules -->
    <xi:include href="++theme++barceloneta/rules.xml" />

    <rules css:if-content="#visual-portal-wrapper">
    <!-- Placeholder for your own additional rules -->
    </rules>

    </rules>
  4. Still in the theme editor, add New file with name styles.lessand edit and Save it with the following content:


    /* Import Barceloneta styles */
    @import "++theme++barceloneta/less/barceloneta.plone.less";

    /* Customize navbar color */
    @plone-sitenav-bg: pink;
    @plone-sitenav-link-hover-bg: darken(pink, 20%);

    /* Customize navbar text color */
    .plone-nav > li > a {
    color: @plone-text-color;
    }

    /* Customize search button */
    #searchGadget_form .searchButton {
    /* Re-use mixin from Barceloneta */
    .button-variant(@plone-text-color, pink, @plone-gray-lighter);
    }

    /* Inspect Barceloneta theme (and its less-folder) for more... */

But before activating the new theme, there's one more manual step to do...

Register and build a new LESS bundle

We just created a new LESS file, which would import the main Barceloneta LESS file at first, and then add our own additional styles with using some features from LESS syntax. To actually make that LESS file into a usable CSS (through the browser), we need register a new bundle for it, and build it:

  1. Open Site Setup and Resource Registries control panel.

  2. Add resource with name mytheme and a single CSS/LESS file with path ++theme++mytheme/styles.less to locate the file we just added into our theme:

    http://2.bp.blogspot.com/-cUE7pFkPMhY/VWJsTaQYOhI/AAAAAAAAAps/HaW5g6OCNJY/s1600/resource.png
  3. Save.

  4. Add bundle with name mytheme, requiring mytheme resoure, which we just created and Does your bundle contain any RequireJS or LESS files?checked:

    http://1.bp.blogspot.com/-6sXxYJmR80o/VWJsTXQ86aI/AAAAAAAAApo/IQmHdiaWRrE/s1600/bundle.png
  5. Save.

  6. Build mytheme bundle.

Now you should be ready to return back to Theming control panel, activate the theme, and see the gorgeous pink navigation bar:

http://4.bp.blogspot.com/-PPj1JGOUNDY/VWJsTW6_76I/AAAAAAAAApw/K31MZDUf8-c/s1600/result.png

Note: To really be a good citizen and follow the rules, there's a few additional steps:

  1. Add production-css setting into your theme's manifest.cfg to point to the compiled CSS bundle:


    [theme]
    title = mytheme
    description =
    production-css = /++plone++static/mytheme-compiled.css
  2. In Resource Registries, disable mytheme bundle by unchecking its Enabled checkbox and clicking Save.

  3. Deactivate and activate the theme once.

Technically this changes the CSS bundle to be registered as a so called Diazo bundle instead of a regular bundle. The difference is that Diazo bundle is always rendered last and can therefore override any CSS rule introduced the other enabled bundles. Also, as a Diazo bundle it get disabled and enabled properly when the active gets changed.

25 May 2015 1:05am GMT

23 May 2015

feedPlanet Plone

Gil Forcada: Fast clones for faster CI

Did you ever got tired waiting a repository (managed by mr.developer) to be cloned?

Wait no more! mr.developer 1.32 comes with two new options to make your life easier:

What's the benefit of this? See it for yourself:

git clone https://github.com/plone/buildout.coredev.git test-full
du -sh test-full
36M test-full

git clone https://github.com/plone/buildout.coredev.git --depth=1 test-shallow
du -sh test-shallow
1,5M    test-shallow

That's a 24x saving.

So think about your CI environments where they are downloading over and over the whole repository information while you only care about the very last changes.

Think about some remote environments (mobile connection while you are on a train?)

Update: it seems that is completely broken, sorry for that, working on a fix.

Update 2: mr.developer 1.33 released, thanks @fschulze for the new release!

23 May 2015 2:44pm GMT

22 May 2015

feedPlanet Plone

Maurits van Rees: zest.releaser 4.0: major update

Contents

zest.releaser 4.0 has various improvements. We have better errors and warnings, did some cleanup, added little tricks, and have a recommended list of extra packages to use in combination with zest.releaser: check-manifest, pyroma, wheel, and twine.

Main improvements

Errors and warnings

We are better at showing errors. This is especially true for errors when uploading to PyPI. They were unintentionally swallowed before, so you did not notice when an upload failed. Oops, sorry about that.

Errors and warnings are more noticeable, because we colorize them. Errors are red, warnings are magenta. We use the colorama package for this.

We not only do this for lines that are printed explicitly by zest.releaser itself, but try to do this also for output from programs that we call, like check-manifest and python setup.py. This is a bit tricky though. Program should print standard messages to the standard output file and errors and warnings to the standard error file. Not all do that. So we might be missing some colors.

We allow retrying some commands when there is an error. Currently this is only done for commands that talk to PyPI or another package index. We ask the user if she wants to retry: Yes, no, quit. If for example PyPI upload fails because you have the wrong credentials in your ~/.pypirc, you can edit this file and tell zest.releaser to retry the command.

Cleanup

Python 2.6 is not officially supported anymore. It will probably still work, but we are no longer testing against it, so issues may start creeping in. This says nothing about the packages that are created: zest.releaser on Python 2.7 is still perfectly capable of creating distributions for older Python versions. Or for Python 3.

Sorry, zest.releaser itself does not run on Python 3. At least I have not tried it, and certainly the tests will be a pain to fix for both 2 and 3.

We have removed code for support of collective.sdist. That package was a backport from distutils for Python 2.5 and earlier, which we do not support.

Little tricks

We do not accept y or n as answer for a new version. I saw that with a few packages, which seems an obvious mistake.

When doing a postrelease, we add a always edit the changelog file to get a new version section with the text '- Nothing changed yet'. Now, when you do a prerelease and no real changelog entry has been added, so the text is still there, we warn and ask if you really want a new release. Probably you want to stop, check your version control history, and add one or more proper changelog entries.

zest.releaser makes commits in the prerelease and postrelease phase. Something like Preparing release 1.0 and Back to development: 1.1. You can add extra text to these messages by configuration in your setup.cfg or global ~/.pypirc.

One nice use case for this is telling Travis or Jenkins to skip Continuous Integration builds, like this:

[zest.releaser]
extra-message = [ci skip]

This depends on how your testing server is setup. It might not have this enabled, or it might be looking for a different message.

Of course, you can also add your favorite geeky quotes there:

[zest.releaser]
extra-message =
    No one expects the Spanish inquisition!
    So long and thanks for all the fish.

recommended extra

We have a list of recommended extra packages that we think are helpful for most users of zest.releaser. This is in the zest.releaser[recommended] extra. So if you use pip, you can install those like this:

pip install zest.releaser[recommended]

With buildout, you would use a section like this:

[script]
recipe = zc.recipe.egg
dependent-scripts = true
eggs =
   zest.releaser[recommended]

What is currently in this list? We have check-manifest, pyroma, wheel, and twine.

check-manifest

This checks if the distribution that zest.releaser will create of your package contains all the files that you expect to be there. In some cases your setup.py will contain enough information for Python to include the correct files. Or you rely on a helper like setuptools-git, but then you miss files when someone else who does not have this helper package makes the next release. So in a lot of cases you will want to use a MANIFEST.in file. As an example, zest.releaser itself currently has this:

recursive-include zest *
recursive-include doc *
include *
exclude .installed.cfg
global-exclude *.pyc

You can run check-manifest as a standalone command. When used in combination with zest.releaser, the prerelease part calls the check-manifest command for you. That may look like this:

$ fullrelease 
INFO: Starting prerelease.
Do you want to run check-manifest? (Y/n)? 
listing source files under version control:
6 files and directories
building an sdist: mauritstestpackage-0.2.dev0.tar.gz:
6 files and directories
copying source files to a temporary directory
building a clean sdist: mauritstestpackage-0.2.dev0.tar.gz:
5 files and directories
files in version control do not match the sdist!
missing from sdist:
  CHANGES.rst
suggested MANIFEST.in rules:
  include *.rst
MANIFEST.in is not in order.
Do you want to continue despite that? (y/N)? 

check-manifest may report some files as missing from the source distribution where you know this is fine. You can tell check-manifest to ignore those by adding a setup.cfg file next to your setup.py. ``zest.releaser`` itself currently has this:

[check-manifest]
ignore =
    doc/build
    doc/build/*

We could add some more there, like bootstrap.py, buildout.cfg, and .travis.yml.

For more info on check-manifest, see its PyPI page: https://pypi.python.org/pypi/check-manifest

pyroma

This does various checks on your package. Most of them are about your setup.py. This results in a rating between zero (bad) and ten (good). The rating is also given with the name of a cheese. If your package 'smells' good, you get a better cheese. Where does the name of the package come from? It checks the aroma of your package: Python + aroma = pyroma.

You can run pyroma as a standalone command. When used in combination with zest.releaser, the prerelease part calls the pyroma command for you. That may look like this:

$ fullrelease 
INFO: Starting prerelease.
Do you want to run check-manifest? (Y/n)? n
Run pyroma on the package before tagging? (Y/n)? 
INFO: ------------------------------
INFO: Checking /Users/mauritsvanrees/own/mauritstestpackage
INFO: Found mauritstestpackage
INFO: ------------------------------
INFO: Your package does not have classifiers data.
INFO: You should specify what Python versions you support.
INFO: Your package does not have keywords data.
INFO: Your package does not have author_email data.
INFO: Setuptools and Distribute support running tests.
      By specifying a test suite, it's easy to find and
      run tests both for automated tools and humans.
INFO: ------------------------------
INFO: Final rating: 6/10
INFO: Compté
INFO: ------------------------------
Continue? (Y/n)? n

For more info on pyroma, see its PyPI page: https://pypi.python.org/pypi/pyroma

wheel

There are various package formats for Python. Until now, zest.releaser was only doing source distributions, by calling python setup.py sdist. We still do this. Note that we have never released binary eggs (bdist_egg).

wheels are the shiny new Python package distribution format. zest.releaser 4.0 supports creating them and pushing them to PyPI.

Should you want this? Maybe. See http://pythonwheels.com for deciding whether this is a good idea for your package. Briefly, if it is a pure Python 2 or pure Python 3 package: just do it.

Also, if you are using buildout: sorry, buildout currently (2.3.1) does not support wheels. It is fine to create them, but you should still create a source distribution as well. zest.releaser does that, like it always has.

Say you want zest.releaser to create a wheel. How do you do this? You add a setup.cfg file in the top level directory of your package, so next to setup.py:

[zest.releaser]
create-wheel = yes

Or if you know you want this for all your packages, you can also do this globally by adding the same text in your ~/.pypirc.

zest.releaser then takes care of the rest: when releasing, it creates a plain old source distribution and a shiny new wheel and uploads them to the package index.

For more info on wheel, see its PyPI page: https://pypi.python.org/pypi/wheel

twine

Since version 4.0, we prefer twine for uploading to the Python Package Index, because it is safer: it uses https for uploading. If the twine command is available, it is used for uploading to PyPI.

Note that we call the twine command directly. If the twine command is not available, you may need to change your system PATH. In the case of buildout, you may need to say dependent-scripts = true in the section where you add the zest.releaser[recommended] egg.

How does it look? When I used zest.releaser to release itself, it looked like this:

INFO: This package is registered on PyPI.
Register and upload to pypi (Y/n)? 
INFO: Running: twine upload dist/* -r pypi
Uploading distributions to https://pypi.python.org/pypi
Uploading zest.releaser-4.0-py2-none-any.whl
Uploading zest.releaser-4.0.tar.gz

When your package is not registered yet on PyPI, twine will currently fail. So you have to register the package manually. Use your browser to login at PyPI and then use the package registration form.

We could consider letting zest.releaser call the old-style python setup.py register instead, or ask the user. If you have thoughts about this, you can use our issue tracker to start a discussion.

As an aside, did you know that there is a test website for the Python Package Index? You can use this for making test releases of packages. If you want to take zest.releaser for a test run and do not want to publish your package on the real PyPI yet, you can release to https://testpypi.python.org. Edit your ~/.pypirc file to something like this:

[distutils]
index-servers =
    pypi
    testpypi

[pypi]
# default repository: https://pypi.python.org/pypi
username:maurits
password:secret

[testpypi]
repository:https://testpypi.python.org/pypi 
username:maurits
password:secret

Then when running our fullrelease or release command, answer 'no' when zest.releaser asks if you want to upload to pypi and answer 'yes' when asked to upload to testpypi.

For more info on twine, see its PyPI page: https://pypi.python.org/pypi/twine

That's it. I hope you enjoyed reading about the improvements. Now go use it! Get it with pip install zest.releaser or pip install zest.releaser[recommended] at https://pypi.python.org/pypi/ or read more at http://zestreleaser.readthedocs.org

22 May 2015 10:57pm GMT

Davide Moro: Kotti CMS - ElasticSearch integration

Announcing a new Kotti CMS (Python web framework based on Pylons/Pyramid and SQLAlchemy) plugin that provides ElasticSearch integration for fulltext search and indexing:

Development status? It should be considered experimental because this is the very first implementation. So any kind of help will be very appreciated! Beer, testing, pull releases, feedback, improving test coverage and so on.

Acknowledgements

kotti_es is based on a pyramid_es fork (https://github.com/truelab/pyramid_es/tree/feature-wrapper, there is a PR in progress). The pyramid_es author is Scott Torborg (https://github.com/storborg).

Configuration

The configuration is very simple.

Just enable the kotti_es plugin just add the kotti_es plugin, choose the index name and elastic search server addresses.

From the kotti_es README file:

kotti.configurators =
kotti_es.kotti_configure

elastic.index = your_project
elastic.servers = localhost:9200
elastic.ensure_index_on_start = 1
kotti_es.blacklist =
Image
...

kotti.search_content = kotti_es.util.es_search_content

Index already existing contents

With kotti_es you can reindex all your already existing contents without any change to the original Kotti code base with just one command:

$ reindex_es -c app.ini

So kotti_es plays well with models defined by third party plugins that are not ElasticSearch aware. You can install kotti_es on an already existing Kotti instance.

Custom behaviours

If you want you can override/extend the default indexing policy just registering your custom adapter. See the kotti_es tests for more info.

So no need to change existing models, no need to inherit from mixin classes and so on.

Video

kotti_es in action:

Wanna know more about Kotti CMS?

If you want to know more about Kotti CMS have a look at:

All Kotti posts published by @davidemoro:

22 May 2015 10:51pm GMT

Reinout van Rees: Pygrunn: ZeroMQ - Pieter Hintjens

(One of the summaries of the 2015 Pygrunn conference)

Pieter Hintjens has quite some some experience with distributed systems. Distributed systems are, to him, about making our systems look more like the real world. The real world is distributed.

Writing distributed systems is hard. You need a big stack. The reason that we're using http such a lot is because that was one of the first ones that is pretty simple and that we could understand. Almost everything seems to be http now.

Three comments:

  • So: the costs of such a system must be low. He really likes ZeroMQ, especially because it makes it cheap.

  • We lack a lot of knowledge. The people that can do it well are few. Ideally, the community should be bigger. We have to build the culture, build the knowledge. Zeromq is one of the first bigger open source projects that succeeded.

  • Conway's law: an organization will build software that looks like itself. A centralized power-hungry organization will probably build centralized power-hungry software.

    So: if you want to write distributed systems stuff, your organization has to be distributed!

    Who has meetings in his company? They are bad bad bad. They're blocking. You have to "synchronize state" and wait for agreement. A conference like pygrunn is fine: meeting people is fine. At pygrunn, there's no state synchronization. Imagine that it were a meeting to agree on a standard editor...

In a distributed system, what you really want is participation. Open source development needs pull requests, so to say.

A question about making money from open source resulted in a rant (I don't mean the term very negatively, here) about open source software being the only way to produce valuable software. "You might as well ask about how you can make money from a free school system". "It is idiotic to ask the question". And some things about people believing things because someone says it is so (like "you can only make money with ...") without thinking themselves.

Something to emulate: our food system. None of us owns the complete food system. Nobody owns the full food system. But it works! Lots of smaller and bigger actors. And everyone had breakfast and lunch today. The system works. This kind of distributed system is an example to emulate in our open source software.

Nice comparison when asked about succesful commercial software. Gmail is a succesful example, but that's something that grew pretty organically. Compare that with google wave or google plus: who even remembers them? Those were vision driven software. Made based on money. A failure.

22 May 2015 6:49pm GMT

Reinout van Rees: Pygrunn: Orchestrating Python projects using CoreOS - Oscar Vilaplana

(One of the summaries of the 2015 Pygrunn conference)

(Note: Oscar Vilaplana had a lot of info in his presentation and also a lot on his slides, so this summary is not as elaborate as what he told us. Wait for the video for the full version.)

"Orchestrating python": why? He cares about reliability. You need a static application environment. Reliable deployments. Easy and reliable continuous integration. And self-healing. Nice is if it is also portable.

A common way to make scalable systems is to use microservices. You compose, mix and extend them into bigger wholes. Ideally it is "cluster-first": also locally you test with a couple of instances. A "microservices architecture".

Wouldn't it be nice to take the "blue pill" and move to a different reality? One in where you have small services, each running in a separate container without a care for what occurs around it? No sysadmin stuff? And similary the smart infrastructure people only have to deal with generic containers that can't break anything.

He did a little demo with rethinkdb and flask.

For the demo it uses coreOS: kernel + docker + etcd. CoreOS uses a read-only root filesystem and it by design doesn't have a package manager. Journald for logging (it automatically captures the stdout). Systemd for managing processes.

etcd? It is a distributed configuration store. It has a http API.

Also: "fleet". "systemd for services". It starts up the containers. It coordinates accross the cluster. It will re-start containers if they die.

How do we get containers to talk to each other? They're containerized... For that there's "flannel": "dhcp for containers". Per-cluster specific subnet. Per-machine smaller subnet. The best system to run all this is Kubernetes.

Kubernetes uses "replication controllers". The basis is a "pod", from which multiple replicas are made, depending on the amount of instances you need.

He then showed a demo. Including a rolling update. Nice. Similarly for a rethinkdb cluster where he increased the number of nodes halfway the demo. Nice, too.

In development, it might be easy to use "nspawn" instead of docker. It is mostly the same, only less isolated (which is handy for development).

22 May 2015 1:50pm GMT

Andreas Jung: XML-driven Plone portal "Onkopedia" finally online

Onkopedia is a medical guideline portal in the field of hematology and oncology. It is based on the Plone content management system and driven by an XML publishing workflow with the conversion from DOCX to XML/HTML and PDF.

22 May 2015 12:01pm GMT

Reinout van Rees: Pygrunn: Laurence de Jong - Towards a web framework for distributed apps

(One of the summaries of the 2015 Pygrunn conference)

Laurence de Jong is a graduate student.

Everyone uses the internet. Many of the most-used sites are centralized. Centralization means control. It also gives scale advantages, like with gmail's great spam filter.

It also has drawbacks. If the site goes down, it is really down. Another drawback is the control they have over our data and what they do with it. If you're not paying for it, you're the product being sold. Also: eavesdropping. Centralized data makes it easy for agencies to collect the data. And: censorship!

A better way would be decentralized websites. There are existing decentralized things like Freenet, but they're a pain to install and the content on there is not the content you want to see... And part of it is stored on your harddisk...

See also Mealstrom, which distributes websites as torrents. A problem there is the non-existence of proper decentralized DNS: you have unreadable hashes.

A solution could be the blockchain system from bitcoin. It is called namecoin. This way, you could store secure DNS records to torrent hashes in a decentralized way.

https://github.com/HelloZeroNet/ZeroNet uses namecoin to have proper DNS addresses and to download the website via bittorrent. Not many people use it right now.

And.... the websites you download right now are all static. We want dynamic content! You can do even that with blockchains. An example is the decentralized twitter alternative http://twister.net.co/ . Mostly used by chinese people because twitter is mostly unavailable there.

There are problems, of course. Where do you store your data? Agencies can still do traffic analysis. How do you manage your private keys? Aren't we getting browsers wars all over again? And can your mom install it (answer: no, it is too hard).

An extra problem is more technical: distributed hash tables are considered unsafe.

And... in the end, if you use hashes for everything (like every individual tweet, email and webpage), that's a lot of hashes to store, partially locally. So it isn't the solution, but at least it is a solution.

22 May 2015 11:58am GMT

Reinout van Rees: Pygrunn: Data acquisition with the Vlermv database - Thomas Levine

(One of the summaries of the 2015 Pygrunn conference)

Thomas Levine wrote vlermv. A simple "kind of database" by using folders and files. Python is always a bit verbose when dealing with files, so that's why he wrote vlermv.

Usage:

from vlermv import Vlermv
vlermv = Vlermv('/tmp/a-directory')

vlermv['filename'] = 'something'
# ^^^ This saves a python pickle with 'something' to /tmp/a-directory/filename

The advantage is that the results are always readable, even if you lose the original program.

You can choose a different serializer, for intance json instead of pickle.

You can also choose your own key_transformer. A key_transformer translates a key to a filename. Handy if you want to use a datetime or tuple as a key, for instance.

The two hard things in computer science are:

  • Cache invalidation.
  • Naming things.

Cache invalidation? Well, vlermv doesn't do cache invalidation, so that's easy. Naming things? Well, the name 'vlermv' comes from typing randomly on his (dvorak) keyboard... :-)

Testing an app that uses vlermv is easy: you can mock the entire database with a simple python dictionary.

What if vlermv is too new for you? You can use the standard library shelve module that does mostly the same, only it stores everything in one file.

A drawback of vlermv: it is quite slow.

Fancy full-featured databases are fast and nice, but do you really need all those features? If not, wouldn't you be better served by a simple vlermv database? You might even use it as a replacement for mongodb! That one is used often only because it is so easy to start with and so easy to create a database. If you don't have a lot of data, vlermv might be a much better fit.

22 May 2015 11:07am GMT

Reinout van Rees: Pygrunn: Reliable distributed task scheduling - Niels Hageman

(One of the summaries of the 2015 Pygrunn conference)

Note: see Niels Hageman's somewhat-related talk from 2012 . Niels works at Paylogic . Wow, the room was packed.

They discovered the normal problem of operations that took too long for the regular request/response cycle. The normal solution is to use a task queue. Some requirements:

  • Support python, as most of their code is in python.
  • It has to be super-reliable. It also needs to allow running in multiple data centers (for redundacy).
  • Ideally, a low-maintenance solution as they already have enough other work.

Option 1: celery + rabbitMQ. It is widely used and relatively easy to use. But rabbitMQ was unreliable. With alarming frequency, the two queues in the two datacenters lost sync. They also got clogged from time to time.

Option 2: celery + mysql. They already use mysql, which is an advantage. But... the combination was buggy and not-production ready.

Option 3: gearman with mysql. Python bindings were buggy and non-maintained. And you could also run one gearman bundle, so multiple datacenters was out of the window.

Option 4: do it yourself. They did this and ended up with "Taskman" (which I couldn't find online, they're planning on making it open source later on: they still need to add installation documentation).

The backend? They started with mysql. It is a great relational database, but it isn't a great queue. There is a saying on the internet: Thou shalt not use thine database as a task queue. With some adjustments, like autocommit, they got it working nicely anyway.

The task server consists of a python daemon (running under supervisor) and a separate task runner. It runs in a separate process to provide isolation and resource control.

Of course, the task server needs to be integrated in the main server. The task server is written as an independent application, so how does the task finder find the python functions it needs to run? They do this via "server plugins" that define which environment variables are needed, which python path you need and which function and which version you need. All this gets applied by the task runner and subsequently it can import and run the function.

Some additional features of their task runner:

  • Tasks can report progress.
  • Tasks can be aborted.
  • Task start time can be constrained.
  • There's exception handling.

Some of the properties of taskman: it is optimized for long running tasks. And: it is designed for reliability. Very necessary, as Paylogic is a payment processor.

It also means it is less suited when you have lots of little tasks. Running everything as a separate process is fine for longer-running processes, but it is too heavy-weight for lots of small tasks. Oh, and there's no admin UI yet: he uses phpmysqladmin :-)

22 May 2015 10:28am GMT

Reinout van Rees: Pygrunn: Python, WebRTC and You - Saúl Ibarra Corretgé

(One of the summaries of the 2015 Pygrunn conference )

Saúl Ibarra Corretgé does telecom and VOIP stuff for his work, which is what webRTC calls legacy :-)

webRTC is Real-Time Communication for the web via simple APIs. So: voice calling, video chat, P2P file sharing without needing internal or external plugins.

Basically it is a big pile of C++ that sits in your browser. One of the implementations is http://www.webrtc.org/. Some people say that webRTC stand for Well, Everybody Better Restart Their Chrome. Because the browser support is mostly limited to chrome. There's a plugin for IE/safari, though.

There are several javascript libraries for webRTC. They help you set up a secure connection to another person (a "RTCPeerConnection"). The connection is directly, if possible. If not, due to firewalls for instance, you can use an external server. It uses ICE, which means Interactive Connectivity Establishment (see ICE trickle which he apparently used). A way to set up the connection.

Once you have a connection, you have an RTCDataChannel. Which you can use, for instance, to send a file from one browser to another.

As a testcase, he wrote Call Roulette. The app is in python, but in the browser javascript is used as that is more-or-less the native way to do it. The "call roulette" app connects a user to a random other user. Users will send simple json requests to the app. Once the app finds two candidates, both get the other's data to set up a subsequent webRTC connection.

He made the toy app in python 3.3 because it is new. It has websockets. And async via asyncio "because async is modern :-)". All, nice new and shiny.

So: users connect from their browser with a websocket connection to the app. They are paired up and the webRTC connection data is send back. Very fast.

Fun: light-weight django-models-like models via https://pypi.python.org/pypi/jsonmodels/ ! Look it up.

He did a live demo with web video with someone from the audience. Worked basically like a charm.

22 May 2015 9:15am GMT

Reinout van Rees: Pygrunn: IPython and MongoDB as big data scratchpads - Jens de Smit

(One of the summaries of the 2015 Pygrunn conference )

A show of hand: about half the people in the room have used mongodb and half used ipython notebooks. There's not a lot of overlap.

Jens de Smit works for optiver, a financial company. A "high-frequency trader", so they use a lot of data and they do a lot of calculations. They do a lot of financial transactions and they need to monitor if they made the right trades.

Trading is now almost exclusively done electronically. Waving hands and shouting on the trading floor at a stock exchange is mostly a thing of the past. Match-making between supply and demand is done centrally. It started 15 years ago. The volume of transactions really exploded. Interesting fact: the response time has gone from 300ms to just 1ms!

So... being fast is important in electronic trading. If you're slow, you trade at the wrong prices. Trading at the wrong prices means losing money. So speed is important. Just as making the right choices.

What he had to do is to figure out how fast an order was made and wether it was a good order. Non-intrusively. So: what market event did we react to? What was the automatic trade decision (done by an algorithm)? Was it a good one? How long did it all take?

So he monitors data going in and out of their system. He couldn't change the base system, so: log files, network data and an accounting database. Most of the data is poorly indexed. And a very low signal-to-noise ratio. And of course the logfiles aren't all consistent. And documentation is bad.

Oh, and the data size is of course also to big to fit in memory :-)

He used mongodb. A schemaless json (well, bson, binary version of json) store. Great for messy data. Easy to use. Just put in a python dictionary, basically. The data is persisted to disk, but as long as you have enough RAM, it'll keep it in memory. Very fast that way. You get indexes and speedups by default.

After he managed to get everything into mongodb, he had to make sense of things. So: correlate decision logs to network data. This is easy for humans to spot, but hard for computers. Computers are good at exact matches, humans are better at inexact pattern matches.

He used ipython notebook, a nice interactive python shell with a browser interface. Including matplotlib integration for easy graphs. Syntax highlighting; you can render html inside the shell; you can save your work at the end of the day (which you can't with a regular python shell!); inline editing.

Nice: since last week, rendering such notebooks is supported by github. (I guess he means this announcement ).

Now mongodb. It is very simple to create a directory and start mongodb. If you stop mongo and delete the directory, it is gone as if it was never there. Easy. And with pymongo it is just a few lines of python code and you're set. Including a handy query language.

He showed a couple of code examples. Looked pretty handy.

Creating an index is a oneliner. If you know beforehand what kinds of queries you want to do, you can quickly create an index for it, which speeds up your queries a lot. You can make complex indexes, but in his experience, simple single-field indexes are often enough.

Something to watch out for: mongo does never return disk space to the OS. If you delete lots of objects, the OS doesn't get it back unless you shut mongodb down and "repair" the database. What he does is simply delete the database at the end of the day!

He showed one of the outputs: a graph with response times which immediately showed that several responses were too slow. Good, useful information. One year ago he wouldn't have dreamt of being able to do this sort of analysis.

Mongo is very useful for this kind of work. You use mongodb's strengths and you aren't bothered by many of the drawbacks, like missing transactions.

22 May 2015 8:34am GMT

Reinout van Rees: Pygrunn: Leveraging procedural knowledge - K Rain Leander

(One of the summaries of the 2015 Pygrunn conference )

K Rain Leander works at Red Hat and yes, she wore a bright red hat :-) She's a python and django newbie. She knows how it is to be a newbie: there is so much in linux that there are always areas where you're a complete newbie. So everyone is helpful there.

"Amsterdam is the capital of the netherlands" is declarative knowledge. Procedural knowledge is things like learning to ride a bike or a knew language. So: What versus How. You might know declaratively how to swim, but procedurally you might still drown: you need to practice and try.

Some background: she was a dancer in the USA. Unless you're famous, you barely scrape by financially. So she started teaching herself new languages. Both real-life languages and computer languages. Css, html for starters. And she kept learning.

She got a job at Red Hat. You have to pass a RHCE certification test within 90 days of starting work there - or you're fired. She made it. She

She has military background. In bootcamp, the purpose is not the pushups and the long runs. The goal is to break you down so that you jump when they say "jump".

In the Red Hat bootcamp, the goal is not making the test. The goal is to figure out if you're able to drink from the firehose. Which means if you get a support request, you say "I'll figure it out for you" and you just dive in and try to figure it out. You have to be able to dive into a whole lot of new information without panicking. That's drinking from the firehose.

She re-used existing knowledge and previous skills to learn everything. The important part was not being afraid to dive in.

She moved towards programming. Python, django. She was new to it. One of the first steps? "Set up a virtualenv and....". It can frighten you, but it is just a question of RTFM. Just read the manual. Just read it and then start doing it.

She went to a Django Girls Workshop. (One of the results: http://leanderthalblog.herokuapp.com/). Django girls does a really good job of providing material and documentation. She had some problems installing it, but continued (and succeeded) anyway.

... and then someone challenged her to deploy it on openshift. http://django-leanderthal.rhcloud.com/ It hasn't succeeded completely yet. But she'll persevere and get it working.

She recommends http://learnpythonthehardway.org/ to learn python.

What's next: she'll practice, practice, practice. And she'll contribute to the community. Probably build one or two apps. And she'll be a coach at the upcoming Groningen django girls workshop ("as a coach. No, I'm not worried....")

So: re-use your existing knowledge and build from there. Don't be afraid. Just do it.

22 May 2015 7:45am GMT