18 Dec 2017

feedPlanet Twisted

Moshe Zadka: Write Python like an expert

Ten tricks to level up your Python.

Trick 0 -- KISS

Experts know about the weird dark corners of Python -- but do not use them in production code. The first tip is remembering that while Python has some interesting corners, they are best avoided in production code.

Make your code as straightforward as possible.

Trick 1 -- The power of lists

The humble list, or even humbler [], pack a lot of punch -- for those who know how to use it.

It serves, of course, as a useful array type. It is also a good stack, using append and pop(), with the correct (amortized) performance characteristic. The .sort() method is sophisticated enough it is one of the few cases where Python actually broke new theoretical grounds on a sorting algorithm -- timsort was originally invented for it.

Trick 2 -- The power of dicts

The humble dict, or even humbler {}, also pack a lot of punch.

While many use string keys, it is important to remember any immutable type is possible as keys, including tuples and frozensets. This helps writing caches, memoizers or even a passable sparse array.

The keyword argument constructor also gives it a lot of power for making simple and readable APIs.

Trick 3 -- Iterators and generators

The iterator protocol is one of the most powerful aspects of Python. Experts understand it deeply, and know how to use it to make code shorter, more readable, more composable and more debuggable.

One of the easiest ways to accomplish it is to write functions that accept an iterator and return an iterator: and remembering that generators are really good syntactic sugar for writing functions which return iterators.

If a code base has a lot of functions that return iterators, the iterator algebra functions in itertools become immediately higher value.

Trick 4 -- Collections

The collections module has a lot of wonderful functionality.

For code that needs defaults, defaultdict.

For code that needs counting, Counter.

For FIFOs, deque.

Trick 5 -- attrs

One thing that is not wonderful about the collections module is the namedtuple class.

In almost every way imaginable, the attrs package is better. Also, for things that wouldn't be namedtuples otherwise, attrs is still better.

Trick 6 -- First class functions and types

Return functions. Store them in lists, or dictionaries. Keep classes in a double-ended queue. These are not a "Python does what". These are ways to avoid boilerplate or needless indirections.

Trick 7 -- Unit tests and lint

Experts hate having to waste time. Writing unit tests makes sure they have to fix any given bug only once. Correctly configuring a linter makes sure they do not have to comment on every pull request with a list of nitpicks.

Trick 8 -- Immutability

Immutable data structures, such as those available from the Pyrsistent library, are useful for avoiding a lot of bugs. "Global mutable state is the root of all evil" -- and if you cannot get rid of things being global (modules, function defaults and other things) it is often possible to make them mutable.

Immutable data structures are much easier to reason about, and much harder to make bugs that are hard to find and trigger.

Trick 9 -- Not reinventing the wheel

If something is available as a wheel, don't reinvent it. PyPI has ~125K packages, at times of writing. It is almost certain that it has something that takes care of some of the task you are currently working on.

How to know what's worthwhile?

Follow Planet Python, check Awesome python and, if it is within reach, try to go to Python meetups or conferences. (If it's not, of even if it is, PyVideo has the videos -- but talking to other Python programmers is extremely useful.)

18 Dec 2017 6:00am GMT

17 Dec 2017

feedPlanet Python

Weekly Python StackOverflow Report: (civ) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2017-12-17 11:01:06 GMT


  1. Triple nested lists Python - [11/9]
  2. Python sort(reverse=True) does not exactly reverse the output? - [11/4]
  3. List of tuples into a binary table? - [9/7]
  4. Comparison operators vs "rich comparison" methods in Python - [9/1]
  5. Is there an efficient way of creating a list of tuples using a range? - [8/5]
  6. Replace all but the last occurrence of a character in a dataframe - [8/2]
  7. When existence of nonlocal variables is checked? - [7/2]
  8. Why does operating on what seems to be a copy of data modify the original data? - [7/2]
  9. Are Pandas' dataframes (Python) closer to R's dataframes or datatables? - [7/1]
  10. Python list keep value only if equal to n predecessors - [6/2]

17 Dec 2017 11:01am GMT

Weekly Python StackOverflow Report: (civ) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2017-12-17 11:01:06 GMT


  1. Triple nested lists Python - [11/9]
  2. Python sort(reverse=True) does not exactly reverse the output? - [11/4]
  3. List of tuples into a binary table? - [9/7]
  4. Comparison operators vs "rich comparison" methods in Python - [9/1]
  5. Is there an efficient way of creating a list of tuples using a range? - [8/5]
  6. Replace all but the last occurrence of a character in a dataframe - [8/2]
  7. When existence of nonlocal variables is checked? - [7/2]
  8. Why does operating on what seems to be a copy of data modify the original data? - [7/2]
  9. Are Pandas' dataframes (Python) closer to R's dataframes or datatables? - [7/1]
  10. Python list keep value only if equal to n predecessors - [6/2]

17 Dec 2017 11:01am GMT

Import Python: #155: Instagram opensourcing MonkeyType, Dict to retain insertion order and more

Worthy Read

GoCD is a continuous delivery tool supporting modern infrastructure with elastic on-demand agents and cloud deployments. With GoCD, you can easily model, orchestrate and visualize complex workflows from end to end. It's open source, free to use and download.
advert

Today we are excited to announce we're open-sourcing MonkeyType, our tool for automatically adding type annotations to your Python 3 code via runtime tracing of types seen.
instagram
,
opensource

Tweet
core-python

PrettyPrinter is a powerful, syntax-highlighting, and declarative pretty printer for Python 3.6+. It uses a modified Wadler-Leijen layout algorithm, similar to those used in Haskell pretty printer libraries prettyprinter and ansi-wl-pprint, JavaScript's Prettier, Ruby's prettyprinter.rb and IPython's IPython.lib.pretty. It combines the best parts of each and builds more on top to produce the most powerful pretty printer in Python to date.
prettify

Regular Expressions are commonly used in Linux command line tools like sed, awk, grep etc. Most programming languages support them in either built - in or through an external library. The main problem of using them is that they difficult to understand, but they are well worth the effort to learn. Using a regular expression can save you a lot of time.
regular expression

Improve DevOps Testing: 4-part eBook to learn how to detect problems earlier in your DevOps processes.
advert

Kenneth Reitz has contributed many things to the Python community, including projects such as Requests, Pipenv, and Maya. He also started the community written Hitchhiker's Guide to Python, and serves on the board of the Python Software Foundation. This week he talks about his career in the Python community and digs into some of his current work.
kenneth

Enlighten Progress Bar is a console progress bar module for Python. (Yes, another one.) The main advantage of Enlighten is it allows writing to stdout and stderr without any redirection.
command line

Compare two floats with math.isclose() to see if they are nearly equal #python https://t.co/y9QiKtpbNP"
tweet

Microsoft is considering adding Python as one of the official Excel scripting languages, according to a topic on Excel's feedback hub opened last month.
excel

humor

If you've never coded before and want to learn how to make websites, we have good news for you: we are holding a one-day workshop for beginners! It will take place on February 25, 2018 at Collective Health in San Francisco.
django-girls

HereIsWally is a Tensorflow project that includes a model for solving Where's Wally puzzles. It uses Faster RCNN Inception v2 model initially trained on COCO dataset and retrained for finding Wally using transfer learning with Tensorflow Object Detection API.
tensorflow


Jobs
Europe
TXODDS are a dynamic player in the Sports data business looking to grow aggressively in the next 18 months. We are currently seeking a Python Backend Developer to join a team responsible for the operation, development and maintenance of our current and future systems.


Projects

Anubis - 347 Stars, 21 Fork
Subdomain enumeration and information gathering tool.

robot-detect - 132 Stars, 26 Fork
Detection script for the ROBOT vulnerability

django-heroku - 122 Stars, 6 Fork
A Django library for Heroku apps.

pytorch-pose-hg-3d - 47 Stars, 3 Fork
PyTorch implementation for 3D human pose estimation.

minipos - 22 Stars, 2 Fork
A self-hosted, 0-confirmation Bitcoin Cash point-of-sale server.

Barcode-generator - 16 Stars, 8 Fork
Desktop app to generate EAN-13, EAN-8 and EAN-5 barcodes (other types are coming soon) automatically and save them as PDF or PNG, JPEG GIF image files with several sizes.

retox - 10 Stars, 0 Fork
For running a local continuous testing environment with tox.

LinkedInCommentAnalyzer - 10 Stars, 11 Fork
Extracting LinkedIn comments from any post and export it to Excel file.

Python 3 script for managing of your twitter account using twitter official api's

Example to implement machine learning microservice with gRPC and Docker in Python

17 Dec 2017 7:32am GMT

Import Python: #155: Instagram opensourcing MonkeyType, Dict to retain insertion order and more

Worthy Read

GoCD is a continuous delivery tool supporting modern infrastructure with elastic on-demand agents and cloud deployments. With GoCD, you can easily model, orchestrate and visualize complex workflows from end to end. It's open source, free to use and download.
advert

Today we are excited to announce we're open-sourcing MonkeyType, our tool for automatically adding type annotations to your Python 3 code via runtime tracing of types seen.
instagram
,
opensource

Tweet
core-python

PrettyPrinter is a powerful, syntax-highlighting, and declarative pretty printer for Python 3.6+. It uses a modified Wadler-Leijen layout algorithm, similar to those used in Haskell pretty printer libraries prettyprinter and ansi-wl-pprint, JavaScript's Prettier, Ruby's prettyprinter.rb and IPython's IPython.lib.pretty. It combines the best parts of each and builds more on top to produce the most powerful pretty printer in Python to date.
prettify

Regular Expressions are commonly used in Linux command line tools like sed, awk, grep etc. Most programming languages support them in either built - in or through an external library. The main problem of using them is that they difficult to understand, but they are well worth the effort to learn. Using a regular expression can save you a lot of time.
regular expression

Improve DevOps Testing: 4-part eBook to learn how to detect problems earlier in your DevOps processes.
advert

Kenneth Reitz has contributed many things to the Python community, including projects such as Requests, Pipenv, and Maya. He also started the community written Hitchhiker's Guide to Python, and serves on the board of the Python Software Foundation. This week he talks about his career in the Python community and digs into some of his current work.
kenneth

Enlighten Progress Bar is a console progress bar module for Python. (Yes, another one.) The main advantage of Enlighten is it allows writing to stdout and stderr without any redirection.
command line

Compare two floats with math.isclose() to see if they are nearly equal #python https://t.co/y9QiKtpbNP"
tweet

Microsoft is considering adding Python as one of the official Excel scripting languages, according to a topic on Excel's feedback hub opened last month.
excel

humor

If you've never coded before and want to learn how to make websites, we have good news for you: we are holding a one-day workshop for beginners! It will take place on February 25, 2018 at Collective Health in San Francisco.
django-girls

HereIsWally is a Tensorflow project that includes a model for solving Where's Wally puzzles. It uses Faster RCNN Inception v2 model initially trained on COCO dataset and retrained for finding Wally using transfer learning with Tensorflow Object Detection API.
tensorflow


Jobs
Europe
TXODDS are a dynamic player in the Sports data business looking to grow aggressively in the next 18 months. We are currently seeking a Python Backend Developer to join a team responsible for the operation, development and maintenance of our current and future systems.


Projects

Anubis - 347 Stars, 21 Fork
Subdomain enumeration and information gathering tool.

robot-detect - 132 Stars, 26 Fork
Detection script for the ROBOT vulnerability

django-heroku - 122 Stars, 6 Fork
A Django library for Heroku apps.

pytorch-pose-hg-3d - 47 Stars, 3 Fork
PyTorch implementation for 3D human pose estimation.

minipos - 22 Stars, 2 Fork
A self-hosted, 0-confirmation Bitcoin Cash point-of-sale server.

Barcode-generator - 16 Stars, 8 Fork
Desktop app to generate EAN-13, EAN-8 and EAN-5 barcodes (other types are coming soon) automatically and save them as PDF or PNG, JPEG GIF image files with several sizes.

retox - 10 Stars, 0 Fork
For running a local continuous testing environment with tox.

LinkedInCommentAnalyzer - 10 Stars, 11 Fork
Extracting LinkedIn comments from any post and export it to Excel file.

Python 3 script for managing of your twitter account using twitter official api's

Example to implement machine learning microservice with gRPC and Docker in Python

17 Dec 2017 7:32am GMT

Kushal Das: Share files securely using OnionShare

Sharing files securely is always a open discussion topic. Somehow the relationship between security/privacy and usability stand in the opposite sides. But, OnionShare managed to create a bridge between them. It is a tool written by Micah Lee which helps to share files of any size securely and anonymously using Tor.

In the rest of the post I will talk about how you can this tool in your daily life.

How to install OnionShare?

OnionShare is a Python application and already packaged for most of the Linux distributions. If you are using Windows or Mac OS X, then visit the homepage of the application, and you can find the download links there.

On Fedora, you can just install it using dnf command.

sudo dnf install onionshare -y

For Ubuntu, use the ppa repository from Micah.

sudo add-apt-repository ppa:micahflee/ppa
sudo apt-get update
sudo apt-get install onionshare

How to use the tool?

When you start the tool, it will first try to connect to the Tor network. After a successful connection, it will have a window open where you can select a number of files, and then click on Start sharing button. The tool will take some time to create a random onion URL, which you can then pass to the person who is going to download the files using the Tor Browser.

You can mark any download to stop after the first download (using the settings menu). Because the tool is using Tor, it can punch through standard NAT. Means you can share files from directly your laptop or home desktop. One can still access the files using the Tor Browser.

Because of the nature of Tor, the whole connection is end to end encrypted. This also makes the sharer and downloader anonymous, but you have to make sure that you are sharing the download URL in a secure way (for example, you can share it using Signal). OnionShare also has a rate-limit so that an attacker can not do many attempts to guess the full download URL.

17 Dec 2017 5:15am GMT

Kushal Das: Share files securely using OnionShare

Sharing files securely is always a open discussion topic. Somehow the relationship between security/privacy and usability stand in the opposite sides. But, OnionShare managed to create a bridge between them. It is a tool written by Micah Lee which helps to share files of any size securely and anonymously using Tor.

In the rest of the post I will talk about how you can this tool in your daily life.

How to install OnionShare?

OnionShare is a Python application and already packaged for most of the Linux distributions. If you are using Windows or Mac OS X, then visit the homepage of the application, and you can find the download links there.

On Fedora, you can just install it using dnf command.

sudo dnf install onionshare -y

For Ubuntu, use the ppa repository from Micah.

sudo add-apt-repository ppa:micahflee/ppa
sudo apt-get update
sudo apt-get install onionshare

How to use the tool?

When you start the tool, it will first try to connect to the Tor network. After a successful connection, it will have a window open where you can select a number of files, and then click on Start sharing button. The tool will take some time to create a random onion URL, which you can then pass to the person who is going to download the files using the Tor Browser.

You can mark any download to stop after the first download (using the settings menu). Because the tool is using Tor, it can punch through standard NAT. Means you can share files from directly your laptop or home desktop. One can still access the files using the Tor Browser.

Because of the nature of Tor, the whole connection is end to end encrypted. This also makes the sharer and downloader anonymous, but you have to make sure that you are sharing the download URL in a secure way (for example, you can share it using Signal). OnionShare also has a rate-limit so that an attacker can not do many attempts to guess the full download URL.

17 Dec 2017 5:15am GMT

16 Dec 2017

feedDjango community aggregator: Community blog posts

Building APIs with Django and GraphQL

This tutorial will introduce you to GraphQL with Python, Django and Graphene. We'll see how to create a simple Django project to demonstrate how to build an API server based on GraphQL (instead of REST) then we'll see how to use graphiql_django, an interface for testing GraphQL queries and mutations before building your front-end application, to send GraphQL Queries (for getting data) and Mutations (for posting and updating data). In this part we'll be dealing with building the backend. In the next tutorials we will see how to use frameworks and libraries such as Angular and React to build a front-end application that consumes and updates our GraphQL server and advanced use cases such as user authentication, permissions and Relay

Make sure to follow me on twitter (@techiediaries) to be notified once the next tutorial parts are ready.

GraphQL is a modern API standard for building Web APIs, invented and used internally by Facebook for its native mobile applications then later open sourced. GraphQL provides a better, powerful and flexible alternative to REST.

Before we dive into GraphQL concepts, let's understand what's REST:

REST stands for Representational State Transfer and it's an architectural pattern for designing client/server distributed systems. Unlike GraphQL, it's not a standard but a set of constraints such as having a uniform interface or statelessness which means each HTTP request has to include all of the information needed by the server to fulfill the request, instead of being dependent on the server to keep track of the previous requests or saving the session state on the server.

These are the principles of REST:

  • Resources: expose easily understood directory structure URIs.
  • Representations: transfer JSON or XML to represent data objects and attributes.
  • Messages: use HTTP methods explicitly (for example, GET, POST, PUT, and DELETE).
  • Stateless: interactions store no client context on the server between requests. State dependencies limit and restrict scalability. The client holds session state. https://spring.io/understanding/REST

If you want to know more, watch this video, by Google Developers, that goes over the basic principles behind REST.

An API (Application Programming Interface) provides a way (or an interface) for clients to fetch data from servers. This establishes a relationship between a client and a server. In the case of REST, the established interface is inflexible and strongly coupled to the server implementation so if the server's implementation changes the depending clients, most often than not, will break.

In its essence, GraphQL allows developers to declaratively fetch data from a server. Most importantly, the clients are able to specify exactly what data they need. Also, unlike REST APIs where you usually have multiple endpoints which provide fixed data shapes or structures, a GraphQL server needs to only expose a single endpoint that provides, the requesting clients, with exactly the data they are asking for, no less no more.

GraphQL is adopted by many big companies, other than Facebook, such as GitHub, Twitter and Shopify etc.

Nowadys, companies have rapidely and frequently changing data requirements. For this reason, companies are investing more money and time rewriting clients that access data using REST APIs. As such, GraphQL provides the better solution for developers to satisfy the needs for the uncertain data requirements.

REST vs. GraphQL

GraphQL allows you to query your server for the exact data that you need by sending a single request even for fetching data from related models.

With a REST API, you will usually communicate with multiple endpoints when trying to fetch data from a server so suppose you have this endpoint for accessing users data by their id /users/<id>, this endpoint /users/<id>/posts for accessing a user's posts and the last one for accessing the user's followers /users/<id>/followers

Source

A client application, that needs data from this API, must send three HTTP requests to the available three endpoints to fetch all required data. So first this will cause many round-trips to the server (consuming more resources) and secondly there will likely be extra data that will be fetched (this is called overfetching), which may not necessarily be used, as the client has no control over the requested data, that's because, in REST, the server dictates the shape of the data that can be sent.

Source

The same data can be fetched from the same API by sending only a single request to the GraphQL endpoint, the request contains a query of all data requirements from the client. The server parses the query and sends back the request data as a JSON object in the exact same format requested by the client.

So as a recap, when using the Rest-based APIs the client will either get less than expected data (underfetching). In this case it needs to send more requests to retrieve all required data, or it will get more data (overfetching)than what it actually needs, which consumes the server resources for no reasons.

Thanks to GraphQL, the client can exactly describe the shape of the requested data with a JSON object and the server takes care of sending the data in the requested shape.

Let's take one more simple example to understand how GraphQL works.

Suppose we have a Django web application with two models: Products and Families:


class Product(models.Model):
    ## attributes

class Family(models.Model):
    ## attributes


Each product belongs to a family so they are related with a foreign key.

Now if you need to build a Rest API for this web app, you will have to create multiple endpoints, such as:

Now, let's suppose you want to get all products of a specified family, you will need to add another endpoint. For example something like: /family/:id/products where :id is the identifier of the family of products.

Let's also suppose that a request to the endpoint /product/1 returns a serialized object, for example:

{
    id : 1 ,
    reference : 'PR001' , 
    name : 'Product 001' ,
    description : 'Description of Product 001'
}

The problem is: What if you need to build another front end app, maybe for mobile devices, that needs more data. For example a quantity attribute. In this case, you have to add another endpoint or modify the existing endpoint to include the quantity.

In the case of Rest-based APIs, the server API architecture is strongly coupled with the client implementation, as a result if you need to change the API implementation on the server, you'll definitely end up breaking the existing clients. And if you need to add another client for your API, which needs less or more data, that's served by some endpoint(s), you'll have to change the server code responsible for serving the API in a way that doesn't break the existing clients. That means, in many cases, conserving the old endpoints and adding new endpoints.

If you have ever developed an API with Django or any other framework then you certainly experienced one or all of these issues we have talked about. Thanks to Facebook, GraphQL presents the solution for you!

Continuing with the simple example above. In the case of GraphQL, we can send a query that may look like:

query {  
    product(id:1) {
        id,
        reference,
        quantity
    }
}

Which is going to return something like:

{
    id : 1,
    reference : 'PR001', 
    quantity : 1000
}

We have neglected two attributes without causing any problems or changing the underlying server API.

Just query the data you want and the server will be able to send it back to you.

Now if you need to get all products of a specified family with GraphQL, say for example for the family with the id equals to 1. You can simply send:

query {  
    family(id:1) {
        id
        products {
            id,    
            reference,
            name,
            description,
            quantity
        }
    }
}

If you send this query to your GraphQL server, you'll get something similar to:


{
    "id":"1",
    "products":[{"id":"1","reference":"PR001","name":"Product1","description":"..."} , ... ]
}    

Even if this example is fairly simple, you can see how powerful this new technology can be, for building Web APIs.

Building a GraphQL Django Application

After the introduction to GraphQL vs. REST. Let's now learn how to build a simple real world example web application with Django and Graphene: the Python implementation for GraphQL.

This tutorial assumes you have already setup your development machine to work with Python. You need to have Python, PIP (Python package manager) and optionally virtualenv installed. Python can be easily installed by grabbing the binaries for your operating system from the official website.

pip is installed Python if have Python 2 >=2.7.9 or Python 3 >=3.4 binaries downloaded from the official python.org website. Otherwise you can install it using get-pip.py.

python get-pip.py

For virtualenv you can use virtualenvwrapper using pip:

pip install virtualenvwrapper

Let's start by creating a new virtual and isolated environment for our project dependencies and then install the required packages including django.

Head over to your terminal in Linux/Mac or command prompt in Windows then run the following:

virtualenv graphqlenv 
source graphqlenv/bin/activate 

This will create a new virtual environment and activate it.

Next install django and graphene packages with pip:

pip install django 
pip install graphene_django

You can also install graphiql_django which provides a user interface for testing GraphQL queries against your server.

pip install graphiql_django

Next let's create a Django project and add a single application:

python django-admin.py startproject inventory . 
cd inventory
python manage.py startapp inventory 

Open settings.py then add inventory and graphene_django to the INSTALLED_APPS array:


INSTALLED_APPS = [
        'django.contrib.admin',
        'django.contrib.auth',
        'django.contrib.contenttypes',
        'django.contrib.sessions',
        'django.contrib.messages',
        'django.contrib.staticfiles',
        'graphene_django',
        'inventory'
]

Next migrate the database with:

python manage.py migrate 

Adding Django Models

Open inventory/models.py then add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.db import models

    class Product(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        title = models.CharField(max_length=200, help_text="Enter Product Title")
        description = models.TextField(help_text="Enter Product Description")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")
        unit = models.CharField(max_length=10,help_text="Enter Product Unit ")

        quantity = models.FloatField(help_text="Enter Product Quantity")
        minQuantity = models.FloatField(help_text="Enter Product Min Quantity")

        family = models.ForeignKey('Family')
        location = models.ForeignKey('Location')


        def __str__(self):

            return self.title


    class Family(models.Model):

        reference = models.CharField(max_length=13, help_text="Enter Family Reference")
        title = models.CharField(max_length=200, help_text="Enter Family Title")
        description = models.TextField(help_text="Enter Family Description")

        unit = models.CharField(max_length=10,help_text="Enter Family Unit ")

        minQuantity = models.FloatField(help_text="Enter Family Min Quantity")


        def __str__(self):

            return self.title


    class Location(models.Model):


        reference = models.CharField(max_length=20, help_text="Enter Location Reference")
        title = models.CharField(max_length=200, help_text="Enter Location Title")
        description = models.TextField(help_text="Enter Location Description")

        def __str__(self):

            return self.title


    class Transaction(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        comment = models.TextField(help_text="Enter Product Stock Keeping Unit")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")

        quantity = models.FloatField(help_text="Enter Product Quantity")

        product = models.ForeignKey('Product')

        date = models.DateField(null=True, blank=True)

        REASONS = (
            ('ns', 'New Stock'),
            ('ur', 'Usable Return'),
            ('nr', 'Unusable Return'),
        )


        reason = models.CharField(max_length=2, choices=REASONS, blank=True, default='ns', help_text='Reason for transaction')

        def __str__(self):

            return 'Transaction :  %d' % (self.id)


Next create migrations and apply them:

python manage.py makemigrations
python manage.py migrate

Adding the Admin Interface

The next thing is to add the models to the admin interface so we can add some test data:

Open inventory/admin.py and add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.contrib import admin

    from .models import Product ,Family ,Location ,Transaction  
    # Register your models here.

    admin.site.register(Product)
    admin.site.register(Family)
    admin.site.register(Location)
    admin.site.register(Transaction)

Next create a login to be able to access the admin app

python manage.py createsuperuser 

Enter the username and password when prompted and hit enter.

Now run the local development server with:

python manage.py runserver

Navigate to http://127.0.0.1:8000/admin with your browser. Login and add some data for each model.

GraphQL Concepts: the Schema and Type System

GraphQL is a strongly typed query language which can be used to describe the data structures of an API. GraphQL uses the concepts of schemas and types. Types define what's exposed in the API and grouped in a schema using the GraphQL's SDL language or the Schema Definition Language.

The schema can be considered as a contract between the client and the server which states how a client can access the data in the server.

Adding GraphQL Support: the Schema and the Object Types

To be able to execute GraphQL queries against your web application you need to add a Schema, Object Types and a view function that receives the GraphQL queries.

Creating the Schema

Create inventory/schema.py then:

First, create a subclass of DjangoObjectType for each model you want to query with GraphQL:


    import graphene

    from graphene_django.types import DjangoObjectType

    from .models import Family , Location , Product , Transaction 

    class FamilyType(DjangoObjectType):
        class Meta:
            model = Family 

    class LocationType(DjangoObjectType):
        class Meta:
            model = Location 

    class ProductType(DjangoObjectType):
        class Meta:
            model = Product 

    class TransactionType(DjangoObjectType):
        class Meta:
            model = Transaction


Next, create an abstract query: a subclass of AbstractType (It's abstract because it's an app level query). For each app you have, you need to create an app level abstract query and then combine all abstract queries with a concrete project level query.

You need to create a subclass of graphene.List for each DjangoObjectType then create the resolve_xxx() methods for each Query member


    class Query(graphene.AbstractType):
        all_families = graphene.List(FamilyType)
        all_locations = graphene.List(LocationType)
        all_products = graphene.List(ProductType)
        all_transactions = graphene.List(TransactionType)

        def resolve_all_families(self, args, context, info):
            return Family.objects.all()

        def resolve_all_locations(self, args, context, info):
            return Location.objects.all()

        def resolve_all_products(self, args, context, info):
            return Product.objects.all()

        def resolve_all_transactions(self, args, context, info):
            return Transaction.objects.all()

Creating the Project Level Query

Next create a project level Query. Create a project level schema.py file then add:


    import graphene

    import inventory.schema 


    class Query(inventory.schema.Query, graphene.ObjectType):
        # This class extends all abstract apps level Queries and graphene.ObjectType
        pass

    schema = graphene.Schema(query=Query)

So we first create a Query class which extends all abstract queries and also ObjectType then we create a graphene.Schema object which takes the Query class as a parameter.

Now we need to add a GRAPHINE config object in settings.py


GRAPHENE = {
        'SCHEMA': 'product_inventory_manager.schema.schema'
} 

Adding the GraphQL View

With GraphQL, you don't need multiple endpoints, only one, so let's create it:

Open urls.py then add:


    from django.conf.urls import url
    from django.contrib import admin

    from graphene_django.views import GraphQLView

    from product_inventory_manager.schema import schema

    urlpatterns = [
        url(r'^admin/', admin.site.urls),
        url(r'^graphql', GraphQLView.as_view(graphiql=True)),
    ]

We have previously installed a GraphQL package for adding a user interface to test GraphQL queries so if you want to enable it you just set the graphiql parameter to True.

Serving the App and Testing GraphQL

Now you are ready to test the GraphQL API, so start by serving your Django app with:

python manage.py runserver 

Then navigate to localhost:8000/graphql with your browser and run some queries.

Fetching Data with Queries

In traditional REST APIs you can fetch data by sending HTTP GET requests to pre-determined and pre-conceived endpoints where each endpoint returns a pre-defined and rigid structure. So the only way you have to express your client's data requirements is through the URLs of the available endpoints and their associated parameters. As a result, the client doesn't have much flexibility for defining its data requirements.

For GraphQL, things are very different. The client needs only to communicate with a single endpoint which can return all requested information with flexible data structures. But since there is only one endpoint, the server needs more information to be passed to be able to properly figure out the data requirements of the client. Here comes the role of the query which is a simple JSON (JavaScript Object Notation) object that defines the requirements.

Example Queries

Let's take a simple example of a query that can be sent to our GraphQL server:


query {
    allProducts {
        id,
        sku
    }
}   

The allProducts field in the previous query is the root field. What follows the root field (i.e id and sku), is the payload of the query.

This previous query returns an array of all products which are currently stored in the database. Here's an example response:

{
    "data": {
        "allProducts": [
        {
            "id": "1",
            "sku": "Product001"
        }
        /*...*/
        ]
    }
}   

You can see that each product returned has two fields (the only fields that are specified in the query), an id and a sku even if a product has 10 fields.

If the client needs more that. All it has to do is adding the field in the query.

You can experiment with the other models and you can also add fields.

Now what question you should ask. How do you get the names of the queries?

It's simple, just take the name of the field you create in the abstract query and transform it to camel case.

For example:


all_families = graphene.List(FamilyType) => allFamilies

all_locations = graphene.List(LocationType) => allLocations 

all_products = graphene.List(ProductType) => allProducts 

all_transactions = graphene.List(TransactionType) => allTransactions 

Then for each query specify the model fields you want to retrieve.

How to Query the Relationships or Nested Data?

You can also query the relationships or nested data. So let's suppose that you need to get all families with their products. You can simply make this query:


query {
    allFamilies {
        id,
        reference, 
        productSet {
            id,
            sku 
        }
    }
}

Note that you need to add <field>+Set for nested lists.

An example response would look like the following:


    {
    "data": {
        "allFamilies": [
        {
            "id": "1",
            "reference": "FM001",
            "productSet": [
            {
                "id": "1",
                "sku": "Product001"
            }
            ]
        },
        {
            "id": "2",
            "reference": "FM001",
            "productSet": []
        }
        ]
    }
    }

Now what if you need the parent family and the location of each product. That's also easy to do with GraphQL:

query {
        allProducts {
            id,
            sku, 
            family {
                id
            }
            location {
                id
            }

        }
    }

Querying Single Items Using Query Arguments

We have seen how to query all items but what if you need just one item by id. Go back to your abstract query in your app schema.py file then update it to be able to query for a single product:


product = graphene.Field(ProductType,id=graphene.Int())


Then a resolve_xxx() method:


def resolve_product(self, args, context, info):
        id = args.get('id')

        if id is not None:
            return Product.objects.get(pk=id)

        return None

In GraphQL, fields may have arguments that can be specified between parenthesis in the schema (just like a function declaration). For example, the product field can have an id parameter to return the product by its id. Here's what an example query may look like:


query {
    product(id: 1) {
        sku,
        barcode
    }
}

In the same way, you can add support for getting single families, locations and transactions.

Writing Data with Mutations

A Mutation is a special ObjectType that can be used to create objects in the GraphQL server.

import graphene

class CreateProduct(graphene.Mutation):
    class Arguments:
        sku = graphene.String()
        barcode = graphene.String()

    result = graphene.Boolean()
    product = graphene.Field(lambda: Product)

    def mutate(self, info, sku):
        product = Product(sku=sku)
        result = True
        return CreateProduct(product=product, result=result)

product and result are the output fields of the Mutation when it's resolved.

Inputs are the arguments that the Mutation CreateProduct needs for resolving, in this case sku, barcode, ... are the arguments for the mutation.

mutate() is the function that will be invoked once the mutation is called.

Conclusion

GraphQL is a very powerful technology for building Web APIs and thanks to Django Graphene you can easily add the support for GraphQL to your django project.

You can find the code in this GitHub repository

Thanks for reading!

16 Dec 2017 7:05pm GMT

15 Dec 2017

feedDjango community aggregator: Community blog posts

Building APIs with Django, GraphQL and Graphene

This tutorial will introduce you to GraphQL with Python, Django and Graphene. We'll see how to create a simple Django project to demonstrate how to build an API server based on GraphQL (instead of REST) then we'll see how to use graphiql_django, an interface for testing GraphQL queries and mutations before building your front-end application, to send GraphQL Queries (for getting data) and Mutations (for posting and updating data). In this part we'll be dealing with building the backend. In the next tutorials we will see how to use frameworks and libraries such as Angular and React to build a front-end application that consumes and updates our GraphQL server and advanced use cases such as user authentication, permissions and Relay

Make sure to follow me on twitter (@techiediaries) to be notified once the next tutorial parts are ready.

GraphQL is a modern API standard for building Web APIs, invented and used internally by Facebook for its native mobile applications then later open sourced. GraphQL provides a better, powerful and flexible alternative to REST.

Before we dive into GraphQL concepts, let's understand what's REST:

REST stands for Representational State Transfer and it's an architectural pattern for designing client/server distributed systems. Unlike GraphQL, it's not a standard but a set of constraints such as having a uniform interface or statelessness which means each HTTP request has to include all of the information needed by the server to fulfill the request, instead of being dependent on the server to keep track of the previous requests or saving the session state on the server.

These are the principles of REST:

  • Resources: expose easily understood directory structure URIs.
  • Representations: transfer JSON or XML to represent data objects and attributes.
  • Messages: use HTTP methods explicitly (for example, GET, POST, PUT, and DELETE).
  • Stateless: interactions store no client context on the server between requests. State dependencies limit and restrict scalability. The client holds session state. https://spring.io/understanding/REST

If you want to know more, watch this video, by Google Developers, that goes over the basic principles behind REST.

An API (Application Programming Interface) provides a way (or an interface) for clients to fetch data from servers. This establishes a relationship between a client and a server. In the case of REST, the established interface is inflexible and strongly coupled to the server implementation so if the server's implementation changes the depending clients, most often than not, will break.

In its essence, GraphQL allows developers to declaratively fetch data from a server. Most importantly, the clients are able to specify exactly what data they need. Also, unlike REST APIs where you usually have multiple endpoints which provide fixed data shapes or structures, a GraphQL server needs to only expose a single endpoint that provides, the requesting clients, with exactly the data they are asking for, no less no more.

GraphQL is adopted by many big companies, other than Facebook, such as GitHub, Twitter and Shopify etc.

Nowadys, companies have rapidely and frequently changing data requirements. For this reason, companies are investing more money and time rewriting clients that access data using REST APIs. As such, GraphQL provides the better solution for developers to satisfy the needs for the uncertain data requirements.

REST vs. GraphQL

GraphQL allows you to query your server for the exact data that you need by sending a single request even for fetching data from related models.

With a REST API, you will usually communicate with multiple endpoints when trying to fetch data from a server so suppose you have this endpoint for accessing users data by their id /users/<id>, this endpoint /users/<id>/posts for accessing a user's posts and the last one for accessing the user's followers /users/<id>/followers

Source

A client application, that needs data from this API, must send three HTTP requests to the available three endpoints to fetch all required data. So first this will cause many round-trips to the server (consuming more resources) and secondly there will likely be extra data that will be fetched (this is called overfetching), which may not necessarily be used, as the client has no control over the requested data, that's because, in REST, the server dictates the shape of the data that can be sent.

Source

The same data can be fetched from the same API by sending only a single request to the GraphQL endpoint, the request contains a query of all data requirements from the client. The server parses the query and sends back the request data as a JSON object in the exact same format requested by the client.

So as a recap, when using the Rest-based APIs the client will either get less than expected data (underfetching). In this case it needs to send more requests to retrieve all required data, or it will get more data (overfetching)than what it actually needs, which consumes the server resources for no reasons.

Thanks to GraphQL, the client can exactly describe the shape of the requested data with a JSON object and the server takes care of sending the data in the requested shape.

Let's take one more simple example to understand how GraphQL works.

Suppose we have a Django web application with two models: Products and Families:


class Product(models.Model):
    ## attributes

class Family(models.Model):
    ## attributes


Each product belongs to a family so they are related with a foreign key.

Now if you need to build a Rest API for this web app, you will have to create multiple endpoints, such as:

Now, let's suppose you want to get all products of a specified family, you will need to add another endpoint. For example something like: /family/:id/products where :id is the identifier of the family of products.

Let's also suppose that a request to the endpoint /product/1 returns a serialized object, for example:

{
    id : 1 ,
    reference : 'PR001' , 
    name : 'Product 001' ,
    description : 'Description of Product 001'
}

The problem is: What if you need to build another front end app, maybe for mobile devices, that needs more data. For example a quantity attribute. In this case, you have to add another endpoint or modify the existing endpoint to include the quantity.

In the case of Rest-based APIs, the server API architecture is strongly coupled with the client implementation, as a result if you need to change the API implementation on the server, you'll definitely end up breaking the existing clients. And if you need to add another client for your API, which needs less or more data, that's served by some endpoint(s), you'll have to change the server code responsible for serving the API in a way that doesn't break the existing clients. That means, in many cases, conserving the old endpoints and adding new endpoints.

If you have ever developed an API with Django or any other framework then you certainly experienced one or all of these issues we have talked about. Thanks to Facebook, GraphQL presents the solution for you!

Continuing with the simple example above. In the case of GraphQL, we can send a query that may look like:

query {  
    product(id:1) {
        id,
        reference,
        quantity
    }
}

Which is going to return something like:

{
    id : 1,
    reference : 'PR001', 
    quantity : 1000
}

We have neglected two attributes without causing any problems or changing the underlying server API.

Just query the data you want and the server will be able to send it back to you.

Now if you need to get all products of a specified family with GraphQL, say for example for the family with the id equals to 1. You can simply send:

query {  
    family(id:1) {
        id
        products {
            id,    
            reference,
            name,
            description,
            quantity
        }
    }
}

If you send this query to your GraphQL server, you'll get something similar to:


{
    "id":"1",
    "products":[{"id":"1","reference":"PR001","name":"Product1","description":"..."} , ... ]
}    

Even if this example is fairly simple, you can see how powerful this new technology can be, for building Web APIs.

Building a GraphQL Django Application

After the introduction to GraphQL vs. REST. Let's now learn how to build a simple real world example web application with Django and Graphene: the Python implementation for GraphQL.

This tutorial assumes you have already setup your development machine to work with Python. You need to have Python, PIP (Python package manager) and optionally virtualenv installed. Python can be easily installed by grabbing the binaries for your operating system from the official website.

pip is installed Python if have Python 2 >=2.7.9 or Python 3 >=3.4 binaries downloaded from the official python.org website. Otherwise you can install it using get-pip.py.

python get-pip.py

For virtualenv you can use virtualenvwrapper using pip:

pip install virtualenvwrapper

Let's start by creating a new virtual and isolated environment for our project dependencies and then install the required packages including django.

Head over to your terminal in Linux/Mac or command prompt in Windows then run the following:

virtualenv graphqlenv 
source graphqlenv/bin/activate 

This will create a new virtual environment and activate it.

Next install django and graphene packages with pip:

pip install django 
pip install graphene_django

You can also install graphiql_django which provides a user interface for testing GraphQL queries against your server.

pip install graphiql_django

Next let's create a Django project and add a single application:

python django-admin.py startproject inventory . 
cd inventory
python manage.py startapp inventory 

Open settings.py then add inventory and graphene_django to the INSTALLED_APPS array:


INSTALLED_APPS = [
        'django.contrib.admin',
        'django.contrib.auth',
        'django.contrib.contenttypes',
        'django.contrib.sessions',
        'django.contrib.messages',
        'django.contrib.staticfiles',
        'graphene_django',
        'inventory'
]

Next migrate the database with:

python manage.py migrate 

Adding Django Models

Open inventory/models.py then add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.db import models

    class Product(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        title = models.CharField(max_length=200, help_text="Enter Product Title")
        description = models.TextField(help_text="Enter Product Description")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")
        unit = models.CharField(max_length=10,help_text="Enter Product Unit ")

        quantity = models.FloatField(help_text="Enter Product Quantity")
        minQuantity = models.FloatField(help_text="Enter Product Min Quantity")

        family = models.ForeignKey('Family')
        location = models.ForeignKey('Location')


        def __str__(self):

            return self.title


    class Family(models.Model):

        reference = models.CharField(max_length=13, help_text="Enter Family Reference")
        title = models.CharField(max_length=200, help_text="Enter Family Title")
        description = models.TextField(help_text="Enter Family Description")

        unit = models.CharField(max_length=10,help_text="Enter Family Unit ")

        minQuantity = models.FloatField(help_text="Enter Family Min Quantity")


        def __str__(self):

            return self.title


    class Location(models.Model):


        reference = models.CharField(max_length=20, help_text="Enter Location Reference")
        title = models.CharField(max_length=200, help_text="Enter Location Title")
        description = models.TextField(help_text="Enter Location Description")

        def __str__(self):

            return self.title


    class Transaction(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        comment = models.TextField(help_text="Enter Product Stock Keeping Unit")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")

        quantity = models.FloatField(help_text="Enter Product Quantity")

        product = models.ForeignKey('Product')

        date = models.DateField(null=True, blank=True)

        REASONS = (
            ('ns', 'New Stock'),
            ('ur', 'Usable Return'),
            ('nr', 'Unusable Return'),
        )


        reason = models.CharField(max_length=2, choices=REASONS, blank=True, default='ns', help_text='Reason for transaction')

        def __str__(self):

            return 'Transaction :  %d' % (self.id)


Next create migrations and apply them:

python manage.py makemigrations
python manage.py migrate

Adding the Admin Interface

The next thing is to add the models to the admin interface so we can add some test data:

Open inventory/admin.py and add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.contrib import admin

    from .models import Product ,Family ,Location ,Transaction  
    # Register your models here.

    admin.site.register(Product)
    admin.site.register(Family)
    admin.site.register(Location)
    admin.site.register(Transaction)

Next create a login to be able to access the admin app

python manage.py createsuperuser 

Enter the username and password when prompted and hit enter.

Now run the local development server with:

python manage.py runserver

Navigate to http://127.0.0.1:8000/admin with your browser. Login and add some data for each model.

GraphQL Concepts: the Schema and Type System

GraphQL is a strongly typed query language which can be used to describe the data structures of an API. GraphQL uses the concepts of schemas and types. Types define what's exposed in the API and grouped in a schema using the GraphQL's SDL language or the Schema Definition Language.

The schema can be considered as a contract between the client and the server which states how a client can access the data in the server.

Adding GraphQL Support: the Schema and the Object Types

To be able to execute GraphQL queries against your web application you need to add a Schema, Object Types and a view function that receives the GraphQL queries.

Creating the Schema

Create inventory/schema.py then:

First, create a subclass of DjangoObjectType for each model you want to query with GraphQL:


    import graphene

    from graphene_django.types import DjangoObjectType

    from .models import Family , Location , Product , Transaction 

    class FamilyType(DjangoObjectType):
        class Meta:
            model = Family 

    class LocationType(DjangoObjectType):
        class Meta:
            model = Location 

    class ProductType(DjangoObjectType):
        class Meta:
            model = Product 

    class TransactionType(DjangoObjectType):
        class Meta:
            model = Transaction


Next, create an abstract query: a subclass of AbstractType (It's abstract because it's an app level query). For each app you have, you need to create an app level abstract query and then combine all abstract queries with a concrete project level query.

You need to create a subclass of graphene.List for each DjangoObjectType then create the resolve_xxx() methods for each Query member


    class Query(graphene.AbstractType):
        all_families = graphene.List(FamilyType)
        all_locations = graphene.List(LocationType)
        all_products = graphene.List(ProductType)
        all_transactions = graphene.List(TransactionType)

        def resolve_all_families(self, args, context, info):
            return Family.objects.all()

        def resolve_all_locations(self, args, context, info):
            return Location.objects.all()

        def resolve_all_products(self, args, context, info):
            return Product.objects.all()

        def resolve_all_transactions(self, args, context, info):
            return Transaction.objects.all()

Creating the Project Level Query

Next create a project level Query. Create a project level schema.py file then add:


    import graphene

    import inventory.schema 


    class Query(inventory.schema.Query, graphene.ObjectType):
        # This class extends all abstract apps level Queries and graphene.ObjectType
        pass

    schema = graphene.Schema(query=Query)

So we first create a Query class which extends all abstract queries and also ObjectType then we create a graphene.Schema object which takes the Query class as a parameter.

Now we need to add a GRAPHINE config object in settings.py


GRAPHENE = {
        'SCHEMA': 'product_inventory_manager.schema.schema'
} 

Adding the GraphQL View

With GraphQL, you don't need multiple endpoints, only one, so let's create it:

Open urls.py then add:


    from django.conf.urls import url
    from django.contrib import admin

    from graphene_django.views import GraphQLView

    from product_inventory_manager.schema import schema

    urlpatterns = [
        url(r'^admin/', admin.site.urls),
        url(r'^graphql', GraphQLView.as_view(graphiql=True)),
    ]

We have previously installed a GraphQL package for adding a user interface to test GraphQL queries so if you want to enable it you just set the graphiql parameter to True.

Serving the App and Testing GraphQL

Now you are ready to test the GraphQL API, so start by serving your Django app with:

python manage.py runserver 

Then navigate to localhost:8000/graphql with your browser and run some queries.

Fetching Data with Queries

In traditional REST APIs you can fetch data by sending HTTP GET requests to pre-determined and pre-conceived endpoints where each endpoint returns a pre-defined and rigid structure. So the only way you have to express your client's data requirements is through the URLs of the available endpoints and their associated parameters. As a result, the client doesn't have much flexibility for defining its data requirements.

For GraphQL, things are very different. The client needs only to communicate with a single endpoint which can return all requested information with flexible data structures. But since there is only one endpoint, the server needs more information to be passed to be able to properly figure out the data requirements of the client. Here comes the role of the query which is a simple JSON (JavaScript Object Notation) object that defines the requirements.

Example Queries

Let's take a simple example of a query that can be sent to our GraphQL server:


query {
    allProducts {
        id,
        sku
    }
}   

The allProducts field in the previous query is the root field. What follows the root field (i.e id and sku), is the payload of the query.

This previous query returns an array of all products which are currently stored in the database. Here's an example response:

{
    "data": {
        "allProducts": [
        {
            "id": "1",
            "sku": "Product001"
        }
        /*...*/
        ]
    }
}   

You can see that each product returned has two fields (the only fields that are specified in the query), an id and a sku even if a product has 10 fields.

If the client needs more that. All it has to do is adding the field in the query.

You can experiment with the other models and you can also add fields.

Now what question you should ask. How do you get the names of the queries?

It's simple, just take the name of the field you create in the abstract query and transform it to camel case.

For example:


all_families = graphene.List(FamilyType) => allFamilies

all_locations = graphene.List(LocationType) => allLocations 

all_products = graphene.List(ProductType) => allProducts 

all_transactions = graphene.List(TransactionType) => allTransactions 

Then for each query specify the model fields you want to retrieve.

How to Query the Relationships or Nested Data?

You can also query the relationships or nested data. So let's suppose that you need to get all families with their products. You can simply make this query:


query {
    allFamilies {
        id,
        reference, 
        productSet {
            id,
            sku 
        }
    }
}

Note that you need to add <field>+Set for nested lists.

An example response would look like the following:


    {
    "data": {
        "allFamilies": [
        {
            "id": "1",
            "reference": "FM001",
            "productSet": [
            {
                "id": "1",
                "sku": "Product001"
            }
            ]
        },
        {
            "id": "2",
            "reference": "FM001",
            "productSet": []
        }
        ]
    }
    }

Now what if you need the parent family and the location of each product. That's also easy to do with GraphQL:

query {
        allProducts {
            id,
            sku, 
            family {
                id
            }
            location {
                id
            }

        }
    }

Querying Single Items Using Query Arguments

We have seen how to query all items but what if you need just one item by id. Go back to your abstract query in your app schema.py file then update it to be able to query for a single product:


product = graphene.Field(ProductType,id=graphene.Int())


Then a resolve_xxx() method:


def resolve_product(self, args, context, info):
        id = args.get('id')

        if id is not None:
            return Product.objects.get(pk=id)

        return None

In GraphQL, fields may have arguments that can be specified between parenthesis in the schema (just like a function declaration). For example, the product field can have an id parameter to return the product by its id. Here's what an example query may look like:


query {
    product(id: 1) {
        sku,
        barcode
    }
}

In the same way, you can add support for getting single families, locations and transactions.

Writing Data with Mutations

A Mutation is a special ObjectType that can be used to create objects in the GraphQL server.

import graphene

class CreateProduct(graphene.Mutation):
    class Arguments:
        sku = graphene.String()
        barcode = graphene.String()

    result = graphene.Boolean()
    product = graphene.Field(lambda: Product)

    def mutate(self, info, sku):
        product = Product(sku=sku)
        result = True
        return CreateProduct(product=product, result=result)

product and result are the output fields of the Mutation when it's resolved.

Inputs are the arguments that the Mutation CreateProduct needs for resolving, in this case sku, barcode, ... are the arguments for the mutation.

mutate() is the function that will be invoked once the mutation is called.

Conclusion

GraphQL is a very powerful technology for building Web APIs and thanks to Django Graphene you can easily add the support for GraphQL to your django project.

You can find the code in this GitHub repository

15 Dec 2017 6:00am GMT

14 Dec 2017

feedPlanet Twisted

Moshe Zadka: Interesting text encodings (and the people who love them)

(Thanks to Tom Prince and Nelson Elhage for suggestions for improvement.)

Nowadays, almost all text will be encoded in UTF-8 -- for good reasons, it is a well thought out encoding. Some of it will be in Latin 1, AKA ISO-8859-1, which is popular in the western world. Less of it will be in other members of the ISO-8859 family (-2 or higher). Some text from Japan will occasionally still be in Shift-JIS. These encodings are all reasonable -- too reasonable.

What about more interesting encodings?

EBCDIC

Encodings turn a sequence of logical code points into a sequence of bytes. Bytes, in turn, are just sequences of ones and zeroes. Usually, we think of the ones and zeroes as mostly symmetric -- it wouldn't matter if the encoding was to the "dual" byte, where every bit was flipped. SSD drives do not like long sequences of zeroes -- but neither do they like long sequences of ones.

What if there was no symmetry? What if every "one" weakened your byte?

This is the history of one of the most venerable media to carry digital information -- predating the computer by its use in automated weaving machines -- the punched card. It was called so because to make a "one", you would punch a hole -- that was detected by the card reader by an electric circuit being completed. Punching too many holes made cards weak: likely to rip in the wear and tear the automated reading machines inflicted upon them, in the drive to read cards ever faster.

EBCDIC (Extended Binary Coded Decimal Interchange Code) was the solution. "Extended" because it extends the Binary Coded Decimal standard -- numbers are encoded using one punch, which makes them easy to read with a hex editor. Letters are encoded with two. Nothing sorts correctly, of course, but that was not a big priority. Quoting from Wikipedia:

"The distinct encoding of 's' and 'S' (using position 2 instead of 1) was maintained from punched cards where it was desirable not to have hole punches too close to each other to ensure the integrity of the physical card.

Of course, it wouldn't be IBM if there weren't a whole host of encodings, subtly incompatible, all called EBCDIC. If you live in the US, you are supposed to use code page 1140 for your EBCDIC needs.

Luckily, if you ever need to connect your Python interpreter to a card-punch machine, the Unicode encodings have got you covered:

>>> "hello".encode('cp1140')
b'\x88\x85\x93\x93\x96'

If you came to this post to learn skills immediately relevant to your day to day job and are entirely not obsolete, you're welcome.

KOI-8

Suppose you're a Russian speaker. You write your language using the Cyrrilic alphabet, suspiciously absent from the American Standard Code for Information Interchange (ASCII), developed during the height of the cold war between the US of A and the USSR. Some computers are going to have Cyrrilic fonts installed -- and some are not. Suppose that it is the 80s, and the only language that runs fast enough on most computers is assembly or C. You want to make a character encoding that

  • Will look fine if someone has the Cyrrilic installed
  • Can be converted to ASCII that will look kinda-sorta like the Cyrrilic with a program that is trivial to write in C.

KOI-8 is the result of this not-quite-thought experiment.

The code to convert from KOI-8 to kinda-sorta-look-a-like ASCII, written in Python, would be:

MASK = (1 << 8) - 1
with open('input', 'rb') as fin, open('output', 'wb') as fout:
    while True:
        c = fin.read(1)
        if not c:
            break
        c = c & MASK # <--- this right here
        fout.write(c)

The MASK constant, written in binary, is just 0b1111111 (seven ones). The line with the arrow masks out the "high bit" in the input character.

Sorting KOI-8 by byte value gives you a sort that is not even a little bit right for the alphabet: the letters are all jumbled up. But it does mean that trivial programs in C or assembly -- or sometimes even things that would try to read words out of old MS Word files -- could convert it to something that looks semi-readable on a display that is only configured to display ASCII characters, possibly as a deep hardware limitations.

Punycode

How lovely it is, of course, to live in 2017 -- the future. We might not have flying cars. We might not even be wearing silver clothing. But by jolly, at least our modern encodings make sense.

We send e-mails in UTF-8 to each other, containing wonderful emoji like "eggplant" or "syringe".

Of course, e-mail is old technology -- we send our eggplants, syringes and avocadoes via end-to-end encrypted Signal chat messages, unreadable by any but our intended recipient.

It is also easy to register our own site, and use an off-the-shelf SaaS offering, such as Wordpress or SquareSpace, to power it. And no matter what we want to put as our domain, we can...as long as it is ASCII-compatible, because DNS is also older than the end of the cold war, and assumes English only.

Seems like this isn't the future after all, which the suspicious lack of flying cars and silver clothing should really have alerted us to.

In our current times, which will be a future generation's benighted past, we must use yet another encoding to put our avocadoes and eggplans in the names of websites, where they rightly belong.

Enter Punycode, an encoding that is not afraid to ask the hard questions like "are you sure that the order encoded bits in the input and the output has to be the same"?

That is, if one string is a prefix of another, should its encoding be a prefix of the other? Just because UTF-8, EBCDIC, KOI-8 or Shift-JIS adhere to this rule doesn't mean we can't think outside the box!

Punycode rearranges the encoding so that all ASCII compatible characters go to the beginning of the string, followed by a hyphen, followed by a complicated algorithm designed to minimize the number of output bytes by assuming the encoded non-ASCII characters are close together.

Consider a simple declaration of love: "I<Red heart emoji>U".

>>> source = b'I\xe2\x9d\xa4U'
>>> declaration = source.decode('utf-8')
>>> declaration.encode('punycode')
b'IU-ony'

Note how, like a well-worn pickup line, I and U were put together, while the part that encodes the heart is at the end.

Consider the slightly more selfish declaration of self-love:

>>> source = b'I\xe2\x9d\xa4me'
>>> source.decode('utf-8').encode('punycode')
b'Ime-4r6a'

Note that even though the selfish declaration and the true love declaration both share a two-character prefix, the result only shares one byte of prefix: the heart got moved to the end -- and not the same heart. Truly, every love is unique.

Punycode's romance with DNS, too, was frought with drama: indeed, many browsers now will not display unicode in the address bar, instead showing "xn--<punycode ASCII>" (the "xn--" in the beginning indicates this is a punycoded string) as a security measure against phishing: it turns out there are a lot of characters in Unicode that look a lot like "a", leading to many interesting variants on "Paypal.com" and "Gmail.com", which look indistinguishable to most humans -- and turns out, most users of the web are indeed of the homo sapiens species.

14 Dec 2017 4:00am GMT

13 Dec 2017

feedDjango community aggregator: Community blog posts

Django Quiz 2017

Man feeding pony

Yesterday evening I gave a quiz at the London Django Meetup Group for the second year running. Here it is so you can do it at home (no cheating!). Answers are at the bottom.

Part 1: Trivia

1. What species is Django's unofficial spirit animal?

  1. Pegasus
  2. Unicorn
  3. Pony
  4. Seal
  5. Dolphin
  6. Elephant

2. Djangocon EU this year was in…

  1. Bologna
  2. Genoa
  3. Venice
  4. Florence

3. What does LTS stand for?

  1. Long Tail Support
  2. Long Term Support
  3. Life Time Support
  4. Life Term Support

4. What does WSGI stand for?

  1. Web Socket Gateway Interface
  2. Web Server Gateway Interface
  3. Web Server Gated Interface
  4. WebS GuardIan

5. What does ACID stand for?

  1. Atomicity Consistency Integrity Durability
  2. Atomicity Concurrency Isolation Durability
  3. Atomicity Consistency Isolation Durability
  4. All Carefully Inserted Data

6. When was the first commit on Django?

One point for year, one for month, one for day

7. When was the first commit in Python?

One point for year, one for month, one for day

8. What is the name of the current Django fellow?

One point for first name, one for last

Part 2: Coding with Django

1. What's the import for the new Django 2.0 URL syntax?

  1. from django.paths import url
  2. from django.urls import path
  3. from django.urls import url
  4. from django.urls import fantastic_new_url

2. When you run tests…

  1. settings.DEBUG is forced to True
  2. settings.DEBUG is forced to False
  3. They fail if settings.DEBUG is not True
  4. They fail if settings.DEBUG is not False

3. The email addresses in settings.ADMINS

  1. will be notified of 404 errors and exceptions
  2. will be notified of exceptions
  3. will be the only ones allowed to use the Admin
  4. will be notified of bots crawling the sites

4. Django 1.11 was the first version with a non-optional dependency - what was it on?

Give the PyPI package name.

5. What's the minimum supported version of Python for Django 2.0?

  1. 2.7
  2. 2.8
  3. 2.999999999999999
  4. 3.3
  5. 3.4
  6. 3.5
  7. 3.6

ANSWERS

But first, some vertical space.

A

N

S

W

E

R

S

B

E

L

O

W

Part 1: Trivia

1. What species is Django's unofficial spirit animal?

a) Pegasus

Although called the Django Pony, it's a winged horse, aka pegasus.

2. Djangocon EU this year was in…

d) Florence

See 2017.djangocon.eu. The youtube channel has some talks worth watching.

3. What does LTS stand for?

b) Long Term Support

Though some would like it to last a life time :)

4. What does WSGI stand for?

b) Web Server Gateway Interface

5. What does ACID stand for?

c) Atomicity Consistency Isolation Durability

6. When was the first commit on Django?

2005-07-13

Jacob Kaplan-Moss in SVN (now imported into Git): "Created basic repository structure". See the commit on GitHub.

7. When was the first commit in Python?

1990-08-09

"Initial revision" by Guido van Rossum - see GitHub.

8. What is the name of the current Django fellow?

Tim Graham

Part 2: Coding with Django

1. What's the import for the new Django 2.0 URL syntax?

from django.urls import path

As per the release notes which are worth reading if you haven't yet :)

2. When you run tests…

b) settings.DEBUG is forced to False.

As per the docs

3. The email addresses in settings.ADMINS

b) will be notified of exceptions

As per the docs.

4. Django 1.11 was the first version with a non-optional dependency - what was it on?

pytz

It was optional and highly recommended for many versions before.

5. What's the minimum supported version of Python for Django 2.0?

3.4

Again see the 2.0 release notes!

Fin

Hope you enjoyed doing/reading/skimming this quiz!

13 Dec 2017 6:00am GMT

11 Dec 2017

feedPlanet Twisted

Moshe Zadka: Exploration Driven Development

"It's ok to mess up your own room."

Sometime there is a problem where the design is obvious -- at least to you. Maybe it's simple. Maybe you've solved one like that many times. In those cases, just go ahead -- use Test-Driven-Development, lint your code as you're writing, and push a branch full of beautiful code, ready to be merged.

This is the exception, rather than the rule, though. In real life, half of the time, we have no idea what we are doing. Is a recursive or iterative solution better? How exactly does this SaaS work? What does Product Management actually want? What is, exactly, the answer to life, the universe and everything?

A lot of the time, solving a problem becomes with exploratory programming. Writing little snippets, testing them out, writing more snippets, throwing some away when they seem to be going down a bad path, saving some from earlier now that we understand the structure better. Poking and prodding at the problem, until the problem's boundaries become clearer.

This is, after all, why dyamic languages became popular -- Python became popular in web development and scientific computing precisely because in both places, "exploratory programming" is important.

In those cases, every single rule about "proper software development" goes straight out the window. Massive functions are fine, when you don't know how to break them. Code with one letter variables is fine, when you are likely to throw it away. Code with bad formatting is fine, when you are likely to refactor it. Code with no tests is fine, if it's doing the wrong thing anyway. Code with big commented out sections is fine, if those are likely to prove useful in an hour.

In short, every single rule of "proper software development" goes out the window when we are exploring a problem, testing its boundaries. All but one -- work on a branch, and keep your work backed up. Luckily, all modern version control systems have good branch isolation and easy branch pushing, so there is no problem. No problem, except the social one -- people are embarrassed at writing "bad code". Please don't be. Everyone does it. Just don't merge it into the production code -- that will legitimately annoy other people.

But as long as the mess is in your own room, don't worry about cleaning it up.

11 Dec 2017 4:00am GMT

11 Feb 2016

feedPlanet TurboGears

Christpher Arndt: Organix Roland JX-3P MIDI Expansion Kit

Foreign visitors: to download the Novation Remote SL template for the Roland JX-3P with the Organix MIDI Upgrade, see the link at the bottom of this post. Zu meinem letzten Geburtstag habe ich mir selbst einen Roland JX-3P geschenkt, inklusive einem DT200-Programmer (ein PG-200 Klon). Der JX-3P ist ein 6-stimmiger analoger Polysynth von 1983 und […]

11 Feb 2016 8:42pm GMT

13 Jan 2016

feedPlanet TurboGears

Christpher Arndt: Anmeldung für das PythonCamp 2016 ab Freitag, 15.1.2016

PythonCamp 2016 Kostenloser Wissensaustausch rund um Python (The following is an announcement for a Python "Un-Conference" in Cologne, Germany and therefor directed at a German-speaking audience.) Liebe Python-Fans, es ist wieder soweit: Am Freitag, den 15. Januar öffnen wir die Online-Anmeldung für Teilnehmer des PythonCamps 2016! Die nunmehr siebte Ausgabe des PythonCamps wird erneut durch […]

13 Jan 2016 3:00pm GMT

02 Nov 2015

feedPlanet TurboGears

Matthew Wilson: Mary Dunbar is the best candidate for Cleveland Heights Council

I'll vote for Mary Dunbar tomorrow in the Cleveland Heights election.

Here's why:

02 Nov 2015 5:14pm GMT

03 Aug 2012

feedPySoy Blog

Juhani Åhman: YA Update

Managed to partyally fix the shading rendering issues with the examples.I reckon the rest of rendering issues are opengl ES related, and not something in libsoy side.
I don't know opengl (ES) very well, so i didn't attempt to fix any further.

I finished implementing a rudimentary pointer controller in pysoy's Client.
There is a pointer.py example program for testing it. Unfortunately it keeps crashing once in a while.
I reckon the problem is something with soy.atoms.Position. Regardless, the pointer controller works.

I started to work on getting keyboard controller to work too, and of course mouse buttons for the pointer,
but I got stuck when writing the python bindings for Genie's events (signals). There's no connect method in pysoy, so maybe that needs to implemented, or then some other solution. I will look into this later.

Plan for this week is to finish documenting bodies, scenes and widgets. I'm about 50% done, and it should be straightforward. Next week I'm finally going to attempt to set up Sphinx and generate readable documentation. I reckon I need to refactor many of the docstrings as well.

03 Aug 2012 12:27pm GMT

10 Jul 2012

feedPySoy Blog

Mayank Singh: Mid-term and dualshock 3

Now that SoC mid-term has arrived, here's bit of update about what I have done till now. The wiimote xinput driver IR update is almost done. Though just like it can said about any piece of software, it's never fully complete.
I also corrected the code for Sphere in the libsoy repository to render an actual sphere.
For now I have started up on an integration of dualshock3 controller. I am currently studying the code given here: http://www.pabr.org/sixlinux/sixlinux.en.html and trying to understand how the dualshock works. I also need to write a controller class to be able to grab and move objects around without the help from the physics engine.

10 Jul 2012 3:00pm GMT

04 Jul 2012

feedPySoy Blog

Juhani Åhman: Weeks 5-7 update

I've have mostly finished writing unit tests for atoms now.
I didn't write tests for Morphs tough, since that seem to be still in WIP.
However, I did encounter a rare memory corruption bug that I'm unable to fix at this point,
because I don't know how to debug it properly.
I can't find the location where the error occurrs.

I'm going to spend rest of this week writing doctests and hopefully getting more examples to work.

04 Jul 2012 9:04am GMT

10 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: King Willams Town Bahnhof

Gestern musste ich morgens zur Station nach KWT um unsere Rerservierten Bustickets für die Weihnachtsferien in Capetown abzuholen. Der Bahnhof selber ist seit Dezember aus kostengründen ohne Zugverbindung - aber Translux und co - die langdistanzbusse haben dort ihre Büros.


Größere Kartenansicht




© benste CC NC SA

10 Nov 2011 10:57am GMT

09 Nov 2011

feedPlanet Plone

Andreas Jung: Produce & Publish Plone Client Connector released as open-source

09 Nov 2011 9:30pm GMT

feedPython Software Foundation | GSoC'11 Students

Benedict Stein

Niemand ist besorgt um so was - mit dem Auto fährt man einfach durch, und in der City - nahe Gnobie- "ne das ist erst gefährlich wenn die Feuerwehr da ist" - 30min später auf dem Rückweg war die Feuerwehr da.




© benste CC NC SA

09 Nov 2011 8:25pm GMT

feedPlanet Plone

ACLARK.NET, LLC: Plone secrets: Episode 4 – Varnish in front

This just in from the production department: use Varnish. (And please forgive the heavily meme-laden approach to describing these techniques :-) .)

Cache ALL the hosts

Our ability to use Varnish in production is no secret by now, or at least it shouldn't be. What is often less clear is exactly how to use it. One way I like[1], is to run Varnish on your public IP port 80 and make Apache listen on your private IP port 80. Then proxy from Varnish to Apache and enjoy easy caching goodness on all your virtual hosts in Apache.

Configuration

This should require less than five minutes of down time to implement. First, configure the appropriate settings. (Well, first install Apache and Varnish if you haven't already: `aptitude install varnish apache2` on Ubuntu Linux[0].)

Varnish

To modify the listen IP address and port, we typically edit a file like /etc/default/varnish (in Ubuntu). However you do it, configure the equivalent of the following on your system:

DAEMON_OPTS="-a 174.143.252.11:80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -s malloc,256m"

This environment variable is then passed to varnishd on the command line. Next, pass traffic to Apache like so (in /etc/varnish/default.vcl on Ubuntu):

backend default {
 .host = "127.0.0.1";
 .port = "80";
 }

Now on to Apache.

Please note that the syntax above is for Varnish 3.x and the syntax has (annoyingly) changed from 2.x to 3.x.

Apache

The Apache part is a bit simpler. You just need to change the listen port (on Ubuntu this is done in /etc/apache2/ports.conf), typically from something like:

Listen *:80

to:

Listen 127.0.0.1:80

Restart ALL the services

Now restart both services. If all goes well you shouldn't notice any difference, except better performance, and when you make a website change and need to clear the cache[2]. For this, I rely on telnetting to the varnish port and issuing the `ban.url` command (formerly `url.purge` in 2.x):

$ telnet localhost 6082
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
200 205     
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,2.6.35.4-rscloud,x86_64,-smalloc,-smalloc,-hcritbit

Type 'help' for command list.
Type 'quit' to close CLI session.

ban.url /
200 0

Cache ALL the disks

This site has Varnish and Apache configured as described in this article. It also has disk caching in Apache enabled, thanks to Elizabeth Leddy's article:

As a result, it's PEPPY AS THE DICKENS™ on a 512MB "slice" (Cloud server) from Rackspace Cloud. And now you know yet another "Plone secret". Now go make your Plone sites faster, and let me know how it goes in the comments section below.

Notes

[0] Using the latest distribution, "oneric".

[1] I first saw this technique at NASA when NASA Science was powered by Plone; I found it odd at the time but years later it makes perfect sense.

[2] Ideally you'd configure this in p.a.caching, but I've not been able to stomach this yet.


09 Nov 2011 5:50pm GMT

feedPlanet Zope.org

Updated MiniPlanet, now with meta-feed

My MiniPlanet Zope product has been working steady and stable for some years, when suddenly a user request came along. Would it be possible to get a feed of all the items in a miniplanet? With this update it became possible. MiniPlanet is an old-styl...

09 Nov 2011 9:41am GMT

08 Nov 2011

feedPlanet Plone

Max M: How to export all redirects from portal_redirection in an older Plone site

Just add the method below to the RedirectionTool and call it from the browser as:

http://localhost:8080/site/portal_redirection/getAllRedirects

Assuming that the site is running at loaclhost:8080 that is :-S

That will show a list of redirects that can be imported into plone 4.x


security.declareProtected(View, 'getAllRedirects')
def getAllRedirects(self):
"get'm'all"
result = []
reference_tool = getToolByName(self, 'reference_catalog')
for k,uuid in self._redirectionmap.items():
obj = reference_tool.lookupObject(uuid)
if obj is None:
print 'could not find redirect from: %s to %s' % (k, uuid)
else:
path = '/'.join(('',)+obj.getPhysicalPath()[2:])
result.append( '%s,%s' % (k,path) )
return '\n'.join(result)

08 Nov 2011 2:58pm GMT

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Brai Party

Brai = Grillabend o.ä.

Die möchte gern Techniker beim Flicken ihrer SpeakOn / Klinke Stecker Verzweigungen...

Die Damen "Mamas" der Siedlung bei der offiziellen Eröffnungsrede

Auch wenn weniger Leute da waren als erwartet, Laute Musik und viele Leute ...

Und natürlich ein Feuer mit echtem Holz zum Grillen.

© benste CC NC SA

08 Nov 2011 2:30pm GMT

07 Nov 2011

feedPlanet Zope.org

Welcome to Betabug Sirius

It has been quite some time that I announced_ that I'd be working as a freelancer. Lots of stuff had to be done in that time, but finally things are ready. I've founded my own little company and set up a small website: Welcome to Betabug Sirius!

07 Nov 2011 9:26am GMT

03 Nov 2011

feedPlanet Zope.org

Assertion helper for zope.testbrowser and unittest

zope.testbrowser is a valuable tool for integration tests. Historically, the Zope community used to write quite a lot of doctests, but we at gocept have found them to be rather clumsy and too often yielding neither good tests nor good documentation. That's why we don't use doctest much anymore, and prefer plain unittest.TestCases instead. However, doctest has one very nice feature, ellipsis matching, that is really helpful for checking HTML output, since you can only make assertions about the parts that interest you. For example, given this kind of page:

>>> print browser.contents
<html>
  <head>
    <title>Simple Page</title>
  </head>
  <body>
    <h1>Simple Page</h1>
  </body>
</html>

If all you're interested in is that the <h1> is rendered properly, you can simply say:

>>> print browser.contents
<...<h1>Simple Page</h1>...

We've now ported this functionality to unittest, as assertEllipsis, in gocept.testing. Some examples:

self.assertEllipsis('...bar...', 'foo bar qux')
# -> nothing happens

self.assertEllipsis('foo', 'bar')
# -> AssertionError: Differences (ndiff with -expected +actual):
     - foo
     + bar

self.assertNotEllipsis('foo', 'foo')
# -> AssertionError: "Value unexpectedly matches expression 'foo'."

To use it, inherit from gocept.testing.assertion.Ellipsis in addition to unittest.TestCase.


03 Nov 2011 7:19am GMT

19 Nov 2010

feedPlanet CherryPy

Robert Brewer: logging.statistics

Statistics about program operation are an invaluable monitoring and debugging tool. How many requests are being handled per second, how much of various resources are in use, how long we've been up. Unfortunately, the gathering and reporting of these critical values is usually ad-hoc. It would be nice if we had 1) a centralized place for gathering statistical performance data, 2) a system for extrapolating that data into more useful information, and 3) a method of serving that information to both human investigators and monitoring software. I've got a proposal. Let's examine each of those points in more detail.

Data Gathering

Just as Python's logging module provides a common importable for gathering and sending messages, statistics need a similar mechanism, and one that does not require each package which wishes to collect stats to import a third-party module. Therefore, we choose to re-use the logging module by adding a statistics object to it.

That logging.statistics object is a nested dict:

import logging
if not hasattr(logging, 'statistics'): logging.statistics = {}

It is not a custom class, because that would 1) require apps to import a third-party module in order to participate, 2) inhibit innovation in extrapolation approaches and in reporting tools, and 3) be slow. There are, however, some specifications regarding the structure of the dict.

    {
   +----"SQLAlchemy": {
   |        "Inserts": 4389745,
   |        "Inserts per Second":
   |            lambda s: s["Inserts"] / (time() - s["Start"]),
   |  C +---"Table Statistics": {
   |  o |        "widgets": {-----------+
 N |  l |            "Rows": 1.3M,      | Record
 a |  l |            "Inserts": 400,    |
 m |  e |        },---------------------+
 e |  c |        "froobles": {
 s |  t |            "Rows": 7845,
 p |  i |            "Inserts": 0,
 a |  o |        },
 c |  n +---},
 e |        "Slow Queries":
   |            [{"Query": "SELECT * FROM widgets;",
   |              "Processing Time": 47.840923343,
   |              },
   |             ],
   +----},
    }

The logging.statistics dict has strictly 4 levels. The topmost level is nothing more than a set of names to introduce modularity. If SQLAlchemy wanted to participate, it might populate the item logging.statistics['SQLAlchemy'], whose value would be a second-layer dict we call a "namespace". Namespaces help multiple emitters to avoid collisions over key names, and make reports easier to read, to boot. The maintainers of SQLAlchemy should feel free to use more than one namespace if needed (such as 'SQLAlchemy ORM').

Each namespace, then, is a dict of named statistical values, such as 'Requests/sec' or 'Uptime'. You should choose names which will look good on a report: spaces and capitalization are just fine.

In addition to scalars, values in a namespace MAY be a (third-layer) dict, or a list, called a "collection". For example, the CherryPy StatsTool keeps track of what each worker thread is doing (or has most recently done) in a 'Worker Threads' collection, where each key is a thread ID; each value in the subdict MUST be a fourth dict (whew!) of statistical data about
each thread. We call each subdict in the collection a "record". Similarly, the StatsTool also keeps a list of slow queries, where each record contains data about each slow query, in order.

Values in a namespace or record may also be functions, which brings us to:

Extrapolation

def extrapolate_statistics(scope):
    """Return an extrapolated copy of the given scope."""
    c = {}
    for k, v in scope.items():
        if isinstance(v, dict):
            v = extrapolate_statistics(v)
        elif isinstance(v, (list, tuple)):
            v = [extrapolate_statistics(record) for record in v]
        elif callable(v):
            v = v(scope)
        c[k] = v
    return c

The collection of statistical data needs to be fast, as close to unnoticeable as possible to the host program. That requires us to minimize I/O, for example, but in Python it also means we need to minimize function calls. So when you are designing your namespace and record values, try to insert the most basic scalar values you already have on hand.

When it comes time to report on the gathered data, however, we usually have much more freedom in what we can calculate. Therefore, whenever reporting tools fetch the contents of logging.statistics for reporting, they first call extrapolate_statistics (passing the whole statistics dict as the only argument). This makes a deep copy of the statistics dict so that the reporting tool can both iterate over it and even change it without harming the original. But it also expands any functions in the dict by calling them. For example, you might have a 'Current Time' entry in the namespace with the value "lambda scope: time.time()". The "scope" parameter is the current namespace dict (or record, if we're currently expanding one of those instead), allowing you access to existing static entries. If you're truly evil, you can even modify more than one entry at a time.

However, don't try to calculate an entry and then use its value in further extrapolations; the order in which the functions are called is not guaranteed. This can lead to a certain amount of duplicated work (or a redesign of your schema), but that's better than complicating the spec.

After the whole thing has been extrapolated, it's time for:

Reporting

A reporting tool would grab the logging.statistics dict, extrapolate it all, and then transform it to (for example) HTML for easy viewing, or JSON for processing by Nagios etc (and because JSON will be a popular output format, you should seriously consider using Python's time module for datetimes and arithmetic, not the datetime module). Each namespace might get its own header and attribute table, plus an extra table for each collection. This is NOT part of the statistics specification; other tools can format how they like.

Turning Collection Off

It is recommended each namespace have an "Enabled" item which, if False, stops collection (but not reporting) of statistical data. Applications SHOULD provide controls to pause and resume collection by setting these entries to False or True, if present.

Usage

    import logging
    # Initialize the repository
    if not hasattr(logging, 'statistics'): logging.statistics = {}
    # Initialize my namespace
    mystats = logging.statistics.setdefault('My Stuff', {})
    # Initialize my namespace's scalars and collections
    mystats.update({
        'Enabled': True,
        'Start Time': time.time(),
        'Important Events': 0,
        'Events/Second': lambda s: (
            (s['Important Events'] / (time.time() - s['Start Time']))),
        })
    ...
    for event in events:
        ...
        # Collect stats
        if mystats.get('Enabled', False):
            mystats['Important Events'] += 1

Original post blogged on b2evolution.

19 Nov 2010 7:08am GMT

12 Nov 2010

feedPlanet CherryPy

Kevin Dangoor: Paver is now on GitHub, thanks to Almad

Paver, the project scripting tool for Python, has just moved to GitHub thanks to Almad. Almad has stepped forward and offered to properly bring Paver into the second decade of the 21st century (doesn't have the same ring to it as bringing something into the 21st century, does it? :)

Seriously, though, Paver reached the point where it was good enough for me and did what I wanted (and, apparently, a good number of other people wanted as well). Almad has some thoughts and where the project should go next and I'm looking forward to hearing more about them. Sign up for the googlegroup to see where Paver is going next.

12 Nov 2010 3:11am GMT

09 Nov 2010

feedPlanet CherryPy

Kevin Dangoor: Paver: project that works, has users, needs a leader

Paver is a Python project scripting tool that I initially created in 2007 to automate a whole bunch of tasks around projects that I was working on. It knows about setuptools and distutils, it has some ideas on handling documentation with example code. It also has users who occasionally like to send in patches. The latest release has had more than 3700 downloads on PyPI.

Paver hasn't needed a lot of work, because it does what it says on the tin: helps you automate project tasks. Sure, there's always more that one could do. But, there isn't more that's required for it to be a useful tool, day-to-day.

Here's the point of my post: Paver is in danger of being abandoned. At this point, everything significant that I am doing is in JavaScript, not Python. The email and patch traffic is low, but it's still too much for someone that's not even actively using the tool any more.

If you're a Paver user and either:

1. want to take the project in fanciful new directions or,

2. want to keep the project humming along with a new .x release every now and then

please let me know.

09 Nov 2010 7:44pm GMT