16 Dec 2017

feedDjango community aggregator: Community blog posts

Building APIs with Django and GraphQL

This tutorial will introduce you to GraphQL with Python, Django and Graphene. We'll see how to create a simple Django project to demonstrate how to build an API server based on GraphQL (instead of REST) then we'll see how to use graphiql_django, an interface for testing GraphQL queries and mutations before building your front-end application, to send GraphQL Queries (for getting data) and Mutations (for posting and updating data). In this part we'll be dealing with building the backend. In the next tutorials we will see how to use frameworks and libraries such as Angular and React to build a front-end application that consumes and updates our GraphQL server and advanced use cases such as user authentication, permissions and Relay

Make sure to follow me on twitter (@techiediaries) to be notified once the next tutorial parts are ready.

GraphQL is a modern API standard for building Web APIs, invented and used internally by Facebook for its native mobile applications then later open sourced. GraphQL provides a better, powerful and flexible alternative to REST.

Before we dive into GraphQL concepts, let's understand what's REST:

REST stands for Representational State Transfer and it's an architectural pattern for designing client/server distributed systems. Unlike GraphQL, it's not a standard but a set of constraints such as having a uniform interface or statelessness which means each HTTP request has to include all of the information needed by the server to fulfill the request, instead of being dependent on the server to keep track of the previous requests or saving the session state on the server.

These are the principles of REST:

  • Resources: expose easily understood directory structure URIs.
  • Representations: transfer JSON or XML to represent data objects and attributes.
  • Messages: use HTTP methods explicitly (for example, GET, POST, PUT, and DELETE).
  • Stateless: interactions store no client context on the server between requests. State dependencies limit and restrict scalability. The client holds session state. https://spring.io/understanding/REST

If you want to know more, watch this video, by Google Developers, that goes over the basic principles behind REST.

An API (Application Programming Interface) provides a way (or an interface) for clients to fetch data from servers. This establishes a relationship between a client and a server. In the case of REST, the established interface is inflexible and strongly coupled to the server implementation so if the server's implementation changes the depending clients, most often than not, will break.

In its essence, GraphQL allows developers to declaratively fetch data from a server. Most importantly, the clients are able to specify exactly what data they need. Also, unlike REST APIs where you usually have multiple endpoints which provide fixed data shapes or structures, a GraphQL server needs to only expose a single endpoint that provides, the requesting clients, with exactly the data they are asking for, no less no more.

GraphQL is adopted by many big companies, other than Facebook, such as GitHub, Twitter and Shopify etc.

Nowadys, companies have rapidely and frequently changing data requirements. For this reason, companies are investing more money and time rewriting clients that access data using REST APIs. As such, GraphQL provides the better solution for developers to satisfy the needs for the uncertain data requirements.

REST vs. GraphQL

GraphQL allows you to query your server for the exact data that you need by sending a single request even for fetching data from related models.

With a REST API, you will usually communicate with multiple endpoints when trying to fetch data from a server so suppose you have this endpoint for accessing users data by their id /users/<id>, this endpoint /users/<id>/posts for accessing a user's posts and the last one for accessing the user's followers /users/<id>/followers

Source

A client application, that needs data from this API, must send three HTTP requests to the available three endpoints to fetch all required data. So first this will cause many round-trips to the server (consuming more resources) and secondly there will likely be extra data that will be fetched (this is called overfetching), which may not necessarily be used, as the client has no control over the requested data, that's because, in REST, the server dictates the shape of the data that can be sent.

Source

The same data can be fetched from the same API by sending only a single request to the GraphQL endpoint, the request contains a query of all data requirements from the client. The server parses the query and sends back the request data as a JSON object in the exact same format requested by the client.

So as a recap, when using the Rest-based APIs the client will either get less than expected data (underfetching). In this case it needs to send more requests to retrieve all required data, or it will get more data (overfetching)than what it actually needs, which consumes the server resources for no reasons.

Thanks to GraphQL, the client can exactly describe the shape of the requested data with a JSON object and the server takes care of sending the data in the requested shape.

Let's take one more simple example to understand how GraphQL works.

Suppose we have a Django web application with two models: Products and Families:


class Product(models.Model):
    ## attributes

class Family(models.Model):
    ## attributes


Each product belongs to a family so they are related with a foreign key.

Now if you need to build a Rest API for this web app, you will have to create multiple endpoints, such as:

Now, let's suppose you want to get all products of a specified family, you will need to add another endpoint. For example something like: /family/:id/products where :id is the identifier of the family of products.

Let's also suppose that a request to the endpoint /product/1 returns a serialized object, for example:

{
    id : 1 ,
    reference : 'PR001' , 
    name : 'Product 001' ,
    description : 'Description of Product 001'
}

The problem is: What if you need to build another front end app, maybe for mobile devices, that needs more data. For example a quantity attribute. In this case, you have to add another endpoint or modify the existing endpoint to include the quantity.

In the case of Rest-based APIs, the server API architecture is strongly coupled with the client implementation, as a result if you need to change the API implementation on the server, you'll definitely end up breaking the existing clients. And if you need to add another client for your API, which needs less or more data, that's served by some endpoint(s), you'll have to change the server code responsible for serving the API in a way that doesn't break the existing clients. That means, in many cases, conserving the old endpoints and adding new endpoints.

If you have ever developed an API with Django or any other framework then you certainly experienced one or all of these issues we have talked about. Thanks to Facebook, GraphQL presents the solution for you!

Continuing with the simple example above. In the case of GraphQL, we can send a query that may look like:

query {  
    product(id:1) {
        id,
        reference,
        quantity
    }
}

Which is going to return something like:

{
    id : 1,
    reference : 'PR001', 
    quantity : 1000
}

We have neglected two attributes without causing any problems or changing the underlying server API.

Just query the data you want and the server will be able to send it back to you.

Now if you need to get all products of a specified family with GraphQL, say for example for the family with the id equals to 1. You can simply send:

query {  
    family(id:1) {
        id
        products {
            id,    
            reference,
            name,
            description,
            quantity
        }
    }
}

If you send this query to your GraphQL server, you'll get something similar to:


{
    "id":"1",
    "products":[{"id":"1","reference":"PR001","name":"Product1","description":"..."} , ... ]
}    

Even if this example is fairly simple, you can see how powerful this new technology can be, for building Web APIs.

Building a GraphQL Django Application

After the introduction to GraphQL vs. REST. Let's now learn how to build a simple real world example web application with Django and Graphene: the Python implementation for GraphQL.

This tutorial assumes you have already setup your development machine to work with Python. You need to have Python, PIP (Python package manager) and optionally virtualenv installed. Python can be easily installed by grabbing the binaries for your operating system from the official website.

pip is installed Python if have Python 2 >=2.7.9 or Python 3 >=3.4 binaries downloaded from the official python.org website. Otherwise you can install it using get-pip.py.

python get-pip.py

For virtualenv you can use virtualenvwrapper using pip:

pip install virtualenvwrapper

Let's start by creating a new virtual and isolated environment for our project dependencies and then install the required packages including django.

Head over to your terminal in Linux/Mac or command prompt in Windows then run the following:

virtualenv graphqlenv 
source graphqlenv/bin/activate 

This will create a new virtual environment and activate it.

Next install django and graphene packages with pip:

pip install django 
pip install graphene_django

You can also install graphiql_django which provides a user interface for testing GraphQL queries against your server.

pip install graphiql_django

Next let's create a Django project and add a single application:

python django-admin.py startproject inventory . 
cd inventory
python manage.py startapp inventory 

Open settings.py then add inventory and graphene_django to the INSTALLED_APPS array:


INSTALLED_APPS = [
        'django.contrib.admin',
        'django.contrib.auth',
        'django.contrib.contenttypes',
        'django.contrib.sessions',
        'django.contrib.messages',
        'django.contrib.staticfiles',
        'graphene_django',
        'inventory'
]

Next migrate the database with:

python manage.py migrate 

Adding Django Models

Open inventory/models.py then add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.db import models

    class Product(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        title = models.CharField(max_length=200, help_text="Enter Product Title")
        description = models.TextField(help_text="Enter Product Description")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")
        unit = models.CharField(max_length=10,help_text="Enter Product Unit ")

        quantity = models.FloatField(help_text="Enter Product Quantity")
        minQuantity = models.FloatField(help_text="Enter Product Min Quantity")

        family = models.ForeignKey('Family')
        location = models.ForeignKey('Location')


        def __str__(self):

            return self.title


    class Family(models.Model):

        reference = models.CharField(max_length=13, help_text="Enter Family Reference")
        title = models.CharField(max_length=200, help_text="Enter Family Title")
        description = models.TextField(help_text="Enter Family Description")

        unit = models.CharField(max_length=10,help_text="Enter Family Unit ")

        minQuantity = models.FloatField(help_text="Enter Family Min Quantity")


        def __str__(self):

            return self.title


    class Location(models.Model):


        reference = models.CharField(max_length=20, help_text="Enter Location Reference")
        title = models.CharField(max_length=200, help_text="Enter Location Title")
        description = models.TextField(help_text="Enter Location Description")

        def __str__(self):

            return self.title


    class Transaction(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        comment = models.TextField(help_text="Enter Product Stock Keeping Unit")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")

        quantity = models.FloatField(help_text="Enter Product Quantity")

        product = models.ForeignKey('Product')

        date = models.DateField(null=True, blank=True)

        REASONS = (
            ('ns', 'New Stock'),
            ('ur', 'Usable Return'),
            ('nr', 'Unusable Return'),
        )


        reason = models.CharField(max_length=2, choices=REASONS, blank=True, default='ns', help_text='Reason for transaction')

        def __str__(self):

            return 'Transaction :  %d' % (self.id)


Next create migrations and apply them:

python manage.py makemigrations
python manage.py migrate

Adding the Admin Interface

The next thing is to add the models to the admin interface so we can add some test data:

Open inventory/admin.py and add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.contrib import admin

    from .models import Product ,Family ,Location ,Transaction  
    # Register your models here.

    admin.site.register(Product)
    admin.site.register(Family)
    admin.site.register(Location)
    admin.site.register(Transaction)

Next create a login to be able to access the admin app

python manage.py createsuperuser 

Enter the username and password when prompted and hit enter.

Now run the local development server with:

python manage.py runserver

Navigate to http://127.0.0.1:8000/admin with your browser. Login and add some data for each model.

GraphQL Concepts: the Schema and Type System

GraphQL is a strongly typed query language which can be used to describe the data structures of an API. GraphQL uses the concepts of schemas and types. Types define what's exposed in the API and grouped in a schema using the GraphQL's SDL language or the Schema Definition Language.

The schema can be considered as a contract between the client and the server which states how a client can access the data in the server.

Adding GraphQL Support: the Schema and the Object Types

To be able to execute GraphQL queries against your web application you need to add a Schema, Object Types and a view function that receives the GraphQL queries.

Creating the Schema

Create inventory/schema.py then:

First, create a subclass of DjangoObjectType for each model you want to query with GraphQL:


    import graphene

    from graphene_django.types import DjangoObjectType

    from .models import Family , Location , Product , Transaction 

    class FamilyType(DjangoObjectType):
        class Meta:
            model = Family 

    class LocationType(DjangoObjectType):
        class Meta:
            model = Location 

    class ProductType(DjangoObjectType):
        class Meta:
            model = Product 

    class TransactionType(DjangoObjectType):
        class Meta:
            model = Transaction


Next, create an abstract query: a subclass of AbstractType (It's abstract because it's an app level query). For each app you have, you need to create an app level abstract query and then combine all abstract queries with a concrete project level query.

You need to create a subclass of graphene.List for each DjangoObjectType then create the resolve_xxx() methods for each Query member


    class Query(graphene.AbstractType):
        all_families = graphene.List(FamilyType)
        all_locations = graphene.List(LocationType)
        all_products = graphene.List(ProductType)
        all_transactions = graphene.List(TransactionType)

        def resolve_all_families(self, args, context, info):
            return Family.objects.all()

        def resolve_all_locations(self, args, context, info):
            return Location.objects.all()

        def resolve_all_products(self, args, context, info):
            return Product.objects.all()

        def resolve_all_transactions(self, args, context, info):
            return Transaction.objects.all()

Creating the Project Level Query

Next create a project level Query. Create a project level schema.py file then add:


    import graphene

    import inventory.schema 


    class Query(inventory.schema.Query, graphene.ObjectType):
        # This class extends all abstract apps level Queries and graphene.ObjectType
        pass

    schema = graphene.Schema(query=Query)

So we first create a Query class which extends all abstract queries and also ObjectType then we create a graphene.Schema object which takes the Query class as a parameter.

Now we need to add a GRAPHINE config object in settings.py


GRAPHENE = {
        'SCHEMA': 'product_inventory_manager.schema.schema'
} 

Adding the GraphQL View

With GraphQL, you don't need multiple endpoints, only one, so let's create it:

Open urls.py then add:


    from django.conf.urls import url
    from django.contrib import admin

    from graphene_django.views import GraphQLView

    from product_inventory_manager.schema import schema

    urlpatterns = [
        url(r'^admin/', admin.site.urls),
        url(r'^graphql', GraphQLView.as_view(graphiql=True)),
    ]

We have previously installed a GraphQL package for adding a user interface to test GraphQL queries so if you want to enable it you just set the graphiql parameter to True.

Serving the App and Testing GraphQL

Now you are ready to test the GraphQL API, so start by serving your Django app with:

python manage.py runserver 

Then navigate to localhost:8000/graphql with your browser and run some queries.

Fetching Data with Queries

In traditional REST APIs you can fetch data by sending HTTP GET requests to pre-determined and pre-conceived endpoints where each endpoint returns a pre-defined and rigid structure. So the only way you have to express your client's data requirements is through the URLs of the available endpoints and their associated parameters. As a result, the client doesn't have much flexibility for defining its data requirements.

For GraphQL, things are very different. The client needs only to communicate with a single endpoint which can return all requested information with flexible data structures. But since there is only one endpoint, the server needs more information to be passed to be able to properly figure out the data requirements of the client. Here comes the role of the query which is a simple JSON (JavaScript Object Notation) object that defines the requirements.

Example Queries

Let's take a simple example of a query that can be sent to our GraphQL server:


query {
    allProducts {
        id,
        sku
    }
}   

The allProducts field in the previous query is the root field. What follows the root field (i.e id and sku), is the payload of the query.

This previous query returns an array of all products which are currently stored in the database. Here's an example response:

{
    "data": {
        "allProducts": [
        {
            "id": "1",
            "sku": "Product001"
        }
        /*...*/
        ]
    }
}   

You can see that each product returned has two fields (the only fields that are specified in the query), an id and a sku even if a product has 10 fields.

If the client needs more that. All it has to do is adding the field in the query.

You can experiment with the other models and you can also add fields.

Now what question you should ask. How do you get the names of the queries?

It's simple, just take the name of the field you create in the abstract query and transform it to camel case.

For example:


all_families = graphene.List(FamilyType) => allFamilies

all_locations = graphene.List(LocationType) => allLocations 

all_products = graphene.List(ProductType) => allProducts 

all_transactions = graphene.List(TransactionType) => allTransactions 

Then for each query specify the model fields you want to retrieve.

How to Query the Relationships or Nested Data?

You can also query the relationships or nested data. So let's suppose that you need to get all families with their products. You can simply make this query:


query {
    allFamilies {
        id,
        reference, 
        productSet {
            id,
            sku 
        }
    }
}

Note that you need to add <field>+Set for nested lists.

An example response would look like the following:


    {
    "data": {
        "allFamilies": [
        {
            "id": "1",
            "reference": "FM001",
            "productSet": [
            {
                "id": "1",
                "sku": "Product001"
            }
            ]
        },
        {
            "id": "2",
            "reference": "FM001",
            "productSet": []
        }
        ]
    }
    }

Now what if you need the parent family and the location of each product. That's also easy to do with GraphQL:

query {
        allProducts {
            id,
            sku, 
            family {
                id
            }
            location {
                id
            }

        }
    }

Querying Single Items Using Query Arguments

We have seen how to query all items but what if you need just one item by id. Go back to your abstract query in your app schema.py file then update it to be able to query for a single product:


product = graphene.Field(ProductType,id=graphene.Int())


Then a resolve_xxx() method:


def resolve_product(self, args, context, info):
        id = args.get('id')

        if id is not None:
            return Product.objects.get(pk=id)

        return None

In GraphQL, fields may have arguments that can be specified between parenthesis in the schema (just like a function declaration). For example, the product field can have an id parameter to return the product by its id. Here's what an example query may look like:


query {
    product(id: 1) {
        sku,
        barcode
    }
}

In the same way, you can add support for getting single families, locations and transactions.

Writing Data with Mutations

A Mutation is a special ObjectType that can be used to create objects in the GraphQL server.

import graphene

class CreateProduct(graphene.Mutation):
    class Arguments:
        sku = graphene.String()
        barcode = graphene.String()

    result = graphene.Boolean()
    product = graphene.Field(lambda: Product)

    def mutate(self, info, sku):
        product = Product(sku=sku)
        result = True
        return CreateProduct(product=product, result=result)

product and result are the output fields of the Mutation when it's resolved.

Inputs are the arguments that the Mutation CreateProduct needs for resolving, in this case sku, barcode, ... are the arguments for the mutation.

mutate() is the function that will be invoked once the mutation is called.

Conclusion

GraphQL is a very powerful technology for building Web APIs and thanks to Django Graphene you can easily add the support for GraphQL to your django project.

You can find the code in this GitHub repository

Thanks for reading!

16 Dec 2017 7:05pm GMT

15 Dec 2017

feedDjango community aggregator: Community blog posts

Building APIs with Django, GraphQL and Graphene

This tutorial will introduce you to GraphQL with Python, Django and Graphene. We'll see how to create a simple Django project to demonstrate how to build an API server based on GraphQL (instead of REST) then we'll see how to use graphiql_django, an interface for testing GraphQL queries and mutations before building your front-end application, to send GraphQL Queries (for getting data) and Mutations (for posting and updating data). In this part we'll be dealing with building the backend. In the next tutorials we will see how to use frameworks and libraries such as Angular and React to build a front-end application that consumes and updates our GraphQL server and advanced use cases such as user authentication, permissions and Relay

Make sure to follow me on twitter (@techiediaries) to be notified once the next tutorial parts are ready.

GraphQL is a modern API standard for building Web APIs, invented and used internally by Facebook for its native mobile applications then later open sourced. GraphQL provides a better, powerful and flexible alternative to REST.

Before we dive into GraphQL concepts, let's understand what's REST:

REST stands for Representational State Transfer and it's an architectural pattern for designing client/server distributed systems. Unlike GraphQL, it's not a standard but a set of constraints such as having a uniform interface or statelessness which means each HTTP request has to include all of the information needed by the server to fulfill the request, instead of being dependent on the server to keep track of the previous requests or saving the session state on the server.

These are the principles of REST:

  • Resources: expose easily understood directory structure URIs.
  • Representations: transfer JSON or XML to represent data objects and attributes.
  • Messages: use HTTP methods explicitly (for example, GET, POST, PUT, and DELETE).
  • Stateless: interactions store no client context on the server between requests. State dependencies limit and restrict scalability. The client holds session state. https://spring.io/understanding/REST

If you want to know more, watch this video, by Google Developers, that goes over the basic principles behind REST.

An API (Application Programming Interface) provides a way (or an interface) for clients to fetch data from servers. This establishes a relationship between a client and a server. In the case of REST, the established interface is inflexible and strongly coupled to the server implementation so if the server's implementation changes the depending clients, most often than not, will break.

In its essence, GraphQL allows developers to declaratively fetch data from a server. Most importantly, the clients are able to specify exactly what data they need. Also, unlike REST APIs where you usually have multiple endpoints which provide fixed data shapes or structures, a GraphQL server needs to only expose a single endpoint that provides, the requesting clients, with exactly the data they are asking for, no less no more.

GraphQL is adopted by many big companies, other than Facebook, such as GitHub, Twitter and Shopify etc.

Nowadys, companies have rapidely and frequently changing data requirements. For this reason, companies are investing more money and time rewriting clients that access data using REST APIs. As such, GraphQL provides the better solution for developers to satisfy the needs for the uncertain data requirements.

REST vs. GraphQL

GraphQL allows you to query your server for the exact data that you need by sending a single request even for fetching data from related models.

With a REST API, you will usually communicate with multiple endpoints when trying to fetch data from a server so suppose you have this endpoint for accessing users data by their id /users/<id>, this endpoint /users/<id>/posts for accessing a user's posts and the last one for accessing the user's followers /users/<id>/followers

Source

A client application, that needs data from this API, must send three HTTP requests to the available three endpoints to fetch all required data. So first this will cause many round-trips to the server (consuming more resources) and secondly there will likely be extra data that will be fetched (this is called overfetching), which may not necessarily be used, as the client has no control over the requested data, that's because, in REST, the server dictates the shape of the data that can be sent.

Source

The same data can be fetched from the same API by sending only a single request to the GraphQL endpoint, the request contains a query of all data requirements from the client. The server parses the query and sends back the request data as a JSON object in the exact same format requested by the client.

So as a recap, when using the Rest-based APIs the client will either get less than expected data (underfetching). In this case it needs to send more requests to retrieve all required data, or it will get more data (overfetching)than what it actually needs, which consumes the server resources for no reasons.

Thanks to GraphQL, the client can exactly describe the shape of the requested data with a JSON object and the server takes care of sending the data in the requested shape.

Let's take one more simple example to understand how GraphQL works.

Suppose we have a Django web application with two models: Products and Families:


class Product(models.Model):
    ## attributes

class Family(models.Model):
    ## attributes


Each product belongs to a family so they are related with a foreign key.

Now if you need to build a Rest API for this web app, you will have to create multiple endpoints, such as:

Now, let's suppose you want to get all products of a specified family, you will need to add another endpoint. For example something like: /family/:id/products where :id is the identifier of the family of products.

Let's also suppose that a request to the endpoint /product/1 returns a serialized object, for example:

{
    id : 1 ,
    reference : 'PR001' , 
    name : 'Product 001' ,
    description : 'Description of Product 001'
}

The problem is: What if you need to build another front end app, maybe for mobile devices, that needs more data. For example a quantity attribute. In this case, you have to add another endpoint or modify the existing endpoint to include the quantity.

In the case of Rest-based APIs, the server API architecture is strongly coupled with the client implementation, as a result if you need to change the API implementation on the server, you'll definitely end up breaking the existing clients. And if you need to add another client for your API, which needs less or more data, that's served by some endpoint(s), you'll have to change the server code responsible for serving the API in a way that doesn't break the existing clients. That means, in many cases, conserving the old endpoints and adding new endpoints.

If you have ever developed an API with Django or any other framework then you certainly experienced one or all of these issues we have talked about. Thanks to Facebook, GraphQL presents the solution for you!

Continuing with the simple example above. In the case of GraphQL, we can send a query that may look like:

query {  
    product(id:1) {
        id,
        reference,
        quantity
    }
}

Which is going to return something like:

{
    id : 1,
    reference : 'PR001', 
    quantity : 1000
}

We have neglected two attributes without causing any problems or changing the underlying server API.

Just query the data you want and the server will be able to send it back to you.

Now if you need to get all products of a specified family with GraphQL, say for example for the family with the id equals to 1. You can simply send:

query {  
    family(id:1) {
        id
        products {
            id,    
            reference,
            name,
            description,
            quantity
        }
    }
}

If you send this query to your GraphQL server, you'll get something similar to:


{
    "id":"1",
    "products":[{"id":"1","reference":"PR001","name":"Product1","description":"..."} , ... ]
}    

Even if this example is fairly simple, you can see how powerful this new technology can be, for building Web APIs.

Building a GraphQL Django Application

After the introduction to GraphQL vs. REST. Let's now learn how to build a simple real world example web application with Django and Graphene: the Python implementation for GraphQL.

This tutorial assumes you have already setup your development machine to work with Python. You need to have Python, PIP (Python package manager) and optionally virtualenv installed. Python can be easily installed by grabbing the binaries for your operating system from the official website.

pip is installed Python if have Python 2 >=2.7.9 or Python 3 >=3.4 binaries downloaded from the official python.org website. Otherwise you can install it using get-pip.py.

python get-pip.py

For virtualenv you can use virtualenvwrapper using pip:

pip install virtualenvwrapper

Let's start by creating a new virtual and isolated environment for our project dependencies and then install the required packages including django.

Head over to your terminal in Linux/Mac or command prompt in Windows then run the following:

virtualenv graphqlenv 
source graphqlenv/bin/activate 

This will create a new virtual environment and activate it.

Next install django and graphene packages with pip:

pip install django 
pip install graphene_django

You can also install graphiql_django which provides a user interface for testing GraphQL queries against your server.

pip install graphiql_django

Next let's create a Django project and add a single application:

python django-admin.py startproject inventory . 
cd inventory
python manage.py startapp inventory 

Open settings.py then add inventory and graphene_django to the INSTALLED_APPS array:


INSTALLED_APPS = [
        'django.contrib.admin',
        'django.contrib.auth',
        'django.contrib.contenttypes',
        'django.contrib.sessions',
        'django.contrib.messages',
        'django.contrib.staticfiles',
        'graphene_django',
        'inventory'
]

Next migrate the database with:

python manage.py migrate 

Adding Django Models

Open inventory/models.py then add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.db import models

    class Product(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        title = models.CharField(max_length=200, help_text="Enter Product Title")
        description = models.TextField(help_text="Enter Product Description")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")
        unit = models.CharField(max_length=10,help_text="Enter Product Unit ")

        quantity = models.FloatField(help_text="Enter Product Quantity")
        minQuantity = models.FloatField(help_text="Enter Product Min Quantity")

        family = models.ForeignKey('Family')
        location = models.ForeignKey('Location')


        def __str__(self):

            return self.title


    class Family(models.Model):

        reference = models.CharField(max_length=13, help_text="Enter Family Reference")
        title = models.CharField(max_length=200, help_text="Enter Family Title")
        description = models.TextField(help_text="Enter Family Description")

        unit = models.CharField(max_length=10,help_text="Enter Family Unit ")

        minQuantity = models.FloatField(help_text="Enter Family Min Quantity")


        def __str__(self):

            return self.title


    class Location(models.Model):


        reference = models.CharField(max_length=20, help_text="Enter Location Reference")
        title = models.CharField(max_length=200, help_text="Enter Location Title")
        description = models.TextField(help_text="Enter Location Description")

        def __str__(self):

            return self.title


    class Transaction(models.Model):

        sku = models.CharField(max_length=13,help_text="Enter Product Stock Keeping Unit")
        barcode = models.CharField(max_length=13,help_text="Enter Product Barcode (ISBN, UPC ...)")

        comment = models.TextField(help_text="Enter Product Stock Keeping Unit")

        unitCost = models.FloatField(help_text="Enter Product Unit Cost")

        quantity = models.FloatField(help_text="Enter Product Quantity")

        product = models.ForeignKey('Product')

        date = models.DateField(null=True, blank=True)

        REASONS = (
            ('ns', 'New Stock'),
            ('ur', 'Usable Return'),
            ('nr', 'Unusable Return'),
        )


        reason = models.CharField(max_length=2, choices=REASONS, blank=True, default='ns', help_text='Reason for transaction')

        def __str__(self):

            return 'Transaction :  %d' % (self.id)


Next create migrations and apply them:

python manage.py makemigrations
python manage.py migrate

Adding the Admin Interface

The next thing is to add the models to the admin interface so we can add some test data:

Open inventory/admin.py and add:


    # -*- coding: utf-8 -*-
    from __future__ import unicode_literals

    from django.contrib import admin

    from .models import Product ,Family ,Location ,Transaction  
    # Register your models here.

    admin.site.register(Product)
    admin.site.register(Family)
    admin.site.register(Location)
    admin.site.register(Transaction)

Next create a login to be able to access the admin app

python manage.py createsuperuser 

Enter the username and password when prompted and hit enter.

Now run the local development server with:

python manage.py runserver

Navigate to http://127.0.0.1:8000/admin with your browser. Login and add some data for each model.

GraphQL Concepts: the Schema and Type System

GraphQL is a strongly typed query language which can be used to describe the data structures of an API. GraphQL uses the concepts of schemas and types. Types define what's exposed in the API and grouped in a schema using the GraphQL's SDL language or the Schema Definition Language.

The schema can be considered as a contract between the client and the server which states how a client can access the data in the server.

Adding GraphQL Support: the Schema and the Object Types

To be able to execute GraphQL queries against your web application you need to add a Schema, Object Types and a view function that receives the GraphQL queries.

Creating the Schema

Create inventory/schema.py then:

First, create a subclass of DjangoObjectType for each model you want to query with GraphQL:


    import graphene

    from graphene_django.types import DjangoObjectType

    from .models import Family , Location , Product , Transaction 

    class FamilyType(DjangoObjectType):
        class Meta:
            model = Family 

    class LocationType(DjangoObjectType):
        class Meta:
            model = Location 

    class ProductType(DjangoObjectType):
        class Meta:
            model = Product 

    class TransactionType(DjangoObjectType):
        class Meta:
            model = Transaction


Next, create an abstract query: a subclass of AbstractType (It's abstract because it's an app level query). For each app you have, you need to create an app level abstract query and then combine all abstract queries with a concrete project level query.

You need to create a subclass of graphene.List for each DjangoObjectType then create the resolve_xxx() methods for each Query member


    class Query(graphene.AbstractType):
        all_families = graphene.List(FamilyType)
        all_locations = graphene.List(LocationType)
        all_products = graphene.List(ProductType)
        all_transactions = graphene.List(TransactionType)

        def resolve_all_families(self, args, context, info):
            return Family.objects.all()

        def resolve_all_locations(self, args, context, info):
            return Location.objects.all()

        def resolve_all_products(self, args, context, info):
            return Product.objects.all()

        def resolve_all_transactions(self, args, context, info):
            return Transaction.objects.all()

Creating the Project Level Query

Next create a project level Query. Create a project level schema.py file then add:


    import graphene

    import inventory.schema 


    class Query(inventory.schema.Query, graphene.ObjectType):
        # This class extends all abstract apps level Queries and graphene.ObjectType
        pass

    schema = graphene.Schema(query=Query)

So we first create a Query class which extends all abstract queries and also ObjectType then we create a graphene.Schema object which takes the Query class as a parameter.

Now we need to add a GRAPHINE config object in settings.py


GRAPHENE = {
        'SCHEMA': 'product_inventory_manager.schema.schema'
} 

Adding the GraphQL View

With GraphQL, you don't need multiple endpoints, only one, so let's create it:

Open urls.py then add:


    from django.conf.urls import url
    from django.contrib import admin

    from graphene_django.views import GraphQLView

    from product_inventory_manager.schema import schema

    urlpatterns = [
        url(r'^admin/', admin.site.urls),
        url(r'^graphql', GraphQLView.as_view(graphiql=True)),
    ]

We have previously installed a GraphQL package for adding a user interface to test GraphQL queries so if you want to enable it you just set the graphiql parameter to True.

Serving the App and Testing GraphQL

Now you are ready to test the GraphQL API, so start by serving your Django app with:

python manage.py runserver 

Then navigate to localhost:8000/graphql with your browser and run some queries.

Fetching Data with Queries

In traditional REST APIs you can fetch data by sending HTTP GET requests to pre-determined and pre-conceived endpoints where each endpoint returns a pre-defined and rigid structure. So the only way you have to express your client's data requirements is through the URLs of the available endpoints and their associated parameters. As a result, the client doesn't have much flexibility for defining its data requirements.

For GraphQL, things are very different. The client needs only to communicate with a single endpoint which can return all requested information with flexible data structures. But since there is only one endpoint, the server needs more information to be passed to be able to properly figure out the data requirements of the client. Here comes the role of the query which is a simple JSON (JavaScript Object Notation) object that defines the requirements.

Example Queries

Let's take a simple example of a query that can be sent to our GraphQL server:


query {
    allProducts {
        id,
        sku
    }
}   

The allProducts field in the previous query is the root field. What follows the root field (i.e id and sku), is the payload of the query.

This previous query returns an array of all products which are currently stored in the database. Here's an example response:

{
    "data": {
        "allProducts": [
        {
            "id": "1",
            "sku": "Product001"
        }
        /*...*/
        ]
    }
}   

You can see that each product returned has two fields (the only fields that are specified in the query), an id and a sku even if a product has 10 fields.

If the client needs more that. All it has to do is adding the field in the query.

You can experiment with the other models and you can also add fields.

Now what question you should ask. How do you get the names of the queries?

It's simple, just take the name of the field you create in the abstract query and transform it to camel case.

For example:


all_families = graphene.List(FamilyType) => allFamilies

all_locations = graphene.List(LocationType) => allLocations 

all_products = graphene.List(ProductType) => allProducts 

all_transactions = graphene.List(TransactionType) => allTransactions 

Then for each query specify the model fields you want to retrieve.

How to Query the Relationships or Nested Data?

You can also query the relationships or nested data. So let's suppose that you need to get all families with their products. You can simply make this query:


query {
    allFamilies {
        id,
        reference, 
        productSet {
            id,
            sku 
        }
    }
}

Note that you need to add <field>+Set for nested lists.

An example response would look like the following:


    {
    "data": {
        "allFamilies": [
        {
            "id": "1",
            "reference": "FM001",
            "productSet": [
            {
                "id": "1",
                "sku": "Product001"
            }
            ]
        },
        {
            "id": "2",
            "reference": "FM001",
            "productSet": []
        }
        ]
    }
    }

Now what if you need the parent family and the location of each product. That's also easy to do with GraphQL:

query {
        allProducts {
            id,
            sku, 
            family {
                id
            }
            location {
                id
            }

        }
    }

Querying Single Items Using Query Arguments

We have seen how to query all items but what if you need just one item by id. Go back to your abstract query in your app schema.py file then update it to be able to query for a single product:


product = graphene.Field(ProductType,id=graphene.Int())


Then a resolve_xxx() method:


def resolve_product(self, args, context, info):
        id = args.get('id')

        if id is not None:
            return Product.objects.get(pk=id)

        return None

In GraphQL, fields may have arguments that can be specified between parenthesis in the schema (just like a function declaration). For example, the product field can have an id parameter to return the product by its id. Here's what an example query may look like:


query {
    product(id: 1) {
        sku,
        barcode
    }
}

In the same way, you can add support for getting single families, locations and transactions.

Writing Data with Mutations

A Mutation is a special ObjectType that can be used to create objects in the GraphQL server.

import graphene

class CreateProduct(graphene.Mutation):
    class Arguments:
        sku = graphene.String()
        barcode = graphene.String()

    result = graphene.Boolean()
    product = graphene.Field(lambda: Product)

    def mutate(self, info, sku):
        product = Product(sku=sku)
        result = True
        return CreateProduct(product=product, result=result)

product and result are the output fields of the Mutation when it's resolved.

Inputs are the arguments that the Mutation CreateProduct needs for resolving, in this case sku, barcode, ... are the arguments for the mutation.

mutate() is the function that will be invoked once the mutation is called.

Conclusion

GraphQL is a very powerful technology for building Web APIs and thanks to Django Graphene you can easily add the support for GraphQL to your django project.

You can find the code in this GitHub repository

15 Dec 2017 6:00am GMT

13 Dec 2017

feedDjango community aggregator: Community blog posts

Django Quiz 2017

Man feeding pony

Yesterday evening I gave a quiz at the London Django Meetup Group for the second year running. Here it is so you can do it at home (no cheating!). Answers are at the bottom.

Part 1: Trivia

1. What species is Django's unofficial spirit animal?

  1. Pegasus
  2. Unicorn
  3. Pony
  4. Seal
  5. Dolphin
  6. Elephant

2. Djangocon EU this year was in…

  1. Bologna
  2. Genoa
  3. Venice
  4. Florence

3. What does LTS stand for?

  1. Long Tail Support
  2. Long Term Support
  3. Life Time Support
  4. Life Term Support

4. What does WSGI stand for?

  1. Web Socket Gateway Interface
  2. Web Server Gateway Interface
  3. Web Server Gated Interface
  4. WebS GuardIan

5. What does ACID stand for?

  1. Atomicity Consistency Integrity Durability
  2. Atomicity Concurrency Isolation Durability
  3. Atomicity Consistency Isolation Durability
  4. All Carefully Inserted Data

6. When was the first commit on Django?

One point for year, one for month, one for day

7. When was the first commit in Python?

One point for year, one for month, one for day

8. What is the name of the current Django fellow?

One point for first name, one for last

Part 2: Coding with Django

1. What's the import for the new Django 2.0 URL syntax?

  1. from django.paths import url
  2. from django.urls import path
  3. from django.urls import url
  4. from django.urls import fantastic_new_url

2. When you run tests…

  1. settings.DEBUG is forced to True
  2. settings.DEBUG is forced to False
  3. They fail if settings.DEBUG is not True
  4. They fail if settings.DEBUG is not False

3. The email addresses in settings.ADMINS

  1. will be notified of 404 errors and exceptions
  2. will be notified of exceptions
  3. will be the only ones allowed to use the Admin
  4. will be notified of bots crawling the sites

4. Django 1.11 was the first version with a non-optional dependency - what was it on?

Give the PyPI package name.

5. What's the minimum supported version of Python for Django 2.0?

  1. 2.7
  2. 2.8
  3. 2.999999999999999
  4. 3.3
  5. 3.4
  6. 3.5
  7. 3.6

ANSWERS

But first, some vertical space.

A

N

S

W

E

R

S

B

E

L

O

W

Part 1: Trivia

1. What species is Django's unofficial spirit animal?

a) Pegasus

Although called the Django Pony, it's a winged horse, aka pegasus.

2. Djangocon EU this year was in…

d) Florence

See 2017.djangocon.eu. The youtube channel has some talks worth watching.

3. What does LTS stand for?

b) Long Term Support

Though some would like it to last a life time :)

4. What does WSGI stand for?

b) Web Server Gateway Interface

5. What does ACID stand for?

c) Atomicity Consistency Isolation Durability

6. When was the first commit on Django?

2005-07-13

Jacob Kaplan-Moss in SVN (now imported into Git): "Created basic repository structure". See the commit on GitHub.

7. When was the first commit in Python?

1990-08-09

"Initial revision" by Guido van Rossum - see GitHub.

8. What is the name of the current Django fellow?

Tim Graham

Part 2: Coding with Django

1. What's the import for the new Django 2.0 URL syntax?

from django.urls import path

As per the release notes which are worth reading if you haven't yet :)

2. When you run tests…

b) settings.DEBUG is forced to False.

As per the docs

3. The email addresses in settings.ADMINS

b) will be notified of exceptions

As per the docs.

4. Django 1.11 was the first version with a non-optional dependency - what was it on?

pytz

It was optional and highly recommended for many versions before.

5. What's the minimum supported version of Python for Django 2.0?

3.4

Again see the 2.0 release notes!

Fin

Hope you enjoyed doing/reading/skimming this quiz!

13 Dec 2017 6:00am GMT

09 Dec 2017

feedDjango community aggregator: Community blog posts

From MySQL to PostgreSQL

In this article I will guide you through the steps I had to take to migrate Django projects from MySQL to PostgreSQL.

MySQL database has proven to be a good start for small and medium scale projects. It is known and used widely and has good documentation. Also there are great clients for easy management, like phpMyAdmin (web), HeidiSQL (windows), or Sequel Pro (macOS). However, in my professional life there were unfortunate moments, when databases from different projects crashed because of large queries or file system errors. Also I had database integrity errors which appeared in the MySQL databases throughout years because of different bugs at the application level.

When one thinks about scaling a project, they have to choose something more suitable. It should be something that is fast, reliable, and well supports ANSI standards of relational databases. Something that most top Django developers use. And the database of choice for most professionals happens to be PostgreSQL. PostgreSQL enables using several vendor-specific features that were not possible with MySQL, e.g. multidimensional arrays, JSON fields, key-value pair fields, special case-insensitive text fields, range fields, special indexes, full-text search, etc. For a newcomer, the best client that I know - pgAdmin (macOS, linux, windows) - might seem too complex at first, compared with MySQL clients, but as you have Django administration and handy ORM, you probably won't need to inspect the database in raw format too often.

So what does it take to migrate from MySQL to PostgreSQL? We will do that in a few steps and we will be using pgloader to help us with data migration. I learned about this tool from Louise Grandjonc, who was giving a presentation about PostgreSQL query optimization at DjangoCon Europe 2017.

One prerequisite for the migration are passing tests. You need to have functional tests to check if all pages are functioning correctly and unit tests to check at least the most critical or complex classes, methods, and functions.

1. Prepare your MySQL database

Make sure that your production MySQL database migration state is up to date:

(env)$ python manage.py migrate --settings=settings.production

Then create a local copy of your production MySQL database. We are going to use it for the migration.

2. Install pgloader

As I mentioned before, for the database migration we will use a tool called pgloader (version 3.4.1 or higher). This tool was programmed by Dimitri Fontaine and is available as an open source project on GitHub. You can compile the required version from the source. Or if you are using macOS, you can install it with Homebrew:

$ brew update
$ brew install pgloader

Note that PostgreSQL will also be installed as a dependency.

3. Create a new PostgreSQL user and database

Unlike with MySQL, creating new database users and databases with PostgreSQL usually happen in the shell rather than in the database client.

Let's create a user and database with the same name myproject.

$ createuser --createdb --password myproject
$ createdb --username=myproject myproject

The --createdb parameter will enable privilege to create databases. The --password parameter will offer to enter a password. The --username parameter will set the owner of the created database.

4. Create the schema

Link the project to this new PostgreSQL database in the settings, for example:

DATABASES = {
'postgresql': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': get_secret("DATABASE_NAME"),
'USER': get_secret("DATABASE_USER"),
'PASSWORD': get_secret("DATABASE_PASSWORD"),
},
}
DATABASES['default'] = DATABASES['postgresql']

Here the custom get_secret() function returns sensitive information from environment variables or a text file that is not tracked under version control. Its implementation is up to you.

Run the migrations to create tables and foreign key constraints in the new PostgreSQL database:

(env)$ python manage.py migrate --settings=settings.local

5. Create the data migration script

The pgloader uses configuration files with the settings defining how to deal with migrations. Let's create the configuration file myproject.load with the following content:

LOAD DATABASE
FROM mysql://mysql_username:mysql_password@localhost/mysql_dbname
INTO postgresql:///myproject
WITH truncate, data only, disable triggers, preserve index names, include no drop, reset sequences
ALTER SCHEMA 'mysql_dbname' RENAME TO 'public'
;

6. Run data migration

Now it's time to copy the data:

$ pgloader myproject.load

Typically you will get a bunch of warnings about type conversions. These can probably be ignored, because the script will take its best guess how to convert data when importing. If in addition you get errors about duplicated data or tables with foreign keys to missing entries, you will need to fix the issues at MySQL database and then repeat the process. In that case, clean up the MySQL database, update your local copy, recreate PostgreSQL database with dropdb and createdb commands, run Django migrations to create the database schema, and copy the data again.

7. Adapt the code

When the database is successfully migrated, we should run Django project tests and fix all PostgreSQL-specific problems in the project's code. The code running Django ORM will run smoothly, but very likely there will be issues with raw SQL, QuerySet's extra() method, and type conversions.

Typically, these are the differences that you might have to keep in mind:

8. Repeat the process for production

When you are sure that the migration process is fluent and all Django tests pass, you can take your production website down, repeat the migration process locally with the latest production data, copy the migrated local database to production server, update the production code, install new dependencies, and take the website back online.

To create a database dump you can use command:

$ pg_dump --format=c --compress=9 --file=myproject.backup myproject

To restore or create the database from dump use commands:

$ dropdb --username=pgsql myproject
$ createdb --username=myproject myproject
$ pg_restore --dbname=myproject --role=myproject --schema=public myproject.backup

I might probably miss some points and there are some ways to automate the upgrade process for production, but you got the idea.

Conclusion

PostgreSQL is more restrictive than MySQL, but it provides greater performance, more stability, and better compliance with standards. In addition, in PostgreSQL there is a bunch of features that were not available in MySQL. If you are lucky, you can switch your project from MySQL to PostgreSQL in one day.

09 Dec 2017 3:51am GMT

08 Dec 2017

feedDjango community aggregator: Community blog posts

Really simple Django view function timer decorator

I use this sometimes to get insight into how long some view functions take. Perhaps you find it useful too:

def view_function_timer(prefix='', writeto=print):

    def decorator(func):
        @functools.wraps(func)
        def inner(*args, **kwargs):
            try:
                t0 = time.time()
                return func(*args, **kwargs)
            finally:
                t1 = time.time()
                writeto(
                    'View Function',
                    '({})'.format(prefix) if prefix else '',
                    func.__name__,
                    args[1:],
                    'Took',
                    '{:.2f}ms'.format(1000 * (t1 - t0)),
                    args[0].build_absolute_uri(),
                )
        return inner

    return decorator

And to use it:

from wherever import view_function_timer


@view_function_timer()
def homepage(request, thing):
    ...
    return render(request, template, context)

And then it prints something like this:

View Function  homepage ('valueofthing',) Took 23.22ms http://localhost:8000/home/valueofthing

It's useful when you don't want a full-blown solution to measure all view functions with a middleware or something.
It can be useful also to see how a cache decorator might work:

from django.views.decorators.cache import cache_page
from wherever import view_function_timer


@view_function_timer('possibly cached')
@cache_page(60 * 60 * 2)  # two hours cache
@view_function_timer('not cached')
def homepage(request, thing):
    ...
    return render(request, template, context)

That way you can trace that, with tail -f or something, to see how/if the cacheing decorator works.

There are many better solutions that are more robust but might be a bigger investment. For example, I would recommend markus which, if you don't have a statsd server you can configure to logger.info call the timings.

08 Dec 2017 9:07pm GMT

Django version 2.0 // A Few Key Features

I'll be updating this post fro...

08 Dec 2017 8:36am GMT

07 Dec 2017

feedDjango community aggregator: Community blog posts

Accessing Model's/Object's Verbose Name in Django Template

Let's say you have a model:

from django.db import models

class Snippet(models.Model):

    ....

    class Meta:
        verbose_name = 'Snippet'
        verbose_name_plural = 'Snippets'

And you want to print model's verbose name in a Django template.

Try to do it as {{ object._meta.verbose_name }} and you will fail - Django template won ...

Read now

07 Dec 2017 9:45pm GMT

06 Dec 2017

feedDjango community aggregator: Community blog posts

My rules for releasing open source software

My rules for releasing open source software

I maintain and help maintain quite a few Open Source Python packages. Possibly well-known packages include django-debug-toolbar, django-ckeditor, django-mptt, and FeinCMS resp. feincms3.

Open source development used to stress me greatly. I was always worrying whether the code is polished enough, whether I didn't introduce new bugs and whether the documentation is sufficient.

These days I still think about these things, but I do not worry as much as I used to. The reason for this are the following principles:

  1. A fully passing test suite on Travis CI is a sufficient quality guarantee for a release.
  2. Do not worry about release notes, but always keep the CHANGELOG up to date.
  3. Put out patch releases even for the smallest bugfixes and feature additions (as long as they are backwards compatible). Nobody wants to wait for the next big release, it always takes longer than intended.
  4. Good enough is perfection.

06 Dec 2017 9:53pm GMT

Django 2.0 Window expressions tutorial

Django 2.0 was released recently and among the most exciting things for me is support for Window expressions, which allows adding an OVER clause to querysets. We will use Window expressions to analyze the commits data to the Django repo.

So what is an over clause?

An over clause is of this format

SELECT depname, empno, salary,
  avg(salary)
    OVER (PARTITION BY depname)
FROM empsalary;

Compare this to a similar GROUP BY statement

SELECT depname, avg(salary)
FROM empsalary
GROUP BY depname;

The difference is a GROUP BY has as many rows as grouping elements, here number of depname. An over clause adds the the aggregated result to each row of the select.

Postgres documentation says, "A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row - the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result." This is true for all other DB implementation as well.

What are real world uses of over expressions?

We will use the Django ORM with the Window expression to to some analysis on the most prolific committers to Django. To do this we will export the commiter names and time of commit to a csv.

git log  --no-merges --date=iso --pretty=format:'%h|%an|%aI' > commits.iso.csv

This is not ranking of Django developers, just of their number of commits, which allows us an interestig dataset. I am grateful to everyone who has contributed to Django - they have made my life immesureably better.

With some light data wrangling using Pandas, we transform this to a per author, per year data and import to Postgres. Our table structure looks like this.

experiments=# \d commits_by_year;
   Table "public.commits_by_year"
    Column     |  Type   | Modifiers
---------------+---------+-----------
 id            | bigint  |
 author        | text    |
 commit_year   | integer |
 commits_count | integer |

We define a model to interact with this table.

from django.db import models


class Committer(models.Model):
    author = models.CharField(max_length=100)
    commit_year = models.PositiveIntegerField()
    commits_count = models.PositiveIntegerField()

    class Meta:
        db_table = 'commits_by_year'

Lets quickly test if our data is imported. You can get a csv from here, and import to Postgres to follow along.

In [2]: Committer.objects.all().count()
Out[2]: 2318

Let us setup our environment and get the imports we need.

## Some ORM imports which we are going to need

from django.db.models import Avg, F, Window
from django.db.models.functions import  Rank, DenseRank, CumeDist
from django_commits.models import Committer

# We will use pandas to display the queryset in tanular format
import pandas
pandas.options.display.max_rows=20

# An utility function to display querysets
def as_table(values_queryset):
    return pandas.DataFrame(list(values_queryset))

Lets quickly look at the data we have.

as_table(Committer.objects.all().values(
  "author", "commit_year", "commits_count"
))
author commit_year commits_count
0 Tim Graham 2017 373
1 Sergey Fedoseev 2017 158
2 Mariusz Felisiak 2017 113
3 Claude Paroz 2017 102
4 Mads Jensen 2017 55
5 Simon Charette 2017 40
6 Jon Dufresne 2017 33
7 Anton Samarchyan 2017 27
8 François Freitag 2017 17
9 Srinivas Reddy Thatiparthy 2017 14
... ... ... ...
2308 Malcolm Tredinnick 2006 175
2309 Georg Bauer 2006 90
2310 Russell Keith-Magee 2006 86
2311 Jacob Kaplan-Moss 2006 83
2312 Luke Plant 2006 20
2313 Wilson Miner 2006 12
2314 Adrian Holovaty 2005 1015
2315 Jacob Kaplan-Moss 2005 130
2316 Georg Bauer 2005 112
2317 Wilson Miner 2005 20

2318 rows × 3 columns

We will now use the Window expression to get the contributors ranked by number of commits, within each year. We will go over the code in detail, but lets look at the queryset and results.

# Find out who have been the most prolific contributors
# in the years 2010-2017

dense_rank_by_year = Window(
    expression=DenseRank(),
    partition_by=F("commit_year"),
    order_by=F("commits_count").desc()
)

commiters_with_rank = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    ).annotate(
        the_rank=dense_rank_by_year
    ).order_by(
        "-commit_year", "the_rank"
    ).values(
        "author", "commit_year", "commits_count", "the_rank"
    )
as_table(commiters_with_rank)
author commit_year commits_count the_rank
0 Tim Graham 2017 373 1
1 Sergey Fedoseev 2017 158 2
2 Mariusz Felisiak 2017 113 3
3 Claude Paroz 2017 102 4
4 Mads Jensen 2017 55 5
5 Simon Charette 2017 40 6
6 Jon Dufresne 2017 33 7
7 Anton Samarchyan 2017 27 8
8 François Freitag 2017 17 9
9 Srinivas Reddy Thatiparthy 2017 14 10
... ... ... ... ...
171 Joseph Kocherhans 2010 53 11
172 Ramiro Morales 2010 53 11
173 Jacob Kaplan-Moss 2010 42 12
174 Chris Beaven 2010 29 13
175 Malcolm Tredinnick 2010 26 14
176 Honza Král 2010 20 15
177 Carl Meyer 2010 17 16
178 Ian Kelly 2010 17 16
179 Simon Meers 2010 11 17
180 Gary Wilson Jr 2010 10 18

181 rows × 4 columns

Lets look a the the ORM code in more detail here.

# We are creating the Window function part of our SQL query here
dense_rank_by_year = Window(
    # We want to get the Rank with no gaps
    expression=DenseRank(),
    # We want to partition the queryset on commit_year
    # Each distinct commit_year is a different partition
    partition_by=F("commit_year"),
    # This decides the ordering within each partition
    order_by=F("commits_count").desc()
)


commiters_with_rank = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    # Standard filter oprtation, limit rows to 2010-2017
    ).annotate(
    # For each commiter, we are annotating its rank
        the_rank=dense_rank_by_year
    ).order_by(
        "-commit_year", "the_rank"
    ).values(
        "author", "commit_year", "commits_count", "the_rank"
    )
as_table(commiters_with_rank)

Now lets try getting the average commits per commiter for each year along with the other data.

avg_commits_per_year = Window(
    # We want the average of commits per committer, with each partition
    expression=Avg("commits_count"),
    # Each individual year is a partition.
    partition_by=F("commit_year")
)

commiters_with_yearly_average = Committer.objects.filter().annotate(
      avg_commit_per_year=avg_commits_per_year
    ).values(
        "author", "commit_year", "commits_count", "avg_commit_per_year"
    )
# We could have done further operation with avg_commit_per_year
# Eg: F(commits_count) - F(avg_commit_per_year),
# would tell us committers who commit more than average
as_table(commiters_with_yearly_average)

This gives us

author avg_commit_per_year commit_year commits_count
0 Wilson Miner 319.250000 2005 20
1 Adrian Holovaty 319.250000 2005 1015
2 Jacob Kaplan-Moss 319.250000 2005 130
3 Georg Bauer 319.250000 2005 112
4 Russell Keith-Magee 188.571429 2006 86
5 Jacob Kaplan-Moss 188.571429 2006 83
6 Luke Plant 188.571429 2006 20
7 Wilson Miner 188.571429 2006 12
8 Adrian Holovaty 188.571429 2006 854
9 Malcolm Tredinnick 188.571429 2006 175
... ... ... ... ...
2308 Adam Johnson 4.916084 2017 13
2309 Tom 4.916084 2017 13
2310 Srinivas Reddy Thatiparthy 4.916084 2017 14
2311 François Freitag 4.916084 2017 17
2312 Anton Samarchyan 4.916084 2017 27
2313 Jon Dufresne 4.916084 2017 33
2314 Simon Charette 4.916084 2017 40
2315 Mads Jensen 4.916084 2017 55
2316 Claude Paroz 4.916084 2017 102
2317 Mariusz Felisiak 4.916084 2017 113

2318 rows × 4 columns

You could try other Window functions such as CumeDist, Rank or Ntile.

from django.db.models.functions import CumeDist
cumedist_by_year = Window(
    expression=CumeDist(),
    partition_by=F("commit_year"),
    order_by=F("commits_count").desc()
)

commiters_with_rank = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    ).annotate(
        cumedist_by_year=cumedist_by_year
    ).order_by(
        "-commit_year", "the_rank"
    ).values(
        "author", "commit_year", "commits_count", "cumedist_by_year"
    )
as_table(commiters_with_rank)

Until now, we have partitioned on commit_year. We can partition on other fields too. We will partition on author to find out how their contributions have changed over the years using the Lag window expression.

from django.db.models.functions import Lag
from django.db.models import Value
commits_in_previous_year = Window(
    expression=Lag("commits_count", default=Value(0)),
    partition_by=F("author"),
    order_by=F("commit_year").asc(),
)

commiters_with_pervious_year_commit = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    ).annotate(
        commits_in_previous_year=commits_in_previous_year
    ).order_by(
        "author", "-commit_year"
    ).values(
        "author", "commit_year", "commits_count", "commits_in_previous_year"
    )
as_table(commiters_with_pervious_year_commit)
author commit_year commits_count commits_in_previous_year
0 Adam Chainz 2016 42 12
1 Adam Chainz 2015 12 0
2 Adam Johnson 2017 13 0
3 Adrian Holovaty 2012 40 98
4 Adrian Holovaty 2011 98 72
5 Adrian Holovaty 2010 72 0
6 Akshesh 2016 31 0
7 Alasdair Nicol 2016 13 19
8 Alasdair Nicol 2015 19 17
9 Alasdair Nicol 2013 17 0
... ... ... ... ...
171 Timo Graham 2012 13 70
172 Timo Graham 2011 70 60
173 Timo Graham 2010 60 0
174 Tom 2017 13 0
175 Unai Zalakain 2013 17 0
176 Vajrasky Kok 2013 14 0
177 areski 2014 15 0
178 eltronix 2016 10 0
179 wrwrwr 2014 21 0
180 Łukasz Langa 2013 15 0

181 rows × 4 columns

I hope this tutorial has been helpful in understanding the window expressions. While still not as felxible as SqlAlchemy, Django ORM has become extremely powerful with recent Django releases. Stay tuned for more advanced ORM tutorials.

06 Dec 2017 4:18pm GMT

Django 2.0 Windows expressions tutorial

Django 2.0 was released recently and among the most exciting things for me is support for Window expressions, which allows adding an OVER clause to querysets. We will use Window expressions to analyze the commits data to the Django repo.

So what is an over clause?

An over clause is of this format

SELECT depname, empno, salary,
  avg(salary)
    OVER (PARTITION BY depname)
FROM empsalary;

Compare this to a similar GROUP BY statement

SELECT depname, avg(salary)
FROM empsalary
GROUP BY depname;

The difference is a GROUP BY has as many rows as grouping elements, here number of depname. An over clause adds the the aggregated result to each row of the select.

Postgres documentation says, "A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row - the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result." This is true for all other DB implementation as well.

What are real world uses of over expressions?

We will use the Django ORM with the Window expression to to some analysis on the most prolific committers to Django. To do this we will export the commiter names and time of commit to a csv.

git log  --no-merges --date=iso --pretty=format:'%h|%an|%aI' > commits.iso.csv

This is not ranking of Django developers, just of their number of commits, which allows us an interestig dataset. I am grateful to everyone who has contributed to Django - they have made my life immesureably better.

With some light data wrangling using Pandas, we transform this to a per author, per year data and import to Postgres. Our table structure looks like this.

experiments=# \d commits_by_year;
   Table "public.commits_by_year"
    Column     |  Type   | Modifiers
---------------+---------+-----------
 id            | bigint  |
 author        | text    |
 commit_year   | integer |
 commits_count | integer |

We define a model to interact with this table.

from django.db import models


class Committer(models.Model):
    author = models.CharField(max_length=100)
    commit_year = models.PositiveIntegerField()
    commits_count = models.PositiveIntegerField()

    class Meta:
        db_table = 'commits_by_year'

Lets quickly test if our data is imported. You can get a csv from here, and import to Postgres to follow along.

In [2]: Committer.objects.all().count()
Out[2]: 2318

Let us setup our environment and get the imports we need.

## Some ORM imports which we are going to need

from django.db.models import Avg, F, Window
from django.db.models.functions import  Rank, DenseRank, CumeDist
from django_commits.models import Committer

# We will use pandas to display the queryset in tanular format
import pandas
pandas.options.display.max_rows=20

# An utility function to display querysets
def as_table(values_queryset):
    return pandas.DataFrame(list(values_queryset))

Lets quickly look at the data we have.

as_table(Committer.objects.all().values(
  "author", "commit_year", "commits_count"
))
author commit_year commits_count
0 Tim Graham 2017 373
1 Sergey Fedoseev 2017 158
2 Mariusz Felisiak 2017 113
3 Claude Paroz 2017 102
4 Mads Jensen 2017 55
5 Simon Charette 2017 40
6 Jon Dufresne 2017 33
7 Anton Samarchyan 2017 27
8 François Freitag 2017 17
9 Srinivas Reddy Thatiparthy 2017 14
... ... ... ...
2308 Malcolm Tredinnick 2006 175
2309 Georg Bauer 2006 90
2310 Russell Keith-Magee 2006 86
2311 Jacob Kaplan-Moss 2006 83
2312 Luke Plant 2006 20
2313 Wilson Miner 2006 12
2314 Adrian Holovaty 2005 1015
2315 Jacob Kaplan-Moss 2005 130
2316 Georg Bauer 2005 112
2317 Wilson Miner 2005 20

2318 rows × 3 columns

We will now use the Window expression to get the contributors ranked by number of commits, within each year. We will go over the code in detail, but lets look at the queryset and results.

# Find out who have been the most prolific contributors
# in the years 2010-2017

dense_rank_by_year = Window(
    expression=DenseRank(),
    partition_by=F("commit_year"),
    order_by=F("commits_count").desc()
)

commiters_with_rank = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    ).annotate(
        the_rank=dense_rank_by_year
    ).order_by(
        "-commit_year", "the_rank"
    ).values(
        "author", "commit_year", "commits_count", "the_rank"
    )
as_table(commiters_with_rank)
author commit_year commits_count the_rank
0 Tim Graham 2017 373 1
1 Sergey Fedoseev 2017 158 2
2 Mariusz Felisiak 2017 113 3
3 Claude Paroz 2017 102 4
4 Mads Jensen 2017 55 5
5 Simon Charette 2017 40 6
6 Jon Dufresne 2017 33 7
7 Anton Samarchyan 2017 27 8
8 François Freitag 2017 17 9
9 Srinivas Reddy Thatiparthy 2017 14 10
... ... ... ... ...
171 Joseph Kocherhans 2010 53 11
172 Ramiro Morales 2010 53 11
173 Jacob Kaplan-Moss 2010 42 12
174 Chris Beaven 2010 29 13
175 Malcolm Tredinnick 2010 26 14
176 Honza Král 2010 20 15
177 Carl Meyer 2010 17 16
178 Ian Kelly 2010 17 16
179 Simon Meers 2010 11 17
180 Gary Wilson Jr 2010 10 18

181 rows × 4 columns

Lets look a the the ORM code in more detail here.

# We are creating the Window function part of our SQL query here
dense_rank_by_year = Window(
    # We want to get the Rank with no gaps
    expression=DenseRank(),
    # We want to partition the queryset on commit_year
    # Each distinct commit_year is a different partition
    partition_by=F("commit_year"),
    # This decides the ordering within each partition
    order_by=F("commits_count").desc()
)


commiters_with_rank = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    # Standard filter oprtation, limit rows to 2010-2017
    ).annotate(
    # For each commiter, we are annotating its rank
        the_rank=dense_rank_by_year
    ).order_by(
        "-commit_year", "the_rank"
    ).values(
        "author", "commit_year", "commits_count", "the_rank"
    )
as_table(commiters_with_rank)

Now lets try getting the average commits per commiter for each year along with the other data.

avg_commits_per_year = Window(
    # We want the average of commits per committer, with each partition
    expression=Avg("commits_count"),
    # Each individual year is a partition.
    partition_by=F("commit_year")
)

commiters_with_yearly_average = Committer.objects.filter().annotate(
      avg_commit_per_year=avg_commits_per_year
    ).values(
        "author", "commit_year", "commits_count", "avg_commit_per_year"
    )
# We could have done further operation with avg_commit_per_year
# Eg: F(commits_count) - F(avg_commit_per_year),
# would tell us committers who commit more than average
as_table(commiters_with_yearly_average)

This gives us

author avg_commit_per_year commit_year commits_count
0 Wilson Miner 319.250000 2005 20
1 Adrian Holovaty 319.250000 2005 1015
2 Jacob Kaplan-Moss 319.250000 2005 130
3 Georg Bauer 319.250000 2005 112
4 Russell Keith-Magee 188.571429 2006 86
5 Jacob Kaplan-Moss 188.571429 2006 83
6 Luke Plant 188.571429 2006 20
7 Wilson Miner 188.571429 2006 12
8 Adrian Holovaty 188.571429 2006 854
9 Malcolm Tredinnick 188.571429 2006 175
... ... ... ... ...
2308 Adam Johnson 4.916084 2017 13
2309 Tom 4.916084 2017 13
2310 Srinivas Reddy Thatiparthy 4.916084 2017 14
2311 François Freitag 4.916084 2017 17
2312 Anton Samarchyan 4.916084 2017 27
2313 Jon Dufresne 4.916084 2017 33
2314 Simon Charette 4.916084 2017 40
2315 Mads Jensen 4.916084 2017 55
2316 Claude Paroz 4.916084 2017 102
2317 Mariusz Felisiak 4.916084 2017 113

2318 rows × 4 columns

You could try other Window functions such as CumeDist, Rank or Ntile.

from django.db.models.functions import CumeDist
cumedist_by_year = Window(
    expression=CumeDist(),
    partition_by=F("commit_year"),
    order_by=F("commits_count").desc()
)

commiters_with_rank = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    ).annotate(
        cumedist_by_year=cumedist_by_year
    ).order_by(
        "-commit_year", "the_rank"
    ).values(
        "author", "commit_year", "commits_count", "cumedist_by_year"
    )
as_table(commiters_with_rank)

Until now, we have partitioned on commit_year. We can partition on other fields too. We will partition on author to find out how their contributions have changed over the years using the Lag window expression.

from django.db.models.functions import Lag
from django.db.models import Value
commits_in_previous_year = Window(
    expression=Lag("commits_count", default=Value(0)),
    partition_by=F("author"),
    order_by=F("commit_year").asc(),
)

commiters_with_pervious_year_commit = Committer.objects.filter(
        commit_year__gte=2010, commits_count__gte=10
    ).annotate(
        commits_in_previous_year=commits_in_previous_year
    ).order_by(
        "author", "-commit_year"
    ).values(
        "author", "commit_year", "commits_count", "commits_in_previous_year"
    )
as_table(commiters_with_pervious_year_commit)
author commit_year commits_count commits_in_previous_year
0 Adam Chainz 2016 42 12
1 Adam Chainz 2015 12 0
2 Adam Johnson 2017 13 0
3 Adrian Holovaty 2012 40 98
4 Adrian Holovaty 2011 98 72
5 Adrian Holovaty 2010 72 0
6 Akshesh 2016 31 0
7 Alasdair Nicol 2016 13 19
8 Alasdair Nicol 2015 19 17
9 Alasdair Nicol 2013 17 0
... ... ... ... ...
171 Timo Graham 2012 13 70
172 Timo Graham 2011 70 60
173 Timo Graham 2010 60 0
174 Tom 2017 13 0
175 Unai Zalakain 2013 17 0
176 Vajrasky Kok 2013 14 0
177 areski 2014 15 0
178 eltronix 2016 10 0
179 wrwrwr 2014 21 0
180 Łukasz Langa 2013 15 0

181 rows × 4 columns

I hope this tutorial has been helpful in understanding the window expressions. While still not as felxible as SqlAlchemy, Django ORM has become extremely powerful with recent Django releases. Stay tuned for more advanced ORM tutorials.

06 Dec 2017 3:31pm GMT

Container Runtimes Part 1: An Introduction to Container Runtimes

One of the terms you hear a lot when dealing with containers is "container runtime". "Container runtime" can have different meanings to different people so it's no wonder that it's such a confusing and vaguely understood term, even within the container community. This post is the first in a series that will be in four parts: 1. Part 1: Intro to Container Runtimes: why are they so confusing? 2. Part 2: Deep Dive into Low-Level Runtimes 3. Part 3: Deep Dive into High-Level Runtimes 4. Part 4: Kubernetes Runtimes and the CRI This post will explain what container runtimes are and w[...]

06 Dec 2017 3:30pm GMT

04 Dec 2017

feedDjango community aggregator: Community blog posts

Caktus is Excited about Django 2.0

Did you know Django 2.0 is coming soon? The development team at Caktus knows and we're excited! You should be excited too if you work with or depend on Django. Here's what our Cakti have been saying about the recently-released 2.0 beta.

What are Cakti Excited About?

Django first supported Python 3 with the release of version 1.5 back in February 2014. Adoption of Python 3 has only grown since then and we're ready for the milestone that 2.0 marks: dropping support for Python 2. Legacy projects that aren't ready to make the jump can still enjoy the long-term support of Django 1.11 on Python 2, of course.

With the removal of Python 2 support, a lot of Django's internals have been simplified and cleaned up, no longer needing to support both major variants of Python. We've put a lot of work into moving our own projects forward to Python 3 and it's great to see the wider Django community moving forward, too.

In more concrete changes, some Caktus devs are enthused by transitions Django is making away from positional arguments, which can be error-prone. Among the changes are the removal of optional positional arguments from form fields, removal of positional arguments form indexes entirely, and the addition of keyword-only arguments to custom template tags.

Of course, the new responsive and mobile-friendly admin is a much-anticipated feature! Django's admin interface has always been a great out-of-the-box way to give staff and client users quick access to the data behind the sites we build with it. It can be a quick way to provide simple behind-the-scenes interfaces to control a wide variety of site content. Now it extends that accessibility to use on the go.

What are Cakti Cautious About?

While we're excited about a Python 3-only Django, the first thing on our list of cautions about the new release is also dropping support for Python 2. We've been upgrading a backlog of our own Django apps to support Python 3 in preparation, but our projects depend on a wide range of third-party apps among which we know we'll find holdouts. That's going to mean finding alternatives, pushing pull requests, and even forking some things to get them forward for any project we want to move to Django 2.0.

Is There Anything Cakti Actually Dislike?

While there's a lot to be excited about, every big change has its costs and its risks. There are certainly upsets in the Django landscape we wish had gone differently, even if we would never consider them reasons to avoid the new release.

Requiring ForeignKey's on_delete parameter

Some of us dislike the new requirement that the on_delete option to ForeignKey fields be explicit. By default, Django has always used the CASCADE rule to handle what happens when an object is deleted and other objects have references to it, causing the whole chain of objects to be deleted together to avoid broken state. There have also been other on_delete options for other behaviors like prohibiting such deletions or setting the references to None when the target is deleted. As of Django 2.0, the on_delete no longer defaults to CASCADE and you must pick an option explicitly.

While there are some benefits to the change, one of the most unfortunate results is that updating to Django 2.0 means updating all of your models with an explicit on_delete choice…including the entire history of your migrations, even the ones that have already been run, which will no longer be compatible without the update.

Adding a Second URL Format

A new URL format is now available. It offers a much more readable and understandable format than the old regular-expression based URL patterns Django has used for years. This largely a welcome change that will make Django more accessible to newcomers and projects easier to maintain.

However, the new format is introduced in addition to the old-style regular-expression version of patterns. You can use the new style in new or existing projects, and you can make the choice to replace all your existing patterns with the cleaner style, but you'll have to continue to contend with third-party apps that won't make the change. If you have a sufficiently large enough project, there's a good chance you'll forgo migrating all your URL patterns.

Maybe this will improve with time, but for now, we'll have to deal with the cognitive cost of both formats in our projects.

In Conclusion

Caktus is definitely ready to continue moving our client and internal projects forward with major Django releases. We have been diligently migrating projects between LTS releases. Django 2.0 will be an important stepping stone to the next LTS after 1.11, but we won't wait until then to start learning and experimenting with these changes for projects both big and small.

Django has come a long way and Caktus is proud to continue to be a part of that.

04 Dec 2017 8:30pm GMT

Setup & Install Ionic

Ionic is a framework for build...

04 Dec 2017 7:06am GMT

02 Dec 2017

feedDjango community aggregator: Community blog posts

Configure Django to log exceptions in production

Django default logging behaviour for unhandled exceptions is:

With DEBUG=True (Development)

With DEBUG=False (Production)

Usually not logging the exception on console isn't a problem since an exception email is sent to you which can help you know the source of exception. But this assumes that email settings are configured correctly else you will not receive the exception email.

You might not have email settings configured correctly and don't want to get into that right away. You might instead want to log the exception on console even with DEBUG=False. This post would help you in such scenario.

Default logging configuration

Django's default logging setting is:

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        },
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
    },
    'formatters': {
        'django.server': {
            '()': 'django.utils.log.ServerFormatter',
            'format': '[%(server_time)s] %(message)s',
        }
    },
    'handlers': {
        'console': {
            'level': 'INFO',
            'filters': ['require_debug_true'],
            'class': 'logging.StreamHandler',
        },
        'django.server': {
            'level': 'INFO',
            'class': 'logging.StreamHandler',
            'formatter': 'django.server',
        },
        'mail_admins': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'django.utils.log.AdminEmailHandler'
        }
    },
    'loggers': {
        'django': {
            'handlers': ['console', 'mail_admins'],
            'level': 'INFO',
        },
        'django.server': {
            'handlers': ['django.server'],
            'level': 'INFO',
            'propagate': False,
        },
    }
}

Without any explicit settings.LOGGING configured in settings.py, this is the default logging configuration Django works with. You can ignore django.server part.

Any unhandled Django exception is handled in function handle_uncaught_exception. The relevant code is on Github

Error is logged using logger.error in this function. This logger is an instance of django.request. Since logger django is a parent of django.request, so log records are propogated to logger django.

As you can see from DEFAULT_LOGGING, logger django has two handlers.

As you can see from DEFAULT_LOGGING, handler console has a filter called require_debug_true because of which this handler doesn't handle log records in production, i.e when DEBUG=False.

Logging to console in production

So you can create a new handler which directs ERROR log records to Stream when DEBUG=False.

This handler would look like

'console_debug_false': {
    'level': 'ERROR',
    'filters': ['require_debug_false'],
    'class': 'logging.StreamHandler',
}

And you can ask logger django to use this handler by adding this handler in loggers['django']['handlers'].

Your final settings.LOGGING would look like following:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        },
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
    },
    'formatters': {
        'django.server': {
            '()': 'django.utils.log.ServerFormatter',
            'format': '[%(server_time)s] %(message)s',
        }
    },
    'handlers': {
        'console': {
            'level': 'INFO',
            'filters': ['require_debug_true'],
            'class': 'logging.StreamHandler',
        },
        'console_debug_false': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'logging.StreamHandler',
        },
        'django.server': {
            'level': 'INFO',
            'class': 'logging.StreamHandler',
            'formatter': 'django.server',
        },
        'mail_admins': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'django.utils.log.AdminEmailHandler'
        }
    },
    'loggers': {
        'django': {
            'handlers': ['console', 'console_debug_false', 'mail_admins'],
            'level': 'INFO',
        },
        'django.server': {
            'handlers': ['django.server'],
            'level': 'INFO',
            'propagate': False,
        }
    }
}

If you don't want emails to be sent to admins, in case email settings aren't configured correctly, then you should remove mail_admins from loggers['django']['handlers'].

02 Dec 2017 7:09am GMT

Building a CRUD Application with Django Rest Framework and Vue.js

In this tutorial, you will learn how to use Django and Vue.js to build a modern CRUD (Create, read, update and delete operations are essential for the majority of web applications) web application. You'll also learn how to integrate Django Rest Framework with Vue.js and how to make HTTP calls using vue-resource (you can also use Axios or the browser's fetch API).

Django + Vue.js

In nutshell, Django is a back-end framework for building web applications with Python. Vue.js is a user interface library for creating JavaScript applications in the front-end. Django Rest Framework is a Django module to create Rest-based APIs that can be then consumed from browsers or mobile devices.

You can use any database management system such as MySQL or PostgreSQL etc. Since the Django ORM can abstracts away all the differences between database systems and let work with any database without writing new code.

The integration part simply consists of using an HTTP client like Axios, vue-resource or even better the browser's fetch API to call APIs exposed by DRF from the Vue.js application. Both server and client applications are decoupled so you can swap any part with any other library in the future. You can also create mobile apps to consume your API without creating a new server or changing anything in the server side.

Introduction to Vue.js

Vue.js is a JavaScript library designed for building SPAs or single page web applications. It's a progressive JavaScript library that's primarily used for building user interfaces (like React).

Vue.js works at the view layer of the MVC (Model-View-Controller) architecture so it has no no knowledge about any back-end technology and therefore it can be integrated easily with any server-side framework.

Vue.js has many modern features for creating modern view layers. Here is a list of features:

You can find more information about Vue.js by visiting its official website

How to use Vue.js?

You can integrate Vue.js into your project with different ways:

For the sake of simplicity we are going to the <script> tag to include Vue.js in our Django project.

Introduction to Django and Django Rest Framework

Django is a Python-based web framework, designed for developers with deadlines. Django uses a variation of the Model View Controller or the MVC architectural design pattern called MTV, an abbreviation for Model, Template, View.

Getting started with Django is quite easy, first make sure you have python and pip (python package manager) installed. It's also preferred to use virtualenv to manage and isolate your development environments and also avoid conflicts between different versions of the same package.

Head over to your terminal on Mac and Linux or command prompt on Windows then run the following commands:

virtualenv env
source env/bin/activate
pip install django
django-admin startproject django-vuejs-demo
cd django-vuejs-demo
python manage.py startapp demoapp


This will create a new virtual environment, activate it, install django framework, generate a new Django project then create a new app. Django apps are a way to organize a Django project into decoupled and reusable modules.

Next open setting.py in project's root folder then add your newly created app to INSTALLED_APPS array.

INSTALLED_APPS = [
    #...
    'demoapp'
]

One last step to properly set up the Django project: you need to migrate your database.


python manage.py migrate

This will create a sqlite database file in your project's root folder and create Django tables. If you want to use another database system such as PostgreSQL, make sure to update your settings.

Django Rest Framework

The official DRF website defines DRF as:

Django REST framework is a powerful and flexible toolkit for building Web APIs.

Some reasons you might want to use REST framework:

The Web browsable API is a huge usability win for your developers. Authentication policies including packages for OAuth1a and OAuth2. Serialization that supports both ORM and non-ORM data sources. Customizable all the way down - just use regular function-based views if you don't need the more powerful features. Extensive documentation, and great community support. Used and trusted by internationally recognized companies including Mozilla, Red Hat, Heroku, and Eventbrite.

You also need to install DRF and add it to your project's settings file:


pip install djangorestframework
pip install django-filter  # Filtering support

Then add rest_framework to INSTALLED_APPS array in settings.py.

INSTALLED_APPS = [
    #...
    'rest_framework',
    'rest_framework.authtoken',
    'django_filters',
    'demoapp'
]

Create First Django View

We don't need to create a separate view function or class-based view in demoapp/views.py but you can do it if you want. Since the view is used to only render the template we are going to use TemplateView.as_view() method and map the view directly to ^$in urls.py

from django.conf.urls import url, include
from django.contrib import admin
from django.views.generic import TemplateView

urlpatterns = [
    url(r'^$', view=TemplateView.as_view(template_name='demoapp/home.html')),
    url(r'^admin/', admin.site.urls)
]

We open urls.py then we import TemplateView then create an URL mapping with the template_name set to demoapp/index.html

You can also create a view function in demoapp/views.py:

from django.shortcuts import render
def home(request):
    return render(request, 'home.html')

then map it with:

from django.conf.urls import url
from django.contrib import admin
from demoapp.views import home

urlpatterns = [
    url(r'^$', home, name='home'),
    url(r'^admin/', admin.site.urls)
]

Open settings.py then in TEMPLATES array add the following setting if it's not present

 'DIRS': [os.path.join(BASE_DIR, "templates"),]

This tells Django where to look for templates.

Next you'll need to create a template file (home.html) in demoapp/templates/demoapp

<!DOCTYPE html>

<html>
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Django Vue.js Demo</title>
</head>

<body>
   Hello Vue.js
</body>

</html>

You can start your development server to check if everything works as expected:

python manage.py runserver

Next open your browser and navigate to http://localhost:8000/.

How to Integrate Vue.js with Django Rest Framework?

You have two options to integrate Vue.js with Django Rest Framework:

In both cases, we use DRF to build an API server then we consume the API from Vue.js application using HTTP clients such as Axios, vue-resource or the browser's fetch API.

Adding Vue.js

Let's use a Jinja2 template to serve a Vue.js application.

Open home.html then follow these steps:

Include Vue.js in the <head>:

<script src="https://unpkg.com/vue"></script> 

The interpolation delimiters `are the same for Jinja2 templating system and Vue.js. Lucky for us, Vue.js provides a way to change them, so we'll be using${and}$`

Add this to <body>:


<div id="app">
    <p> ${ message }$ </p>
</div>

<script>
new Vue({
  delimiters: ['${', '}$'],
  el: '#app',
  data: {
    message: 'Hello from Vue.js'
  }
})
</script>


That's it we have integrated Vue.js with Django. If you run your Django server and navigate to your app, you should see Hell from Vue.js message.

Now we need to create an API then consume data from our simple Vue.js application mounted from the home.html template.

Setting up Vue-resource

First make sure to download and add resource-vue.min.js to your app static folder (demoapp/static/demoapp).

You can also use it from a CDN

<script src="https://cdn.jsdelivr.net/npm/vue-resource@1.3.4"></script>

The vue-resource plugin for Vue.js provides an HTTP services that allows you to make HTTP requests (also called API calls or AJAX requests) and process HTTP responses, using the browser's XMLHttpRequest ** interface or **JSONP.

You can find more information frol the Github official repository for resource-vue.

Before you can use the HTTP service you need to sets it up:

new Vue({

  http: {
    root: 'http://localhost:8000',
    headers: {
      Authorization: 'Basic YXBpOnBhc3N3b3Jk'
    }
  }

})

All requests should be be relative to root.

Please note that vue-resource was retired but it's still used and developed as a separate project. If you want other alternatives you can use Axios HTTP client.

Building an API with DRF

We are not going to reinvent the wheel here, instead we'll use a simple API that we have previously built in this tutorial. It's a simple API for managing products inventories with four models: Product, Family, Location and Transaction and four API endpoints /products, /families, /locations and /transactions. So make sure to follow that tutorial for building this API if this is your first time working with Django Rest Framework.

Sending API Calls from Vue.js with vue-resource

Sending an API call to Django Rest API server is easy. You simply use the injected $http service with an HTTP method: GET, POST, PUT or DELETE (CRUD operations)

Add a method to your Vue.js application that fetches the list of products

   new Vue({
        delimiters: ['${', '}$'],
        el: '#app',
        data: {
            products: []

        },
        http: {
            root: 'http://localhost:8000',
            headers: {
              Authorization: '<TOKEN_HERE>'
            }
        },
        methods: {
            getProducts: function () {
                this.$http.get('products/').then(function (data,status,request) {
                if (status == 200) {
                    this.products = data.body.results;
                 }   
                })
            }
        },
        mounted: function () {
            this.getProducts();
        }
    })

So we added a products variable in Vue.js data object to hold our fetched products then we declared a method getProducts() (in Vue.js methods object). In this method we use this.$http.get() to send a GET request to our DRF server then we assign the result to this.products array.

Next we call getProducts() method when the Vue.js application gets mounted.

Now let's see how we can add display these products in the HTML template.

 <div id="app">
    <ul>
        <li v-for="product in products">
            <h1>${ product.title }$</h1>
            <p> ${ product.description }$ </p>
        </li>
    </ul>
</div>


So we use v-for for iterating through the products array then we use the custom interpolation delimiters to show product's title and description.

Conclusion

In this tutorial we have seen how to get started using the Django Rest API with Vue.js to build CRUD applications that consume an API from a Vue.js front-end using vue-resource plugin to send HTTP requests.

02 Dec 2017 6:00am GMT

28 Nov 2017

feedDjango community aggregator: Community blog posts

Django Logging, The Right Way

Good logging is critical to debugging and troubleshooting problems. Not only is it helpful in local development, but in production it's indispensable. When reviewing logs for an issue, it's rare to hear somebody say, "We have too much logging in our app." but common to hear the converse. So, with that in mind, let's get started.

A Crash Course in Python Loggers

At the top of every file, you should have something like this:

import logging

logger = logging.getLogger(__name__)

__name__ will evaluate to the dotted Python path of the module, e.g. myproject/myapp/views.py will use myproject.myapp.views. Now we can use that logger throughout the file like so:

# A simple string logged at the "warning" level
logger.warning("Your log message is here")

# A string with a variable at the "info" level
logger.info("The value of var is %s", var)

# Logging the traceback for a caught exception
try:
    function_that_might_raise_index_error()
except IndexError:
    # equivalent to logger.error(msg, exc_info=True)
    logger.exception("Something bad happened")

Note: Python's loggers will handle inserting variables into your log message if you pass them as arguments in the logging function. If the log does not need to be output, the variable substitution won't ever occur, helping avoid a small performance hit for a log that will never be used.

Pepper your code liberally with these logging statements. Here's the rule of thumb I use for log levels:

Where to Log

Your app should not be concerned with where its logs go. Instead, it should log everything to the console (stdout/stderr) and let the server decide what to do with it from there. Typically this is put in a dedicated (and logrotated) file, captured by the Systemd journal or Docker, sent to a separate service such as ElasticSearch, or some combination of those. Log storage is a deployment concern, not an application concern.

The only thing your app does need to concern itself with is the format of the logs. Typically this is just a string with the relevant data, but if your server already adds a timestamp to the log, you probably want to exclude it from your own formatter. Likewise, if your log aggregator can accept JSON, a formatter like python-json-logger may be more appropriate.

Configuring the Logger

Writing to loggers in Python is easy. Configuring them to go to the right place is more challenging than you'd expect. I'll start by bypassing Django's default logging as described in my previous post. That will provide us with a blank slate to work with.

Setting up Sentry

Error reporting services are critical for any non-trivial site. By default these catch uncaught exceptions, notify you of the problem (only once per incident), and provide a nice interface to see the state of the app when the exception occurred. My favorite service for this is Sentry.

We can take Sentry one-step further by sending any log messages that are warning or higher to the service as well. These would otherwise be lost in a sea of log files that in-practice rarely get reviewed. To do this, we'll add a "root" logger which serves as a catch-all for any logs that are sent from any Python module. That looks something like this in the Django settings,

import logging.config
LOGGING_CONFIG = None
logging.config.dictConfig({
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'console': {
            # exact format is not important, this is the minimum information
            'format': '%(asctime)s %(name)-12s %(levelname)-8s %(message)s',
        },
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'console',
        },
        # Add Handler for Sentry for `warning` and above
        'sentry': {
            'level': 'WARNING',
            'class': 'raven.contrib.django.raven_compat.handlers.SentryHandler',
        },
    },
    'loggers': {
    # root logger
        '': {
            'level': 'WARNING',
            'handlers': ['console', 'sentry'],
        },
    },
})

Logging From Your Application

While you may only want to know about warnings and errors from your third-party dependencies, you typically want much deeper insight into your application code. Ideally, your code all lives under a single namespace so it can be captured with a single addition to the loggers. Let's assume your project uses the namespace myproject, building on the code from above you would add this:

logging.config.dictConfig({
    # ...
    'loggers': {
        '': {
            'level': 'WARNING',
            'handlers': ['console', 'sentry'],
        },
        'myproject': {
            'level': 'INFO',
            'handlers': ['console', 'sentry'],
            # required to avoid double logging with root logger
            'propagate': False,
        },
    },
})

But what if you want to investigate something in your application deeper with debug logging? Having to commit new code and deploy it feels like overkill. This is a great use-case for environment variables. We can modify the previous stanza to look like this:

import os
LOGLEVEL = os.environ.get('LOGLEVEL', 'info').upper()
logging.config.dictConfig({
    # ...
    'loggers': {
        '': {
            'level': 'WARNING',
            'handlers': ['console', 'sentry'],
        },
        'myproject': {
            'level': LOGLEVEL,
            'handlers': ['console', 'sentry'],
            # required to avoid double logging with root logger
            'propagate': False,
        },
    },
})

Now our application logging will default to info, but can easily be increased temporarily by setting the environment variable LOGLEVEL=debug. Alternatively, if log storage is not an issue, consider always logging at the debug level. They are easy enough to filter out with a simple grep or via your log visualization tool, e.g. Kibana.

Cutting out the Noise

Once you have your logging setup and running, you may find some modules log information that you don't really care about and only serve to create extra noise (I'm looking at you newrelic). For these modules, we can add another logger to tune them out. The first option is to log them to the console, but avoid propagating them to the root logger which would send them to Sentry:

logging.config.dictConfig({
    # ...
    'loggers': {
        '': {
            'level': 'WARNING',
            'handlers': ['console', 'sentry'],
        },
        'myproject': {
            'level': LOGLEVEL,
            'handlers': ['console', 'sentry'],
            # required to avoid double logging with root logger
            'propagate': False,
        },
        # Don't send this module's logs to Sentry
        'noisy_module': {
            'level':'ERROR',
            'handlers': ['console'],
            'propagate': False,
        },
    },
})

If you find they are too noisy, even for the console, we can drop them altogether:

logging.config.dictConfig({
    # ...
    'loggers': {
        # ...
        # Don't log this module at all
        'noisy_module': {
            'level': 'NOTSET',
            'propagate': False,
        },
    },
})

Local Request Logging

Once niceity in Django's default config is request logging with runserver. By overridding Django's config, we lose it, but it is easy enough to add back in:

from django.utils.log import DEFAULT_LOGGING
logging.config.dictConfig({
    # ...
    'formatters': {
        # ...
        'django.server': DEFAULT_LOGGING['formatters']['django.server'],
    },
    'handlers' {
        # ...
        'django.server': DEFAULT_LOGGING['handlers']['django.server'],
    },
    'loggers': {
        # ...
        'django.server': DEFAULT_LOGGING['loggers']['django.server'],
    },
})

This technique is a little brittle, since it depends on some internals that could change between releases, but it should be easier to detect breakages than doing a direct copy/paste from Django's source.

Go Forth and Log!

For a complete example of Django logging from this post, see this gist. If you have any other tips or tricks to share, we'd love to hear them!

28 Nov 2017 11:12pm GMT