15 Jul 2019

feedDjango community aggregator: Community blog posts

Django's Test Case Classes and a Three Times Speed-Up

Rearing Pony

This is a story about how I sped up a client's Django test suite to be three times faster, through swapping the test case class in use.

Speeding up test runs is rarely a bad thing. Even small teams can repeat their test run hundreds of times per week, so time saved there is time won back. This keeps developers fast, productive, and happy.

Django's Test Case Classes

A quick refresher on how the Django's three basic test case classes affect the database:

The distinction between TransactionTestCase and TestCase can be confusing. Here's my attempt to summarize it in one sentence:

TransactionTestCase that allows your code to use transactions, whilst TestCase uses transactions itself.

The Speed-Up Story

Recently I was helping my client ev.energy improve their Django project. A full test run took about six minutes on my laptop when using the test command's --parallel option. This isn't particularly long - I've worked on projects where it took up to 30 minutes! But it did give me a little time during runs to look for easy speed-ups.

Their project uses a custom test case class for all their tests, to add extra helper methods. It originally extended TransactionTestCase, with its slower but more complete database reset procedure. I wondered why this had been done.

I searched the Git history for the first use of TransactionTestCase with git log -S TransactionTestCase (a very useful Git option!). I found a developer had first used it in tests for their custom background task class called Task.

Task closed the database connection at the end of its process with connection.close(). This helped isolate the tasks. Since they're run in a long running background process, using a fresh database connection for each task helped prevent a failure in one from affecting the others.

Unfortunately the call to connection.close() prevented use of TestCase when testing Task classes. Closing the database connection also ends any transactions. So when TestCase ran its teardown process, it errored when trying to roll back the transactions it started in its setup process.

Because of this, the developers used TransactionTestCase for their custom test case class. And they stuck with it as the project grew.

This was all fair, and the speed difference would not have been noticeable when there were fewer tests. Fixing it then allowed them to focus on feature development.

But as with test time things like this, the seconds added up over time. Much like the metaphorical frog in a slowly boiling pot of water.

Once I'd discovered this piece of history, I guessed most of the tests that didn't run Task classes would work with TestCase. I swapped the base of the custom test class to TestCase, reran, and only the Task tests failed!

After changing those broken test classes to remain on TransactionTestCase, I reran the suite and everything passed. The run time went down from 375 seconds to 120 seconds. A three times speed-up!

Fin

I hope this post helps you find the right test case class in your Django project. If you want help with this, email me - I'm happy to answer any questions, and am available for contracts. See my front page for details.

Thanks for reading,

-Adam

15 Jul 2019 4:00am GMT

11 Jul 2019

feedDjango community aggregator: Community blog posts

SongSearch autocomplete rate now 2+ per second

By analyzing my Nginx logs, I've concluded that SongSearch's autocomplete JSON API now gets about 2.2 requests per second. I.e. these are XHR requests to /api/search/autocomplete?q=....

Roughly, 1.8 requests per second goes back to the Django/Elasticsearch backend. That's a hit ratio of 16%. These Django/Elasticsearch requests take roughly 200ms on average. I suspect about 150-180ms of that time is spent querying Elasticsearch, the rest being Python request/response and JSON "paperwork".

Autocomplete counts in Datadog

Caching strategy

Caching is hard because the queries are so vastly different over time. Had I put a Redis cache decorator on the autocomplete Django view function I'd quickly bloat Redis memory and cause lots of evictions.

What I used to do was something like this:

def search_autocomplete(request):
   q = request.GET.get('q') 

   cache_key = None
   if len(q) < 10:
      cache_key = 'autocomplete:' + q
      results = cache.get(cache_key)
      if results is not None:
          return http.JsonResponse(results)

   results = _do_elastisearch_query(q)
   if cache_key:
       cache.set(cache_key, results, 60 * 60)

   return http.JsonResponse(results)   

However, after some simple benchmarking it was clear that using Nginx' uwsgi_cache it was much faster to let the cacheable queries terminate already at Nginx. So I changed the code to something like this:

def search_autocomplete(request):
   q = request.GET.get('q') 
   results = _do_elastisearch_query(q)
   response = http.JsonResponse(results)   

   if len(q) < 10:
       patch_cache_control(response, public=True, max_age=60 * 60)

   return response

The only annoying thing about Nginx caching is that purging is hard unless you go for that Nginx Plus (or whatever their enterprise version is called). But more annoying, to me, is that fact that I can't really see what this means for my server. When I was caching with Redis I could just use redis-cli and...

> INFO
...
# Memory
used_memory:123904288
used_memory_human:118.16M
...

Nginx Amplify

My current best tool for keeping an eye on Nginx is Nginx Amplify. It gives me some basic insights about the state of things. Here are some recent screenshots:

NGINX Requests/s

NGINX Memory Usage

NGINX CPU Usage %

Thoughts and conclusion

Caching is hard. But it's also fun because it ties directly into performance work.

In my business logic, I chose that autocomplete queries that are between 1 and 9 characters are cacheable. And I picked a TTL of 60 minutes. At this point, I'm not sure exactly why I chose that logic but I remember doing some back-of-envelope calculations about what the hit ratio would be and roughly what that would mean in bytes in RAM. I definitely remember picking 60 minutes because I was nervous about bloating Nginx's memory usage. But as of today, I'm switching that up to 24 hours and let's see what that does to my current 16% Nginx cache hit ratio. At the moment, /var/cache/nginx-cache/ is only 34MB which isn't much.

Ideal would be some user-friendly diagnostics tool that I can point somewhere, towards Nginx, that says how much my uwsgi_cache is hurting or saving me. Autocomplete is just one of many things going on on this single DigitalOcean server. There's also a big PostgreSQL server, a node-express cluster, a bunch of uwsgi workers, Redis, lots of cron job scripts, and of course a big honking Elasticsearch 6.

11 Jul 2019 10:21pm GMT

Single-file Python/Django Deployments

This post covers portions of my talk, Containerless Django, from DjangoCon US 2018.

Deploying Python has improved significantly since I started working with it over a decade ago. We have virtualenv, pip, wheels, package hash verification, and lock files. Despite all the improvements, it still feels harder than it needs to be. Installing a typical large project has many steps, each one easy to trip up on:

  1. Install Python
  2. Install build tools (pip/virtualenv, pipenv, poetry, etc.)
  3. Install build dependencies (C compiler, development libraries, etc.)
  4. Download the code
  5. Run the build tools
  6. If you're using Node to build client-side files for a website, repeat steps 1-5 for that

It's normal for ops teams to repeat this process as part of the automated testing and then again on every server the project is deployed to. It's no wonder Docker has become so popular because of the ease in which you can build-once and deploy-everywhere.

But Docker is a heavy-handed solution and doesn't fit for every project. I envy the simplicity of languages like Go where you can compile your project down to a single binary that runs without any external dependencies. Even Java's JAR file format which requires Java to be preinstalled, but otherwise only requires downloading a single file would be a huge improvement.

A JAR File for Python

Turns out there are already a few projects solving this problem. PEX from Twitter, XAR from Facebook, and more ambitious projects like PyOxidizer, but shiv from LinkedIn hits the sweet spot for us. It is simple, stable, does not require special tooling, and incurs no runtime performance hit. It creates a ZIP file of all your code and dependencies that is executable with only a Python interpreter. In a future post, we'll do a deep-dive into how shiv works under-the-hood, but for brevity's sake, we'll treat it as a black box here.

Using Shiv with Django

Shiv works with any proper Python package. Since most Django developers don't think about packaging their projects, we'll give you a crash course here.

Packaging Your Project

Previously, the only viable packaging tool was setuptools, but since PEP-517 we now have a number of other options including flit and poetry. At the moment, setuptools is still the de-facto standard, so, despite it being a little cruftier than the other options we'll use that in our example.

You can use our post, Using setup.py in Your (Django) Project, as a starting point, but we need to take a couple more steps to ensure non-Python files (static files, templates, etc.) are included.

The easiest way to do this is with a MANIFEST.in file. It might look something like this:

graft your_project/collected_static
graft your_project/templates

Note that these directories need to live inside the top-level Python module to be included. Also note that the static files directory should be your STATIC_ROOT not your STATICFILES_DIRS. If you define templates inside your individual apps, you'll need to include those directories as well.

Dealing with Dependencies

These days, every project should include some sort of a lock file which is machine generated and defines the exact version of every dependency and the hash verifications for them. You can do this via poetry, pip-compile from pip-tools, or pipenv.

I typically let one of these tools handle the install and then pass its site-packages directory to shiv via the --site-packages option. In that case, you'll also pass the pip flags --no-deps . to install your local project, but not include any defined dependencies for it.

Including an Entry Point

We need to provide shiv a Python function that it will run when the zipapp is executed. The most logical one is the equivalent, manage.py. You can use django.core.management.execute_from_command_line directly, but I recommend writing a small wrapper which also sets the default DJANGO_SETTINGS_MODULE environment variable. You could create a __main__.py in your project and include the following:

import os
from django.core.management import execute_from_command_line

def main():
    os.environ.setdefault("DJANGO_SETTINGS_MODULE", "your_project.settings")
    execute_from_command_line()

if __name__ == "__main__":
    main()

Putting it in __main__.py is a Python convention that also allows you to execute management commands via python -m your_project ....

The Shebang

While not necessary, you can customize the shebang of your zipapp to have it executed with a specific version of Python or a Python from a specific location. I typically use /usr/bin/env python3.7 (or whatever version the project expects).

Putting this altogether, you might have something like this in your CI script:

pipenv install --deploy
pipenv run manage.py collectstatic --noinput
shiv --output-file=your_project.pyz \
     --site-packages=$(pipenv --venv)/lib/python3.7/site-packages \
     --python="/usr/bin/env python3.7" \
     --entry-point=your_project.__main__.main \
     --no-deps .

Production Webserver

Astute readers may have noticed there's not an easy way to run uwsgi or gunicorn when we want to deploy. Typically you execute your webserver, then point it at your project instead of the other way around. We created django-webserver so you'll have access to your favorite WSGI server as a management command. We also sponsored work on uWSGI to package it as a wheel making it quick and easy to use in this setup. 🎺

You'll also want to be sure that your zipapp can serve its own static files, either via whitenoise or by letting uwsgi handle your staticfiles (included by default in django-webserver).

Settings

People do all sorts of strange things with Django settings. You can still use the DJANGO_SETTINGS_MODULE environment variable to pick the settings to use at runtime. You can also use environment variables to set different values in your settings file, but that can be tedious and, potentially, a security problem.

Instead, I prefer to use a file that is easily machine readable (JSON or YAML) and also easily generated from configuration management or a secret manager like chamber or Hashicorp Vault.

We built another package, goodconf which lets you use a static configuration file (or environment variables) to adjust settings across environments. This lets us treat our zipapp more like a standard application and less like a special-case Django app. The people who handle your deployments should appreciate this.

Deployment

Once you have your zipapp, deployment is almost trivial:

  1. Install Python
  2. Create the configuration file
  3. Download the zipapp (we store ours in S3)
  4. Start server - ./myproject.pyz gunicorn or ./myproject.pyz pyuwsgi

Caveats

Zipapps created with shiv aren't a perfect solution. You should be aware of a few things before you start using them.

Extraction

On first execution, shiv will cache the zip file contents into a unique directory in ~/.shiv (path is configurable at runtime). This creates a small delay on first run. It also means you may need to periodically clean out the directory if you're doing lots of deploys.

System Dependencies

If you are using libraries which depend on system libraries, they will also need to be installed on the deployment target. For example, mysqlclient will require the MySQL library. Fortunately, the proliferation of the wheel format allows authors to bundle these libraries with their packages as is the case with Pillow, psycopg2-binary, lxml, and many others.

Flexibility

Your zipapp will define a single entry point. While it is possible to override it at runtime or even drop into a Python interpreter, I would save those options for debugging only. If you are used to running arbitrary scripts from your project or loading arbitrary files from your git repository, you'll need to use some more discipline to make management commands for the things you want to run once deployed.

Portability

Pure Python projects should be very portable across different operating systems and potentially different Python versions. As soon as you start compiling dependencies however, your best bet is to build on the same OS and Python as you intend to run the project.

Isolation

Your project will run with the system's site packages in sys.path, effectively the same as creating a virtualenv with --system-site-packages. For better isolation, use the -S flag for Python, e.g. python3.7 -S ./your_project.pyz. See this GitHub Issue for more details.

We've been successfully running this site and many of our client sites using shiv-generated zipapps for a few months now. We're very happy with the simplicity and speed it lends to rolling out new software. If you're using shiv or a similar technique to bundle your application, let us know in the comments below.

11 Jul 2019 3:29am GMT

10 Jul 2019

feedDjango community aggregator: Community blog posts

WWDC 2019: How to Build the Best Product Strategy for iOS 13, watchOS 6, and iPadOS

While Apple's annual Worldwide Developers Conference (WWDC) is always exciting and informative, the WWDC 2019 conference was particularly jam-packed with new updates, capabilities, and ideas worth knowing about. Of course, most product owners don't have the time to sift through four days' worth of video presentations, or the hundreds of reaction articles that come out in the weeks that follow. That's exactly why Distillery has pulled together summaries of many of the most important updates from WWDC 2019. We've watched it and read it so you don't have to. With that in mind, what do product owners really need to know about iOS 13, iPadOS, watchOS, and the other new capabilities unveiled at the Apple WWDC 2019? And what impacts and opportunities do all these changes mean for product strategy and development? Finally, what does your business need to do to be ready for iOS 13's release this fall? Read on. Privacy Sign in with Apple What is it? This is Apple's new single sign-on (SSO service) protected with biometric two-factor authentication, a feature that largely hasn't been adopted by other SSO services. It offers users with Apple IDs two options: either log in using their actual email addresses, or auto-generate a unique email address. Sign in with Apple ultimately wants users to be able to use it anywhere - on all iOS devices and apps, the web, and even on Android or Windows. What does it mean? It means more privacy and security for users. If users elect to have Apple auto-generate a unique address, website or application developers/owners won't receive their actual email addresses. Apple will serve as an intermediary, forwarding any emails from the developer/owner to users' actual email addresses. Importantly, Apple has sworn that it will never track or profile users. In addition, there's no password to forget or steal. What are the product strategy impacts and opportunities? Required implementation. Any product using third-party authentication will be required to add Sign in with Apple. It's available cross-platform as a JavaScript SDK. Verified authentication. Sign in with Apple alleviates the need to use third-party authentication options. Improved product UX. It's convenient, fast, and entirely respectful of users' privacy. Apple is not only protecting users' email addresses, but giving them confidence that they won't be profiled or tracked. In addition, when using Sign in with Apple, developers can be certain they're receiving verified email addresses, so it's no longer necessary to send email verification messages. Location Privacy What is it? iOS 13 gives users more fine-tuned control over location data shared with applications. Now, users can choose whether to grant applications one-time access, or access anytime the app is used. They'll be notified when an app is using location data in the background, and new controls and API changes will keep apps from non-consensually accessing location data while users are on WiFi and Bluetooth. What does it mean? Through all these privacy changes, Apple is positioning itself as the tech giant that really cares about ensuring user privacy. Now, it's truly up to users to decide how much location information they'd like to share. Also, the ongoing notifications keep location privacy top of mind and in users' control. What are the product strategy impacts and opportunities? Ensuring app support for all modes. Make sure your app supports all location-sharing modes (e.g., "always allow," "only while the app is active"). Careful evaluation of product location data strategy. How will your planned or existing product be impacted if users are hesitant to share location data? If your app requires ongoing access, how can you create UX and messaging that helps users feel secure? Data Privacy for Kids What is it? Any apps in the kids' category of the App Store are no longer permitted to include any third-party advertising or analytics software. This wasn't announced onstage at WWDC, but it's an important policy change worth understanding. What does it mean? Apple's strengthened commitment to privacy extends to making sure children are protected from outside tracking and targeted advertising. Previously, Apple's guidelines were much more lax, banning only "behavioral advertising" based on user activity. So it was actually pretty common for app developers to integrate data-tracking software, even in kid-focused applications. What are the product strategy impacts and opportunities? Required compliance. If developers don't adhere to these guidelines, their apps will be removed from the App Store. They could even face permanent expulsion from Apple's Developer Program. More user research. Going forward, product development efforts for kids' apps are likely to rely more heavily on actual user research. The ability to showcase your app and business as respectful of kids' privacy. Apple is differentiating themselves from the competition by taking actions to ensure true privacy and security for users. Following their example will not only keep your kid-focused app on the App Store, but align you with Apple as a privacy champion. Accessibility Voice Control for Mac What is it? The latest version of macOS, Catalina, allows users to control their Macs entirely with their voices. (Apple has indicated that this change is also coming to iOS.) They've improved dictation and editing features, and created more comprehensive commands that enable voice to open and interact with apps. Grids and numbers can also be used to allow users to work in apps. What does it mean? For users who either can't - or don't prefer to - use more traditional inputs (e.g., keyboard, mouse, touch), writing and using applications via voice just became easier and more efficient. This is a clear step forward for accessibility, effectively removing or lowering barriers to usage for users with motor limitations. What are the product strategy impacts and opportunities? Expanding your product's user base. Can your existing or planned macOS app use these new capabilities to allow control entirely via voice? You now have the potential to serve an underserved market. Opportunities to dream up new products that embrace these capabilities. This update changes what's possible with voice control. What possibilities can you imagine? SiriKit Media […] The post WWDC 2019: How to Build the Best Product Strategy for iOS 13, watchOS 6, and iPadOS appeared first on Distillery.

10 Jul 2019 10:21pm GMT

05 Jul 2019

feedDjango community aggregator: Community blog posts

Tuples versus Lists in Python

"Tupleth verthuth lithts?" said the lispy snake

One thing I often ask for in code review is conversion of tuples to lists. For example, imagine we had this Django admin class:

class BookAdmin(ModelAdmin):
    fieldsets = (
        (
            None,
            {
                "fields": (
                    "id",
                    "name",
                    "created_time",
                )
            },
        ),
    )

    readonly_fields = ("created_time",)

I'd prefer the tuples to all be lists, like:

class BookAdmin(ModelAdmin):
    fieldsets = [
        [
            None,
            {
                "fields": [
                    "id",
                    "name",
                    "created_time",
                ]
            },
        ],
    ]

    readonly_fields = ["created_time"]

This is counter to the examples in the Django admin docs. So, why?

Tuples use parentheses, and parentheses have several uses in Python. Therefore it's easier to make typo mistakes with them.

First, it's easy to miss a single trailing comma and accidentally create a parenthesized string instead of a tuple. For example, if we missed it in the above code we would have:

readonly_fields = ("created_time")

This is the string "created_time", instead of a tuple containing that string. Woops.

This often works with code you pass the variable to because strings are iterable character-by-character. (Django guards against this in the particular case of ModelAdmin.readonly_fields with its check admin.E034. But in general Python code doesn't check for strings versus tuples/lists before iterating).

There's an argument Python shouldn't have made strings iterable character-by-character. There could instead be an iterator created with something like str.iter(). However, it's quite a backwards incompatible change to add now.

Second, it's easy to miss a couple of trailing commas and again accidentally create a parenthesized string. Imagine we missed the end-of line commas in the above sub-element "fields":

"fields": (
    "id"
    "name"
    "created_time"
)

Now instead of the tuple ("id", "name", "created_time"), we have the string "idnamecreated_time". Woops again!

This too can silently fail because strings are iterable character-by-character.

Third, it's easy to create tuples accidentally. The following is a string:

x = "Hello,"

And this is a tuple, containing a string:

x = "Hello",

It's quite hard to spot the difference! Even with syntax highlighting, the comma is a small character so the difference in colour is easily missed.

(There's a dedicated flake8 plugin, flake8-tuple for banning exactly this construct, considering how harmful it can be.)

Fourth, they provide an extra step for new users of the language. Many languages have only one list-like type, so learning the subtle differences between tuples and lists takes time. Lists work in nearly every case tuples do, so it's simpler to stick to them.

I do believe there's a time and place to use tuples over lists though. The main times to do so are:

But for day-to-day application development, I think they tend to cause more problems than they're worth.

Fin

Thanks for reading this mini-rant. May you write more readable code,

-Adam

05 Jul 2019 4:00am GMT

Tuples versus Lists in Python

"Tupleth verthuth lithts?" said the lispy snake

One thing I often ask for in code review is conversion of tuples to lists. For example, imagine we had this Django admin class:

class BookAdmin(ModelAdmin):
    fieldsets = (
        (
            None,
            {
                "fields": (
                    "id",
                    "name",
                    "created_time",
                )
            },
        ),
    )

    readonly_fields = ("created_time",)

I'd prefer the tuples to all be lists, like:

class BookAdmin(ModelAdmin):
    fieldsets = [
        [
            None,
            {
                "fields": [
                    "id",
                    "name",
                    "created_time",
                ]
            },
        ],
    ]

    readonly_fields = ["created_time"]

This is counter to the examples in the Django admin docs. So, why?

Tuples use parentheses, and parentheses have several uses in Python. Therefore it's easier to make typo mistakes with them.

First, it's easy to miss a single trailing comma and accidentally create a parenthesized string instead of a tuple. For example, if we missed it in the above code we would have:

readonly_fields = ("created_time")

This is the string "created_time", instead of a tuple containing that string. Woops.

This often works with code you pass the variable to because strings are iterable character-by-character. (Django guards against this in the particular case of ModelAdmin.readonly_fields with its check admin.E034. But in general Python code doesn't check for strings versus tuples/lists before iterating).

There's an argument Python shouldn't have made strings iterable character-by-character. There could instead be an iterator created with something like str.iter(). However, it's quite a backwards incompatible change to add now.

Second, it's easy to miss a couple of trailing commas and again accidentally create a parenthesized string. Imagine we missed the end-of line commas in the above sub-element "fields":

"fields": (
    "id"
    "name"
    "created_time"
)

Now instead of the tuple ("id", "name", "created_time"), we have the string "idnamecreated_time". Woops again!

This too can silently fail because strings are iterable character-by-character.

Third, it's easy to create tuples accidentally. The following is a string:

x = "Hello,"

And this is a tuple, containing a string:

x = "Hello",

It's quite hard to spot the difference! Even with syntax highlighting, the comma is a small character so the difference in colour is easily missed.

(There's a dedicated flake8 plugin, flake8-tuple for banning exactly this construct, considering how harmful it can be.)

Fourth, they provide an extra step for new users of the language. Many languages have only one list-like type, so learning the subtle differences between tuples and lists takes time. Lists work in nearly every case tuples do, so it's simpler to stick to them.

I do believe there's a time and place to use tuples over lists though. The main times to do so are:

But for day-to-day application development, I think they tend to cause more problems than they're worth.

Fin

Thanks for reading this mini-rant. May you write more readable code,

-Adam

05 Jul 2019 4:00am GMT

04 Jul 2019

feedDjango community aggregator: Community blog posts

Evennia 0.9 released

Last week we released Evennia 0.9, the next version of the open source Python MU* creation system.

This release is the result of about 10 months of development, featuring 771 commits, 70 closed pull requests from the community and something like 80 issues and feature/requests closed. Thanks everyone!



The main feature of Evennia 0.9 is that we have finally made the move to Python3. And we burn the bridges behind us; as announced in previous posts we completely drop Python2 support and move exclusively to only support the latest Python3.7.

Overall the move to Python3 was not too bloody (and much work towards a never published py2+3 version was already done by Evennia contributors in a separate branch earlier). The main issues I ran into were mainly in the changes in how Python3 separates strings from bytes. This became crticial since Evennia implements several connection protocols; there were a lot of edge cases and weird errors appearing where data went to and from the wire.

A regular user has it a lot easier though. So far people have not had too much trouble converting their games from 2.7 to 3.7. The biggest Linux distros don't all have Py3.7 out of the box though, so that may be a concern for some, we'll see.

... but Py3 is nowhere all there is to find in this release though! There are a plethora of more features in the latest Evennia, all to make it easier to make the text-based multiplayer game of your dreams.

You can see a summary of new features in the ML announcement and even more details in the actual CHANGELOG file.


So what's up next?

Now follows a period of bug-fixing and stabilizing. Maybe resolve some of those long-standing "tech-dept" issues and overall make Evennia more stable.

Eventually work will then commence (in the develop branch) on version 1.0 of Evennia. For this next release I think I'll step back from new features a bit and focus on refactoring and cleanup of the API as well as other things around the library's distribution, documentation and presentation.

But for now, onward to summer vacations.

04 Jul 2019 10:45pm GMT

02 Jul 2019

feedDjango community aggregator: Community blog posts

What Is DevOps? Here Are the Core Concepts You Actually Need to Know

Over the past few years, DevOps has rapidly gained in popularity. We hear news near-constantly about how organizations worldwide are being transformed with the help of DevOps, and how engineers are mastering the new profession of DevOps. But what actually is the definition of DevOps? Is it a methodology, a profession, or both at once? What's the difference between CI and CD, or between IaaS, PaaS, and SaaS? How do DevOps and SRE relate to each other? And what other terms and concepts do you need to know to understand the language and meaning of DevOps? DevOps Concepts You Need to Know - A DevOps Glossary To answer the question, "What is DevOps?", Dmitry Stepanenko, Distillery's Head of DevOps, has prepared a glossary of core concepts and definitions you need to know. Our DevOps glossary cuts through the clutter, defining and explaining only the terms you really need to know, including: A Definition of DevOps CI and CD: The Backbone of DevOps TDD and Microservices: The Development Side of DevOps Virtual Machines and Cloud Computing IaaS, PaaS, and SaaS Containers and Kubernetes Infrastructure as Code - the Confluence of "Dev" and "Ops" What Is the Definition of DevOps? DevOps - As a Methodology DevOps can be defined as a two-pronged methodology that includes a combination of philosophies, tools, and practices related to both software development ("Dev") and IT operations ("Ops"). It is intended to accelerate the delivery of software products. With increased velocity, organizations are able to reduce their products' time to market and thus better satisfy the needs of their customers. The DevOps methodology relies on a culture of collaboration between people who have historically operated in separate organization silos - the development and operations departments. Now, all together, they are the "DevOps" team. DevOps - As a Profession DevOps can also be defined as a profession. A DevOps engineer is a person responsible for both the reliability of the software product (the "Ops" part) and the high pace of the product's development (the "Dev" part). DevOps is a multidisciplinary role that includes a wide range of duties previously performed by distinct individuals: system administrators, release engineers, and software developers. DevOps engineers implement continuous integration (CI) and continuous delivery (CD) processes, automate software build and deployment routines, and manage either on-premise or cloud infrastructure. SRE (Site Reliability Engineering) According to Ben Treynor Sloss, Google's VP of engineering and the founder of the Site Reliability Engineering term, "SRE is what you get when you treat operations as if it's a software problem." SRE can be defined as a more opinionated and prescriptive way of doing DevOps - a way pioneered by Google. (For a more detailed description of the relationship between DevOps and SRE, please refer to The Site Reliability Workbook.) Thinking in terms of programming languages, an SRE is a concrete class that implements a DevOps interface. (See also this video with Seth Vargo and Liz Fong-Jones: "What's the Difference Between DevOps and SRE?") CI and CD - The Backbone of DevOps Continuous Integration (CI) Continuous integration, or CI, is a software development practice in which developers integrate source code changes into the shared repository as early and often as possible, ideally several times a day. Each source code integration invokes an automated process of building, testing, and verification of the code change. Continuous integration allows developers to locate and eliminate software defects at an early stage of software development, thus dramatically accelerating development speed. Continuous Delivery (CD) Continuous delivery, or CD, is a further extension of continuous integration in which software is ready to be released to production at any time. Apart from the automated build and test steps of the continuous integration, continuous delivery also includes fully automated release deployment, thus speeding up the development process even more. Continuous Deployment (CD) Continuous deployment, also known as CD, is the superlative form of software development practice. It extends the principles of continuous integration and continuous delivery to the extreme. The software isn't just ready to be released to production at any time, as in continuous delivery. It's actually being released as soon as any change is produced by the developer, usually multiple times per day. TDD and Microservices - The Development Side of DevOps Test-Driven Development (TDD) Test-driven development, or TDD, is a software development practice that requires software engineers to write automated tests before they even write the actual code that will be validated by the tests. With TDD, software engineers follow a short development cycle: Write an initially failing automated test case that serves as a specification Ensure the test fails Write the minimum amount of code required to pass the test Refactor the code Repeat the cycle TDD dramatically reduces the risks of software bugs associated with each individual release, making it possible to confidently release product quickly and often. Microservices Microservices are an architectural approach in software development. They structure complex applications as a collection of small, loosely coupled services. Unlike the more traditional monolithic architectural approach, microservices allow for the development of large, complex applications by small, autonomous teams following the two-pizza rule. Microservices and DevOps perfectly complement each other, because it is much easier to build the continuous delivery pipelines for independently developed, tested, and deployed microservices than to configure a continuous delivery pipeline for a large monolith with one stroke. Moreover, small, independent teams can deliver faster and be more Agile than their larger counterparts. Virtual Machines and Cloud Computing Virtual Machines Virtual machines are emulations of the physical computer machine. The widespread proliferation of virtual machines was preceded by a long history of research and development that began with the work on the IBM CP-40 operating system in 1964. In the 2000s, virtualization products such as VMware and VirtualBox revolutionized software development, significantly accelerating computer machines' provisioning time and thus greatly simplifying the job of IT operations. Virtual machines reduced the typical time required to provision a computing resource from several days (the minimum required to approve a purchase and […] The post What Is DevOps? Here Are the Core Concepts You Actually Need to Know appeared first on Distillery.

02 Jul 2019 8:54pm GMT

Celery, Rabbits, and Warrens

Stunned by Rabbit?

Every time I pick up the Python job queue Celery after not using it for a while, I find I've forgotten exactly how RabbitMQ works. I find the Advanced Message Queuing Protocol (AMQP) concepts drop out of my head pretty quickly: Exchanges, Routing Keys, Bindings, Queues, Virtual Hosts…

I've repeatedly returned to the blog post "Rabbits and Warrens" by Jason J. W. Williams to refresh my mind on the concepts. I saved it into my Evernote on December 2014, but unfortunately it has gone offline since. Luckily it's saved in the Web Archive.

If you use Celery with an AMQP backend, or RabbitMQ, it's worth reading.

My Favourite Bit

Here's the the extract that helps clarify things the most for me.

First, the basic AMQP building blocks:

There are four building blocks you really care about in AMQP: virtual hosts, exchanges, queues and bindings. A virtual host holds a bundle of exchanges, queues and bindings. Why would you want multiple virtual hosts? Easy. A username in RabbitMQ grants you access to a virtual host…in its entirety. So the only way to keep group A from accessing group B's exchanges/queues/bindings/etc. is to create a virtual host for A and one for B. Every RabbitMQ server has a default virtual host named "/". If that's all you need, you're ready to roll.

Jason then goes on to explain in straightforward language how they work together. The image at the end really clarifies things:

Queues are where your "messages" end up. They're message buckets…and your messages sit there until a client (a.k.a. consumer) connects to the queue and siphons it off. However, you can configure a queue so that if there isn't a consumer ready to accept the message when it hits the queue, the message goes poof. But we digress…

The important thing to remember is that queues are created programmatically by your consumers (not via a configuration file or command line program). That's OK, because if a consumer app tries to "create" a queue that already exists, RabbitMQ pats it on the head, smiles gently and NOOPs the request. So you can keep your MQ configuration in-line with your app code…what a concept.

OK, so you've created and attached to your queue, and your consumer app is drumming its fingers waiting for a message…and drumming…and drumming…but alas no message. What happened? Well you gotta pump a message in first! But to do that you've got to have an exchange…

Exchanges are routers with routing tables. That's it. End stop. Every message has what's known as a "routing key", which is simply a string. The exchange has a list of bindings (routes) that say, for example, messages with routing key "X" go to queue "timbuktu". But we get slightly ahead of ourselves.

Your consumer application should create your exchanges (plural). Wait? You mean you can have more than one exchange? Yes, you can, but why? Easy. Each exchange operates in its own userland process, so adding exchanges, adds processes allowing you to scale message routing capacity with the number of cores in your server. As an example, on an 8-core server you could create 5 exchanges to maximize your utilization, leaving 3 cores open for handling the queues, etc.. Similarly, in a RabbitMQ cluster, you can use the same principle to spread exchanges across the cluster members to add even more throughput.

OK, so you've created an exchange…but it doesn't know what queues the messages go in. You need "routing rules" (bindings). A binding essentially says things like this: put messages that show up in exchange "desert" and have routing key "ali-baba" into the queue "hideout". In other words, a binding is a routing rule that links an exchange to a queue based on a routing key. It is possible for two binding rules to use the same routing key. For example, maybe messages with the routing key "audit" need to go both to the "log-forever" queue and the "alert-the-big-dude" queue. To accomplish this, just create two binding rules (each one linking the exchange to one of the queues) that both trigger on routing key "audit". In this case, the exchange duplicates the message and sends it to both queues. Exchanges are just routing tables containing bindings.

Now for the curveball: there are multiple types of exchanges. They all do routing, but they accept different styles of binding "rules". Why not just create one type of exchange for all style of rules? Because each rule style has a different CPU cost for analyzing if a message matches the rule. For example, a "topic" exchange tries to match a message's routing key against a pattern like "dogs.*". Matching that wildcard on the end takes more CPU than simply seeing if the routing key is "dogs" or not (e.g. a "direct" exchange). If you don't need the extra flexibility of a "topic" exchange, you can get more messages/sec routed if you choose the "direct" exchange type. So what are the types and how do they route?

Fanout Exchange - No routing keys involved. You simply bind a queue to the exchange. Any message that is sent to the exchange is sent to all queues bound to that exchange. Think of it like a subnet broadcast. Any host on the subnet gets a copy of the packet. Fanout exchanges route messages the fastest.

Direct Exchange - Routing keys are involved. A queue binds to the exchange to request messages that match a particular routing key exactly. This is a straight match. If a queue binds to the exchange requesting messages with routing key "dog", only messages labelled "dog" get sent to that queue (not "dog.puppy", not "dog.guard"…only "dog").

Topic Exchange - Matches routing keys against a pattern. Instead of binding with a particular routing key, the queue binds with a pattern string. The symbol # matches one or more words, and the symbol * matches any single word (no more, no less). So "audit.#" would match "audit.irs.corporate", but "audit.*" would only match "audit.irs". Our friends at RedHat have put together a great image to express how topic exchanges work:

The post goes on to explain persistence, durability, and some demo Python code.

Summary

Thanks to Jason for an explanation worth reblogging. Check out the full content on Web Archive, and if you need to learn more about RabbitMQ, Jason co-authored the book RabbitMQ in Action.

Enjoy,

-Adam

02 Jul 2019 4:00am GMT

Celery, Rabbits, and Warrens

Stunned by Rabbit?

Every time I pick up the Python job queue Celery after not using it for a while, I find I've forgotten exactly how RabbitMQ works. I find the Advanced Message Queuing Protocol (AMQP) concepts drop out of my head pretty quickly: Exchanges, Routing Keys, Bindings, Queues, Virtual Hosts…

I've repeatedly returned to the blog post "Rabbits and Warrens" by Jason J. W. Williams to refresh my mind on the concepts. I saved it into my Evernote on December 2014, but unfortunately it has gone offline since. Luckily it's saved in the Web Archive.

If you use Celery with an AMQP backend, or RabbitMQ, it's worth reading.

My Favourite Bit

Here's the the extract that helps clarify things the most for me.

First, the basic AMQP building blocks:

There are four building blocks you really care about in AMQP: virtual hosts, exchanges, queues and bindings. A virtual host holds a bundle of exchanges, queues and bindings. Why would you want multiple virtual hosts? Easy. A username in RabbitMQ grants you access to a virtual host…in its entirety. So the only way to keep group A from accessing group B's exchanges/queues/bindings/etc. is to create a virtual host for A and one for B. Every RabbitMQ server has a default virtual host named "/". If that's all you need, you're ready to roll.

Jason then goes on to explain in straightforward language how they work together. The image at the end really clarifies things:

Queues are where your "messages" end up. They're message buckets…and your messages sit there until a client (a.k.a. consumer) connects to the queue and siphons it off. However, you can configure a queue so that if there isn't a consumer ready to accept the message when it hits the queue, the message goes poof. But we digress…

The important thing to remember is that queues are created programmatically by your consumers (not via a configuration file or command line program). That's OK, because if a consumer app tries to "create" a queue that already exists, RabbitMQ pats it on the head, smiles gently and NOOPs the request. So you can keep your MQ configuration in-line with your app code…what a concept.

OK, so you've created and attached to your queue, and your consumer app is drumming its fingers waiting for a message…and drumming…and drumming…but alas no message. What happened? Well you gotta pump a message in first! But to do that you've got to have an exchange…

Exchanges are routers with routing tables. That's it. End stop. Every message has what's known as a "routing key", which is simply a string. The exchange has a list of bindings (routes) that say, for example, messages with routing key "X" go to queue "timbuktu". But we get slightly ahead of ourselves.

Your consumer application should create your exchanges (plural). Wait? You mean you can have more than one exchange? Yes, you can, but why? Easy. Each exchange operates in its own userland process, so adding exchanges, adds processes allowing you to scale message routing capacity with the number of cores in your server. As an example, on an 8-core server you could create 5 exchanges to maximize your utilization, leaving 3 cores open for handling the queues, etc.. Similarly, in a RabbitMQ cluster, you can use the same principle to spread exchanges across the cluster members to add even more throughput.

OK, so you've created an exchange…but it doesn't know what queues the messages go in. You need "routing rules" (bindings). A binding essentially says things like this: put messages that show up in exchange "desert" and have routing key "ali-baba" into the queue "hideout". In other words, a binding is a routing rule that links an exchange to a queue based on a routing key. It is possible for two binding rules to use the same routing key. For example, maybe messages with the routing key "audit" need to go both to the "log-forever" queue and the "alert-the-big-dude" queue. To accomplish this, just create two binding rules (each one linking the exchange to one of the queues) that both trigger on routing key "audit". In this case, the exchange duplicates the message and sends it to both queues. Exchanges are just routing tables containing bindings.

Now for the curveball: there are multiple types of exchanges. They all do routing, but they accept different styles of binding "rules". Why not just create one type of exchange for all style of rules? Because each rule style has a different CPU cost for analyzing if a message matches the rule. For example, a "topic" exchange tries to match a message's routing key against a pattern like "dogs.*". Matching that wildcard on the end takes more CPU than simply seeing if the routing key is "dogs" or not (e.g. a "direct" exchange). If you don't need the extra flexibility of a "topic" exchange, you can get more messages/sec routed if you choose the "direct" exchange type. So what are the types and how do they route?

Fanout Exchange - No routing keys involved. You simply bind a queue to the exchange. Any message that is sent to the exchange is sent to all queues bound to that exchange. Think of it like a subnet broadcast. Any host on the subnet gets a copy of the packet. Fanout exchanges route messages the fastest.

Direct Exchange - Routing keys are involved. A queue binds to the exchange to request messages that match a particular routing key exactly. This is a straight match. If a queue binds to the exchange requesting messages with routing key "dog", only messages labelled "dog" get sent to that queue (not "dog.puppy", not "dog.guard"…only "dog").

Topic Exchange - Matches routing keys against a pattern. Instead of binding with a particular routing key, the queue binds with a pattern string. The symbol # matches one or more words, and the symbol * matches any single word (no more, no less). So "audit.#" would match "audit.irs.corporate", but "audit.*" would only match "audit.irs". Our friends at RedHat have put together a great image to express how topic exchanges work:

The post goes on to explain persistence, durability, and some demo Python code.

Summary

Thanks to Jason for an explanation worth reblogging. Check out the full content on Web Archive, and if you need to learn more about RabbitMQ, Jason co-authored the book RabbitMQ in Action.

Enjoy,

-Adam

02 Jul 2019 4:00am GMT

01 Jul 2019

feedDjango community aggregator: Community blog posts

Python 3 GUI: wxPython 4 Tutorial - Urllib & JSON Example

In this tutorial, we'll learn to build a Python 3 GUI app from scratch using wxPython and Urllib. We'll be consuming a third-party news REST API available from newsapi.org which provides breaking news headlines, and allows you to search for articles from over 30,000 news sources and blogs worldwide. We'll use Urllib for sending HTTP requests to the REST API and the json module to parse the response.

Throughout you'll understand how to create desktop user interfaces in Python 3, including adding widgets, and managing data. In more details, you'll see:

First of all, head over to the registration page and create a new account then take note of the provided API key which will be using later to access the news data.

What is wxPython

wxPython is a Python wrapper around wxWidgets - the cross platform C++ library for building desktop apps for macOS, Linux and Windows. wxPython was created by Robin Dunn.

Prerequisites

You will need to have the following prerequisistes:

Installing wxPython 4

Let's start by installing wxPython 4 using pip. Open a new terminal and simply run the following command:

$ pip install wxpython

If the installation fails, you may be requiring some dependencies depending on your operating system. Check out the prerequisites section in the official GitHub repository for more information.

Creating your First wxPython 4 GUI Window

After installing wxPython, you can easily create your first GUI window by creating a Python single file and call the wx.App() and the wx.Frame() methods.

Inside your working folder, create a newsy.py file and add the following code:

import wx

app = wx.App()
frame = wx.Frame(parent=None, title='Newsy: Read the World News!')
frame.Show()
app.MainLoop()

In this example, we use two essentials classes - wx.App and wx.Frame.

The wx.App class is used to instantiate a wxPython application object .

From the wx.Appobject, you can call the MainLoop() method which starts the event loop which is used to listen for events in your application.

wx.Frame is used to create a window. In our example, we created a window with no parent has the Newsy: Read the World News! title.

Now, run your GUI app using the following command from your terminal:

$ python newsy.py

This is a screenshot of our GUI window:

wxPython 4 GUI Window

Let's refactor our code and create a menu and status bars. First, we create a MainWindow class that extends the wx.Frame class:

class MainWindow(wx.Frame):
    def __init__(self, parent, title):

        super(MainWindow, self).__init__(parent, title = title, size = (600,500))
        self.Centre()
        self.CreateStatusBar()
        self.createMenu()

    def createMenu(self):

        menu= wx.Menu()
        menuExit = menu.Append(wx.ID_EXIT, "E&xit", "Quit application")

        menuBar = wx.MenuBar()
        menuBar.Append(menu,"&File")
        self.SetMenuBar(menuBar)

        self.Bind(wx.EVT_MENU, self.OnExit, menuExit)

    def OnExit(self, event):
        self.Close(True) #Close the frame

In the __init__() method, we call the Centre() method of wx.Frame to center the window in the screen. Next, we call the CreateStatusBar() method to create a status bar. Finally, we define and call the createMenu() method which:

Next, refacor the code for creating the app as follows:

if __name__ == '__main__':
    app = wx.App()
    window= MainWindow(None, "Newsy - read worldwide news!")
    window.Show()
    app.MainLoop()

After running the app, this is a screenshot of our window at this point:

Adding a wxPython Panel

According to the docs:

A panel is a window on which controls are placed. It is usually placed within a frame. Its main feature over its parent class wx.Window is code for handling child windows and TAB traversal, which is implemented natively if possible (e.g. in wxGTK) or by wxWidgets itself otherwise.

Now, let's create a panel called NewsPanel that extends wxPanel:

class NewsPanel(wx.Panel):

    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        self.SetBackgroundColour("gray")

Next, let's instantiate the class in the MainWindow constructor for actually adding a panel to our window:

class MainWindow(wx.Frame):
    def __init__(self, parent, title):

        super(MainWindow, self).__init__(parent, title = title, size = (600,500))
        self.Centre()
        NewsPanel(self)
        self.createStatusBar()
        self.createMenu()         

Adding wxPython Lists for News and Sources

According to the docs:

A list control presents lists in a number of formats: list view, report view, icon view and small icon view. In any case, elements are numbered from zero. For all these modes, the items are stored in the control and must be added to it using wx.ListCtrl.InsertItem method.

After creating our panel, let's add two lists which will hold the sources and the news items:

class NewsPanel(wx.Panel):

    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        self.SetBackgroundColour("gray")

        self.sources_list = wx.ListCtrl(
            self, 
            style=wx.LC_REPORT | wx.BORDER_SUNKEN
        )
        self.sources_list.InsertColumn(0, "Source", width=200)

        self.news_list = wx.ListCtrl(
            self, 
            size = (-1 , - 1),
            style=wx.LC_REPORT | wx.BORDER_SUNKEN
        )
        self.news_list.InsertColumn(0, 'Link')
        self.news_list.InsertColumn(1, 'Title')

We use wx.ListCtrl to create a list in wxPython, next we call the InsertColumn() method for adding columns to our lists. For our first list, we only add one Source column. For the seconf lists we add two Link and Title columns.

Creating a Layout with Box Sizer

According to the docs:

Sizers ... have become the method of choice to define the layout of controls in dialogs in wxPython because of their ability to create visually appealing dialogs independent of the platform, taking into account the differences in size and style of the individual controls.

Next, let's place the two lists side by side using the BoxSizer layout. wxPython provides absoulte positioning and also adavanced layout algorithms such as:

wx.BoxSizer allows you to place several widgets into a row or a column.

box = wx.BoxSizer(wx.VERTICAL | wx.HORIZONTAL)

The orientation can be wx.VERTICAL or wx.HORIZONTAL.

You can add widgets into the wx.BoxSizer using the Add() method:

box.Add(wx.Window window, integer proportion=0, integer flag = 0, integer border = 0)

In the __init__() method of our news panel, add the following code:

        sizer = wx.BoxSizer(wx.HORIZONTAL)
        sizer.Add(self.sources_list, 0, wx.ALL | wx.EXPAND)
        sizer.Add(self.news_list, 1, wx.ALL | wx.EXPAND)
        self.SetSizer(sizer)

This is a screenshot of our window with two lists:

Let's now start by populating the source list. First import the following modules:

import urllib.request 
import json

Next, define the API_KEY variable which will hold your API key that you received after creating an account with NewsAPI.org:

API_KEY = ''

Fetching JSON Data Using Urllib.request

Next, in NewsPanel, add a method for grabbing the news sources:

    def getNewsSources(self):
        with urllib.request.urlopen("https://newsapi.org/v2/sources?language=en&apiKey=" + API_KEY) as response:
            response_text = response.read()   
            encoding = response.info().get_content_charset('utf-8')
            JSON_object = json.loads(response_text.decode(encoding))            

            for el in JSON_object["sources"]:
                 print(el["description"] + ":")
                 print(el["id"] + ":")

                 print(el["url"] + "\n")
                 self.sources_list.InsertItem(0, el["name"])


Next, call the method in the constructor:

class NewsPanel(wx.Panel):

    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        # [...]
        self.getNewsSources()

That's it! If you run the application again, you should see a list of news sources displayed:

Now, when we select a news source from the list at left, we want the news from this source to get displayed on the list at the right. We first, need to define a method to fetch the news data. In NewsPanel, add the following method:

    def getNews(self, source):
         with urllib.request.urlopen("https://newsapi.org/v2/top-headlines?sources="+ source + "&apiKey=" + API_KEY) as response:
             response_text = response.read()   
             encoding = response.info().get_content_charset('utf-8')
             JSON_object = json.loads(response_text.decode(encoding))           
             for el in JSON_object["articles"]:
                 index = 0
                 self.news_list.InsertItem(index, el["url"])
                 self.news_list.SetItem(index, 1, el["title"])
                 index += 1


Next, we need to call this method when a source is selected. Here comes the role of wxPython events.

Binding wxPython Events

In the __init__() constructor of NewsPanel, call the Bind() method on the sources_list object to bind the wx.EVT_LIST_ITEM_SELECTED event of the list to the OnSourceSelected() method: ```py class NewsPanel(wx.Panel):

def __init__(self, parent):
    wx.Panel.__init__(self, parent)
    # [...]
    self.sources_list.Bind(wx.EVT_LIST_ITEM_SELECTED, self.OnSourceSelected)


Next, define the `OnSourceSelected()` method as follows:

```py
    def OnSourceSelected(self, event):
         source = event.GetText().replace(" ", "-")
         self.getNews(source)

Now, run your application and select a news source, you should get a list of news from the select source in the right list:

Open External URLs in Web Browsers

Now, we want to be able to open the news article, when selected, in the web browser to read the full article. First import the webbrowser module:

import webbrowser

Next, in NewsPanel define the OnLinkSelected() method as follows:

    def OnLinkSelected(self, event):
          webbrowser.open(event.GetText()) 

Finally, bind the method to the wx.EVT_LIST_ITEM_SELECTED on the news_list object:

class NewsPanel(wx.Panel):

    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        # [...]
        self.news_list.Bind(wx.EVT_LIST_ITEM_SELECTED , self.OnLinkSelected)

Now, when you select a news item, its corresponding URL will be opened in your default web browser so you can read the full article.

Resizing the Lists when the Window is Resized

If your resize your window, you'll notice that the lists are not resized accordingly. You can change this behavior by adding the following method to NewsPanel and bind it to the wx.EVT_PAINT event:

    def OnPaint(self, evt):
        width, height = self.news_list.GetSize()
        for i in range(2):
            self.news_list.SetColumnWidth(i, width/2)
        evt.Skip()

Next, bind the method as follows:

class NewsPanel(wx.Panel):

    def __init__(self, parent):
        wx.Panel.__init__(self, parent)
        # [...]        
        self.Bind(wx.EVT_PAINT, self.OnPaint) 

Conclusion

In this tutorial, we've seen how to do desktop GUI development with Python 3 and wxPython. We've also seen:

We've also learned how to use wxPython to create windows, panels and lists and how to listen for events.

01 Jul 2019 5:00am GMT

30 Jun 2019

feedDjango community aggregator: Community blog posts

Simple bash deployment script for Django

#!/bin/bash

remote=$1
cmd=$2

if [[ ! "${remote}" ]]; then
  echo No remote given, aborting, try username@host
  exit 1
fi
if [[ ! "${cmd}" ]]; then
  echo No command given, aborting, try switch or nop
  exit 1
fi

python=/usr/bin/python3.5
commit=$(git rev-parse HEAD)
date=$(date +%Y%m%d_%H%M%S)
name="${date}_${commit}"
src="${name}/git"
settings="${src}/src/conf/settings/"
venv="${name}/virtualenv"
archive="${name}.tar.gz"
previous="previous"
latest="latest"

set -e
echo "Transfer archive..."
git archive --format tar.gz -o "${archive}" "${commit}"
scp "${archive}" ${remote}:
rm -f "${archive}"

echo "Set up remote host..."
ssh "${remote}" mkdir -p "${src}"
ssh "${remote}" tar xzf "${archive}" -C "${src}"
ssh "${remote}" virtualenv --quiet "${venv}" -p ${python}
ssh "${remote}" "${venv}/bin/pip" install --quiet --upgrade pip setuptools
ssh "${remote}" "${venv}/bin/pip" install --quiet -r "${src}/requirements.txt"

echo "Set up django..."
ssh "${remote}" "${venv}/bin/python ${src}/src/manage.py check"
ssh "${remote}" "${venv}/bin/python ${src}/src/manage.py migrate --noinput"
ssh "${remote}" "${venv}/bin/python ${src}/src/manage.py collectstatic --noinput"

if [[ "$cmd" == "switch" ]]; then
  echo "Switching to new install..."
  ssh "${remote}" rm -f "${previous}"
  ssh "${remote}" mv "${latest}" "${previous}"
  ssh "${remote}" ln -s ${name} "${latest}"
  ssh "${remote}" 'kill -15 $(cat previous/git/src/gunicorn.pid)'
  echo "Deleting obsolete deploys"
  ssh "${remote}" /usr/bin/find . -maxdepth 1 -type d -name "2*" \
    grep -v $(basename $(readlink latest)) | \
    grep -v $(basename $(readlink previous )) \
    | /usr/bin/xargs /bin/rm -rfv
fi

echo "Cleaning up..."
ssh "${remote}" rm -f "${archive}"
rm -f "${archive}"
set +e

30 Jun 2019 1:34am GMT

27 Jun 2019

feedDjango community aggregator: Community blog posts

Testing email in Django

Sending email from a web app often seems like throwing stones into a black hole. You create a message, pass it to a mail send function and hope for the best. You don't control an inbox and mail server, so this whole process happens somewhere in between, and you ...

Read now

27 Jun 2019 2:51pm GMT

25 Jun 2019

feedDjango community aggregator: Community blog posts

Series A Funding: Why Outsourcing Should Be Part of Your Startup Growth Strategy

You've raised your seed funding round and are on the path to Series A - Your startup shows promise! Investors have taken note. That's fantastic, and we're excited for you. But we also want to make it clear: Nowadays, to be a successful startup, you need to get right back to thinking about how you can spend that seed funding as wisely as possible. And you need to start thinking ASAP about how you'll ensure your startup is deemed worthy of any additional funding. Why the super-practical downer advice? While more startups than ever are getting seed money, fewer than ever are getting follow-on funding. So that Series A funding you're hoping for? It's not a given. Also, as accelerator Y Combinator always advises its startups, it's not your money. Getting more money is dependent on showing you can be a smart spender with good margins and a sustainable business model. Let me explain. What's the "Series A Crunch"? Analyzing Seed vs. Series A Funding The venture capital community coined the term "Series A Crunch" to describe a trend they saw in Series A funding. Essentially, they noted that, while huge numbers of startups were easily raising large sums in their seed rounds, much fewer startups were moving on to have successful Series A rounds. In the big picture, that means: More startups are getting more seed funding. More startups are competing for Series A funding. PE and VC investors are becoming more selective, providing Series A funding to fewer startups. Increasingly, startups only receive Series A funding if PEs and VCs assess that they're low-risk, high-reward business models with demonstrated traction in a growing market. And therein lies the "crunch." What does this mean for a successful startup looking to create sustainable competitive advantage? Higher Expectations and Higher Stakes The expectations for seed-stage performance are now higher. Only some startups will meet expectations. For example, VCs and PEs increasingly expect that seeded startups should be generating revenue before receiving Series A funding. TechCrunch offers up a compelling analysis from Silicon Valley VC firm Wing. In 2010, only 15% of seed-stage companies that had raised Series A rounds were already making money. Compare that to 2019, when Wing's data shows that "82 percent of companies that raised Series A rounds from top investors last year are already making money off their customers." What does this mean? VCs and PEs are unlikely to offer Series A funding to startups that aren't showing sufficiently solid evidence of market traction. If a startup can't show the potential for significant growth in both revenues and customer base, that follow-on funding is unlikely to flow. The stakes are high, and the risk of failure is even higher. CB Insights found that "nearly 67% of startups stall at some point in the VC process and fail to exit or raise follow-on funding." What Are the Series A Business Risks for Startups? VCs and PEs aren't interested in funding risky investments. That's why smart startups are focused on reducing their business risk, and on identifying sources of competitive advantage. Risk impacts countless areas of any business. Below, we've covered the three most important business risks for startups looking to raise Series A funding. Business Model Risk: Growth Matters How's that business model looking? Long gone are the days when a sparkly new idea was sufficient to generate funding beyond the seed round. Nowadays, to make it to Series A, startups must have business models that demonstrate traction in several key areas, including: Revenue growth: As mentioned above, even early-stage startups are expected to start making money. The next step is showing that the business model can create sustainable revenue growth. Customer growth: Are your customers out there? Even more importantly, will they buy your product? And can your customer base continue to grow? Market share growth: Can your business model compete and win in your chosen market? Before VC and PE investors are willing to follow you to Series A, you must derisk your business model as thoroughly as possible in these areas. If investors don't see a strong, data-backed business plan that shows clear potential for growth, they'll mostly see… risk. What you can do: Reality-check your business plan, examining your assumptions. Can you test your assumptions? Are they sound? For example, can you demonstrate that your market is large enough to provide the type of growth investors seek? Fully assess barriers to entry. How does your product stack up against the competition? Is your product truly differentiated against competitor products? Build out your business plan with as much data as possible. If the economics don't make sense, investors are going to run the opposite direction. Technology Risk: Speed Matters Nowadays, technology development is part of a high percentage of startup business plans. That means VC and PE investors must consider the time, expertise, and resources required for design and development. After all, these variables all inform a startup's technology risk profile. When assessing startups' technology-related business risks, VCs and PEs are likely to focus on: Development scope and type. Does your startup need to build technology from scratch, or can it leverage technologies that are already available? What alternatives already exist in the market, and is your product differentiated in meaningful ways? What is the risk that you'll be unable to build the technology? Does your product rely on third-party services? What happens if those third-party services go down? Development speed. How quickly can development happen? How likely is it that development will take significantly longer than anticipated? Can you build and iterate quickly enough to deliver a reliable, functional product to customers? Development expertise. Do you have access to the expertise and skill sets you need to develop the desired technology? Development process. Do you have strong, proven, efficient processes in place? Compliance, security, and intellectual property risk. What legal or regulatory risks impact development? Do you own all of the code? After all, nothing changes the basic, underlying assumption that - […] The post Series A Funding: Why Outsourcing Should Be Part of Your Startup Growth Strategy appeared first on Distillery.

25 Jun 2019 9:49pm GMT

24 Jun 2019

feedDjango community aggregator: Community blog posts

Responses versus Requests-Mock

Last October, I swapped out using Responses for Requests-Mock in my Python projects. Both are libraries for achieving the same goal of mocking out HTTP requests made with the popular Requests library.

I swapped on my private and open source projects. For example of the transition, see my PR on ec2-metadata.

Responses is still vastly more popular than Requests-Mock. At time of writing, Responses has 2,376 GitHub stars to Requests-Mock's 257 (92 + 165 from its original repository). I'm writing this post as a bit of promotion for such a useful library, and also to give an example of my reasoning behind choosing one open source library over another. This is not to disparage Responses - it works really well, I've enjoyed using it, and have even contributed to it several times.

Project Health

The more people use a project, the more likely it solves the problem well, its "sharp corners" have been "sanded off", and bugs have been fixed. We also want to see this continuing over time to know that we're likely to be supported going forwards.

Here are the statistics on each project:

24 Jun 2019 4:00am GMT

20 Jun 2019

feedDjango community aggregator: Community blog posts

How to Set Up a Centralized Log Server with rsyslog

For many years, we've been running an ELK (Elasticsearch, Logstash, Kibana) stack for centralized logging. We have a specific project that requires on-premise infrastructure, so sending logs off-site to a hosted solution was not an option. Over time, however, the maintenance requirements of this self-maintained ELK stack were staggering. Filebeat, for example, filled up all the disks on all the servers in a matter of hours, not once, but twice (and for different reasons) when it could not reach its Logstash/Elasticsearch endpoint. Metricbeat suffered from a similar issue: It used far too much disk space relative to the value provided in its Elasticsearch indices. And while provisioning a self-hosted ELK stack has gotten easier over the years, it's still a lengthy process, which requires extra care anytime an upgrade is needed. Are these problems solvable? Yes. But for our needs, a simpler solution was needed.

Enter rsyslog. rsyslog has been around since 2004. It's an alternative to syslog and syslog-ng. It's fast. And relative to an ELK stack, its RAM and CPU requirements are negligible.

This idea started as a proof-of-concept, and quickly turned into a production-ready centralized logging service. Our goals are as follows:

  1. Set up a single VM to serve as a centralized log aggregator. We want the simplest possible solution, so we're going to combine all logs for each environment into a single log file, relying on the source IP address, hostname, log facility, and tag in each log line to differentiate where logs are coming from. Then, we can use tail, grep, and other command-line tools to watch or search those files, like we might have through the Kibana web interface previously.
  2. On every other server in our cluster, we'll also use rsyslog to read and forward logs from the log files created by our application. In other words, we want an rsyslog configuration to mimic how Filebeat worked for us previously (or how the AWS CloudWatch Logs agent works, if you're using AWS).

Disclaimer: Throughout this post, we'll show you how to install and configure rsyslog manually, but you'll probably want to automate that with your configuration management tool of choice (Ansible, Salt, Chef, Puppet, etc.).

Log Aggregator Setup

On a central logging server, first install rsyslog and its relp module (for lossless log sending/receiving):

sudo apt install rsyslog rsyslog-relp

As of 2019, rsyslog is the default logger on current Debian and Ubuntu releases, but rsyslog-relp is not installed by default. We've included both for clarity.

Now, we need to create a minimal rsyslog configuration to receive logs and write them to one or more files. Let's create a file at /etc/rsyslog.d/00-log-aggregator.conf, with the following content:

module(load="imrelp")

ruleset(name="receive_from_12514") {
    action(type="omfile" file="/data/logs/production.log")
}

input(type="imrelp" port="12514" ruleset="receive_from_12514")

If needed, we can listen on one or more additional ports, and write those logs to a different file by appending new ruleset and input settings in our config file:

ruleset(name="receive_from_12515") {
    action(type="omfile" file="/data/logs/staging.log")
}

input(type="imrelp" port="12515" ruleset="receive_from_12515")

Rotating Logs

You'll probably want to rotate these logs from time to time as well. You can do that with a simple logrotate config. Create a new file /etc/logrotate.d/rsyslog_aggregator with the following content:

/data/logs/*.log {
  rotate 365
  daily
  compress
  missingok
  notifempty
  dateext
  dateformat .%Y-%m-%d
  dateyesterday
  postrotate
      /usr/lib/rsyslog/rsyslog-rotate
  endscript
}

This configuration will rotate log files daily, compressing older files, and rename the rotated files with the applicable date.

To see what this logrotate configuration will do (without actually doing anything, you can run it with the --debug option:

logrotate --debug /etc/logrotate.d/rsyslog_aggregator

To customize this configuration further, look at the logrotate man page (or type man logrotate on your UNIX-like operating system of choice).

Sending Logs to Our Central Server

We can also use rsyslog to send logs to our central server, with the help of the imfile module. First, we'll need the same packages installed on the server:

sudo apt install rsyslog rsyslog-relp

Create a file /etc/rsyslog.d/90-log-forwarder.conf with the following content:

# Poll each file every 2 seconds
module(load="imfile" PollingInterval="2")

# Create a ruleset to send logs to the right port for our environment
module(load="omrelp")
ruleset(name="send_to_remote") {
    action(type="omrelp" target="syslog" port="12514")  # production
}

# Send all files on this server to the same remote, tagged appropriately
input(
    type="imfile"
    File="/home/myapp/logs/myapp_django.log"
    Tag="myapp_django:"
    Facility="local7"
    Ruleset="send_to_remote"
)
input(
    type="imfile"
    File="/home/myapp/logs/myapp_celery.log"
    Tag="myapp_celery:"
    Facility="local7"
    Ruleset="send_to_remote"
)

Again, I listed a few example log files and tags here, but you may wish to create this file with a configuration management tool that allows you to templatize it (and create each input() in a Jinja2 {% for %} loop, for example).

Be sure to restart rsyslog (i.e., sudo service rsyslog restart) any time you change this configuration file, and inspect /var/log/syslog carefully for any errors reading and/or sending your log files.

Watching & Searching Logs

Since we've given up our fancy Kibana web interface, we need to search logs through the command line now. Thankfully, that's fairly easy with the help of tail, grep, and zgrep.

To watch logs come through as they happen, just type:

tail -f /data/logs/staging.log

You can also pipe that into grep, to narrow down the logs you're watching to a specific host or tag, for example:

tail -f /data/logs/staging.log | grep django_celery

If you want to search previous log entries from today, you can do that with grep, too:

grep myapp_django /data/logs/staging.log

If you want to search the logs for a few specific days, you can do that with zgrep:

zgrep myapp_celery /data/logs/staging.log.2019-05-{23,24,25}.gz

Of course, you could search all logs from all time with the same method, but that might take awhile:

zgrep myapp_django /data/logs/staging.log.*.gz

Conclusion

There are a myriad of ways to configure rsyslog (and centralized logging generally), often with little documentation about how best to do so. Hopefully this helps you consolidate logs with minimal resource overhead. Feel free to comment below with feedback, questions, or the results of your tests with this method.

20 Jun 2019 5:00pm GMT