15 Dec 2019

feedPlanet Python

S. Lott: Functional programming design pattern: Nested Iterators == Flattening

Here's a functional programming design pattern I uncovered. This may not be news to you, but it was a surprise to me. It cropped up when looking at something that needs parallelization to reduced the elapsed run time.

Consider this data collection process.

for h in some_high_level_collection(arg1):
for l in h.some_low_level_collection(arg2):
if some_filter(l):
logger.info("Processing %s %s", h, l)
some_function(h, l)

This is pretty common in devops world. You might be looking at all repositories in all github organizations. You might be looking at all keys in all AWS S3 buckets under a specific account. You might be looking at all tables owned by all schemas in a database.

It's helpful -- for the moment -- to stay away from taller tree structures like the file system. Traversing the file system involves recursion, and the pattern is slightly different there. We'll get to it, but what made this clear to me was a "simpler" walk through a two-layer hierarchy.

The nested for-statements aren't really ideal. We can't apply any itertools techniques here. We can't trivially change this to a multiprocessing.map().

In fact, the more we look at this, the worse it is.

Here's something that's a little easier to work with:

def h_l_iter(arg1, arg2):
for h in some_high_level_collection(arg1):
for l in h.some_low_level_collection(arg2):
if some_filter(l):
logger.info("Processing %s %s", h, l)
yield h, l

itertools.starmap(some_function, h_l_iter(arg1, arg2))

The data gathering has expanded to a few more lines of code. It gained a lot of flexibility. Once we have something that can be used with starmap, it can also be used with other itertools functions to do additional processing steps without breaking the loops into horrible pieces.

I think the pattern here is a kind of "Flattened Map" transformation. The initial design, with nested loops wrapping a process wasn't a good plan. A better plan is to think of the nested loops as a way to flatten the two tiers of the hierarchy into a single iterator. Then a mapping can be applied to process each item from that flat iterator.

Extracting the Filter

We can now tease apart the nested loops to expose the filter. In the version above, the body of the h_l_iter() function binds log-writing with the yield. If we take those two apart, we gain the flexibility of being able to change the filter (or the logging) without an awfully complex rewrite.

T = TypeVar('T')
def logging_iter(source: Iterable[T]) -> Iterator[T]:
for item in source:
logger.info("Processing %s", item)
yield item

def h_l_iter(arg1, arg2):
for h in some_high_level_collection(arg1):
for l in h.some_low_level_collection(arg2):
yield h, l

raw_data = h_l_iter(arg1, arg2)
filtered_subset = logging_iter(filter(some_filter, raw_data))
itertools.starmap(some_function, filtered_subset)

Yes, this is still longer, but all of the details are now exposed in a way that lets me change filters without further breakage.

Now, I can introduce various forms of multiprocessing to improve concurrency.

This transformed a hard-wired set of nest loops, if, and function evaluation into a "Flattener" that can be combined with off-the shelf filtering and mapping functions.

I've snuck in a kind of "tee" operation that writes an iterable sequence to a log. This can be injected at any point in the processing.

Logging the entire "item" value isn't really a great idea. Another mapping is required to create sensible log messages from each item. I've left that out to keep this exposition more focused.

I'm sure others have seen this pattern, but it was eye-opening to me.

Full Flattening

The h_l_iter() function is actually a generator expression. A function isn't needed.

h_l_iter = (
(h, l)
for h in some_high_level_collection(arg1)
for l in h.some_low_level_collection(arg2)
)

This simplification doesn't add much value, but it seems to be general truth. In Python, it's a small change in syntax and therefore, an easy optimization to make.

What About The File System?

When we're working with some a more deeply-nested structure, like the File System, we'll make a small change. We'll replace the h_l_iter() function with a recursive_walk() function.

def recursive_walk(path: Path) -> Iterator[Path]:
for item in path.glob():
if item.is_file():
yield item
elif item.is_dir():
yield from recursive_walk(item)

This function has, effectively the same signature as h_l_iter(). It walks a complex structure yielding a flat sequence of items. The other functions used for filtering, logging, and processing don't change, allowing us to build new features from various combinations of these functions.

tl;dr

The too-long version of this is:

Replace for item in iter: process(item) with map(process, iter).

This pattern works for simple, flat items, nested structures, and even recursively-defined trees. It introduces flexibility with no real cost.

The other pattern in play is:

Any for item in iter: for sub-item in item: processing is "flattening" a hierarchy into a sequence. Replace it with (sub-item for item in iter for sub-item in item).

These felt like blinding revelations to me.

15 Dec 2019 10:29am GMT

S. Lott: Functional programming design pattern: Nested Iterators == Flattening

Here's a functional programming design pattern I uncovered. This may not be news to you, but it was a surprise to me. It cropped up when looking at something that needs parallelization to reduced the elapsed run time.

Consider this data collection process.

for h in some_high_level_collection(arg1):
for l in h.some_low_level_collection(arg2):
if some_filter(l):
logger.info("Processing %s %s", h, l)
some_function(h, l)

This is pretty common in devops world. You might be looking at all repositories in all github organizations. You might be looking at all keys in all AWS S3 buckets under a specific account. You might be looking at all tables owned by all schemas in a database.

It's helpful -- for the moment -- to stay away from taller tree structures like the file system. Traversing the file system involves recursion, and the pattern is slightly different there. We'll get to it, but what made this clear to me was a "simpler" walk through a two-layer hierarchy.

The nested for-statements aren't really ideal. We can't apply any itertools techniques here. We can't trivially change this to a multiprocessing.map().

In fact, the more we look at this, the worse it is.

Here's something that's a little easier to work with:

def h_l_iter(arg1, arg2):
for h in some_high_level_collection(arg1):
for l in h.some_low_level_collection(arg2):
if some_filter(l):
logger.info("Processing %s %s", h, l)
yield h, l

itertools.starmap(some_function, h_l_iter(arg1, arg2))

The data gathering has expanded to a few more lines of code. It gained a lot of flexibility. Once we have something that can be used with starmap, it can also be used with other itertools functions to do additional processing steps without breaking the loops into horrible pieces.

I think the pattern here is a kind of "Flattened Map" transformation. The initial design, with nested loops wrapping a process wasn't a good plan. A better plan is to think of the nested loops as a way to flatten the two tiers of the hierarchy into a single iterator. Then a mapping can be applied to process each item from that flat iterator.

Extracting the Filter

We can now tease apart the nested loops to expose the filter. In the version above, the body of the h_l_iter() function binds log-writing with the yield. If we take those two apart, we gain the flexibility of being able to change the filter (or the logging) without an awfully complex rewrite.

T = TypeVar('T')
def logging_iter(source: Iterable[T]) -> Iterator[T]:
for item in source:
logger.info("Processing %s", item)
yield item

def h_l_iter(arg1, arg2):
for h in some_high_level_collection(arg1):
for l in h.some_low_level_collection(arg2):
yield h, l

raw_data = h_l_iter(arg1, arg2)
filtered_subset = logging_iter(filter(some_filter, raw_data))
itertools.starmap(some_function, filtered_subset)

Yes, this is still longer, but all of the details are now exposed in a way that lets me change filters without further breakage.

Now, I can introduce various forms of multiprocessing to improve concurrency.

This transformed a hard-wired set of nest loops, if, and function evaluation into a "Flattener" that can be combined with off-the shelf filtering and mapping functions.

I've snuck in a kind of "tee" operation that writes an iterable sequence to a log. This can be injected at any point in the processing.

Logging the entire "item" value isn't really a great idea. Another mapping is required to create sensible log messages from each item. I've left that out to keep this exposition more focused.

I'm sure others have seen this pattern, but it was eye-opening to me.

Full Flattening

The h_l_iter() function is actually a generator expression. A function isn't needed.

h_l_iter = (
(h, l)
for h in some_high_level_collection(arg1)
for l in h.some_low_level_collection(arg2)
)

This simplification doesn't add much value, but it seems to be general truth. In Python, it's a small change in syntax and therefore, an easy optimization to make.

What About The File System?

When we're working with some a more deeply-nested structure, like the File System, we'll make a small change. We'll replace the h_l_iter() function with a recursive_walk() function.

def recursive_walk(path: Path) -> Iterator[Path]:
for item in path.glob():
if item.is_file():
yield item
elif item.is_dir():
yield from recursive_walk(item)

This function has, effectively the same signature as h_l_iter(). It walks a complex structure yielding a flat sequence of items. The other functions used for filtering, logging, and processing don't change, allowing us to build new features from various combinations of these functions.

tl;dr

The too-long version of this is:

Replace for item in iter: process(item) with map(process, iter).

This pattern works for simple, flat items, nested structures, and even recursively-defined trees. It introduces flexibility with no real cost.

The other pattern in play is:

Any for item in iter: for sub-item in item: processing is "flattening" a hierarchy into a sequence. Replace it with (sub-item for item in iter for sub-item in item).

These felt like blinding revelations to me.

15 Dec 2019 10:29am GMT

Anwesha Das: Rootconf Hyderbad, 2019

What is Rootconf?

Rootconf is the conference on sysadmins, DevOps, SRE, Network engineers. Rootconf started its journey in 2012 in Bangalore, 2019 was the 7th edition of Rootconf. In these years, through all the Rootconfs, there is a community that has developed around Rootconf. Now people do come to attend Rootconf not just to attend the conference but also to attend friends and peers to discuss projects and ideas.

Need for more Rootconf

Over all these years, we have witnessed changes in the network, infrastructure, and security threats. We have designed Rootconf (in all these years), keeping in mind the changing needs of the community. Lately, we have realized that the needs of the community based on their geographic locations/ cities. Like in Pune, there is a considerable demand for sessions that deals with small size infrastructure suited for startups and SMEs as there is a growing startup industry there. In Delhi, there is a demand for discussion around data centers, network designs, and so on. And in Hyderabad, there is a want for solutions around large scale infrastructure. The Bangalore event did not suffice to solve all these needs. So more the merrier, we decided to have more than one Rootconf a year.

Rootconf Pune was the first of this 'outstation Rootconf journey'. The next was Rootconf Hyderabad. It was the first event for which I was organizing the editorial, community, and all by myself.
I joined HasGeek as Community Manager and Editorial co-ordinator. After my Rootconf, Bangalore Zainab fixed a goal for me.

Z : 'Anwesha, I want to organize Rootconf Hyderabad all by yourself, you must be doing with no or minimum help from me.'
A: "Ummm hmmm ooops"
Z: 'Do not worry, I will be there to guide you. We will have our test run with you in Pune. So buck up, girl.'

Rootconf Hyderabad, the conference

The preparation for Rootconf Hyderabad started with them. After months of the editorial process - scouting for the proposals, reviewing them, having several rehearsals, and after passing the iron test in Pune, I reached Hyderabad to join my colleague Mak. Mak runs the sales at Hasgeek. With the camera, we had our excellent AV captain Amogh. So I was utterly secured and not worried about those two aspects.

A day before the conference Damini, our emcee, and I chalked out the plans for navigating the schedule and coordinating the conference. We met the volunteers at the venue after a humongous lunch with Hyderabadi deliciously (honest confession: food is the primary reason why I love to attend the conference in Hyderabad). We have several call volunteers in which our volunteer coordinator Jyotsna briefed them the duties. But it is always essential to make the volunteers introduced with the ground reality. We had a meet up at Thought Works.
The day of the conference starts early, much too early for the organizers and volunteers. Rootconf Hyderabad was no different. We opened the registration, and people started flocking in the auditorium. I opened the conference by addressing -

Then our emcee Damini took over. The first half of our schedule designed keeping the problems of large scale infrastructure in mind, like observability, maintainability, scalability, performance, taming the large systems, and networking issues. Piyush started our first speaker gave a talk on Observability and control theory. Next was Flipkart's journey of "Fast object distribution using P2P" by Ankur Jain. After a quick beverage break, Anubhav Mishra shared his take on "Taming infrastructure workflow at scale", the story of Hashicorp Followed by Tasdik Rahman and his story of "Achieving repeatable, extensible and self serve infrastructure" at Gojek."

The next half of the day planned to address the issues shared with infrastructure despite size or complexity. Like - Security, DevOpSec, scaling, and of course, microservices (an infrastructure conference seems incomplete without the discussion around monolith to microservices). Our very own security expert Lavakumar started it with "Deploying and managing CSP: the browser-side firewall", describing the security complexities post mege cart attack days. Jambunathan shared the tale of "Designing microservices around your data design" . For the last talk of the day, we had Gaurav Kamboj. He told us what happens with the system engineers at Hotstar when Virat Kohli is batting on his 90s, in "Scaling hotstar.com for 25 million concurrent viewers"
Birds of a Feather (BOF) session has always been a favorite at Rootconf. These non-recorded sessions give the participants a chance, to be frank. We have facilitators to progress the discussion and not presenters. While we had talks going on in the main audi, there are dedicated BOF area where we had sessions on

This was the first time gauging the popularity of the BOFs we tried something new. We had a BOF session planned at the primary audi. It was on "Doing DevSecOps in your organization," aided by Lava and Hari. It was one session in which our emcee Damini had a difficult time to end. People had so many stories to share questions to ask, but there was no time. I also got some angry looks (which I do not mind at all) :).

In India, I have noticed that most of the conferences fail to have good/up to the mark flash talks. Invariably they have community information, conference, or meetup notifications (the writer is guilty of doing it). So I proposed that why can not we accept proposals for flash talks as well. Half of them are pre-selected and rest selected on the spot. Zainab agreed to it. Now we are following this rule since Rootconf Pune, and the quality of the flash talks has improved a lot. We had some fantastic flash talks. You can check it for yourself at https://www.youtube.com/watch?v=AlREWUAEMVk.

Thank you

Organizing a conference is not a person's job. In an extensive infrastructure, it is the small tools, microservices that keeps an extensive system working. Consider conference as a system, tasks as microservices. It requires each task to be perfect for the conference to be successful and flawless. And I am blessed to have an amazing team. Each amazing volunteers, the Null Hyderabad, Mozilla, AWS community, our emcee Damini, Hall manager Geetanjali, Speakers, Sponsors, attendees, and my team HasGeek. Last but not least, thank you, Zainab, for trusting me, being by my side, and not letting me fall.

The experience

Organizing a conference has been the journey of estrogen and adrenaline overflow for me. Be it getting into nightmares the excitement of each ticket sales, the long chats with the reviewers about talks, BOFs, discussion with the communities what they want from Rootconf, jitters before the conference starts or tweets, a blog post from the people that they enjoyed the conference was useful for them. It was an exciting, scary, happy, and satisfying journey for me. And guess what, my life continues to be so as Rootconf is ready with it's Delhi edition. I hope to meet you there.

15 Dec 2019 10:15am GMT

Anwesha Das: Rootconf Hyderbad, 2019

What is Rootconf?

Rootconf is the conference on sysadmins, DevOps, SRE, Network engineers. Rootconf started its journey in 2012 in Bangalore, 2019 was the 7th edition of Rootconf. In these years, through all the Rootconfs, there is a community that has developed around Rootconf. Now people do come to attend Rootconf not just to attend the conference but also to attend friends and peers to discuss projects and ideas.

Need for more Rootconf

Over all these years, we have witnessed changes in the network, infrastructure, and security threats. We have designed Rootconf (in all these years), keeping in mind the changing needs of the community. Lately, we have realized that the needs of the community based on their geographic locations/ cities. Like in Pune, there is a considerable demand for sessions that deals with small size infrastructure suited for startups and SMEs as there is a growing startup industry there. In Delhi, there is a demand for discussion around data centers, network designs, and so on. And in Hyderabad, there is a want for solutions around large scale infrastructure. The Bangalore event did not suffice to solve all these needs. So more the merrier, we decided to have more than one Rootconf a year.

Rootconf Pune was the first of this 'outstation Rootconf journey'. The next was Rootconf Hyderabad. It was the first event for which I was organizing the editorial, community, and all by myself.
I joined HasGeek as Community Manager and Editorial co-ordinator. After my Rootconf, Bangalore Zainab fixed a goal for me.

Z : 'Anwesha, I want to organize Rootconf Hyderabad all by yourself, you must be doing with no or minimum help from me.'
A: "Ummm hmmm ooops"
Z: 'Do not worry, I will be there to guide you. We will have our test run with you in Pune. So buck up, girl.'

Rootconf Hyderabad, the conference

The preparation for Rootconf Hyderabad started with them. After months of the editorial process - scouting for the proposals, reviewing them, having several rehearsals, and after passing the iron test in Pune, I reached Hyderabad to join my colleague Mak. Mak runs the sales at Hasgeek. With the camera, we had our excellent AV captain Amogh. So I was utterly secured and not worried about those two aspects.

A day before the conference Damini, our emcee, and I chalked out the plans for navigating the schedule and coordinating the conference. We met the volunteers at the venue after a humongous lunch with Hyderabadi deliciously (honest confession: food is the primary reason why I love to attend the conference in Hyderabad). We have several call volunteers in which our volunteer coordinator Jyotsna briefed them the duties. But it is always essential to make the volunteers introduced with the ground reality. We had a meet up at Thought Works.
The day of the conference starts early, much too early for the organizers and volunteers. Rootconf Hyderabad was no different. We opened the registration, and people started flocking in the auditorium. I opened the conference by addressing -

Then our emcee Damini took over. The first half of our schedule designed keeping the problems of large scale infrastructure in mind, like observability, maintainability, scalability, performance, taming the large systems, and networking issues. Piyush started our first speaker gave a talk on Observability and control theory. Next was Flipkart's journey of "Fast object distribution using P2P" by Ankur Jain. After a quick beverage break, Anubhav Mishra shared his take on "Taming infrastructure workflow at scale", the story of Hashicorp Followed by Tasdik Rahman and his story of "Achieving repeatable, extensible and self serve infrastructure" at Gojek."

The next half of the day planned to address the issues shared with infrastructure despite size or complexity. Like - Security, DevOpSec, scaling, and of course, microservices (an infrastructure conference seems incomplete without the discussion around monolith to microservices). Our very own security expert Lavakumar started it with "Deploying and managing CSP: the browser-side firewall", describing the security complexities post mege cart attack days. Jambunathan shared the tale of "Designing microservices around your data design" . For the last talk of the day, we had Gaurav Kamboj. He told us what happens with the system engineers at Hotstar when Virat Kohli is batting on his 90s, in "Scaling hotstar.com for 25 million concurrent viewers"
Birds of a Feather (BOF) session has always been a favorite at Rootconf. These non-recorded sessions give the participants a chance, to be frank. We have facilitators to progress the discussion and not presenters. While we had talks going on in the main audi, there are dedicated BOF area where we had sessions on

This was the first time gauging the popularity of the BOFs we tried something new. We had a BOF session planned at the primary audi. It was on "Doing DevSecOps in your organization," aided by Lava and Hari. It was one session in which our emcee Damini had a difficult time to end. People had so many stories to share questions to ask, but there was no time. I also got some angry looks (which I do not mind at all) :).

In India, I have noticed that most of the conferences fail to have good/up to the mark flash talks. Invariably they have community information, conference, or meetup notifications (the writer is guilty of doing it). So I proposed that why can not we accept proposals for flash talks as well. Half of them are pre-selected and rest selected on the spot. Zainab agreed to it. Now we are following this rule since Rootconf Pune, and the quality of the flash talks has improved a lot. We had some fantastic flash talks. You can check it for yourself at https://www.youtube.com/watch?v=AlREWUAEMVk.

Thank you

Organizing a conference is not a person's job. In an extensive infrastructure, it is the small tools, microservices that keeps an extensive system working. Consider conference as a system, tasks as microservices. It requires each task to be perfect for the conference to be successful and flawless. And I am blessed to have an amazing team. Each amazing volunteers, the Null Hyderabad, Mozilla, AWS community, our emcee Damini, Hall manager Geetanjali, Speakers, Sponsors, attendees, and my team HasGeek. Last but not least, thank you, Zainab, for trusting me, being by my side, and not letting me fall.

The experience

Organizing a conference has been the journey of estrogen and adrenaline overflow for me. Be it getting into nightmares the excitement of each ticket sales, the long chats with the reviewers about talks, BOFs, discussion with the communities what they want from Rootconf, jitters before the conference starts or tweets, a blog post from the people that they enjoyed the conference was useful for them. It was an exciting, scary, happy, and satisfying journey for me. And guess what, my life continues to be so as Rootconf is ready with it's Delhi edition. I hope to meet you there.

15 Dec 2019 10:15am GMT

Catalin George Festila: Python 3.7.5 : Simple intro in CSRF.

CSRF or Cross-Site Request Forgery is a technique used by cyber-criminals to force users into executing unwanted actions on a web application. To protect against web form CSRF attacks, it's isn't sufficient for web applications to trust authenticated users, must be equipped with a unique identifier called a CSRF token similar to a session identifier. Django 3.0 can be used with CSRF, see the

15 Dec 2019 6:03am GMT

Catalin George Festila: Python 3.7.5 : Simple intro in CSRF.

CSRF or Cross-Site Request Forgery is a technique used by cyber-criminals to force users into executing unwanted actions on a web application. To protect against web form CSRF attacks, it's isn't sufficient for web applications to trust authenticated users, must be equipped with a unique identifier called a CSRF token similar to a session identifier. Django 3.0 can be used with CSRF, see the

15 Dec 2019 6:03am GMT

14 Dec 2019

feedPlanet Python

Andre Roberge: A Tiny Python Exception Oddity

Today, while working on Friendly-traceback (improved documentation !) as I have been doing a lot recently, I came into an odd SyntaxError case:

You have been warned.

Normal behaviour


When Python finds a SyntaxError, it flags its location. Let's have a look at a simple case, using CPython 3.7.

Notice how it indicates where it found the error, as shown by the red arrow: this happened when it reached a token that was inconsistent with the code entered so far. According to my experience until today, this seemed to be always the case. Note that using CPython 3.6 yields exactly the same behaviour, and unhelpful error message.

Before discussing the case with a different behaviour, let's make a detour and look at Pypy's handling of the same case.

Same location indicated, but a much more helpful error message, even though this is version 3.6. This improved error message was discussed in this Pypy blog post. I strongly suspect that this is what lead to this improved error message in CPython 3.8.

Same error message as Pypy ... but the exact location of the error, previously indicated by ^, no longer appears - which could be unfortunate when nested parenthesis (including square and curly brackets) are present.

What about Friendly-traceback you ask? I thought you never would! ;-)

Well, here's the information when using CPython 3.7.


The line about not having enough information from Python refers to the unhelpful message ("invalid syntax"). Hopefully you will agree that the information given by Friendly-traceback would be generally more useful, and especially more so for beginners.

But enough about this case. It is time to look at the odd behaviour one.

Odd case


Consider the following:

Having a variable declared both as a global and nonlocal variable is not allowed. Let see what happens when this is executed by Pypy.


So, pypy processed the file passed the nonlocal statement and flagged the location where it encountered a statement which was inconsistent with everything that had been read so far: it thus flagged that as the location of the error.

Now, what happens with CPython:


The location flagged is one line earlier. The nonlocal statement is flagged as problematic but, reading the code up to that point, there is no indication that a global statement was encountered before.

Note that, changing the order of the two statements does not change the result: pypy shows the beginning of the second statement (line 6) as the problem, whereas CPython always shows the line before.

Why does it matter to me?

If you go back to the first case I discussed, with the unmatched parenthesis, in Friendly-traceback, I rely on the location of the error shown by Python to indicate where the problem arose and, when appropriate, I look *back* to also show where the potential problem started. Unfortunately, I cannot do that in this case with CPython.

Why is this case handled differently by CPython?

While I have some general idea of how the CPython interpreter works, I absolutely do not understand well enough to claim with absolute certainty how this situation arise. Please, feel free to leave a comment to correct the description below if it is incorrect.

My understanding is the following:

After breaking down a file into tokens, parsing it according to the rules of the Python grammar, an abstract syntax tree (AST) is constructed if no syntax error is found. The nonlocal/global problem noted is not picked up by CPython up to that point - which also explains why flake8 would not find it as it relies on the AST, and does not actually executes the code. (I'm a bit curious as to how Pylint does ... I'll probably have to look into it when I have more time).

Using the AST, a control flow graph is created and various "frames" are created with links (GOTOs, under a different name...) joining different parts. It is at that point that relationships between variables in different frames is examined in details. Pictorially, this can be represented as follows:


(This image was taken from this blog post by Eli Bendersky) In terms of the actual code, it is in the CPython symtable.c file. At that point, errors are not found by scanning lines of code linearly, but rather by visiting nodes in the AST in some deterministic fashion ... which leads to the oddity mentioned previously: CPython consistently shows the first of two statements as the source of the problem, whereas Pypy (which relies on some other method) shows the second, which is consistent with the way it shows the location of all SyntaxError messages.

Conclusion

For Friendly-traceback, this likely means that for such cases, and unlike the mismatched parenthesis case, I will not attempt to figure out which two lines are problematic, and will simply expand slightly on the terse one liner given by Python (and in a way that can be translated into languages other than English).

14 Dec 2019 10:09pm GMT

Andre Roberge: A Tiny Python Exception Oddity

Today, while working on Friendly-traceback (improved documentation !) as I have been doing a lot recently, I came into an odd SyntaxError case:

You have been warned.

Normal behaviour


When Python finds a SyntaxError, it flags its location. Let's have a look at a simple case, using CPython 3.7.

Notice how it indicates where it found the error, as shown by the red arrow: this happened when it reached a token that was inconsistent with the code entered so far. According to my experience until today, this seemed to be always the case. Note that using CPython 3.6 yields exactly the same behaviour, and unhelpful error message.

Before discussing the case with a different behaviour, let's make a detour and look at Pypy's handling of the same case.

Same location indicated, but a much more helpful error message, even though this is version 3.6. This improved error message was discussed in this Pypy blog post. I strongly suspect that this is what lead to this improved error message in CPython 3.8.

Same error message as Pypy ... but the exact location of the error, previously indicated by ^, no longer appears - which could be unfortunate when nested parenthesis (including square and curly brackets) are present.

What about Friendly-traceback you ask? I thought you never would! ;-)

Well, here's the information when using CPython 3.7.


The line about not having enough information from Python refers to the unhelpful message ("invalid syntax"). Hopefully you will agree that the information given by Friendly-traceback would be generally more useful, and especially more so for beginners.

But enough about this case. It is time to look at the odd behaviour one.

Odd case


Consider the following:

Having a variable declared both as a global and nonlocal variable is not allowed. Let see what happens when this is executed by Pypy.


So, pypy processed the file passed the nonlocal statement and flagged the location where it encountered a statement which was inconsistent with everything that had been read so far: it thus flagged that as the location of the error.

Now, what happens with CPython:


The location flagged is one line earlier. The nonlocal statement is flagged as problematic but, reading the code up to that point, there is no indication that a global statement was encountered before.

Note that, changing the order of the two statements does not change the result: pypy shows the beginning of the second statement (line 6) as the problem, whereas CPython always shows the line before.

Why does it matter to me?

If you go back to the first case I discussed, with the unmatched parenthesis, in Friendly-traceback, I rely on the location of the error shown by Python to indicate where the problem arose and, when appropriate, I look *back* to also show where the potential problem started. Unfortunately, I cannot do that in this case with CPython.

Why is this case handled differently by CPython?

While I have some general idea of how the CPython interpreter works, I absolutely do not understand well enough to claim with absolute certainty how this situation arise. Please, feel free to leave a comment to correct the description below if it is incorrect.

My understanding is the following:

After breaking down a file into tokens, parsing it according to the rules of the Python grammar, an abstract syntax tree (AST) is constructed if no syntax error is found. The nonlocal/global problem noted is not picked up by CPython up to that point - which also explains why flake8 would not find it as it relies on the AST, and does not actually executes the code. (I'm a bit curious as to how Pylint does ... I'll probably have to look into it when I have more time).

Using the AST, a control flow graph is created and various "frames" are created with links (GOTOs, under a different name...) joining different parts. It is at that point that relationships between variables in different frames is examined in details. Pictorially, this can be represented as follows:


(This image was taken from this blog post by Eli Bendersky) In terms of the actual code, it is in the CPython symtable.c file. At that point, errors are not found by scanning lines of code linearly, but rather by visiting nodes in the AST in some deterministic fashion ... which leads to the oddity mentioned previously: CPython consistently shows the first of two statements as the source of the problem, whereas Pypy (which relies on some other method) shows the second, which is consistent with the way it shows the location of all SyntaxError messages.

Conclusion

For Friendly-traceback, this likely means that for such cases, and unlike the mismatched parenthesis case, I will not attempt to figure out which two lines are problematic, and will simply expand slightly on the terse one liner given by Python (and in a way that can be translated into languages other than English).

14 Dec 2019 10:09pm GMT

Weekly Python StackOverflow Report: (ccvi) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2019-12-14 12:53:35 GMT


  1. Python __getitem__ and in operator result in strange behavior - [22/1]
  2. What is the reason for difference between integer division and float to int conversion in python? - [18/1]
  3. Match multiple(3+) occurrences of each character - [7/5]
  4. Pandas Dataframe: Multiplying Two Columns - [7/3]
  5. What does !r mean in Python? - [7/1]
  6. python how to find the number of days in each month from Dec 2019 and forward between two date columns - [6/4]
  7. Itertools zip_longest with first item of each sub-list as padding values in stead of None by default - [6/4]
  8. Writing more than 50 millions from Pyspark df to PostgresSQL, best efficient approach - [6/0]
  9. How to merge and groupby between seperate dataframes - [5/3]
  10. Setting the index after merging with pandas? - [5/3]

14 Dec 2019 12:54pm GMT

Weekly Python StackOverflow Report: (ccvi) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2019-12-14 12:53:35 GMT


  1. Python __getitem__ and in operator result in strange behavior - [22/1]
  2. What is the reason for difference between integer division and float to int conversion in python? - [18/1]
  3. Match multiple(3+) occurrences of each character - [7/5]
  4. Pandas Dataframe: Multiplying Two Columns - [7/3]
  5. What does !r mean in Python? - [7/1]
  6. python how to find the number of days in each month from Dec 2019 and forward between two date columns - [6/4]
  7. Itertools zip_longest with first item of each sub-list as padding values in stead of None by default - [6/4]
  8. Writing more than 50 millions from Pyspark df to PostgresSQL, best efficient approach - [6/0]
  9. How to merge and groupby between seperate dataframes - [5/3]
  10. Setting the index after merging with pandas? - [5/3]

14 Dec 2019 12:54pm GMT

Catalin George Festila: Python 3.7.5 : Django admin shell by Grzegorz Tężycki.

Today I tested another python package for Django named django-admin-shell. This package created by Grzegorz Tężycki can be found on GitHub and come with the intro: Django application can execute python code in your project's environment on django admin site. You can use similar as python manage shell without reloading the environment. [mythcat@desk ~]$ cd projects/ [mythcat@desk projects]$ cd

14 Dec 2019 11:18am GMT

Catalin George Festila: Python 3.7.5 : Django admin shell by Grzegorz Tężycki.

Today I tested another python package for Django named django-admin-shell. This package created by Grzegorz Tężycki can be found on GitHub and come with the intro: Django application can execute python code in your project's environment on django admin site. You can use similar as python manage shell without reloading the environment. [mythcat@desk ~]$ cd projects/ [mythcat@desk projects]$ cd

14 Dec 2019 11:18am GMT

13 Dec 2019

feedPlanet Python

Peter Bengtsson: A Python and Preact app deployed on Heroku

Heroku is great but it's sometimes painful when your app isn't just in one single language. What I have is a project where the backend is Python (Django) and the frontend is JavaScript (Preact). The folder structure looks like this:

/
  - README.md
  - manage.py
  - requirements.txt
  - my_django_app/
     - settings.py
     - asgi.py
     - api/
        - urls.py
        - views.py
  - frontend/
     - package.json
     - yarn.lock
     - preact.config.js
     - build/
        ...
     - src/
        ...

A bunch of things omitted for brevity but people familiar with Django and preact-cli/create-create-app should be familiar.
The point is that the root is a Python app and the front-end is exclusively inside a sub folder.

When you do local development, you start two servers:

The latter is what you open in your browser. That preact app will do things like:

const response = await fetch('/api/search');

and, in preact.config.js I have this:

export default (config, env, helpers) => {

  if (config.devServer) {
    config.devServer.proxy = [
      {
        path: "/api/**",
        target: "http://localhost:8000"
      }
    ];
  }

};

...which is hopefully self-explanatory. So, calls like GET http://localhost:3000/api/search actually goes to http://localhost:8000/api/search.

That's when doing development. The interesting thing is going into production.

Before we get into Heroku, let's first "merge" the two systems into one and the trick used is Whitenoise. Basically, Django's web server will be responsibly not only for things like /api/search but also static assets such as / --> frontend/build/index.html and /bundle.17ae4.js --> frontend/build/bundle.17ae4.js.

This is basically all you need in settings.py to make that happen:

MIDDLEWARE = [
    "django.middleware.security.SecurityMiddleware",
    "whitenoise.middleware.WhiteNoiseMiddleware",
    ...
]

WHITENOISE_INDEX_FILE = True

STATIC_URL = "/"
STATIC_ROOT = BASE_DIR / "frontend" / "build"

However, this isn't quite enough because the preact app uses preact-router which uses pushState() and other code-splitting magic so you might have a URL, that users see, like this: https://myapp.example.com/that/thing/special and there's nothing about that in any of the Django urls.py files. Nor is there any file called frontend/build/that/thing/special/index.html or something like that.
So for URLs like that, we have to take a gamble on the Django side and basically hope that the preact-router config knows how to deal with it. So, to make that happen with Whitenoise we need to write a custom middleware that looks like this:

from whitenoise.middleware import WhiteNoiseMiddleware


class CustomWhiteNoiseMiddleware(WhiteNoiseMiddleware):
    def process_request(self, request):
        if self.autorefresh:
            static_file = self.find_file(request.path_info)
        else:
            static_file = self.files.get(request.path_info)

            # These two lines is the magic.
            # Basically, the URL didn't lead to a file (e.g. `/manifest.json`)
            # it's either a API path or it's a custom browser path that only
            # makes sense within preact-router. If that's the case, we just don't
            # know but we'll give the client-side preact-router code the benefit
            # of the doubt and let it through.
            if not static_file and not request.path_info.startswith("/api"):
                static_file = self.files.get("/")

        if static_file is not None:
            return self.serve(static_file, request)

And in settings.py this change:

MIDDLEWARE = [
    "django.middleware.security.SecurityMiddleware",
-   "whitenoise.middleware.WhiteNoiseMiddleware",
+   "my_django_app.middleware.CustomWhiteNoiseMiddleware",
    ...
]

Now, all traffic goes through Django. Regular Django view functions, static assets, and everything else fall back to frontend/build/index.html.

Heroku

Heroku tries to make everything so simple for you. You basically, create the app (via the cli or the Heroku web app) and when you're ready you just do git push heroku master. However that won't be enough because there's more to this than Python.

Unfortunately, I didn't take notes of my hair-pulling excruciating journey of trying to add buildpacks and hacks and Procfiles and custom buildpacks. Nothing seemed to work. Perhaps the answer was somewhere in this issue: "Support running an app from a subdirectory" but I just couldn't figure it out. I still find buildpacks confusing when it's beyond Hello World. Also, I didn't want to run Node as a service, I just wanted it as part of the "build process".

Docker to the rescue

Finally I get a chance to try "Deploying with Docker" in Heroku which is a relatively new feature. And the only thing that scared me was that now I need to write a heroku.yml file which was confusing because all I had was a Dockerfile. We'll get back to that in a minute!

So here's how I made a Dockerfile that mixes Python and Node:

FROM node:12 as frontend

COPY . /app
WORKDIR /app
RUN cd frontend && yarn install && yarn build


FROM python:3.8-slim

WORKDIR /app

RUN groupadd --gid 10001 app && useradd -g app --uid 10001 --shell /usr/sbin/nologin app
RUN chown app:app /tmp

RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y --no-install-recommends \
    gcc apt-transport-https python-dev

# Gotta try moving this to poetry instead!
COPY ./requirements.txt /app/requirements.txt
RUN pip install --upgrade --no-cache-dir -r requirements.txt

COPY . /app
COPY --from=frontend /app/frontend/build /app/frontend/build

USER app

ENV PORT=8000
EXPOSE $PORT

CMD uvicorn gitbusy.asgi:application --host 0.0.0.0 --port $PORT

If you're not familiar with it, the critical trick is on the first line where it builds some Node with as frontend. That gives me a thing I can then copy from into the Python image with COPY --from=frontend /app/frontend/build /app/frontend/build.

Now, at the very end, it starts a uvicorn server with all the static .js, index.html, and favicon.ico etc. available to uvicorn which ultimately runs whitenoise.

To run and build:

docker build . -t my_app
docker run -t -i --rm --env-file .env -p 8000:8000 my_app

Now, opening http://localhost:8000/ is a production grade app that mixes Python (runtime) and JavaScript (static).

Heroku + Docker

Heroku says to create a heroku.yml file and that makes sense but what didn't make sense is why I would add cmd line in there when it's already in the Dockerfile. The solution is simple: omit it. Here's what my final heroku.yml file looks like:

build:
  docker:
    web: Dockerfile

Check in the heroku.yml file and git push heroku master and voila, it works!

To see a complete demo of all of this check out https://github.com/peterbe/gitbusy and https://gitbusy.herokuapp.com/

13 Dec 2019 4:55pm GMT

Peter Bengtsson: A Python and Preact app deployed on Heroku

Heroku is great but it's sometimes painful when your app isn't just in one single language. What I have is a project where the backend is Python (Django) and the frontend is JavaScript (Preact). The folder structure looks like this:

/
  - README.md
  - manage.py
  - requirements.txt
  - my_django_app/
     - settings.py
     - asgi.py
     - api/
        - urls.py
        - views.py
  - frontend/
     - package.json
     - yarn.lock
     - preact.config.js
     - build/
        ...
     - src/
        ...

A bunch of things omitted for brevity but people familiar with Django and preact-cli/create-create-app should be familiar.
The point is that the root is a Python app and the front-end is exclusively inside a sub folder.

When you do local development, you start two servers:

The latter is what you open in your browser. That preact app will do things like:

const response = await fetch('/api/search');

and, in preact.config.js I have this:

export default (config, env, helpers) => {

  if (config.devServer) {
    config.devServer.proxy = [
      {
        path: "/api/**",
        target: "http://localhost:8000"
      }
    ];
  }

};

...which is hopefully self-explanatory. So, calls like GET http://localhost:3000/api/search actually goes to http://localhost:8000/api/search.

That's when doing development. The interesting thing is going into production.

Before we get into Heroku, let's first "merge" the two systems into one and the trick used is Whitenoise. Basically, Django's web server will be responsibly not only for things like /api/search but also static assets such as / --> frontend/build/index.html and /bundle.17ae4.js --> frontend/build/bundle.17ae4.js.

This is basically all you need in settings.py to make that happen:

MIDDLEWARE = [
    "django.middleware.security.SecurityMiddleware",
    "whitenoise.middleware.WhiteNoiseMiddleware",
    ...
]

WHITENOISE_INDEX_FILE = True

STATIC_URL = "/"
STATIC_ROOT = BASE_DIR / "frontend" / "build"

However, this isn't quite enough because the preact app uses preact-router which uses pushState() and other code-splitting magic so you might have a URL, that users see, like this: https://myapp.example.com/that/thing/special and there's nothing about that in any of the Django urls.py files. Nor is there any file called frontend/build/that/thing/special/index.html or something like that.
So for URLs like that, we have to take a gamble on the Django side and basically hope that the preact-router config knows how to deal with it. So, to make that happen with Whitenoise we need to write a custom middleware that looks like this:

from whitenoise.middleware import WhiteNoiseMiddleware


class CustomWhiteNoiseMiddleware(WhiteNoiseMiddleware):
    def process_request(self, request):
        if self.autorefresh:
            static_file = self.find_file(request.path_info)
        else:
            static_file = self.files.get(request.path_info)

            # These two lines is the magic.
            # Basically, the URL didn't lead to a file (e.g. `/manifest.json`)
            # it's either a API path or it's a custom browser path that only
            # makes sense within preact-router. If that's the case, we just don't
            # know but we'll give the client-side preact-router code the benefit
            # of the doubt and let it through.
            if not static_file and not request.path_info.startswith("/api"):
                static_file = self.files.get("/")

        if static_file is not None:
            return self.serve(static_file, request)

And in settings.py this change:

MIDDLEWARE = [
    "django.middleware.security.SecurityMiddleware",
-   "whitenoise.middleware.WhiteNoiseMiddleware",
+   "my_django_app.middleware.CustomWhiteNoiseMiddleware",
    ...
]

Now, all traffic goes through Django. Regular Django view functions, static assets, and everything else fall back to frontend/build/index.html.

Heroku

Heroku tries to make everything so simple for you. You basically, create the app (via the cli or the Heroku web app) and when you're ready you just do git push heroku master. However that won't be enough because there's more to this than Python.

Unfortunately, I didn't take notes of my hair-pulling excruciating journey of trying to add buildpacks and hacks and Procfiles and custom buildpacks. Nothing seemed to work. Perhaps the answer was somewhere in this issue: "Support running an app from a subdirectory" but I just couldn't figure it out. I still find buildpacks confusing when it's beyond Hello World. Also, I didn't want to run Node as a service, I just wanted it as part of the "build process".

Docker to the rescue

Finally I get a chance to try "Deploying with Docker" in Heroku which is a relatively new feature. And the only thing that scared me was that now I need to write a heroku.yml file which was confusing because all I had was a Dockerfile. We'll get back to that in a minute!

So here's how I made a Dockerfile that mixes Python and Node:

FROM node:12 as frontend

COPY . /app
WORKDIR /app
RUN cd frontend && yarn install && yarn build


FROM python:3.8-slim

WORKDIR /app

RUN groupadd --gid 10001 app && useradd -g app --uid 10001 --shell /usr/sbin/nologin app
RUN chown app:app /tmp

RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y --no-install-recommends \
    gcc apt-transport-https python-dev

# Gotta try moving this to poetry instead!
COPY ./requirements.txt /app/requirements.txt
RUN pip install --upgrade --no-cache-dir -r requirements.txt

COPY . /app
COPY --from=frontend /app/frontend/build /app/frontend/build

USER app

ENV PORT=8000
EXPOSE $PORT

CMD uvicorn gitbusy.asgi:application --host 0.0.0.0 --port $PORT

If you're not familiar with it, the critical trick is on the first line where it builds some Node with as frontend. That gives me a thing I can then copy from into the Python image with COPY --from=frontend /app/frontend/build /app/frontend/build.

Now, at the very end, it starts a uvicorn server with all the static .js, index.html, and favicon.ico etc. available to uvicorn which ultimately runs whitenoise.

To run and build:

docker build . -t my_app
docker run -t -i --rm --env-file .env -p 8000:8000 my_app

Now, opening http://localhost:8000/ is a production grade app that mixes Python (runtime) and JavaScript (static).

Heroku + Docker

Heroku says to create a heroku.yml file and that makes sense but what didn't make sense is why I would add cmd line in there when it's already in the Dockerfile. The solution is simple: omit it. Here's what my final heroku.yml file looks like:

build:
  docker:
    web: Dockerfile

Check in the heroku.yml file and git push heroku master and voila, it works!

To see a complete demo of all of this check out https://github.com/peterbe/gitbusy and https://gitbusy.herokuapp.com/

13 Dec 2019 4:55pm GMT

Programiz: Python Dictionary Comprehension

In this tutorial, we will learn about Python dictionary comprehension and how to use it with the help of examples.

13 Dec 2019 6:23am GMT

Programiz: Python Dictionary Comprehension

In this tutorial, we will learn about Python dictionary comprehension and how to use it with the help of examples.

13 Dec 2019 6:23am GMT

Kushal Das: Highest used usernames in break-in attempts to my servers 2019

list of usernames

A few days ago, I wrote about different IP addresses trying to break into my servers. Today, I looked into another server to find the frequently used user names used in the SSH attempts.

I never knew that admin is such important user name for Linux servers, I thought I will see root there. Also, why alex? I can understand the reason behind pi. If you want to find out the similar details, you can use the following command.

last -f /var/log/btmp

13 Dec 2019 6:08am GMT

Kushal Das: Highest used usernames in break-in attempts to my servers 2019

list of usernames

A few days ago, I wrote about different IP addresses trying to break into my servers. Today, I looked into another server to find the frequently used user names used in the SSH attempts.

I never knew that admin is such important user name for Linux servers, I thought I will see root there. Also, why alex? I can understand the reason behind pi. If you want to find out the similar details, you can use the following command.

last -f /var/log/btmp

13 Dec 2019 6:08am GMT

Steve Dower: What makes Python a great language?

I know I'm far from the only person who has opined about this topic, but figured I'd take my turn.

A while ago I hinted on Twitter that I have Thoughts(tm) about the future of Python, and while this is not going to be that post, this is going to be important background for when I do share those thoughts.

If you came expecting a well researched article full of citations to peer-reviewed literature, you came to the wrong place. Similarly if you were hoping for unbiased and objective analysis. I'm not even going to link to external sources for definitions. This is literally just me on a soap box, and you can take it or leave it.

I'm also deliberately not talking about CPython the runtime, pip the package manager, venv the %PATH% manipulator, or PyPI the ecosystem. This post is about the Python language.

My hope is that you will get some ideas for thinking about why some programming languages feel better than others, even if you don't agree that Python feels better than most.

Need To Know

What makes Python a great language? It gets the need to know balance right.

When I use the term "need to know", I think of how the military uses the term. For many, "need to know" evokes thoughts of power imbalances, secrecy, and dominance-for-the-sake-of-dominance. But even in cases that may look like or actually be as bad as these, the intent is to achieve focus.

In a military organisation, every individual needs to make frequent life-or-death choices. The more time you spend making each choice, the more likely you are choosing death (specifically, your own). Having to factor in the full range of ethical factors into every decision is very inefficient.

Since no army wants to lose their own men, they delegate decision-making up through a series of ranks. By the time individuals are in the field, the biggest decisions are already made, and the soldier has a very narrow scope to make their own decisions. They can focus on exactly what they need to know, trusting that their superiors have taken into account anything else that they don't need to know.

Software libraries and abstractions are fundamentally the same. Another developer has taken the broader context into account, and has provided you - the end-developer - with only what you need to know. You get to focus on your work, trusting that the rest has been taken care of.

Memory management is probably the easiest example. Languages that decide how memory management is going to work (such as through a garbage collector) have taken that decision for you. You don't need to know. You get to use the time you would have been thinking about deallocation to focus on your actual task.

Does "need to know" ever fail? Of course it does. Sometimes you need more context in order to make a good decision. In a military organisation, there are conventions for requesting more information, ways to get promoted into positions with more context (and more complex decisions), and systems for refusing to follow orders (which mostly don't turn out so well for the person refusing, but hey, there's a system).

In software, "need to know" breaks down when you need some functionality that isn't explicitly exposed or documented, when you need to debug library or runtime code, or just deal with something not behaving as it claims it should. When these situations arise, not being able to incrementally increase what you know becomes a serious blockage.

A good balance of "need to know" will actively help you focus on getting your job done, while also providing the escape hatches necessary to handle the times you need to know more. Python gets this balance right.

Python's Need To Know levels

There are many levels of what you "need to know" to use Python.

At the lowest level, there's the basic syntax and most trivial semantics of assignment, attributes and function calls. These concepts, along with your project-specific context, are totally sufficient to write highly effective code.



matplotlib.org/gallery/statistics/histogram_features

The example to the right (source) generates a histogram from a random distribution. By my count, there are two distinct words in that are not specific to the task at hand ("import" and "as"), and the places they are used are essentially boiler-plate - they were likely copied by the author, rather than created by the author. Everything else in the sample code relates to specifying the random distribution and creating the plot.

The most complex technical concept used is tuple unpacking, but all the user needs to know here is that they're getting multiple return values. The fact that there's really only a single return value and that the unpacking is performed by the assignment isn't necessary or useful knowledge.

Find a friend who's not a developer and try this experiment on them: show them x, y = get_points() and explain how it works, without ever mentioning that it's returning multiple values. Then point out that get_points() actually just returns two values, and x, y = is how you give them names. Turns out, they won't need to know how it works, just what it does.

As you add introduce new functionality, you will see the same pattern repeated. for x in y: can (and should) be explained without mentioning iterators. open() can (and should) be explained without mentioning the io module. Class instantiation can (and should) be explained without mentioning __call__. And so on.

Python very effectively hides unnecessary details from those who just want to use it.

Think about basically any other language you've used. How many concepts do you need to express the example above?

Basically every other language is going to distinguish between declaring a variable and assigning a variable. Many are going to require nominal typing, where you need to know about types before you can do assignment. I can't think of many languages with fewer than the three concepts Python requires to generate a histogram from a random distribution with certain parameters (while also being readable from top to bottom - yes, I thought of LISP).

When Need To Know breaks down

But when need to know starts breaking down, Python has some of the best escape hatches in the entire software industry.

For starters, there are no truly private members. All the code you use in your Python program belongs to you. You can read everything, mutate everything, wrap everything, proxy everything, and nobody can stop you. Because it's your program. Duck typing makes a heroic appearance here, enabling new ways to overcome limiting abstractions that would be fundamentally impossible in other languages.

Should you make a habit of doing this? Of course not. You're using libraries for a reason - to help you focus on your own code by delegating "need to know" decisions to someone else. If you are going to regularly question and ignore their decisions, you completely spoil any advantage you may have received. But Python also allows you to rely on someone else's code without becoming a hostage to their choices.

Today, the Python ecosystem is almost entirely publicly-visible code. You don't need to know how it works, but you have the option to find out. And you can find out by following the same patterns that you're familiar with, rather than having to learn completely new skills. Reading Python code, or interactively inspecting live object graphs, are exactly what you were doing with your own code.

Compare Python to languages that tend towards sharing compiled, minified, packaged or obfuscated code, and you'll have a very different experience figuring out how things really (don't) work.

Compare Python to languages that emphasize privacy, information hiding, encapsulation and nominal typing, and you'll have a very different experience overcoming a broken or limiting abstraction.

Features you don't Need To Know about

In the earlier plot example, you didn't need to know about anything beyond assignment, attributes and function calls. How much more do you need to know to use Python? And who needs to know about these extra features?

As it turns out, there are millions of Python developers who don't need much more than assignment, attributes and function calls. Those of us in the 1% of the Python community who use Twitter and mailing lists like to talk endlessly about incredibly advanced features, such as assignment expressions and position-only parameters, but the reality is that most Python users never need these and should never have to care.

When I teach introductory Python programming, my order of topics is roughly assignment, arithmetic, function calls (with imports thrown in to get to the interesting ones), built-in collection types, for loops, if statements, exception handling, and maybe some simple function definitions and decorators to wrap up. That should be enough for 90% of Python careers (syntactically - learning which functions to call and when is considerably more effort than learning the language).

The next level up is where things get interesting. Given the baseline knowledge above, the Python's next level allows 10% of developers to provide the 90% with significantly more functionality without changing what they need to know about the language. Those awesome libraries are written by people with deeper technical knowledge, but (can/should) expose only the simplest syntactic elements.

When I adopt classes, operator overloading, generators, custom collection types, type checking, and more, Python does not force my users to adopt them as well. When I expand my focus to include more complexity, I get to make decisions that preserve my users' need to know.

For example, my users know that calling something returns a value, and that returned values have attributes or methods. Whether the callable is a function or a class is irrelevant to them in Python. But compare with most other languages, where they would have to change their syntax if I changed a function into a class.

When I change a function to return a custom mapping type rather than a standard dictionary, it is irrelevant to them. In other languages, the return type is also specified explicitly in my user's code, and so even a compatible change might force them outside of what they really need to know.

If I return a number-like object rather than a built-in integer, my users don't need to know. Most languages don't have any way to replace primitive types, but Python provides all the functionality I need to create a truly number-like object.

Clearly the complexity ramps up quickly, even at this level. But unlike most other languages, complexity does not travel down. Just because some complexity is used within your codebase doesn't mean you will be forced into using it everywhere throughout the codebase.

The next level adds even more complexity, but its use also remains hidden behind normal syntax. Metaclasses, object factories, decorator implementations, slots, __getattribute__ and more allow a developer to fundamentally rewrite how the language works. There's maybe 1% of Python developers who ever need to be aware of these features, and fewer still who should use them, but the enabling power is unique among languages that also have such an approachable lowest level.

Even with this ridiculous level of customisation, the same need to know principles apply, and in a way that only Python can do it. Enums and data classes in Python are based on these features, but the knowledge required to use them is not the same as the knowledge required to create them. Users get to focus on what they're doing, assisted by trusting someone else to have made the right decision about what they need to know.

Summary and foreshadowing

People often cite Python's ecosystem as the main reason for its popularity. Others claim the language's simplicity or expressiveness is the primary reason.

I would argue that the Python language has an incredibly well-balanced sense of what developers need to know. Better than any other language I've used.

Most developers get to write incredibly functional and focused code with just a few syntax constructs. Some developers produce reusable functionality that is accessible through simple syntax. A few developers manage incredible complexity to provide powerful new semantics without leaving the language.

By actively helping library developers write complex code that is not complex to use, Python has been able to build an amazing ecosystem. And that amazing ecosystem is driving the popularity of the language.

But does our ecosystem have the longevity to maintain the language…? Does the Python language have the qualities to survive a changing ecosystem…? Will popular libraries continue to drive the popularity of the language, or does something need to change…?

(Contact me on Twitter for discussion.)

13 Dec 2019 4:00am GMT

Steve Dower: What makes Python a great language?

I know I'm far from the only person who has opined about this topic, but figured I'd take my turn.

A while ago I hinted on Twitter that I have Thoughts(tm) about the future of Python, and while this is not going to be that post, this is going to be important background for when I do share those thoughts.

If you came expecting a well researched article full of citations to peer-reviewed literature, you came to the wrong place. Similarly if you were hoping for unbiased and objective analysis. I'm not even going to link to external sources for definitions. This is literally just me on a soap box, and you can take it or leave it.

I'm also deliberately not talking about CPython the runtime, pip the package manager, venv the %PATH% manipulator, or PyPI the ecosystem. This post is about the Python language.

My hope is that you will get some ideas for thinking about why some programming languages feel better than others, even if you don't agree that Python feels better than most.

Need To Know

What makes Python a great language? It gets the need to know balance right.

When I use the term "need to know", I think of how the military uses the term. For many, "need to know" evokes thoughts of power imbalances, secrecy, and dominance-for-the-sake-of-dominance. But even in cases that may look like or actually be as bad as these, the intent is to achieve focus.

In a military organisation, every individual needs to make frequent life-or-death choices. The more time you spend making each choice, the more likely you are choosing death (specifically, your own). Having to factor in the full range of ethical factors into every decision is very inefficient.

Since no army wants to lose their own men, they delegate decision-making up through a series of ranks. By the time individuals are in the field, the biggest decisions are already made, and the soldier has a very narrow scope to make their own decisions. They can focus on exactly what they need to know, trusting that their superiors have taken into account anything else that they don't need to know.

Software libraries and abstractions are fundamentally the same. Another developer has taken the broader context into account, and has provided you - the end-developer - with only what you need to know. You get to focus on your work, trusting that the rest has been taken care of.

Memory management is probably the easiest example. Languages that decide how memory management is going to work (such as through a garbage collector) have taken that decision for you. You don't need to know. You get to use the time you would have been thinking about deallocation to focus on your actual task.

Does "need to know" ever fail? Of course it does. Sometimes you need more context in order to make a good decision. In a military organisation, there are conventions for requesting more information, ways to get promoted into positions with more context (and more complex decisions), and systems for refusing to follow orders (which mostly don't turn out so well for the person refusing, but hey, there's a system).

In software, "need to know" breaks down when you need some functionality that isn't explicitly exposed or documented, when you need to debug library or runtime code, or just deal with something not behaving as it claims it should. When these situations arise, not being able to incrementally increase what you know becomes a serious blockage.

A good balance of "need to know" will actively help you focus on getting your job done, while also providing the escape hatches necessary to handle the times you need to know more. Python gets this balance right.

Python's Need To Know levels

There are many levels of what you "need to know" to use Python.

At the lowest level, there's the basic syntax and most trivial semantics of assignment, attributes and function calls. These concepts, along with your project-specific context, are totally sufficient to write highly effective code.



matplotlib.org/gallery/statistics/histogram_features

The example to the right (source) generates a histogram from a random distribution. By my count, there are two distinct words in that are not specific to the task at hand ("import" and "as"), and the places they are used are essentially boiler-plate - they were likely copied by the author, rather than created by the author. Everything else in the sample code relates to specifying the random distribution and creating the plot.

The most complex technical concept used is tuple unpacking, but all the user needs to know here is that they're getting multiple return values. The fact that there's really only a single return value and that the unpacking is performed by the assignment isn't necessary or useful knowledge.

Find a friend who's not a developer and try this experiment on them: show them x, y = get_points() and explain how it works, without ever mentioning that it's returning multiple values. Then point out that get_points() actually just returns two values, and x, y = is how you give them names. Turns out, they won't need to know how it works, just what it does.

As you add introduce new functionality, you will see the same pattern repeated. for x in y: can (and should) be explained without mentioning iterators. open() can (and should) be explained without mentioning the io module. Class instantiation can (and should) be explained without mentioning __call__. And so on.

Python very effectively hides unnecessary details from those who just want to use it.

Think about basically any other language you've used. How many concepts do you need to express the example above?

Basically every other language is going to distinguish between declaring a variable and assigning a variable. Many are going to require nominal typing, where you need to know about types before you can do assignment. I can't think of many languages with fewer than the three concepts Python requires to generate a histogram from a random distribution with certain parameters (while also being readable from top to bottom - yes, I thought of LISP).

When Need To Know breaks down

But when need to know starts breaking down, Python has some of the best escape hatches in the entire software industry.

For starters, there are no truly private members. All the code you use in your Python program belongs to you. You can read everything, mutate everything, wrap everything, proxy everything, and nobody can stop you. Because it's your program. Duck typing makes a heroic appearance here, enabling new ways to overcome limiting abstractions that would be fundamentally impossible in other languages.

Should you make a habit of doing this? Of course not. You're using libraries for a reason - to help you focus on your own code by delegating "need to know" decisions to someone else. If you are going to regularly question and ignore their decisions, you completely spoil any advantage you may have received. But Python also allows you to rely on someone else's code without becoming a hostage to their choices.

Today, the Python ecosystem is almost entirely publicly-visible code. You don't need to know how it works, but you have the option to find out. And you can find out by following the same patterns that you're familiar with, rather than having to learn completely new skills. Reading Python code, or interactively inspecting live object graphs, are exactly what you were doing with your own code.

Compare Python to languages that tend towards sharing compiled, minified, packaged or obfuscated code, and you'll have a very different experience figuring out how things really (don't) work.

Compare Python to languages that emphasize privacy, information hiding, encapsulation and nominal typing, and you'll have a very different experience overcoming a broken or limiting abstraction.

Features you don't Need To Know about

In the earlier plot example, you didn't need to know about anything beyond assignment, attributes and function calls. How much more do you need to know to use Python? And who needs to know about these extra features?

As it turns out, there are millions of Python developers who don't need much more than assignment, attributes and function calls. Those of us in the 1% of the Python community who use Twitter and mailing lists like to talk endlessly about incredibly advanced features, such as assignment expressions and position-only parameters, but the reality is that most Python users never need these and should never have to care.

When I teach introductory Python programming, my order of topics is roughly assignment, arithmetic, function calls (with imports thrown in to get to the interesting ones), built-in collection types, for loops, if statements, exception handling, and maybe some simple function definitions and decorators to wrap up. That should be enough for 90% of Python careers (syntactically - learning which functions to call and when is considerably more effort than learning the language).

The next level up is where things get interesting. Given the baseline knowledge above, the Python's next level allows 10% of developers to provide the 90% with significantly more functionality without changing what they need to know about the language. Those awesome libraries are written by people with deeper technical knowledge, but (can/should) expose only the simplest syntactic elements.

When I adopt classes, operator overloading, generators, custom collection types, type checking, and more, Python does not force my users to adopt them as well. When I expand my focus to include more complexity, I get to make decisions that preserve my users' need to know.

For example, my users know that calling something returns a value, and that returned values have attributes or methods. Whether the callable is a function or a class is irrelevant to them in Python. But compare with most other languages, where they would have to change their syntax if I changed a function into a class.

When I change a function to return a custom mapping type rather than a standard dictionary, it is irrelevant to them. In other languages, the return type is also specified explicitly in my user's code, and so even a compatible change might force them outside of what they really need to know.

If I return a number-like object rather than a built-in integer, my users don't need to know. Most languages don't have any way to replace primitive types, but Python provides all the functionality I need to create a truly number-like object.

Clearly the complexity ramps up quickly, even at this level. But unlike most other languages, complexity does not travel down. Just because some complexity is used within your codebase doesn't mean you will be forced into using it everywhere throughout the codebase.

The next level adds even more complexity, but its use also remains hidden behind normal syntax. Metaclasses, object factories, decorator implementations, slots, __getattribute__ and more allow a developer to fundamentally rewrite how the language works. There's maybe 1% of Python developers who ever need to be aware of these features, and fewer still who should use them, but the enabling power is unique among languages that also have such an approachable lowest level.

Even with this ridiculous level of customisation, the same need to know principles apply, and in a way that only Python can do it. Enums and data classes in Python are based on these features, but the knowledge required to use them is not the same as the knowledge required to create them. Users get to focus on what they're doing, assisted by trusting someone else to have made the right decision about what they need to know.

Summary and foreshadowing

People often cite Python's ecosystem as the main reason for its popularity. Others claim the language's simplicity or expressiveness is the primary reason.

I would argue that the Python language has an incredibly well-balanced sense of what developers need to know. Better than any other language I've used.

Most developers get to write incredibly functional and focused code with just a few syntax constructs. Some developers produce reusable functionality that is accessible through simple syntax. A few developers manage incredible complexity to provide powerful new semantics without leaving the language.

By actively helping library developers write complex code that is not complex to use, Python has been able to build an amazing ecosystem. And that amazing ecosystem is driving the popularity of the language.

But does our ecosystem have the longevity to maintain the language…? Does the Python language have the qualities to survive a changing ecosystem…? Will popular libraries continue to drive the popularity of the language, or does something need to change…?

(Contact me on Twitter for discussion.)

13 Dec 2019 4:00am GMT

12 Dec 2019

feedPlanet Python

Anwesha Das: Circuit Python at PyConf Hyderabad

Introduction

Coding in/with hardware has become my biggest stress buster for me ever since I have been introduced to it in PyCon Pune 2017 by John. Coding with hardware provides a real-life interaction with the code you write. It flourishes creativity. I can do all of this while I learn something new. Now I look for auctions to offer me a chance to code in/with Hardware. It gives the chance to escape the muggle world.

Diwali and Circuit Python

Diwali is the festival of flowers, food, and lights. So why not I take Diwali as an opportunity to create some magic. Since 2017 I try to lit my house with hardware, which I operate with coding. This year in PyCon US, all the participants got a piece of Adafruit Circuit Playground Express. For 2019 Diwali, I chose Circuit Playground Express as my wand and spell is abracadabra Circuit Python.

Circuit Python in PyConf Hyderabad

A week before, I got a call from the organizers of PyConf Hyderabad 2019 that if I want to deliver a talk there. Initially, I thought of a talk titled " Limiting the legal risk of your open source project", this was a talk selected for PyCon India 2019. Unfortunately, I could not deliver it due to my grandfather's demise. I went for this as the talk was ready. But an afterthought invoked an idea that why shouldn't I go for a talk on Circuit Python? And share what I did in Diwali with it. The organizers also liked it. (Of course, talk about/on hardware is more interesting than a legal talk in a Python conference, right?). In hindsight, it meant much work within 6 days, but I dove for it.

The day of the talk

My morning of the day of the conference started with a lovely surprise. My work has been featured on a blog by Adafruit. It was a fantastic feeling when your work gets recognition from the organization you admire the most. Thank you, Adafruit.

Mine was the 4th talk of the day and the talk before lunch. It was my first talk on/about hardware/hardware projects. This is a talk I am giving after 1 and a half years after my battle with depression. So in a word, I was nervous.

The talk

I started with why programming with/on hardware is essential for me. Then moving to what is Circuit Python and Circuit Playground Express. I use mu as my editor for all my hardware projects. This is no exception. I opened up the editor and started to code. Then came the time to showcase my work/projects with Circuit Python.

In the first example, I turned on the first NeoPixel of the CPX into Red.

from adafruit_circuitplayground.express import cpx

cpx.pixels.brightness = 0.5

# COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0,one_led_red.py 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
PURPLE =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)


while True:
    cpx.pixels[0] = RED

In CPX, we have 10 individually addressable LEDs. Now, we are importing the adafruit_circuitplayground.express module as cpx so that it becomes easier to type. If you remember, there are 10 neopixels on the board, and we can access them as cpx.pixels. We can set the brightness of pixels on the board. Instead of full, I am setting it as half as the lights are really bright. Then I have defined a few colours.

I want my light to lit forever so,' while True' and I am setting the first neopixel's color as red.

all_led_rgb.py

One led is a bit dull; let us lighten up the room. Here we will light up all the neopixels of cpx. With cpx.pixels.fill I am filling up the color red, green, and blue. I am giving a 1-second gap to see the change of colors.

from adafruit_circuitplayground.express import cpx
import time

cpx.pixels.brightness = 0.5

#COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
PURPLE =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)


while True:
    cpx.pixels.fill(RED)
    time.sleep(1)


For the third example, I showed the code where I was

import time
from adafruit_circuitplayground.express import cpx
from random import randint

cpx.pixels.brightness = 0.5

#  COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
VIOLET =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)

COLOURS = [RED, YELLOW, GREEN, CYAN, BLUE, PURPLE, WHITE, MAROON, VIOLET, TEAL]

i = 0
complete = False

while True:
    if complete:
        c = (0,0,0)
    else:
        r = randint(0,9)
        c = COLOURS[r]
    cpx.pixels[i] = c
    time.sleep(1)
    i += 1
    if i == 10:

        i = 0
        if complete:
            complete = False
        else:
            complete = True


The next was the code which I used to lit my Diwali Kandel. I needed cpx to be continuously lighted. So I wrote this code.

import time
from adafruit_circuitplayground.express import cpx
from random import randint

cpx.pixels.brightness = 0.5

# COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
VIOLET =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)

COLOURS= [RED, YELLOW, GREEN, CYAN, BLUE, PURPLE, WHITE, MAROON, VIOLET, TEAL]

i = 0


while True:
    r = randint(0,9)
    c = COLOURS[r]
    cpx.pixels[i] = c
    time.sleep(1)
    i += 1
    if i == 10:
        i = 0

It is customary that we play a game on Diwali. I have replaced card games with a game on cpx. It is a guessing game on CPX.

import time
from adafruit_circuitplayground.express import cpx
from random import randint

cpx.pixels.brightness = 0.5

# COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
VIOLET =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)

COLOURS = [RED, YELLOW, GREEN, CYAN, BLUE, PURPLE, WHITE, MAROON, VIOLET, TEAL]

i = 0


while True:
    if cpx.button_a:
        r = randint(0,9)
        c = COLOURS[r]
        cpx.pixels[i] = c
        time.sleep(0.5)
        i += 1
    elif cpx.button_b:
        cpx.pixels.fill( (0,0,0))
        i = 0

    if i == 10:
        i = 0

What is Diwali without music? Circuit Python and Adafruit have options for that also. I demonstrated Adafruit NeoTrellis M4 for creating some beats.
It is a backlight keypad driver system. We can use the Adafruit NeoTrellis with

The Adafruit module enables one to write Python code controlling the neopixel and read button presses on a single Trellis board or with a matrix of up to eight Trellis boards.

This board can be used with any

Coloring the NeoPixel strip

At PyCon US 2019, while discussing about projects on my previous Diwali with Nina, she got excited and gifted me with a neopixel strip. And as the lady commanded, I lit my Diwali rangoli with neopixel strip connecting it with the alligator clip at cpx :

and ran the code :

import time
from adafruit_circuitplayground.express import cpx
import board
import neopixel
pixels = neopixel.NeoPixel(board.D6, 30)
red = (255,0,0)
green = (0,255,0)
blue = (0,0,255)


while True:
    for i in range(0,30):
        pixels[i] = red
        time.sleep(0.1)
    for i in range(29,-1, -1):
        pixels[i] = green
        time.sleep(0.1)
        

Finally, it was time to light the Diwali diya. I have placed 3 diyas, (color by my toddler) and then fixed the neo pixel strip taping it on the diyas and then ran this code :

import time
from adafruit_circuitplayground.express import cpx
import board
import neopixel
pixels = neopixel.NeoPixel(board.D6, 30)
red = (255,0,0)
green = (0,255,0)
blue = (0,0,255)

cpx.pixels[1] = red
cpx.pixels[2] = red
cpx.pixels[9] = green
cpx.pixels[10]= green
cpx.pixels[18] = red
cpx.pixels[19]= red
cpx.pixels[27] = green
cpx.pixels[26]= green
time.sleep(30)


I never showed this code during the talk itself for the time constraint.

The slide :

My slides are public at https://slides.com/dascommunity/my-diwali-with-circuit-python#/

My Gratitude:

I would like to show my deepest gratitude to Nina Zakharenko, Kattni and Carol Willing for the hardware, Scott Shawcroft for giving me guidance into Circuit Python, Nicholas Tollervey for giving us mu and John Hawley for dragging me into hardware. Moreover thank you everyone for helping me, supporting me, standing by me and inspiring me when I broke down.

Conclusion

I really enjoyed giving the talk on/with/in Circuit Python. I will be here, coding simple, fun, useless and creative stuff.

12 Dec 2019 6:46pm GMT

Anwesha Das: Circuit Python at PyConf Hyderabad

Introduction

Coding in/with hardware has become my biggest stress buster for me ever since I have been introduced to it in PyCon Pune 2017 by John. Coding with hardware provides a real-life interaction with the code you write. It flourishes creativity. I can do all of this while I learn something new. Now I look for auctions to offer me a chance to code in/with Hardware. It gives the chance to escape the muggle world.

Diwali and Circuit Python

Diwali is the festival of flowers, food, and lights. So why not I take Diwali as an opportunity to create some magic. Since 2017 I try to lit my house with hardware, which I operate with coding. This year in PyCon US, all the participants got a piece of Adafruit Circuit Playground Express. For 2019 Diwali, I chose Circuit Playground Express as my wand and spell is abracadabra Circuit Python.

Circuit Python in PyConf Hyderabad

A week before, I got a call from the organizers of PyConf Hyderabad 2019 that if I want to deliver a talk there. Initially, I thought of a talk titled " Limiting the legal risk of your open source project", this was a talk selected for PyCon India 2019. Unfortunately, I could not deliver it due to my grandfather's demise. I went for this as the talk was ready. But an afterthought invoked an idea that why shouldn't I go for a talk on Circuit Python? And share what I did in Diwali with it. The organizers also liked it. (Of course, talk about/on hardware is more interesting than a legal talk in a Python conference, right?). In hindsight, it meant much work within 6 days, but I dove for it.

The day of the talk

My morning of the day of the conference started with a lovely surprise. My work has been featured on a blog by Adafruit. It was a fantastic feeling when your work gets recognition from the organization you admire the most. Thank you, Adafruit.

Mine was the 4th talk of the day and the talk before lunch. It was my first talk on/about hardware/hardware projects. This is a talk I am giving after 1 and a half years after my battle with depression. So in a word, I was nervous.

The talk

I started with why programming with/on hardware is essential for me. Then moving to what is Circuit Python and Circuit Playground Express. I use mu as my editor for all my hardware projects. This is no exception. I opened up the editor and started to code. Then came the time to showcase my work/projects with Circuit Python.

In the first example, I turned on the first NeoPixel of the CPX into Red.

from adafruit_circuitplayground.express import cpx

cpx.pixels.brightness = 0.5

# COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0,one_led_red.py 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
PURPLE =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)


while True:
    cpx.pixels[0] = RED

In CPX, we have 10 individually addressable LEDs. Now, we are importing the adafruit_circuitplayground.express module as cpx so that it becomes easier to type. If you remember, there are 10 neopixels on the board, and we can access them as cpx.pixels. We can set the brightness of pixels on the board. Instead of full, I am setting it as half as the lights are really bright. Then I have defined a few colours.

I want my light to lit forever so,' while True' and I am setting the first neopixel's color as red.

all_led_rgb.py

One led is a bit dull; let us lighten up the room. Here we will light up all the neopixels of cpx. With cpx.pixels.fill I am filling up the color red, green, and blue. I am giving a 1-second gap to see the change of colors.

from adafruit_circuitplayground.express import cpx
import time

cpx.pixels.brightness = 0.5

#COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
PURPLE =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)


while True:
    cpx.pixels.fill(RED)
    time.sleep(1)


For the third example, I showed the code where I was

import time
from adafruit_circuitplayground.express import cpx
from random import randint

cpx.pixels.brightness = 0.5

#  COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
VIOLET =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)

COLOURS = [RED, YELLOW, GREEN, CYAN, BLUE, PURPLE, WHITE, MAROON, VIOLET, TEAL]

i = 0
complete = False

while True:
    if complete:
        c = (0,0,0)
    else:
        r = randint(0,9)
        c = COLOURS[r]
    cpx.pixels[i] = c
    time.sleep(1)
    i += 1
    if i == 10:

        i = 0
        if complete:
            complete = False
        else:
            complete = True


The next was the code which I used to lit my Diwali Kandel. I needed cpx to be continuously lighted. So I wrote this code.

import time
from adafruit_circuitplayground.express import cpx
from random import randint

cpx.pixels.brightness = 0.5

# COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
VIOLET =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)

COLOURS= [RED, YELLOW, GREEN, CYAN, BLUE, PURPLE, WHITE, MAROON, VIOLET, TEAL]

i = 0


while True:
    r = randint(0,9)
    c = COLOURS[r]
    cpx.pixels[i] = c
    time.sleep(1)
    i += 1
    if i == 10:
        i = 0

It is customary that we play a game on Diwali. I have replaced card games with a game on cpx. It is a guessing game on CPX.

import time
from adafruit_circuitplayground.express import cpx
from random import randint

cpx.pixels.brightness = 0.5

# COLOUR

RED = (255, 0, 0)
YELLOW = (255, 150, 0)
GREEN = (0, 255, 0)
CYAN = (0, 255, 255)
BLUE = (0, 0, 255)
PURPLE = (180, 0, 255)
WHITE = (255, 255, 255)
MAROON = (128,0,0)
VIOLET =(128,0,128)
TEAL = (0,128,128)
OFF = (0, 0, 0)

COLOURS = [RED, YELLOW, GREEN, CYAN, BLUE, PURPLE, WHITE, MAROON, VIOLET, TEAL]

i = 0


while True:
    if cpx.button_a:
        r = randint(0,9)
        c = COLOURS[r]
        cpx.pixels[i] = c
        time.sleep(0.5)
        i += 1
    elif cpx.button_b:
        cpx.pixels.fill( (0,0,0))
        i = 0

    if i == 10:
        i = 0

What is Diwali without music? Circuit Python and Adafruit have options for that also. I demonstrated Adafruit NeoTrellis M4 for creating some beats.
It is a backlight keypad driver system. We can use the Adafruit NeoTrellis with

The Adafruit module enables one to write Python code controlling the neopixel and read button presses on a single Trellis board or with a matrix of up to eight Trellis boards.

This board can be used with any

Coloring the NeoPixel strip

At PyCon US 2019, while discussing about projects on my previous Diwali with Nina, she got excited and gifted me with a neopixel strip. And as the lady commanded, I lit my Diwali rangoli with neopixel strip connecting it with the alligator clip at cpx :

and ran the code :

import time
from adafruit_circuitplayground.express import cpx
import board
import neopixel
pixels = neopixel.NeoPixel(board.D6, 30)
red = (255,0,0)
green = (0,255,0)
blue = (0,0,255)


while True:
    for i in range(0,30):
        pixels[i] = red
        time.sleep(0.1)
    for i in range(29,-1, -1):
        pixels[i] = green
        time.sleep(0.1)
        

Finally, it was time to light the Diwali diya. I have placed 3 diyas, (color by my toddler) and then fixed the neo pixel strip taping it on the diyas and then ran this code :

import time
from adafruit_circuitplayground.express import cpx
import board
import neopixel
pixels = neopixel.NeoPixel(board.D6, 30)
red = (255,0,0)
green = (0,255,0)
blue = (0,0,255)

cpx.pixels[1] = red
cpx.pixels[2] = red
cpx.pixels[9] = green
cpx.pixels[10]= green
cpx.pixels[18] = red
cpx.pixels[19]= red
cpx.pixels[27] = green
cpx.pixels[26]= green
time.sleep(30)


I never showed this code during the talk itself for the time constraint.

The slide :

My slides are public at https://slides.com/dascommunity/my-diwali-with-circuit-python#/

My Gratitude:

I would like to show my deepest gratitude to Nina Zakharenko, Kattni and Carol Willing for the hardware, Scott Shawcroft for giving me guidance into Circuit Python, Nicholas Tollervey for giving us mu and John Hawley for dragging me into hardware. Moreover thank you everyone for helping me, supporting me, standing by me and inspiring me when I broke down.

Conclusion

I really enjoyed giving the talk on/with/in Circuit Python. I will be here, coding simple, fun, useless and creative stuff.

12 Dec 2019 6:46pm GMT

testmon: New in testmon 1.0.0

Testmon in editor

Significant portions of testmon have been rewritten for v 1.0.1. Although the UI is mostly the same, there are some significant differences.

End of python 2.7 support

Testmon requires python 3.6 or higher and pytest 5 or higher.

No subprocess measurement

--testmon-singleprocess was removed (because it's the only option now)

It's not easy to automate the setup of subprocess measurement and therefore to automate testing it. We're not sure if it worked outside of our environment and if anybody used it. If you miss subprocess measurement, please let us know.

Renames

Playing nice with -m, -k and all other selectors

Old versions of testmon got confused if you deselected some tests through other means than testmon itself. If there are bugs in this they will be squashed with priority.

Quickest tests first

Testmon reorders the test files according to tests per second average so that the quickest tests go first, but the order within the test file or test class is not changed.

New algorithm

Last but not least, we developed a new algorithm and database schema used for selecting tests affected by changes. It will allow us to add new functionality and continue to improve testmon. If there are not many changes determining affected tests should take hundreds of milliseconds at most. If you have to wait for testmon to find out nothing has changed, it's a bug. Please report it. Now also whitespace and comments are taken into account when detecting changes. We hope to improve this in the future and make them insignificant.

12 Dec 2019 2:16pm GMT

testmon: New in testmon 1.0.0

Testmon in editor

Significant portions of testmon have been rewritten for v 1.0.1. Although the UI is mostly the same, there are some significant differences.

End of python 2.7 support

Testmon requires python 3.6 or higher and pytest 5 or higher.

No subprocess measurement

--testmon-singleprocess was removed (because it's the only option now)

It's not easy to automate the setup of subprocess measurement and therefore to automate testing it. We're not sure if it worked outside of our environment and if anybody used it. If you miss subprocess measurement, please let us know.

Renames

Playing nice with -m, -k and all other selectors

Old versions of testmon got confused if you deselected some tests through other means than testmon itself. If there are bugs in this they will be squashed with priority.

Quickest tests first

Testmon reorders the test files according to tests per second average so that the quickest tests go first, but the order within the test file or test class is not changed.

New algorithm

Last but not least, we developed a new algorithm and database schema used for selecting tests affected by changes. It will allow us to add new functionality and continue to improve testmon. If there are not many changes determining affected tests should take hundreds of milliseconds at most. If you have to wait for testmon to find out nothing has changed, it's a bug. Please report it. Now also whitespace and comments are taken into account when detecting changes. We hope to improve this in the future and make them insignificant.

12 Dec 2019 2:16pm GMT

Stack Abuse: Merge Sort in Python

Introduction

Merge Sort is one of the most famous sorting algorithms. If you're studying Computer Science, Merge Sort, alongside Quick Sort is likely the first efficient, general-purpose sorting algorithm you have heard of. It is also a classic example of a divide-and-conquer category of algorithms.

Merge Sort

The way Merge Sort works is:

An initial array is divided into two roughly equal parts. If the array has an odd number of elements, one of those "halves" is by one element larger than the other.

The subarrays are divided over and over again into halves until you end up with arrays that have only one element each.

Then you combine the pairs of one-element arrays into two-element arrays, soring them in the process. Then these sorted pairs are merged into four-element arrays, and so on until you end up with the initial array sorted.

Here's a visualization of Merge Sort:

alt

As you can see, the fact that the array couldn't be divided into equal halves isn't a problem, the 3 just "waits" until the sorting begins.

There are two main ways we can implement the Merge Sort algorithm, one is using a top-down approach like in the example above, which is how Merge Sort is most often introduced.

The other approach, i.e. bottom-up, works in the opposite direction, without recursion (works iteratively) - if our array has N elements we divide it into N subarrays of one element and sort pairs of adjacent one-element arrays, then sort the adjacent pairs of two-element arrays and so on.

Note: The bottom-up approach provides an interesting optimization which we'll discuss later. We'll be implementing the top-down approach as it's simpler and more intuitive couples with the fact that there's no real difference between the time complexity between them without specific optimizations.

The main part of both these approaches is how we combine (merge) the two smaller arrays into a larger array. This is done fairly intuitively, let's say we examine the last step in our previous example. We have the arrays:

The first thing we do is look at the first element of both arrays. We find the one that's smaller, in our case that's 1, so that's the first element of our sorted array, and we move forward in the B array:

Then we look at the next pair of elements 2 and 3; 2 is smaller so we put it in our sorted array and move forward in array A. Of course, we don't move forward in array B and we keep our pointer at 3 for future comparisons:

Using the same logic we move through the rest and end up with with an array of {1, 2, 3, 4, 7, 8, 11}.

The two special cases that can occur are:

Keep in mind that we can sort however we want - this example sorts integers in ascending order but we can just as easily sort in descending order, or sort custom objects.

Implementation

We'll be implementing Merge Sort on two types of collections - on arrays of integers (typically used to introduce sorting) and on custom objects (a more practical and realistic scenario).

We'll implement the Merge Sort algorithm using the top-down approach. The algorithm doesn't look very "pretty" and can be confusing, so we'll go through each step in detail.

Sorting Arrays

Let's start with the easy part. The base idea of the algorithm is to divide (sub)arrays into halves and sort them recursively. We want to keep doing this as much as possible, i.e. until we end up with subarrays that have only one element:

def merge_sort(array, left_index, right_index):
    if left_index > right_index:
        return

    middle = (left_index + right_index)//2
    merge_sort(array, left_index, middle)
    merge_sort(array, middle + 1, right_index)
    merge(array, left_index, right_index, middle)

By calling the merge method last, we make sure that all the divisions will happen before we start the sorting. We use the // operator to be explicit about the fact that we want integer values for our indices.

The next step is the actual merging part through a few steps and scenarios:

With our requirements laid out, let's go ahead and define a merge() function:

def merge(array, left_index, right_index, middle):
    # Make copies of both arrays we're trying to merge

    # The second parameter is non-inclusive, so we have to increase by 1
    left_copy = array[left_index:middle + 1]
    right_copy = array[middle+1:right_index+1]

    # Initial values for variables that we use to keep
    # track of where we are in each array
    left_copy_index = 0
    right_copy_index = 0
    sorted_index = left_index

    # Go through both copies until we run out of elements in one
    while left_copy_index < len(left_copy) and right_copy_index < len(right_copy):

        # If our left_copy has the smaller element, put it in the sorted
        # part and then move forward in left_copy (by increasing the pointer)
        if left_copy[left_copy_index] <= right_copy[right_copy_index]:
            array[sorted_index] = left_copy[left_copy_index]
            left_copy_index = left_copy_index + 1
        # Opposite from above
        else:
            array[sorted_index] = right_copy[right_copy_index]
            right_copy_index = right_copy_index + 1

        # Regardless of where we got our element from
        # move forward in the sorted part
        sorted_index = sorted_index + 1

    # We ran out of elements either in left_copy or right_copy
    # so we will go through the remaining elements and add them
    while left_copy_index < len(left_copy):
        array[sorted_index] = left_copy[left_copy_index]
        left_copy_index = left_copy_index + 1
        sorted_index = sorted_index + 1

    while right_copy_index < len(right_copy):
        array[sorted_index] = right_copy[right_copy_index]
        right_copy_index = right_copy_index + 1
        sorted_index = sorted_index + 1

Now let's test our program out:

array = [33, 42, 9, 37, 8, 47, 5, 29, 49, 31, 4, 48, 16, 22, 26]
merge_sort(array, 0, len(array) -1)
print(array)

And the output is:

[4, 5, 8, 9, 16, 22, 26, 29, 31, 33, 37, 42, 47, 48, 49]

Sorting Custom Objects

Now that we have the basic algorithm down we can take a look at how to sort custom classes. We can override the __eq__, __le__, __ge__ and other operators as needed for this.

This lets us use the same algorithm as above but limits us to only one way of sorting our custom objects, which in most cases isn't what we want. A better idea is to make the algorithm itself more versatile, and pass a comparison function to it instead.

First we'll implement a custom class, Car and add a few fields to it:

class Car:
    def __init__(self, make, model, year):
        self.make = make
        self.model = model
        self.year = year

    def __str__(self):
        return str.format("Make: {}, Model: {}, Year: {}", self.make, self.model, self.year)

Then we'll make a few changes to our Merge Sort methods. The easiest way to achieve what we want is by using lambda functions. You can see that we only added an extra parameter and changed the method calls accordingly, and only one other line of code to make this algorithm a lot more versatile:

def merge(array, left_index, right_index, middle, comparison_function):
    left_copy = array[left_index:middle + 1]
    right_copy = array[middle+1:right_index+1]

    left_copy_index = 0
    right_copy_index = 0
    sorted_index = left_index

    while left_copy_index < len(left_copy) and right_copy_index < len(right_copy):

        # We use the comparison_function instead of a simple comparison operator
        if comparison_function(left_copy[left_copy_index], right_copy[right_copy_index]):
            array[sorted_index] = left_copy[left_copy_index]
            left_copy_index = left_copy_index + 1
        else:
            array[sorted_index] = right_copy[right_copy_index]
            right_copy_index = right_copy_index + 1

        sorted_index = sorted_index + 1

    while left_copy_index < len(left_copy):
        array[sorted_index] = left_copy[left_copy_index]
        left_copy_index = left_copy_index + 1
        sorted_index = sorted_index + 1

    while right_copy_index < len(right_copy):
        array[sorted_index] = right_copy[right_copy_index]
        right_copy_index = right_copy_index + 1
        sorted_index = sorted_index + 1


def merge_sort(array, left_index, right_index, comparison_function):
    if left_index >= right_index:
        return

    middle = (left_index + right_index)//2
    merge_sort(array, left_index, middle, comparison_function)
    merge_sort(array, middle + 1, right_index, comparison_function)
    merge(array, left_index, right_index, middle, comparison_function)

Let's test out or modified algorithm on a few Car instances:

car1 = Car("Alfa Romeo", "33 SportWagon", 1988)
car2 = Car("Chevrolet", "Cruze Hatchback", 2011)
car3 = Car("Corvette", "C6 Couple", 2004)
car4 = Car("Cadillac", "Seville Sedan", 1995)

array = [car1, car2, car3, car4]

merge_sort(array, 0, len(array) -1, lambda carA, carB: carA.year < carB.year)

print("Cars sorted by year:")
for car in array:
    print(car)

print()
merge_sort(array, 0, len(array) -1, lambda carA, carB: carA.make < carB.make)
print("Cars sorted by make:")
for car in array:
    print(car)

We get the output:

Cars sorted by year:
Make: Alfa Romeo, Model: 33 SportWagon, Year: 1988
Make: Cadillac, Model: Seville Sedan, Year: 1995
Make: Corvette, Model: C6 Couple, Year: 2004
Make: Chevrolet, Model: Cruze Hatchback, Year: 2011

Cars sorted by make:
Make: Alfa Romeo, Model: 33 SportWagon, Year: 1988
Make: Cadillac, Model: Seville Sedan, Year: 1995
Make: Chevrolet, Model: Cruze Hatchback, Year: 2011
Make: Corvette, Model: C6 Couple, Year: 2004

Optimization

Let's elaborate the difference between top-down and bottom-up Merge Sort now. Bottom-up works like the second half of the top-down approach where instead of recursively calling the sort on halved subarrays, we iteratively sort adjacent subarrays.

One thing we can do to improve this algorithm is to consider sorted chunks instead of single elements before breaking the array down.

What this means is that, given an array such as {4, 8, 7, 2, 11, 1, 3}, instead of breaking it down into {4}, {8}, {7}, {2}, {11}, {1} ,{3} - it's divided into subarrays which may already be sorted: {4,8}, {7}, {2,11}, {1,3}, and then sorting them.

With real life data we often have a lot of these already sorted subarrays that can noticeably shorten the execution time of Merge Sort.

Another thing to consider with Merge Sort, particularly the top-down version is multi-threading. Merge Sort is convenient for this since each half can be sorted independently of its pair. The only thing that we need to make sure of is that we're done sorting each half before we merge them.

Merge Sort is however relatively inefficient (both time and space) when it comes to smaller arrays, and is often optimized by stopping when we reach an array of ~7 elements, instead of going down to arrays with one element, and calling Insertion Sort to sort them instead, before merging into a larger array.

This is because Insertion Sort works really well with small and/or nearly sorted arrays.

Conclusion

Merge Sort is an efficient, general-purpose sorting algorithm. It's main advantage is the reliable runtime of the algorithm and it's efficiency when sorting large arrays. Unlike Quick Sort, it doesn't depend on any unfortunate decisions that lead to bad runtimes.

One of the main drawbacks is the additional memory that Merge Sort uses to store the temporary copies of arrays before merging them. However, Merge Sort is an excellent, intuitive example to introduce future Software Engineers to the divide-and-conquer approach to creating algorithms.

We've implemented Merge Sort both on simple integer arrays and on custom objects via a lambda function used for comparison. In the end, possible optimizations for both approaches were briefly discussed.

12 Dec 2019 1:46pm GMT

Stack Abuse: Merge Sort in Python

Introduction

Merge Sort is one of the most famous sorting algorithms. If you're studying Computer Science, Merge Sort, alongside Quick Sort is likely the first efficient, general-purpose sorting algorithm you have heard of. It is also a classic example of a divide-and-conquer category of algorithms.

Merge Sort

The way Merge Sort works is:

An initial array is divided into two roughly equal parts. If the array has an odd number of elements, one of those "halves" is by one element larger than the other.

The subarrays are divided over and over again into halves until you end up with arrays that have only one element each.

Then you combine the pairs of one-element arrays into two-element arrays, soring them in the process. Then these sorted pairs are merged into four-element arrays, and so on until you end up with the initial array sorted.

Here's a visualization of Merge Sort:

alt

As you can see, the fact that the array couldn't be divided into equal halves isn't a problem, the 3 just "waits" until the sorting begins.

There are two main ways we can implement the Merge Sort algorithm, one is using a top-down approach like in the example above, which is how Merge Sort is most often introduced.

The other approach, i.e. bottom-up, works in the opposite direction, without recursion (works iteratively) - if our array has N elements we divide it into N subarrays of one element and sort pairs of adjacent one-element arrays, then sort the adjacent pairs of two-element arrays and so on.

Note: The bottom-up approach provides an interesting optimization which we'll discuss later. We'll be implementing the top-down approach as it's simpler and more intuitive couples with the fact that there's no real difference between the time complexity between them without specific optimizations.

The main part of both these approaches is how we combine (merge) the two smaller arrays into a larger array. This is done fairly intuitively, let's say we examine the last step in our previous example. We have the arrays:

The first thing we do is look at the first element of both arrays. We find the one that's smaller, in our case that's 1, so that's the first element of our sorted array, and we move forward in the B array:

Then we look at the next pair of elements 2 and 3; 2 is smaller so we put it in our sorted array and move forward in array A. Of course, we don't move forward in array B and we keep our pointer at 3 for future comparisons:

Using the same logic we move through the rest and end up with with an array of {1, 2, 3, 4, 7, 8, 11}.

The two special cases that can occur are:

Keep in mind that we can sort however we want - this example sorts integers in ascending order but we can just as easily sort in descending order, or sort custom objects.

Implementation

We'll be implementing Merge Sort on two types of collections - on arrays of integers (typically used to introduce sorting) and on custom objects (a more practical and realistic scenario).

We'll implement the Merge Sort algorithm using the top-down approach. The algorithm doesn't look very "pretty" and can be confusing, so we'll go through each step in detail.

Sorting Arrays

Let's start with the easy part. The base idea of the algorithm is to divide (sub)arrays into halves and sort them recursively. We want to keep doing this as much as possible, i.e. until we end up with subarrays that have only one element:

def merge_sort(array, left_index, right_index):
    if left_index > right_index:
        return

    middle = (left_index + right_index)//2
    merge_sort(array, left_index, middle)
    merge_sort(array, middle + 1, right_index)
    merge(array, left_index, right_index, middle)

By calling the merge method last, we make sure that all the divisions will happen before we start the sorting. We use the // operator to be explicit about the fact that we want integer values for our indices.

The next step is the actual merging part through a few steps and scenarios:

With our requirements laid out, let's go ahead and define a merge() function:

def merge(array, left_index, right_index, middle):
    # Make copies of both arrays we're trying to merge

    # The second parameter is non-inclusive, so we have to increase by 1
    left_copy = array[left_index:middle + 1]
    right_copy = array[middle+1:right_index+1]

    # Initial values for variables that we use to keep
    # track of where we are in each array
    left_copy_index = 0
    right_copy_index = 0
    sorted_index = left_index

    # Go through both copies until we run out of elements in one
    while left_copy_index < len(left_copy) and right_copy_index < len(right_copy):

        # If our left_copy has the smaller element, put it in the sorted
        # part and then move forward in left_copy (by increasing the pointer)
        if left_copy[left_copy_index] <= right_copy[right_copy_index]:
            array[sorted_index] = left_copy[left_copy_index]
            left_copy_index = left_copy_index + 1
        # Opposite from above
        else:
            array[sorted_index] = right_copy[right_copy_index]
            right_copy_index = right_copy_index + 1

        # Regardless of where we got our element from
        # move forward in the sorted part
        sorted_index = sorted_index + 1

    # We ran out of elements either in left_copy or right_copy
    # so we will go through the remaining elements and add them
    while left_copy_index < len(left_copy):
        array[sorted_index] = left_copy[left_copy_index]
        left_copy_index = left_copy_index + 1
        sorted_index = sorted_index + 1

    while right_copy_index < len(right_copy):
        array[sorted_index] = right_copy[right_copy_index]
        right_copy_index = right_copy_index + 1
        sorted_index = sorted_index + 1

Now let's test our program out:

array = [33, 42, 9, 37, 8, 47, 5, 29, 49, 31, 4, 48, 16, 22, 26]
merge_sort(array, 0, len(array) -1)
print(array)

And the output is:

[4, 5, 8, 9, 16, 22, 26, 29, 31, 33, 37, 42, 47, 48, 49]

Sorting Custom Objects

Now that we have the basic algorithm down we can take a look at how to sort custom classes. We can override the __eq__, __le__, __ge__ and other operators as needed for this.

This lets us use the same algorithm as above but limits us to only one way of sorting our custom objects, which in most cases isn't what we want. A better idea is to make the algorithm itself more versatile, and pass a comparison function to it instead.

First we'll implement a custom class, Car and add a few fields to it:

class Car:
    def __init__(self, make, model, year):
        self.make = make
        self.model = model
        self.year = year

    def __str__(self):
        return str.format("Make: {}, Model: {}, Year: {}", self.make, self.model, self.year)

Then we'll make a few changes to our Merge Sort methods. The easiest way to achieve what we want is by using lambda functions. You can see that we only added an extra parameter and changed the method calls accordingly, and only one other line of code to make this algorithm a lot more versatile:

def merge(array, left_index, right_index, middle, comparison_function):
    left_copy = array[left_index:middle + 1]
    right_copy = array[middle+1:right_index+1]

    left_copy_index = 0
    right_copy_index = 0
    sorted_index = left_index

    while left_copy_index < len(left_copy) and right_copy_index < len(right_copy):

        # We use the comparison_function instead of a simple comparison operator
        if comparison_function(left_copy[left_copy_index], right_copy[right_copy_index]):
            array[sorted_index] = left_copy[left_copy_index]
            left_copy_index = left_copy_index + 1
        else:
            array[sorted_index] = right_copy[right_copy_index]
            right_copy_index = right_copy_index + 1

        sorted_index = sorted_index + 1

    while left_copy_index < len(left_copy):
        array[sorted_index] = left_copy[left_copy_index]
        left_copy_index = left_copy_index + 1
        sorted_index = sorted_index + 1

    while right_copy_index < len(right_copy):
        array[sorted_index] = right_copy[right_copy_index]
        right_copy_index = right_copy_index + 1
        sorted_index = sorted_index + 1


def merge_sort(array, left_index, right_index, comparison_function):
    if left_index >= right_index:
        return

    middle = (left_index + right_index)//2
    merge_sort(array, left_index, middle, comparison_function)
    merge_sort(array, middle + 1, right_index, comparison_function)
    merge(array, left_index, right_index, middle, comparison_function)

Let's test out or modified algorithm on a few Car instances:

car1 = Car("Alfa Romeo", "33 SportWagon", 1988)
car2 = Car("Chevrolet", "Cruze Hatchback", 2011)
car3 = Car("Corvette", "C6 Couple", 2004)
car4 = Car("Cadillac", "Seville Sedan", 1995)

array = [car1, car2, car3, car4]

merge_sort(array, 0, len(array) -1, lambda carA, carB: carA.year < carB.year)

print("Cars sorted by year:")
for car in array:
    print(car)

print()
merge_sort(array, 0, len(array) -1, lambda carA, carB: carA.make < carB.make)
print("Cars sorted by make:")
for car in array:
    print(car)

We get the output:

Cars sorted by year:
Make: Alfa Romeo, Model: 33 SportWagon, Year: 1988
Make: Cadillac, Model: Seville Sedan, Year: 1995
Make: Corvette, Model: C6 Couple, Year: 2004
Make: Chevrolet, Model: Cruze Hatchback, Year: 2011

Cars sorted by make:
Make: Alfa Romeo, Model: 33 SportWagon, Year: 1988
Make: Cadillac, Model: Seville Sedan, Year: 1995
Make: Chevrolet, Model: Cruze Hatchback, Year: 2011
Make: Corvette, Model: C6 Couple, Year: 2004

Optimization

Let's elaborate the difference between top-down and bottom-up Merge Sort now. Bottom-up works like the second half of the top-down approach where instead of recursively calling the sort on halved subarrays, we iteratively sort adjacent subarrays.

One thing we can do to improve this algorithm is to consider sorted chunks instead of single elements before breaking the array down.

What this means is that, given an array such as {4, 8, 7, 2, 11, 1, 3}, instead of breaking it down into {4}, {8}, {7}, {2}, {11}, {1} ,{3} - it's divided into subarrays which may already be sorted: {4,8}, {7}, {2,11}, {1,3}, and then sorting them.

With real life data we often have a lot of these already sorted subarrays that can noticeably shorten the execution time of Merge Sort.

Another thing to consider with Merge Sort, particularly the top-down version is multi-threading. Merge Sort is convenient for this since each half can be sorted independently of its pair. The only thing that we need to make sure of is that we're done sorting each half before we merge them.

Merge Sort is however relatively inefficient (both time and space) when it comes to smaller arrays, and is often optimized by stopping when we reach an array of ~7 elements, instead of going down to arrays with one element, and calling Insertion Sort to sort them instead, before merging into a larger array.

This is because Insertion Sort works really well with small and/or nearly sorted arrays.

Conclusion

Merge Sort is an efficient, general-purpose sorting algorithm. It's main advantage is the reliable runtime of the algorithm and it's efficiency when sorting large arrays. Unlike Quick Sort, it doesn't depend on any unfortunate decisions that lead to bad runtimes.

One of the main drawbacks is the additional memory that Merge Sort uses to store the temporary copies of arrays before merging them. However, Merge Sort is an excellent, intuitive example to introduce future Software Engineers to the divide-and-conquer approach to creating algorithms.

We've implemented Merge Sort both on simple integer arrays and on custom objects via a lambda function used for comparison. In the end, possible optimizations for both approaches were briefly discussed.

12 Dec 2019 1:46pm GMT

Python Bytes: #160 Your JSON shall be streamed

12 Dec 2019 8:00am GMT

Python Bytes: #160 Your JSON shall be streamed

12 Dec 2019 8:00am GMT

Kushal Das: Updates on Unoon in December 2019

This Saturday evening, I sat with Unoon project after a few weeks, I was continuously running it, but, did not resume the development effort. This time Bhavin also joined me. Together, we fixed a location of the whitelist files issue, and unoon now also has a database (using SQLite), which stores all the historical process and connection information. In the future, we will provide some way to query this information.

As usual, we learned many new things about different Linux processes while doing this development. One of the important ones is about running podman process, and how the user id maps to the real system. Bhavin added a patch that fixes a previously known issue of crashing due to missing user name. Now, unoon shows the real user ID when it can not find the username in the /etc/passwd file.

You can read about Unoon more in my previous blog post.

12 Dec 2019 3:44am GMT

Kushal Das: Updates on Unoon in December 2019

This Saturday evening, I sat with Unoon project after a few weeks, I was continuously running it, but, did not resume the development effort. This time Bhavin also joined me. Together, we fixed a location of the whitelist files issue, and unoon now also has a database (using SQLite), which stores all the historical process and connection information. In the future, we will provide some way to query this information.

As usual, we learned many new things about different Linux processes while doing this development. One of the important ones is about running podman process, and how the user id maps to the real system. Bhavin added a patch that fixes a previously known issue of crashing due to missing user name. Now, unoon shows the real user ID when it can not find the username in the /etc/passwd file.

You can read about Unoon more in my previous blog post.

12 Dec 2019 3:44am GMT

11 Dec 2019

feedPlanet Python

Python Insider: Python 3.7.6rc1 and 3.6.10rc1 are now available for testing

Python 3.7.6rc1 and 3.6.10rc1 are now available. 3.7.6rc1 is the release preview of the next maintenance release of Python 3.7; 3.6.10rc1 is the release preview of the next security-fix release of Python 3.6. Assuming no critical problems are found prior to 2019-12-18, no code changes are planned between these release candidates and the final releases. These release candidates are intended to give you the opportunity to test the new security and bug fixes in 3.7.6 and security fixes in 3.6.10. While we strive to not introduce any incompatibilities in new maintenance and security releases, we encourage you to test your projects and report issues found to bugs.python.org as soon as possible. Please keep in mind that these are preview releases and, thus, their use is not recommended for production environments.

You can find the release files, a link to their changelogs, and more information here:

https://www.python.org/downloads/release/python-376rc1/
https://www.python.org/downloads/release/python-3610rc1/

11 Dec 2019 6:36pm GMT

Python Insider: Python 3.7.6rc1 and 3.6.10rc1 are now available for testing

Python 3.7.6rc1 and 3.6.10rc1 are now available. 3.7.6rc1 is the release preview of the next maintenance release of Python 3.7; 3.6.10rc1 is the release preview of the next security-fix release of Python 3.6. Assuming no critical problems are found prior to 2019-12-18, no code changes are planned between these release candidates and the final releases. These release candidates are intended to give you the opportunity to test the new security and bug fixes in 3.7.6 and security fixes in 3.6.10. While we strive to not introduce any incompatibilities in new maintenance and security releases, we encourage you to test your projects and report issues found to bugs.python.org as soon as possible. Please keep in mind that these are preview releases and, thus, their use is not recommended for production environments.

You can find the release files, a link to their changelogs, and more information here:

https://www.python.org/downloads/release/python-376rc1/
https://www.python.org/downloads/release/python-3610rc1/

11 Dec 2019 6:36pm GMT

10 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: King Willams Town Bahnhof

Gestern musste ich morgens zur Station nach KWT um unsere Rerservierten Bustickets für die Weihnachtsferien in Capetown abzuholen. Der Bahnhof selber ist seit Dezember aus kostengründen ohne Zugverbindung - aber Translux und co - die langdistanzbusse haben dort ihre Büros.


Größere Kartenansicht




© benste CC NC SA

10 Nov 2011 10:57am GMT

09 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein

Niemand ist besorgt um so was - mit dem Auto fährt man einfach durch, und in der City - nahe Gnobie- "ne das ist erst gefährlich wenn die Feuerwehr da ist" - 30min später auf dem Rückweg war die Feuerwehr da.




© benste CC NC SA

09 Nov 2011 8:25pm GMT

08 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Brai Party

Brai = Grillabend o.ä.

Die möchte gern Techniker beim Flicken ihrer SpeakOn / Klinke Stecker Verzweigungen...

Die Damen "Mamas" der Siedlung bei der offiziellen Eröffnungsrede

Auch wenn weniger Leute da waren als erwartet, Laute Musik und viele Leute ...

Und natürlich ein Feuer mit echtem Holz zum Grillen.

© benste CC NC SA

08 Nov 2011 2:30pm GMT

07 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Lumanyano Primary

One of our missions was bringing Katja's Linux Server back to her room. While doing that we saw her new decoration.

Björn, Simphiwe carried the PC to Katja's school


© benste CC NC SA

07 Nov 2011 2:00pm GMT

06 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Nelisa Haircut

Today I went with Björn to Needs Camp to Visit Katja's guest family for a special Party. First of all we visited some friends of Nelisa - yeah the one I'm working with in Quigney - Katja's guest fathers sister - who did her a haircut.

African Women usually get their hair done by arranging extensions and not like Europeans just cutting some hair.

In between she looked like this...

And then she was done - looks amazing considering the amount of hair she had last week - doesn't it ?

© benste CC NC SA

06 Nov 2011 7:45pm GMT

05 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Mein Samstag

Irgendwie viel mir heute auf das ich meine Blogposts mal ein bischen umstrukturieren muss - wenn ich immer nur von neuen Plätzen berichte, dann müsste ich ja eine Rundreise machen. Hier also mal ein paar Sachen aus meinem heutigen Alltag.

Erst einmal vorweg, Samstag zählt zumindest für uns Voluntäre zu den freien Tagen.

Dieses Wochenende sind nur Rommel und ich auf der Farm - Katja und Björn sind ja mittlerweile in ihren Einsatzstellen, und meine Mitbewohner Kyle und Jonathan sind zu Hause in Grahamstown - sowie auch Sipho der in Dimbaza wohnt.
Robin, die Frau von Rommel ist in Woodie Cape - schon seit Donnerstag um da ein paar Sachen zur erledigen.
Naja wie dem auch sei heute morgen haben wir uns erstmal ein gemeinsames Weetbix/Müsli Frühstück gegönnt und haben uns dann auf den Weg nach East London gemacht. 2 Sachen waren auf der Checkliste Vodacom, Ethienne (Imobilienmakler) außerdem auf dem Rückweg die fehlenden Dinge nach NeedsCamp bringen.

Nachdem wir gerade auf der Dirtroad losgefahren sind mussten wir feststellen das wir die Sachen für Needscamp und Ethienne nicht eingepackt hatten aber die Pumpe für die Wasserversorgung im Auto hatten.

Also sind wir in EastLondon ersteinmal nach Farmerama - nein nicht das onlinespiel farmville - sondern einen Laden mit ganz vielen Sachen für eine Farm - in Berea einem nördlichen Stadteil gefahren.

In Farmerama haben wir uns dann beraten lassen für einen Schnellverschluss der uns das leben mit der Pumpe leichter machen soll und außerdem eine leichtere Pumpe zur Reperatur gebracht, damit es nicht immer so ein großer Aufwand ist, wenn mal wieder das Wasser ausgegangen ist.

Fego Caffé ist in der Hemmingways Mall, dort mussten wir und PIN und PUK einer unserer Datensimcards geben lassen, da bei der PIN Abfrage leider ein zahlendreher unterlaufen ist. Naja auf jeden Fall speichern die Shops in Südafrika so sensible Daten wie eine PUK - die im Prinzip zugang zu einem gesperrten Phone verschafft.

Im Cafe hat Rommel dann ein paar online Transaktionen mit dem 3G Modem durchgeführt, welches ja jetzt wieder funktionierte - und übrigens mittlerweile in Ubuntu meinem Linuxsystem perfekt klappt.

Nebenbei bin ich nach 8ta gegangen um dort etwas über deren neue Deals zu erfahren, da wir in einigen von Hilltops Centern Internet anbieten wollen. Das Bild zeigt die Abdeckung UMTS in NeedsCamp Katjas Ort. 8ta ist ein neuer Telefonanbieter von Telkom, nachdem Vodafone sich Telkoms anteile an Vodacom gekauft hat müssen die komplett neu aufbauen.
Wir haben uns dazu entschieden mal eine kostenlose Prepaidkarte zu testen zu organisieren, denn wer weis wie genau die Karte oben ist ... Bevor man einen noch so billigen Deal für 24 Monate signed sollte man wissen obs geht.

Danach gings nach Checkers in Vincent, gesucht wurden zwei Hotplates für WoodyCape - R 129.00 eine - also ca. 12€ für eine zweigeteilte Kochplatte.
Wie man sieht im Hintergrund gibts schon Weihnachtsdeko - Anfang November und das in Südafrika bei sonnig warmen min- 25°C

Mittagessen haben wir uns bei einem Pakistanischen Curry Imbiss gegönnt - sehr empfehlenswert !
Naja und nachdem wir dann vor ner Stunde oder so zurück gekommen sind habe ich noch den Kühlschrank geputzt den ich heute morgen zum defrosten einfach nach draußen gestellt hatte. Jetzt ist der auch mal wieder sauber und ohne 3m dicke Eisschicht...

Morgen ... ja darüber werde ich gesondert berichten ... aber vermutlich erst am Montag, denn dann bin ich nochmal wieder in Quigney(East London) und habe kostenloses Internet.

© benste CC NC SA

05 Nov 2011 4:33pm GMT

31 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Sterkspruit Computer Center

Sterkspruit is one of Hilltops Computer Centres in the far north of Eastern Cape. On the trip to J'burg we've used the opportunity to take a look at the centre.

Pupils in the big classroom


The Trainer


School in Countryside


Adult Class in the Afternoon


"Town"


© benste CC NC SA

31 Oct 2011 4:58pm GMT

Benedict Stein: Technical Issues

What are you doing in an internet cafe if your ADSL and Faxline has been discontinued before months end. Well my idea was sitting outside and eating some ice cream.
At least it's sunny and not as rainy as on the weekend.


© benste CC NC SA

31 Oct 2011 3:11pm GMT

30 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Nellis Restaurant

For those who are traveling through Zastron - there is a very nice Restaurant which is serving delicious food at reasanable prices.
In addition they're selling home made juices jams and honey.




interior


home made specialities - the shop in the shop


the Bar


© benste CC NC SA

30 Oct 2011 4:47pm GMT

29 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: The way back from J'burg

Having the 10 - 12h trip from J'burg back to ELS I was able to take a lot of pcitures including these different roadsides

Plain Street


Orange River in its beginngings (near Lesotho)


Zastron Anglican Church


The Bridge in Between "Free State" and Eastern Cape next to Zastron


my new Background ;)


If you listen to GoogleMaps you'll end up traveling 50km of gravel road - as it was just renewed we didn't have that many problems and saved 1h compared to going the official way with all it's constructions sites




Freeway


getting dark


© benste CC NC SA

29 Oct 2011 4:23pm GMT

28 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Wie funktioniert eigentlich eine Baustelle ?

Klar einiges mag anders sein, vieles aber gleich - aber ein in Deutschland täglich übliches Bild einer Straßenbaustelle - wie läuft das eigentlich in Südafrika ?

Ersteinmal vorweg - NEIN keine Ureinwohner die mit den Händen graben - auch wenn hier mehr Manpower genutzt wird - sind sie fleißig mit Technologie am arbeiten.

Eine ganz normale "Bundesstraße"


und wie sie erweitert wird


gaaaanz viele LKWs


denn hier wird eine Seite über einen langen Abschnitt komplett gesperrt, so das eine Ampelschaltung mit hier 45 Minuten Wartezeit entsteht


Aber wenigstens scheinen die ihren Spaß zu haben ;) - Wie auch wir denn gücklicher Weise mussten wir nie länger als 10 min. warten.

© benste CC NC SA

28 Oct 2011 4:20pm GMT