09 Dec 2014

feedPlanet Twisted

Glyph Lefkowitz: Docker Dev to Prod in Just A Few Easy Steps

It seems that docker is all the rage these days. Docker has popularized a powerful paradigm for repeatable, isolated deployments of pretty much any application you can run on Linux. There are numerous highly sophisticated orchestration systems which can leverage Docker to deploy applications at massive scale. At the other end of the spectrum, there are quick ways to get started with automated deployment or orchestrated multi-container development environments.

When you're just getting started, this dazzling array of tools can be as bewildering as it is impressive.

A big part of the promise of docker is that you can build your app in a standard format on any computer, anywhere, and then run it. As docker.com puts it:

"... run the same app, unchanged, on laptops, data center VMs, and any cloud ..."

So when I started approaching docker, my first thought was: before I mess around with any of this deployment automation stuff, how do I just get an arbitrary docker container that I've built and tested on my laptop shipped into the cloud?

There are a few documented options that I came across, but they all had drawbacks, and didn't really make the ideal tradeoff for just starting out:

  1. I could push my image up to the public registry and then pull it down. While this works for me on open source projects, it doesn't really generalize.
  2. I could run my own registry on a server, and push it there. I can either run it plain-text and risk the unfortunate security implications that implies, deal with the administrative hassle of running my own certificate authority and propagating trust out to my deployment node, or spend money on a real TLS certificate. Since I'm just starting out, I don't want to deal with any of these hassles right away.
  3. I could re-run the build on every host where I intend to run the application. This is easy and repeatable, but unfortunately it means that I'm missing part of that great promise of docker - I'm running potentially subtly different images in development, test, and production.

I think I have figured out a fourth option that is super fast to get started with, as well as being reasonably secure.

What I have done is:

  1. run a local registry
  2. build an image locally - testing it until it works as desired
  3. push the image to that registry
  4. use SSH port forwarding to "pull" that image onto a cloud server, from my laptop

Before running the registry, you should set aside a persistent location for the registry's storage. Since I'm using boot2docker, I stuck this in my home directory, like so:

1
me@laptop$ mkdir -p ~/Documents/Docker/Registry

To run the registry, you need to do this:

1
2
3
4
5
6
7
8
me@laptop$ docker pull registry
...
Status: Image is up to date for registry:latest
me@laptop$ docker run --name registry --rm=true -p 5000:5000 \
    -e GUNICORN_OPTS=[--preload] \
    -e STORAGE_PATH=/registry \
    -v "$HOME/Documents/Docker/Registry:/registry" \
    registry

To briefly explain each of these arguments - --name is just there so I can quickly identify this as my registry container in docker ps and the like; --rm=true is there so that I don't create detritus from subsequent runs of this container, -p 5000:5000 exposes the registry to the docker host, -e GUNICORN_OPTS=[--preload] is a workaround for a small bug, STORAGE_PATH=/registry tells the registry to look in /registry for its images, and the -v option points /registry at the directory we previously created above.

It's important to understand that this registry container only needs to be running for the duration of the commands below. Spin it up, push and pull your images, and then you can just shut it down.

Next, you want to build your image, tagging it with localhost.localdomain.

1
2
me@laptop$ cd MyDockerApp
me@laptop$ docker build -t localhost.localdomain:5000/mydockerapp .

Assuming the image builds without incident, the next step is to send the image to your registry.

1
me@laptop$ docker push localhost.localdomain:5000/mydockerapp

Once that has completed, it's time to "pull" the image on your cloud machine, which - again, if you're using boot2docker, like me, can be done like so:

1
2
3
me@laptop$ ssh -t -R 127.0.0.1:5000:"$(boot2docker ip 2>/dev/null)":5000 \
    mycloudserver.example.com \
    'docker pull localhost.localdomain:5000/mydockerapp'

If you're on Linux and simply running Docker on a local host, then you don't need the "boot2docker" command:

1
2
3
me@laptop$ ssh -t -R 127.0.0.1:5000:127.0.0.1:5000 \
    mycloudserver.example.com \
    'docker pull localhost.localdomain:5000/mydockerapp'

Finally, you can now run this image on your cloud server. You will of course need to decide on appropriate configuration options for your applications such as -p, -v, and -e:

1
2
3
4
me@laptop$ ssh mycloudserver.example.com \
    'docker run -d --restart=always --name=mydockerapp \
        -p ... -v ... -e ... \
        localhost.localdomain:5000/mydockerapp'

To avoid network round trips, you can even run the previous two steps as a single command:

1
2
3
4
5
6
me@laptop$ ssh -t -R 127.0.0.1:5000:"$(boot2docker ip 2>/dev/null)":5000 \
    mycloudserver.example.com \
    'docker pull localhost.localdomain:5000/mydockerapp && \
     docker run -d --restart=always --name=mydockerapp \
        -p ... -v ... -e ... \
        localhost.localdomain:5000/mydockerapp'

I would not recommend setting up any intense production workloads this way; those orchestration tools I mentioned at the beginning of this article exist for a reason, and if you need to manage a cluster of servers you should probably take the time to learn how to set up and manage one of them.

However, as far as I know, there's also nothing wrong with putting your application into production this way. If you have a simple single-container application, then this is a reasonably robust way to run it: the docker daemon will take care of restarting it if your machine crashes, and running this command again (with a docker rm -f mydockerapp before docker run) will re-deploy it in a clean, reproducible way.

So if you're getting started exploring docker and you're not sure how to get a couple of apps up and running just to give it a spin, hopefully this can set you on your way quickly!

(Many thanks to my employer, Rackspace, for sponsoring the time for me to write this post. Thanks also to Jean-Paul Calderone, Alex Gaynor, and Julian Berman for their thoughtful review. Any errors are surely my own.)

09 Dec 2014 2:14am GMT

28 Nov 2014

feedPlanet Twisted

Glyph Lefkowitz: Public or Private?

If I am creating a new feature in library code, I have two choices with the implementation details: I can make them public - that is, exposed to application code - or I can make them private - that is, for use only within the library.

https://www.flickr.com/photos/skyrim/6518329775/

If I make them public, then the structure of my library is very clear to its clients. Testing is easy, because the public structures may be manipulated and replaced as the tests dictate. Inspection is easy, because all the data is exposed for clients to manipulate. Developers are happy when they can manipulate and test things easily. If I select "public" as the general rule, then developers using my library will be happy, because they'll be able to inspect and test everything quite easily whether I specifically designed in support for that or not.

However, now that they're public, I have to support them in their current form into the forseeable future. Since I tend to maintain the libraries I work on, and maintenance necessarily means change, a large public API surface means a lot of ongoing changes to exposed functionality, which means a constant stream of deprecation warnings and deprecated feature removals. Without private implementation details, there's no axis on which I can change my software without deprecating older versions. Developers hate keeping up with deprecation warnings or having their applications break when a new version of a library comes out, so if I adopt "public" as the general rule, developers will be unhappy.

https://www.flickr.com/photos/lisasaunders18/14061397865

If I make them private, then the structure of my library is a lot easier to understand by developers, because the API surface is much smaller, and exposes only the minimum necessary to accomplish their tasks. Because the implementation details are private, when I maintain the library, I can add functionality "for free" and make internal changes without requiring any additional work from developers. Developers like it when you don't waste their time with trivia and make it easier to get to what they want right away, and they love getting stuff "for free", so if I adopt "private" as the general rule, developers will be happy.

However, now that they're private, there's potentially no way to access that functionality for unforseen use-cases, and testing and inspection may be difficult unless the functionality in question was designed with an above-average level of care. Since most functionality is, by definition, designed with an average level of care, that means that there will inevitably be gaps in these meta-level tools until the functionality has already been in use for a while, which means that developers will need to report bugs and wait for new releases. Developers don't like waiting for the next release cycle to get access to functionality that they need to get work done right now, so if I adopt "private" as the general rule, developers will be unhappy.

Hmm.

28 Nov 2014 11:25pm GMT

Duncan McGreggor: Scientific Computing with Hy and IPython

This blog post is a bit different than other technical posts I've done in the past in that the majority of the content is not on the blog in or gists; instead, it is in an IPython notebook. Having adored Mathematica back in the 90s, you can imagine how much I love the IPython Notebook app. I'll have more to say on that at a future date.

I've been doing a great deal of NumPy and matplotlib again lately, every day for hours a day. In conjunction with the new features in Python 3, this has been quite a lot of fun -- the most fun I've had with Python in years (thanks Guido, et al!). As you might have guessed, I'm also using it with Erlang (specifically, LFE), but that too is for a post yet to come.

With all this matplotlib and numpy work in standard Python, I've been going through Lisp withdrawals and needed to work with it from a fresh perspective. Needless to say, I had an enormous amount of fun doing this. Naturally, I decided to share with folks how one can do the latest and greatest with the tools of Python scientific computing, but in the syntax of the Python community's best kept secret: Clojure-Flavoured Python (Github, Twitter, Wikipedia).

Spoiler: observed data and
polynomial curve fitting

Looking about for ideas, I decided to see what Clojure's Incanter project had for tutorials, and immediately found what I was looking for: Linear regression with higher-order terms, a 2009 post by David Edgar Liebke.

Nearly every cell in the tutorial notebook is in Hy, and for that we owe a huge thanks to yardsale8 for his Hy IPython magics code. For those that love Python and Lisp equally, who are familiar with the ecosystems' tools, Hy offers a wonderful option for being highly productive with a language supporting Lisp- and Clojure-style macros. You can get your work done, have a great time doing it, and let that inner code artist out!

(In fact, I've started writing a macro for one of the examples in the tutorial, offering a more Lisp-like syntax for creating class methods. We'll see what Paul Tagliamonte has to say about it when it's done ... !)

If you want to check out the notebook code and run it locally, just do the following:

This will do the following:


If you just want to read along, you're more than welcome to do that as well, thanks to the IPython NBViewer service. Here's the link: Scientific Computing with Hy: Linear Regressions.

One thing I couldn't get working was the community-provided code for generating tables of contents in IPython notebooks. If you have any expertise in this area, I'd love to get your feedback to see how I need to configure the custom ihy IPython profile for this tutorial.

Without that, I've opted for the manual approach and have provided a table of contents here:

If all goes well, you will enjoy that as much as I did :-)

More soon ...


28 Nov 2014 8:11pm GMT

21 Nov 2014

feedPlanet Twisted

Duncan McGreggor: ErlPort: Using Python from Erlang/LFE

This is a short little blog post I've been wanting to get out there ever since I ran across the erlport project a few years ago. Erlang was built for fault-tolerance. It had a goal of unprecedented uptimes, and these have been achieved. It powers 40% of our world's telecommunications traffic. It's capable of supporting amazing levels of concurrency (remember the 2007 announcement about the performance of YAWS vs. Apache?).

With this knowledge in mind, a common mistake by folks new to Erlang is to think these performance characteristics will be applicable to their own particular domain. This has often resulted in failure, disappointment, and the unjust blaming of Erlang. If you want to process huge files, do lots of string manipulation, or crunch tons of numbers, Erlang's not your bag, baby. Try Python or Julia.

But then, you may be thinking: I like supervision trees. I have long-running processes that I want to be managed per the rules I establish. I want to run lots of jobs in parallel on my 64-core box. I want to run jobs in parallel over the network on 64 of my 64-core boxes. Python's the right tool for the jobs, but I wish I could manage them with Erlang.

(There are sooo many other options for the use cases above, many of them really excellent. But this post is about Erlang/LFE :-)).

Traditionally, if you want to run other languages with Erlang in a reliable way that doesn't bring your Erlang nodes down with badly behaved code, you use Ports. (more info is available in the Interoperability Guide). This is what JInterface builds upon (and, incidentally, allows for some pretty cool integration with Clojure). However, this still leaves a pretty significant burden for the Python or Ruby developer for any serious application needs (quick one-offs that only use one or two data types are not that big a deal).

erlport was created by Dmitry Vasiliev in 2009 in an effort to solve just this problem, making it easier to use of and integrate between Erlang and more common languages like Python and Ruby. The project is maintained, and in fact has just received a few updates. Below, we'll demonstrate some usage in LFE with Python 3.

If you want to follow along, there's a demo repo you can check out:
Change into the repo directory and set up your Python environment:
Next, switch over to the LFE directory, and fire up a REPL:
Note that this will first download the necessary dependencies and compile them (that's what the [snip] is eliding).

Now we're ready to take erlport for a quick trip down to the local:
And that's all there is to it :-)

Perhaps in a future post we can dive into the internals, showing you more of the glory that is erlport. Even better, we could look at more compelling example usage, approaching some of the functionality offered by such projects as Disco or Anaconda.


21 Nov 2014 11:08pm GMT

Duncan McGreggor: The Secret History of Lambda

Being a bit of an origins nut (I always want to know how something came to be or why it is a certain way), one of the things that always bothered me with regard to Lisp was that no one seemed to talking about the origin of lambda in the lambda calculus. I suppose if I wasn't lazy, I'd have gone to a library and spent some time looking it up. But since I was lazy, I used Wikipedia. Sadly, I never got what I wanted: no history of lambda. [1] Well, certainly some information about the history of the lambda calculus, but not the actual character or term in that context.

Why lambda? Why not gamma or delta? Or Siddham ṇḍha?

To my great relief, this question was finally answered when I was reading one of the best Lisp books I've ever read: Peter Norvig's Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp. I'll save my discussion of that book for later; right now I'm going to focus on the paragraph at location 821 of my Kindle edition of the book. [2]

The story goes something like this:

It seems that our beloved lambda [11], then, is an accident in typography more than anything else.


Somehow, this endears lambda to me even more ;-)



[1] As you can see from the rest of the footnotes, I've done some research since then and have found other references to this history of the lambda notation.

[2] Norvig, Peter (1991-10-15). Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp (Kindle Locations 821-829). Elsevier Science - A. Kindle Edition. The paragraph in question is quoted here:

The name lambda comes from the mathematician Alonzo Church's notation for functions (Church 1941). Lisp usually prefers expressive names over terse Greek letters, but lambda is an exception. Abetter name would be make - function. Lambda derives from the notation in Russell and Whitehead's Principia Mathematica, which used a caret over bound variables: x( x + x). Church wanted a one-dimensional string, so he moved the caret in front: ^ x( x + x). The caret looked funny with nothing below it, so Church switched to the closest thing, an uppercase lambda, Λx( x + x). The Λ was easily confused with other symbols, so eventually the lowercase lambda was substituted: λx( x + x). John McCarthy was a student of Church's at Princeton, so when McCarthy invented Lisp in 1958, he adopted the lambda notation. There were no Greek letters on the keypunches of that era, so McCarthy used (lambda (x) (+ xx)), and it has survived to this day.

[3] http://plato.stanford.edu/entries/pm-notation/#4

[4] Norvig, 1991, Location 821.

[5] History of Lambda-calculus and Combinatory Logic, page 7.

[6] Ibid.

[7] Norvig, 1991, Location 821.

[8] Ibid.

[9] Looking at Church's works online, he uses lambda notation in his 1932 paper A Set of Postulates for the Foundation of Logic. His preceding papers upon which the seminal 1932 is based On the Law of Excluded Middle (1928) and Alternatives to Zermelo's Assumption (1927), make no reference to lambda notation. As such, A Set of Postulates for the Foundation of Logic seems to be his first paper that references lambda.

[10] Norvig indicates that this is simply due to the limitations of the keypunches in the 1950s that did not have keys for Greek letters.

[11] Alex Martelli is not a fan of lambda in the context of Python, and though a good friend of Peter Norvig, I've heard Alex refer to lambda as an abomination :-) So, perhaps not beloved for everyone. In fact, Peter Norvig himself wrote (see above) that a better name would have been make-function.


21 Nov 2014 9:12pm GMT

Duncan McGreggor: Maths and Programming: Whence Recursion?

As a manager in the software engineering industry, one of the things that I see on a regular basis is a general lack of knowledge from less experienced developers (not always "younger"!) with regard to the foundations of computing and the several related fields of mathematics. There is often a great deal of focus on what the hottest new thing is, or how the industry can be changed, or how we can innovate on the decades of profound research that has been done. All noble goals.

Notably, another trend I've recognized is that in a large group of devs, there are often a committed few who really know their field and its history. That is always so amazing to me and I have a great deal of admiration for the commitment and passion they have for their art. Let's have more of that :-)

As for myself, these days I have many fewer hours a week which I can dedicate to programming compared to what I had 10 years ago. This is not surprising, given my career path. However, what it has meant is that I have to be much more focused when I do get those precious few hours a night (and sometimes just a few per week!). I've managed this in an ad hoc manner by taking quick notes about fields of study that pique my curiosity. Over time, these get filtered and a few pop to the top that I really want to give more time.

One of the driving forces of this filtering process is my never-ending curiosity: "Why is it that way?" "How did this come to be?" "What is the history behind that convention?" I tend to keep these musings to myself, exploring them at my leisure, finding some answers, and then moving on to the next question (usually this takes several weeks!).

However, given the observations of the recent years, I thought it might be constructive to ponder aloud, as it were. To explore in a more public forum, to set an example that the vulnerability of curiosity and "not knowing" is quite okay, that even those of us with lots of time in the industry are constantly learning, constantly asking.

My latest curiosity has been around recursion: who first came up with it? How did it make it's way from abstract maths to programming languages? How did it enter the consciousness of so many software engineers (especially those who are at ease in functional programming)? It turns out that an answer to this is actually quite closely related to a previous post I wrote on the secret history of lambda. A short version goes something like this:

Giuseppe Peano wanted to establish a firm foundation for logic and maths in general. As part of this, he ended up creating consistent axioms around the hard-to-define natural numbers, counting, and arithmetic operations (which utilized recursion). While visiting a conference in Europe, Bertrand Russell was deeply impressed by the dialectic talent of Peano and his unfailing clarity; he queried Peano as to his secret for success (Peano told him) and them asked for all of his published works. Russell proceeded to study these quite deeply. With this in his background, he eventually co-wrote the Principia Mathematica. Later, Alonzo Church (along with his grad students) sought to improve upon this, and in the process Alonzo Church ended up developing the lambda calculus. His student, John McCarthy, later created the first functional programming language, Lisp, utilizing concepts from the lambda calculus (recursion and function composition).

In the course of reading between 40-50 mathematics papers (including various histories) over the last week, I have learned far more than I had originally intended. So much so, in fact, that I'm currently working on a very fun recursion tutorial that not only covers the usual practical stuff, but steps the reader through programming implementations of the Peano axioms, arithmetic definitions, the Ackermann function, and parts of the lambda calculus.

I've got a few more blog post ideas cooking that dive into functions, their history and evolution. We'll see how those pan out. Even more exciting, though, was having found interesting papers discussing the evolution of functions and the birth of category theory from algebraic topology. This, needless to say, spawned a whole new trail of research, papers, and books... and I've got some great ideas for future blog posts/tutorials around this topic as well. (I've encountered category theory before, but watching it appear unsearched and unbidden in the midst of the other reading was quite delightful).

In closing, I enjoy reading not only the original papers (and correspondence between great thinkers of a previous era), but also the meanderings and rediscoveries of my peers. I've run across blog posts like this in the past, and they were quite enchanting. I hope that we continue to foster that in our industry, and that we see more examples of it in the future.

Keep on questing ;-)


21 Nov 2014 9:10pm GMT

22 Oct 2014

feedPlanet Twisted

Moshe Zadka: Clash of Clans: War strategy tips

[I wanted a central place to put my CoC Clan War strategy tips so I can refer people to them.]

Defense

Offense

22 Oct 2014 3:52pm GMT

13 Oct 2014

feedPlanet Twisted

Ashwini Oruganti: Open Source Retreat: Afterword

As I wrap up my time with Stripe's Open Source Retreat, I have quite some things to be thankful for. (Also, I just always wanted to give a thank-you speech.) So here goes:

I am grateful to Glyph Lefkowitz, Ying Li, and Christopher Armstrong for their long-suffering patience and support. Thanks also to Allen Short, who coaxed work out of me when I badly needed to feel capable of it.

I would like to thank Jessica McKellar, for being generally and specifically brilliant, for teaching me how to be a meticulous manager, and for helping me find my things when I lost them in a scary building.

I am grateful to Stripe, for giving me a place to be. And the monitor. To all the people there who showed me where the good coffee is hidden: thank you! Thanks too to those who went climbing with me, pretended to not hear me when I said I wanted to stop, and patiently helped me improve.

Thanks to friends near and far - for helping me deal with dead things, for teaching me all about alligators and crocodiles, and for helping me pronounce words correctly.

I am grateful to the authors of the good books I steal phrases from.

And lastly, I am grateful to my family, for spoiling me always.

13 Oct 2014 10:54pm GMT

07 Oct 2014

feedPlanet Twisted

Glyph Lefkowitz: Thank You Lennart

I (along with about 6 million other people, according to the little statistics widget alongside it) just saw this rather heartbreaking post from Lennart Poettering.

I have not had much occasion to interact with Lennart personally, and (like many people) I have experienced bugs in the software he has written. I have been frustrated by those bugs. I may not have always been charitable in my descriptions of his software. I have, however, always assumed good faith on his part and been happy that he continues making valuable contributions to the free software ecosystem. I haven't felt the need to make that clear in the past because I thought it was understood.

Apparently, not only is it not understood, there is active hostility directed against his participation. There is constant, aggressive, bad-faith attempts to get him to stop working on the important problems he is working on.

So, Lennart,

Thank you for your work on GNOME, for working on the problem of getting free software into the hands of normal people.

Thank you for furthering the cause of free software by creating PulseAudio, so that we can at least attempt to allow users to play sound on their Linux computers from multiple applications simultaneously without writing tedious configuration files.

Thank you for your work on SystemD, attempting to bring modern system-startup and service-management practices to the widest possible free software audience.

Thank you, Lennart, for putting up with all these vile personal attacks while you have done all of these things. I know you could have walked away; I'm sure that at times, you wanted to. Thank you for staying anyway and continuing to do the good work that you've done.

While the abuse is what prompted me to write this, I should emphasize that my appreciation is real. As a long-time user of Linux both on the desktop and in the cloud, I know that my life has been made materially better by Lennart's work.

This shouldn't be read as an endorsement of any specific technical position that Mr. Poettering holds. The point is that it doesn't have to be: this isn't about whether he's right or not, it's about whether we can have the discussion about whether he's right in a calm, civil, technical manner. In fact I don't agree with all of his technical choices, but I'm not going to opine about that here, because he's putting in the work and I'm not, and he's fighting many battles for software freedom (most of them against our so-called "allies") that I haven't been involved in.

The fact that he felt the need to write an article on the hideous state of the free software community is as sad as it is predictable. As a guest on a podcast recently, I praised the Linux community's technical achievements while critiquing its poisonous culture. Now I wonder if "critiquing" is strong enough; I wonder if I should have given any praise at all. We should all condemn this kind of bilious ad-hominem persecution.

Today I am saying "thank you" to Lennart because the toxicity in our communities is not a vague, impersonal force that we can discuss academically. It is directed at specific individuals, in an attempt to curtail their participation. We need to show those targetted people, regardless of how high-profile they are, or whether they're paid for their work or not, that they are not alone, that they have our gratitude. It is bullying, pure and simple, and we should not allow it to stand.

Software is made out of feelings. If we intend to have any more free software, we need to respect and honor those feelings, and, frankly speaking, stop trying to make the people who give us that software feel like shit.

07 Oct 2014 8:32am GMT

25 Sep 2014

feedPlanet Twisted

Moshe Zadka: A Problem With High School Math Education

When I studied math in high-school, I absolutely hated geometry. It pretty much made me give up on math entirely, until I renewed my fascination in university. Geometry and I had a hate/disgust relationship since. I am putting it out there so if you want to blame me for being biased, hey, I just made your day easier.

But it made me question: why are we even teaching geometry? Unlike the ancient Egyptians, few of us will need to collect land taxes on rapidly changing land sizes. The usual answer is that geometry is used to teach the art of mathematical proofs. There are lemmas-upon-lemmas building up to interesting results. They show how mathematical rigor is achieved, and what it is useful for.

This is bullshit.

It is bullshit because there is an algorithm one can run to manufacture a proof for every true theorem, or a counter-example for every non-theorem. True, understanding how the algorithm works takes about two years of math undergrad studies. The hand-waving proof is here: first, build up the proof that real number pairs as points and equations of lines as lines obey the Euclidean axioms. Then show that the theory of real numbers is complete. Then show the theory of real numbers is quantifier-free. QED.

When we are teaching kids geometry, we are teaching them tricks to solve something, when an algorithm would do it at well. We are teaching them to be slightly-worse than computers.

The proof above is actually incorrect: Euclidean geometry is missing an axiom for it to work. (The axiom is "there is no line and a triangle such that the line intersects the triangle in exactly one edge".) Euclid missed it, as have generations after him, because geometry is visually seductive: it is hard to verify proofs formally when the drawings look "correct". Teaching kids rigor through geometry is teaching them something not even Euclid achieved!

Is there an alternative? I think there is. The theory of finite sets is a beautiful little gem of mathematics. The axioms are the classic ZF axioms, with the infinity axiom inverted ("every one-to-one function from a set to itself is onto"). One can start building the theory the same as building the ZF theory: proving that ordinals exist, building up simple sets and the like - and then introduce the infinity axiom and its inversion, show a tiny taste of the infinity side of the theory, and then introduce the "finiteness axiom", and show how you can build the natural numbers on top of it.

For the coup-de-grace, finite set theory gives a more natural language to talk about Godel's incompleteness theorem, showing that we cannot have an algorithm for deciding questions about finite sets - or the natural numbers.

Achieving rigor is much easier: one uses Venn diagrams as intuitions, but a formal derivation as proof. It's beautiful, it's nice, and it is the actual basis of all math.

25 Sep 2014 6:42pm GMT

22 Sep 2014

feedPlanet Twisted

Twisted Matrix Laboratories: Twisted 14.0.1 & 14.0.2 Released

On behalf of Twisted Matrix Laboratories, I'm releasing Twisted 14.0.1, a security release for Twisted 14.0. It is strongly suggested that users of 14.0.0 upgrade to this release.

This patches a bug in Twisted Web's Agent, where BrowserLikePolicyForHTTPS would not honour the trust root given, and would use the system trust root instead. This would have broken, for example, attempting to pin the issuer for your HTTPS application because you only trust one issuer.

Note: on OS X, with the system OpenSSL, you still can't fully rely on this API for issuer pinning, due to modifications by Apple - please see https://hynek.me/articles/apple-openssl-verification-surprises/ for more details.

You can find the downloads at https://pypi.python.org/pypi/Twisted (or alternatively http://twistedmatrix.com/trac/wiki/Downloads). The NEWS file is also available at https://twistedmatrix.com/trac/browser/tags/releases/twisted-14.0.1/NEWS?format=raw.

Thanks for Alex Gaynor for discovering the bug, Glyph & Alex for developing a patch, and David Reid for reviewing it.

14.0.2 is a bugfix patch for distributors, that corrects a failing test in the patch for 14.0.1.

Twisted Regards,
HawkOwl

Edit, 22nd Sep 2014: CVE-2014-7143 has been assigned for this issue.

22 Sep 2014 1:49pm GMT

20 Sep 2014

feedPlanet Twisted

Glyph Lefkowitz: Ungineering

I am not an engineer.

I am a computer programmer. I am a software developer. I am a software author. I am a coder.

I program computers. I develop software. I write software. I code.

I'd prefer that you not refer to me as an engineer, but this is not an essay about how I'm going to heap scorn upon you if you do so. Sometimes, I myself slip and use the word "engineering" to refer to this activity that I perform. Sometimes I use the word "engineer" to refer to myself or my peers. It is, sadly, fairly conventional to refer to us as "engineers", and avoiding this term in a context where it's what everyone else uses is a constant challenge.

Nevertheless, I do not "engineer" software. Neither do you, because nobody has ever known enough about the process of creating software to "engineer" it.

According to dictionary.com, "engineering" is:

"the art or science of making practical application of the knowledge of pure sciences, as physics or chemistry, as in the construction of engines, bridges, buildings, mines, ships, and chemical plants."

When writing software, we typically do not apply "knowledge of pure sciences". Very little science is germane to the practical creation of software, and the places where it is relevant (firmware for hard disks, for example, or analytics for physical sensors) are highly rarified. The one thing that we might sometimes use called "science", i.e. computer science, is a subdiscipline of mathematics, and not a science at all. Even computer science, though, is hardly ever brought to bear - if you're a working programmer, what was the last project where you had to submit formal algorithmic analysis for any component of your system?

Wikipedia has a heaping helping of criticism of the terminology behind software engineering, but rather than focusing on that, let's see where Wikipedia tells us software engineering comes from in the first place:

The discipline of software engineering was created to address poor quality of software, get projects exceeding time and budget under control, and ensure that software is built systematically, rigorously, measurably, on time, on budget, and within specification. Engineering already addresses all these issues, hence the same principles used in engineering can be applied to software.

Most software projects fail; as of 2009, 44% are late, over budget, or out of specification, and an additional 24% are cancelled entirely. Only a third of projects succeed according to those criteria of being under budget, within specification, and complete.

What would that look like if another engineering discipline had that sort of hit rate? Consider civil engineering. Would you want to live in a city where almost a quarter of all the buildings were simply abandoned half-constructed, or fell down during construction? Where almost half of the buildings were missing floors, had rents in the millions of dollars, or both?

My point is not that the software industry is awful. It certainly can be, at times, but it's not nearly as grim as the metaphor of civil engineering might suggest. Consider this: despite the statistics above, is using a computer today really like wandering through a crumbling city where a collapsing building might kill you at any moment? No! The social and economic costs of these "failures" is far lower than most process consultants would have you believe. In fact, the cause of many such "failures" is a clumsy, ham-fisted attempt to apply engineering-style budgetary and schedule constraints to a process that looks nothing whatsoever like engineering. I have to use scare quotes around "failure" because many of these projects classified as failed have actually delivered significant value. For example, if the initial specification for a project is overambitious due to lack of information about the difficulty of the tasks involved, for example - an extremely common problem at the beginning of a software project - that would still be a failure according to the metric of "within specification", but it's a problem with the specification and not the software.

Certain missteps notwithstanding, most of the progress in software development process improvement in the last couple of decades has been in acknowledging that it can't really be planned very far in advance. Software vendors now have to constantly present works in progress to their customers, because the longer they go without doing that there is an increasing risk that the software will not meet the somewhat arbitrary goals for being "finished", and may never be presented to customers at all.

The idea that we should not call ourselves "engineers" is not a new one. It is a minority view, but I'm in good company in that minority. Edsger W. Dijkstra points out that software presents what he calls "radical novelty" - it is too different from all the other types of things that have come before to try to construct it by analogy to those things.

One of the ways in which writing software is different from engineering is the matter of raw materials. Skyscrapers and bridges are made of steel and concrete, but software is made out of feelings. Physical construction projects can be made predictable because the part where creative people are creating the designs - the part of that process most analagous to software - is a small fraction of the time required to create the artifact itself.

Therefore, in order to create software you have to have an "engineering" process that puts its focus primarily upon the psychological issue of making your raw materials - the brains inside the human beings you have acquired for the purpose of software manufacturing - happy, so that they may be efficiently utilized. This is not a common feature of other engineering disciplines.

The process of managing the author's feelings is a lot more like what an editor does when "constructing" a novel than what a foreperson does when constructing a bridge. In my mind, that is what we should be studying, and modeling, when trying to construct large and complex software systems.

Consequently, not only am I not an engineer, I do not aspire to be an engineer, either. I do not think that it is worthwhile to aspire to the standards of another entirely disparate profession.

This doesn't mean we shouldn't measure things, or have quality standards, or try to agree on best practices. We should, by all means, have these things, but we authors of software should construct them in ways that make sense for the specific details of the software development process.

While we are on the subject of things that we are not, I'm also not a maker. I don't make things. We don't talk about "building" novels, or "constructing" music, nor should we talk about "building" and "assembling" software. I like software specifically because of all the ways in which it is not like "making" stuff. Making stuff is messy, and hard, and involves making lots of mistakes.

I love how software is ethereal, and mistakes are cheap and reversible, and I don't have any desire to make it more physical and permanent. When I hear other developers use this language to talk about software, it makes me think that they envy something about physical stuff, and wish that they were doing some kind of construction or factory-design project instead of making an application.

The way we use language affects the way we think. When we use terms like "engineer" and "builder" to describe ourselves as creators, developers, maintainers, and writers of software, we are defining our role by analogy and in reference to other, dissimilar fields.

Right now, I think I prefer the term "developer", since the verb develop captures both the incremental creation and ongoing maintenance of software, which is so much a part of any long-term work in the field. The only disadvantage of this term seems to be that people occasionally think I do something with apartment buildings, so I am careful to always put the word "software" first.

If you work on software, whichever particular phrasing you prefer, pick one that really calls to mind what software means to you, and don't get stuck in a tedious metaphor about building bridges or cars or factories or whatever.

To paraphrase a wise man:

I am developer, and so can you.

20 Sep 2014 5:00am GMT

17 Sep 2014

feedPlanet Twisted

Glyph Lefkowitz: The Most Important Thing You Will Read On This Blog All Year

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

I have two PGP keys.

One, 16F13480, is signed by many people in the open source community.  It is a
4096-bit RSA key.

The other, 0FBC4A07, is superficially worse.  It doesn't have any signatures on
it.  It is only a 3072-bit RSA key.

However, I would prefer that you all use 0FBC4A07.

16F13480 lives encrypted on disk, and occasionally resident in memory on my
personal laptop.  I have had no compromises that I'm aware of, so I'm not
revoking the key - I don't want to lose all the wonderful trust I build up.  In
order to avoid compromising it in the future, I would really prefer to avoid
decrypting it any more often than necessary.

By contrast, aside from backups which I have not yet once had occasion to
access, 0FBC4A07 exists only on an OpenPGP smart card, it requires a PIN, it is
never memory resident on a general purpose computer, and is only plugged in
when I'm actively Doing Important Crypto Stuff.  Its likelyhood of future
compromise is *extremely* low.

If said smart card had supported 4096-bit keys I probably would have just put
the old key on the more secure hardware and called it a day.  Sadly, that is
not the world we live in.

Here's what I'd like you to do, if you wish to interact with me via GnuPG:

    $ gpg --recv-keys 0FBC4A07 16F13480
    gpg: requesting key 0FBC4A07 from hkp server keys.gnupg.net
    gpg: requesting key 16F13480 from hkp server keys.gnupg.net
    gpg: key 0FBC4A07: "Matthew "Glyph" Lefkowitz (OpenPGP Smart Card) <glyph@twistedmatrix.com>" 1 new signature
    gpg: key 16F13480: "Matthew Lefkowitz (Glyph) <glyph@twistedmatrix.com>" not changed
    gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
    gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u
    gpg: next trustdb check due at 2015-08-18
    gpg: Total number processed: 2
    gpg:              unchanged: 1
    gpg:         new signatures: 1
    $ gpg --edit-key 16F13480
    gpg (GnuPG/MacGPG2) 2.0.22; Copyright (C) 2013 Free Software Foundation, Inc.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.


    gpg: checking the trustdb
    gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
    gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u
    gpg: next trustdb check due at 2015-08-18
    pub  4096R/16F13480  created: 2012-11-16  expires: 2016-04-12  usage: SC
                         trust: unknown       validity: unknown
    sub  4096R/0F3F064E  created: 2012-11-16  expires: 2016-04-12  usage: E
    [ unknown] (1). Matthew Lefkowitz (Glyph) <glyph@twistedmatrix.com>

    gpg> disable

    gpg> save
    Key not changed so no update needed.
    $

If you're using keybase, "keybase encrypt glyph" should be pointed at the
correct key.

Thanks for reading,

- -glyph
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)

iQGcBAEBCgAGBQJUGRfNAAoJEH7CgSUPvEoHwg8L/0MHoG4FLzr1U3Ulu45sX/QO
VDUC4wJp4dpUKW2Yvjyw3LBYtFvsJfqUhM2oBURDPPgVfC5aOz7qevuBndlOYPB+
8dK//lPLZvYMAx2AlTGhz0wQokl0Cdlo+vK5E+Ex5oDJYhaPI9YPsSDbvynb6yhI
DK+EXRBtra7ev4hHDiucLGvqlSQnV+eOSijZfHgm6aBImfMUM7SM3UGFtE8oEJDE
XhTwW93L/c2epZOEFkfSLzQLlIcV5Ll2B6KOQLsdMuvgSkVX3NN+efuLFy4diD0U
HvL8nxxBpM98Jj+0PAucLbw4JwyAwEF6viEwXWiwngFTKeU60kUjUoSMFecMQuEz
IFqR7e9J7OaK9pMLrimwYtfCHVCx5WIXRJShuvcHhRCjwJb1N6rAffGOiwzkY+3w
mvpfEwQoC8F1wu2SOUpgQlvHUxxYbdibNTUvlaO74nyzE9fQZSXbZcGH2Skf9tHi
DjPhd2kLnyOzOy+BAOGcTKJ+ldOpbdmsnpcFTDA/MA==
=k0u9
-----END PGP SIGNATURE-----

17 Sep 2014 5:14am GMT

15 Sep 2014

feedPlanet Twisted

Glyph Lefkowitz: The Horizon

Sometimes, sea sickness is caused by a sort of confusion. Your inner ear can feel the motion of the world around it, but because it can't see the outside world, it can't reconcile its visual input with its proprioceptive input, and the result is nausea.

This is why it helps to see the horizon. If you can see the horizon, your eyes will talk to your inner ears by way of your brain, they will realize that everything is where it should be, and that everything will be OK.

As a result, you'll stop feeling sick.

photo credit: https://secure.flickr.com/people/reallyterriblephotographer/

I have a sort of motion sickness too, but it's not seasickness. Luckily, I do not experience nausea due to moving through space. Instead, I have a sort of temporal motion sickness. I feel ill, and I can't get anything done, when I can't see the time horizon.

I think I'm going to need to explain that a bit, since I don't mean the end of the current fiscal quarter. I realize this doesn't make sense yet. I hope it will, after a bit of an explanation. Please bear with me.

Time management gurus often advise that it is "good" to break up large, daunting tasks into small, achievable chunks. Similarly, one has to schedule one's day into dedicated portions of time where one can concentrate on specific tasks. This appears to be common sense. The most common enemy of productivity is procrastination. Procrastination happens when you are working on the wrong thing instead of the right thing. If you consciously and explicitly allocate time to the right thing, then chances are you will work on the right thing. Problem solved, right?

Except, that's not quite how work, especially creative work, works.

ceci n’est pas un task

I try to be "good". I try to classify all of my tasks. I put time on my calendar to get them done. I live inside little boxes of time that tell me what I need to do next. Sometimes, it even works. But more often than not, the little box on my calendar feels like a little cage. I am inexplicably, involuntarily, haunted by disruptive visions of the future that happen when that box ends.

Let me give you an example.

Let's say it's 9AM Monday morning and I have just arrived at work. I can see that at 2:30PM, I have a brief tax-related appointment. The hypothetical person doing my hypothetical taxes has an office that is a 25 minute hypothetical walk away, so I will need to leave work at 2PM sharp in order to get there on time. The appointment will last only 15 minutes, since I just need to sign some papers, and then I will return to work. With a 25 minute return trip, I should be back in the office well before 4, leaving me plenty of time to deal with any peripheral tasks before I need to leave at 5:30. Aside from an hour break at noon for lunch, I anticipate no other distractions during the day, so I have a solid 3 hour chunk to focus on my current project in the morning, an hour from 1 to 2, and an hour and a half from 4 to 5:30. Not an ideal day, certainly, but I have plenty of time to get work done.

The problem is, as I sit down in front of my nice, clean, empty text editor to sketch out my excellent programming ideas with that 3-hour chunk of time, I will immediately start thinking about how annoyed I am that I'm going to get interrupted in 5 and a half hours. It consumes my thoughts. It annoys me. I unconsciously attempt to soothe myself by checking email and getting to a nice, refreshing inbox zero. Now it's 9:45. Well, at least my email is done. Time to really get to work. But now I only have 2 hours and 15 minutes, which is not as nice of an uninterrupted chunk of time for a deep coding task. Now I'm even more annoyed. I glare at the empty window on my screen. It glares back. I spend 20 useless minutes doing this, then take a 10-minute coffee break to try to re-set and focus on the problem, and not this silly tax meeting. Why couldn't they just mail me the documents? Now it's 10:15, and I still haven't gotten anything done.

By 10:45, I manage to crank out a couple of lines of code, but the fact that I'm going to be wasting a whole hour with all that walking there and walking back just gnaws at me, and I'm slogging through individual test-cases, mechanically filling docstrings for the new API and for the tests, and not really able to synthesize a coherent, holistic solution to the overall problem I'm working on. Oh well. It feels like progress, albeit slow, and some days you just have to play through the pain. I struggle until 11:30 at which point I notice that since I haven't been able to really think about the big picture, most of the test cases I've written are going to be invalidated by an API change I need to make, so almost all of the morning's work is useless. Damn it, it's 2014, I should be able to just fill out the forms online or something, having to physically carry an envelope with paper in it ten blocks is just ridiculous. Maybe I could get my falcon to deliver it for me.

It's 11:45 now, so I'm not going to get anything useful done before lunch. I listlessly click on stuff on my screen and try to relax by thinking about my totally rad falcon until it's time to go. As I get up, I glance at my phone and see the reminder for the tax appointment.

Wait a second.

The appointment has today's date, but the subject says "2013". This was just some mistaken data-entry in my calendar from last year! I don't have an appointment today! I have nowhere to be all afternoon.

For pointless anxiety over this fake chore which never even actually happened, a real morning was ruined. Well, a hypothetical real morning; I have never actually needed to interrupt a work day to walk anywhere to sign tax paperwork. But you get the idea.

To a lesser extent, upcoming events later in the week, month, or even year bother me. But the worst is when I know that I have only 45 minutes to get into a task, and I have another task booked up right against it. All this trying to get organized, all this carving out uninterrupted time on my calendar, all of this trying to manage all of my creative energies and marshal them at specific times for specific tasks, annihilates itself when I start thinking about how I am eventually going to have to stop working on the seemingly endless, sprawling problem set before me.

The horizon I need to see is the infinite time available before me to do all the thinking I need to do to solve whatever problem has been set before me. If I want to write a paragraph of an essay, I need to see enough time to write the whole thing.

Sometimes - maybe even, if I'm lucky, increasingly frequently - I manage to fool myself. I hide my calendar, close my eyes, and imagine an undisturbed millennium in front of my text editor ... during which I may address some nonsense problem with malformed utf-7 in mime headers.

... during which I can complete a long and detailed email about process enhancements in open source.

... during which I can write a lengthy blog post about my productivity-related neuroses.

I imagine that I can see all the way to the distant horizon at the end of time, and that there is nothing between me and it except dedicated, meditative concentration.

That is on a good day. On a bad day, trying to hide from this anxiety manifests itself in peculiar and not particularly healthy ways. For one thing, I avoid sleep. One way I can always extend the current block of time allocated to my current activity is by just staying awake a little longer. I know this is basically the wrong way to go about it. I know that it's bad for me, and that it is almost immediately counterproductive. I know that ... but it's almost 1AM and I'm still typing. If I weren't still typing right now, instead of sleeping, this post would never get finished, because I've spent far too many evenings looking at the unfinished, incoherent draft of it and saying to myself, "Sure, I'd love to work on it, but I have a dentist's appointment in six months and that is going to be super distracting; I might as well not get started".

Much has been written about the deleterious effects of interrupting creative thinking. But what interrupts me isn't an interruption; what distracts me isn't even a distraction. The idea of a distraction is distracting; the specter of a future interruption interrupts me.

This is the part of the article where I wow you with my life hack, right? Where I reveal the one weird trick that will make productivity gurus hate you? Where you won't believe what happens next?

Believe it: the surprise here is that this is not a set-up for some choice productivity wisdom or a sales set-up for my new book. I have no idea how to solve this problem. The best I can do is that thing I said above about closing my eyes, pretending, and concentrating. Honestly, I have no idea even if anyone else suffers from this, or if it's a unique neurosis. If a reader would be interested in letting me know about their own experiences, I might update this article to share some ideas, but for now it is mostly about sharing my own vulnerability and not about any particular solution.

I can share one lesson, though. The one thing that this peculiar anxiety has taught me is that productivity "rules" are not revealed divine truth. They are ideas, and those ideas have to be evaluated exclusively on the basis of their efficacy, which is to say, on the basis of how much stuff that you want to get done that they help you get done.

For now, what I'm trying to do is to un-clench the fearful, spasmodic fist in my mind that gripping the need to schedule everything into these small boxes and allocate only the time that I "need" to get something specific done.

Maybe the only way I am going to achieve anything of significance is with opaque, 8-hour slabs of time with no more specific goal than "write some words, maybe a blog post, maybe some fiction, who knows" and "do something work-related". As someone constantly struggling to force my own fickle mind to accomplish any small part of my ridiculously ambitions creative agenda, it's really hard to relax and let go of anything which might help, which might get me a little further a little faster.

Maybe I should be trying to schedule my time into tiny little blocks. Maybe I'm just doing it wrong somehow and I just need to be harder on myself, madder at myself, and I really can get the blood out of this particular stone.

Maybe it doesn't matter all that much how I schedule my own time because there's always some upcoming distraction that I can't control, and I just need to get better at meditating and somehow putting them out of my mind without really changing what goes on my calendar.

Maybe this is just as productive as I get, and I'll still be fighting this particular fight with myself when I'm 80.

Regardless, I think it's time to let go of that fear and try something a little different.

15 Sep 2014 8:27am GMT

04 Sep 2014

feedPlanet Twisted

Glyph Lefkowitz: Techniques for Actually Distributed Development with Git

The Setup

I have a lot of computers. Sometimes I'll start a project on one computer and get interrupted, then later find myself wanting to work on that same project, right where I left off, on another computer. Sometimes I'm not ready to publish my work, and I might still want to rebase it a few times (because if you're not arbitrarily rewriting history all the time you're not really using Git) so I don't want to push to @{upstream} (which is how you say "upstream" in Git).

I would like to be able to use Git to synchronize this work in progress between multiple computers. I would like to be able to easily automate this synchronization so that I don't have to remember any special steps to sync up; one of the most focused times that I have available to get coding and writing work done is when I'm disconnected from the Internet, on a cross-country train or a plane trip, or sitting near a beach, for 5-10 hours. It is very frustrating to realize only once I'm settled in and unable to fetch the code, that I don't actually have it on my laptop because I was last doing something on my desktop. I would particularly like to be able to that offline time to resolve tricky merge conflicts; if there are two versions of a feature, I want to have them both available locally while I'm disconnected.

Completely Fake, Made-Up History That Is A Lie

As everyone knows, Git is a centralized version control system created by the popular website GitHub as a command-line client for its "forking" HTML API. Alternate central Git authorities have been created by other startups following in the wave following GitHub's success, such as BitBucket and GitLab.

It may surprise many younger developers to know that when GitHub first created Git, it was originally intended to be a distributed version control system, where it was possible to share code with no particular central authority!

Although the feature has been carefully hidden from the casual user, with a bit of trickery you can enable re-enable it!

Technique 0: Understanding What's Going On

It's a bit confusing to have to actually set up multiple computers to test these things, so one useful thing to understand is that, to Git, the place you can fetch revisions from and push revisions to is a repository. Normally these are identified by URLs which identify hosts on the Internet, but you can also just indicate a path name on your computer. So for example, we can simulate a "network" of three computers with three clones of a repository like this:

1
2
3
4
5
6
7
8
$ mkdir tmp
$ cd tmp/
$ mkdir a b c
$ for repo in a b c; do (cd $repo; git init); done
Initialized empty Git repository in .../tmp/a/.git/
Initialized empty Git repository in .../tmp/b/.git/
Initialized empty Git repository in .../tmp/c/.git/
$ 

This creates three separate repositories. But since they're not clones of each other, none of them have any remotes, and none of them can push or pull from each other. So how do we teach them about each other?

1
2
3
4
5
6
7
8
9
$ cd a
$ git remote add b ../b
$ git remote add c ../c
$ cd ../b
$ git remote add a ../a
$ git remote add c ../c
$ cd ../c
$ git remote add a ../a
$ git remote add b ../b

Now, you can go into a and type git fetch --all and it will fetch from b and c, and similarly for git fetch --all in b and c.

To turn this into a practical multiple-machine scenario, rather than specifying a path like ../b, you would specify an SSH URL as your remote URL, and turn SSH on on each of your machines ("Remote Login" in the Sharing preference pane on the mac, if you aren't familiar with doing that on a mac).

So, for example, if you have a home desktop tweedledee and a work laptop tweedledum, you can do something like this:

1
2
3
4
5
6
7
8
tweedledee:~ neo$ mkdir foo; cd foo
tweedledee:foo neo$ git init .
# ...
tweedledum:~ m_anderson$ mkdir bar; cd bar
tweedledum:bar m_anderson$ git init .
tweedledum:bar m_anderson$ git remote add tweedledee neo@tweedledee.local:foo
# ...
tweedledee:foo neo$ git remote add tweedledum m_anderson$@tweedledum.local:foo

I don't know the names of the hosts on your network. So, in order to make it possible for you to follow along exactly, I'll use the repositories that I set up above, with path-based remotes, in the following examples.

Technique 1 (Simple): Always Only Fetch, Then Merge

Git repositories are pretty boring without any commits, so let's create a commit:

1
2
3
4
5
6
7
$ cd ../a
$ echo 'some data' > data.txt
$ git add data.txt
$ git ci -m "data"
[master (root-commit) 8dc3db4] data
 1 file changed, 1 insertion(+)
 create mode 100644 data.txt

Now on our "computers" b and c, we can easily retrieve this commit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ cd ../b/
$ git fetch --all
Fetching a
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ../a
 * [new branch]      master     -> a/master
Fetching c
$ git merge a/master
$ ls
data.txt
$ cd ../c
$ ls
$ git fetch --all
Fetching a
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ../a
 * [new branch]      master     -> a/master
Fetching b
From ../b
 * [new branch]      master     -> b/master
$ git merge b/master
$ ls
data.txt

If we make a change on b, we can easily pull it into a as well.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ cd ../b
$ echo 'more data' > data.txt 
$ git commit data.txt -m "more data"
[master f3d4165] more data
 1 file changed, 1 insertion(+), 1 deletion(-)
$ cd ../a
$ git fetch --all
Fetching b
remote: Counting objects: 5, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ../b
 * [new branch]      master     -> b/master
Fetching c
From ../c
 * [new branch]      master     -> c/master
$ git merge b/master
Updating 8dc3db4..f3d4165
Fast-forward
 data.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

This technique is quite workable, except for one minor problem.

The Minor Problem

Let's say you're sitting on your laptop and your battery is about to die. You want to push some changes from your laptop to your desktop. Your SSH key, however, is plugged in to your laptop. So you just figure you'll push from your laptop to your desktop. Unfortunately, if you try, you'll see something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ cd ../a/
$ echo 'even more data' >> data.txt 
$ git commit data.txt -m "even more data"
[master a9f3d89] even more data
 1 file changed, 1 insertion(+)
$ git push b master
Counting objects: 7, done.
Writing objects: 100% (3/3), 260 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
remote: error: refusing to update checked out branch: refs/heads/master
remote: error: By default, updating the current branch in a non-bare repository
remote: error: is denied, because it will make the index and work tree inconsistent
remote: error: with what you pushed, and will require 'git reset --hard' to match
remote: error: the work tree to HEAD.
remote: error: 
remote: error: You can set 'receive.denyCurrentBranch' configuration variable to
remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into
remote: error: its current branch; however, this is not recommended unless you
remote: error: arranged to update its work tree to match what you pushed in some
remote: error: other way.
remote: error: 
remote: error: To squelch this message and still keep the default behaviour, set
remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.
To ../b
 ! [remote rejected] master -> master (branch is currently checked out)
error: failed to push some refs to '../b'

While you're reading this, you fail your will save and become too bored to look at a computer any more.

Too late, your battery is dead! Hopefully you didn't lose any work.

In other words: sometimes it's nice to be able to push changes as well.

Technique 1.1: The Manual Workaround

The problem that you're facing here is that b has its master branch checked out, and is therefore rejecting changes to that branch. Your commits have actually all been "uploaded" to b, and are present in that repository, but there is no branch pointing to them. Doing either of those configuration things that Git warns you about in order to force it to allow it is a bad idea, though; if your working tree and your index and your your commits don't agree with each other, you're just asking for trouble. Git is confusing enough as it is.

In order work around this, you can just push your changes in master on a to a diferent branch on b, and then merge it later, like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ git push b master:master_from_a
Counting objects: 7, done.
Writing objects: 100% (3/3), 260 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To ../b
 * [new branch]      master -> master_from_a
$ cd ../b
$ git merge master_from_a 
Updating f3d4165..a9f3d89
Fast-forward
 data.txt | 1 +
 1 file changed, 1 insertion(+)
$ 

This works just fine, you just always have to manually remember which branches you want to push, and where, and where you're pushing from.

Technique 2: Push To Reverse Pull To Remote Remotes

The astute reader will have noticed at this point that git already has a way of tracking "other places that changes came from", they're called remotes! And in fact b already has a remote called a pointing at ../a. Once you're sitting in front of your b computer again, wouldn't you rather just have those changes already in the a remote, instead of in some branch you have to specifically look for?

What if you could just push your branches from a into that remote? Well, friend, I'm here today to tell you you can.

First, head back over to a...

1
$ cd ../a

And now, all you need is this entirely straightforward and obvious command:

1
$ git config remote.b.push '+refs/heads/*:refs/remotes/a/*'

and now, when you git push b from a, you will push those branches into b's "a" remote, as if you had done git fetch a while in b.

1
2
3
4
$ git push b
Total 0 (delta 0), reused 0 (delta 0)
To ../b
   8dc3db4..a9f3d89  master -> a/master

So, if we make more changes:

1
2
3
4
$ echo 'YET MORE data' >> data.txt
$ git commit data.txt -m "You get the idea."
[master c641a41] You get the idea.
 1 file changed, 1 insertion(+)

we can push them to b...

1
2
3
4
5
6
$ git push b
Counting objects: 5, done.
Writing objects: 100% (3/3), 272 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To ../b
   a9f3d89..c641a41  master -> a/master

and when we return to b...

1
$ cd ../b/

there's nothing to fetch, it's all been pre-fetched already, so

1
$ git fetch a

produces no output.

But there is some stuff to merge, so if we took b on a plane with us:

1
2
3
4
5
$ git merge a/master
Updating a9f3d89..c641a41
Fast-forward
 data.txt | 1 +
 1 file changed, 1 insertion(+)

we can merge those changes in whenever we please!

More importantly, unlike the manual-syncing solution, this allows us to push multiple branches on a to b without worrying about conflicts, since the a remote on b will always only be updated to reflect the present state of a and should therefore never have conflicts (and if it does, it's because you rewrote history and you should be able to force push with no particular repercussions).

Generalizing

Here's a shell function which takes 2 parameters, "here" and "there". "here" is the name of the current repository - meaning, the name of the remote in the other repository that refers to this one - and "there" is the name of the remote which refers to another repository.

1
2
3
4
5
6
function remoteremote () {
    local here="$1"; shift;
    local there="$1"; shift;

    git config "remote.$there.push" "+refs/heads/*:refs/remotes/$here/*";
}

In the above example, we could have used this shell function like so:

1
2
$ cd ../a
$ remoteremote a b

I now use this all the time when I check out a repository on multiple machines for the first time; I can then always easily push my code to whatever machine I'm going to be using next.

I really hope at least one person out there has equally bizarre usage patterns of version control systems and finds this post useful. Let me know!

Acknowledgements

Thanks very much to Tom Prince for the information that lead to this post being worth sharing, and Jenn Schiffer for teaching me that it is OK to write jokes sometimes, except about javascript which is very serious and should not be made fun of ever.

04 Sep 2014 5:44am GMT

12 Aug 2014

feedPlanet Twisted

Jonathan Lange: Isn't it Pythonic?

I really should be over this by now.

Here's how you test exceptions in Python 2.7 and later:

with self.assertRaises(SomeException) as cm:
    do_something()

the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)

(as recommended in the unittest documentation)

Here's how you test exceptions in earlier Pythons, using libraries like testtools.

the_exception = self.assertRaises(SomeException, do_something)
self.assertEqual(the_exception.error_code, 3)

I still like the second way. It's simpler, uses less code, and doesn't rely on new syntax. I'm sorry it didn't work out that way.

I guess this is how Barry feels about the diamond operator.

12 Aug 2014 11:00pm GMT