18 Jul 2014

feedPlanet Twisted

Moshe Zadka: Clash of Clans and the Prisoners’ Dilemma

I have started playing, recently, the online game "Clash of Clans". Clash of Clans has the option of "donating" troops to other clan members. Troops donated stay in a separate area (so they do not take up area for one's own troops), and also participate in base defense (as opposed to one's own troops, which are offensive only). In short, being donated-to makes both one's offense and one's defense stronger. However, donating troops means paying the cost of training them and reaping no benefit from it.

Does we have the pay-off matrix: assume paying the cost of troops is -1, and the benefit is +2 (it has to be bigger than the cost, since players do build troops for themselves with even less benefits):

Donate/Donate: 1/1
Don't donate/Don't donate: 0/0
Donate/Don't donate: -1/2
Don't donate/Donate: 2/-1

What I love about this is this is a massive experiment with prisoners' dilemma on more-or-less average people. As expected, one sees both equilibrium: clans where hardly anyone donates, and clans where everyone donates. In the clans where hardly anyone donates, there is still some reciprocity: (some) people will try to donate "back" to people who donated to them. Since in practice the cost-benefit analysis turns that way, it is sometimes worthwhile to donate even for a .2 chance of a reciprocation, which allows the loop to be fed.

There are reputation effects (the number of troops one donated and one had donated to appears where everyone can see it), which would be assumed to improve the donation rate. However, from my anecdotal experience, being in two clans, the most distinguishing feature is "do you know these people in real life?" Clans made up of people whose reputation effects extend beyond the game are in the good equilibrium. Since I've joined the Facebook-employee-only clan, my life has been much better.

I started thinking that Facebook is like that too. In situations where you can help a colleague, where the benefit to the colleague would be disproportional to one's self, we encourage that kind of thinking. Facebook is in the good equilibrium of the dilemma!

[Plug: We're hiring engineers and eng. managers. Please contact me on FB with further questions or resumes.]

18 Jul 2014 4:10pm GMT

Duncan McGreggor: Uncovering Inherent Structures in Organizations

Vladimir Levenshtein

This post should have a subtitle: "Using Team Analysis and Levenshtein Distance to Reveal said Structure." It's the first part of that subtitle that is the secret, though: being able to correctly analyze and classify individual teams. Without that, using something clever like Levenshtein distance isn't going to be very much help.

But that's coming in towards the end of the story. Let's start at the beginning.

What You're Going to See

This post is a bit long. Here are the sections I've divided it into:


However, you don't need to read the whole thing to obtain the main benefits. You can get the Cliff Notes version by reading the Premise, Categorizing Teams, Interpretation, and the Conclusion.

Premise

Companies grow. Teams expand. If you're well-placed in your industry and providing in-demand services or products, this is happening to you. Individuals and small teams tend to deal with this sort of change pretty well. At an organizational level, however, this sort of change tends to have an impact that can bring a group down, or rocket it up to the next level.

Of the many issues faced by growing companies (or rapidly growing orgs within large companies), the structuring one can be most problematic: "Our old structures, though comfortable, won't scale well with all these new teams and all the new hires joining our existing teams. How do we reorganize? Where do we put folks? Are there natural lines along which we can provide better management (and vision!) structure?"

The answer, of course, is "yes" -- but! It requires careful analysis and a deep understanding every team in your org.

The remainder of this post will set up a scenario and then figure out how to do a re-org. I use a software engineering org as an example, but that's just because I have a long and intimate knowledge of them and understand the ways in which one can classify such teams. These same methods could be applied a Sales group, Marketing groups, etc., as long as you know the characteristics that define the teams of which these orgs are comprised.



Introducing ACME

ACME Corporation is the leading producer of some of the most innovative products of the 20th century. The CTO had previously tasked you, the VP of Software Development to bring this product line into the digital age -- and you did! Your great ideas for the updated suite are the new hottness that everyone is clamouring for. Subsequently, the growth of your teams has been fast, and dare we say, exponential.

More details on the scenario: your Software Development Group has several teams of engineers, all working on different products or services, each of which supports ACME Corporation in different ways. In the past 2 years, you've built up your org by an order of magnitude in size. You've started promoting and hiring more managers and directors to help organize these teams into sensible encapsulating structures. These larger groups, once identified, would comprise the whole Development Group.

Ideally, the new groups would represent some aspect of the company, software development, engineering, and product vision -- in other words, some sensible clustering of teams doing related work. How would you group the teams in the most natural way?

Simply dividing along language or platform lines may seem like the obvious answer, but is it the best choice? There are some questions that can help guide you in figuring this out:

There are many more questions you could ask (some are implicit in the analysis data linked below), but this should give a taste.

ACME Software Development has grown the following teams, some of which focus on products, some on infrastructure, some on services, etc.:

Early SW Dev team hacking the ENIAC

Categorizing Teams

Each of those teams started with 2-4 devs hacking on small skunkworks projects. They've now blossomed to the extent that each team has significant sub-teams working on new features and prototyping for the product they support. These large teams now need to be characterized using a method that will allow them to be easily compared. We need the ability to see how closely related one team is to another, across many different variables. (In the scheme outlined below, we end up examining 50 bits of information for each team.)

Keep in mind that each category should be chosen such that it would make sense for teams categorized similarly to be grouped together. A counter example might be "Team Size"; you don't necessarily want all large teams together in one group, and all small teams in a different group. As such, "Team Size" is probably a poor category choice.

Here are the categories which we will use for the ACME Software Development Group:
  • Language
  • Syntax
  • Platform
  • Implementation Focus
  • Supported OS
  • Deployment Type
  • Product?
  • Service?
  • License Type
  • Industry Segment
  • Stakeholders
  • Customer Type
  • Corporate Priority
Each category may be either single-valued or multi-valued. For instance, the categories ending in question marks will be booleans. In contrast, multiple languages might be used by the same team, so the "Language" category will sometimes have several entries.

Category Example

(Things are going to get a bit more technical at this point; for those who care more about the outcomes than the methods used, feel free to skip to the section at the end: Sorting and Interpretation.)

In all cases, we will encode these values as binary digits -- this allows us to very easily compare teams using Levenshtein distance, since the total of all characteristics we are filtering on can be represented as a string value. An example should illustrate this well.

(The Levenshtein distance between two words is the minimum number of single-character edits -- such as insertions, deletions or substitutions -- required to change one word into the other. It is named after Vladimir Levenshtein, who defined this "distance" in 1965 when exploring the possibility of correcting deletions, insertions, and reversals in binary codes.)

Let's say the Software Development Group supports the following languages, with each one assigned a binary value:
  • LFE - #b0000000001
  • Erlang - #b0000000010
  • Elixir - #b0000000100
  • Ruby - #b0000001000
  • Python - #b0000010000
  • Hy - #b0000100000
  • Clojure - #b0001000000
  • Java - #b0010000000
  • JavaScript - #b0100000000
  • CoffeeScript - #b1000000000
A team that used LFE, Hy, and Clojure would obtain its "Language" category value by XOR'ing the three supported languages, and would thus be #b0001100001. In LFE, that could be done by entering the following code the REPL:


We could then compare this to a team that used just Hy and Clojure (#b0001100000), which has a Levenshtein distance of 1 with the previous language category value. A team that used Ruby and Elixir (#b0000001100) would have a Levenshtein distance of 5 with the LFE/Hy/Clojure team (which makes sense: a total of 5 languages between the two teams with no languages shared in common).


Calculating the Levenshtein Distance of Teams

As a VP who is keen on deeply understanding your teams, you have put together a spreadsheet with a break-down of not only languages used in each team, but lots of other categories, too. For easy reference, you've put a "legend" for the individual category binary values is at the bottom of the linked spreadsheet.

In the third table on that sheet, all of the values for each column are combined into a single binary string. This (or a slight modification of this) is what will be the input to your calculations. Needless to say, as a complete fan of LFE, you will be writing some Lisp code :-)

Partial view of the spreadsheet's first page.
(If you would like to try the code out yourself while reading, and you have lfetool installed, simply create a new project and start up the REPL: $ lfetool new library ld; cd ld && make-shell
That will download and compile the dependencies for you. In particular, you will then have access to the lfe-utils project -- which contains the Levenshtein distance functions we'll be using. You should be able to copy-and-paste functions, vars, etc., into the REPL from the Github gists.)

Let's create a couple of data structures that will allow us to more easily work with the data you collected about your teams in the spreadsheet:


We can use a quick copy and paste into the LFE REPL for two of those numbers to do a sanity check on the distance between the Community Team and the Digital Iron Carrot Team:


That result doesn't seem unreasonable, given that at a quick glance we can see both of these strings have many differences in their respective character positions.

It looks like we're on solid ground, then, so let's define some utility functions to more easily work with our data structures:


Now we're ready to roll; let's try sorting the data based on a comparison with a one of the teams:


It may not be obvious at first glance, but what the levenshtein-sort function did for us is compare our "control" string to every other string in our data set, providing both the distance and the string that the control was compared to. The first entry in the results is the our control string, and we see what we would expect: the Levenshtein distance with itself is 0 :-)

The result above is not very easily read by most humans ... so let's define a custom sorter that will take human-readable text and then output the same, after doing a sort on the binary strings:


(If any of that doesn't make sense, please stop in and say "hello" on the LFE mail list -- ask us your questions! We're a friendly group that loves to chat about LFE and how to translate from Erlang, Common Lisp, or Clojure to LFE :-) )



Sorting and Interpretation

Before we try out our new function, we should ponder which team will be compared to all the others -- the sort results will change based on this choice. Looking at the spreadsheet, we see that the "Digital Iron Carrot Team" (DICT) has some interesting properties that make it a compelling choice:
  • it has stakeholders in Sales, Engineering, and Senior Leadership;
  • it has a "Corporate Priority" of "Business critical"; and
  • it has both internal and external customers.
Of all the products and services, it seems to be the biggest star. Let's try a sort now, using our new custom function -- inputting something that's human-readable:


Here we're making the request "Show me the sorted results of each team's binary string compared to the binary string of the DICT." Here are the human-readable results:


For a better visual on this, take a look at the second tab of the shared spreadsheet. The results have been applied to the collected data there, and then colored by major groupings. The first group shares these things in common:
  • Lisp- and Python-heavy
  • Middleware running on BSD boxen
  • Mostly proprietary
  • Externally facing
  • Focus on apps and APIs
It would make sense to group these three together.

A sort (and thus grouping) by comparison to critical team.
Next on the list is Operations and QA -- often a natural pairing, and this process bears out such conventional wisdom. These two are good candidates for a second group.

Things get a little trickier at the end of the list. Depending upon the number of developers in the Java-heavy Giant Rubber Band App Team, they might make up their own group. However, both that one and the next team on the list have frontend components written in Angular.js. They both are used internally and have Engineering as a stakeholder in common, so let's go ahead and group them.

The next two are cloud-deployed Finance APIs running on the Erlang VM. These make a very natural pairing.

Which leaves us with the oddball: the Community Team. The Levenshtein distance for this team is the greatest for all the teams ... but don't be mislead. Because it has something in common with all teams (the Community Team supports every product with docs, example code, Sales and TAM support, evangelism for open source projects, etc.), it will have many differing bits with each team. This really should be in a group all its own so that structure represents reality: all teams depend upon the Community Team. A good case could also probably be made for having the manager of this team report directly up to you.

The other groups should probably have directors that the team managers report to (keeping in mind that the teams have grown to anywhere from 20 to 40 per team). The director will be able to guide these teams according to your vision for the Software Group and the shared traits/common vision you have uncovered in the course of this analysis.

Let's go back to the Community Team. Perhaps in working with them, you have uncovered a hidden fact: the community interactions your devs have are seriously driving market adoption through some impressive and passionate service and open source docs+evangelism. You are curious how your teams might be grouped if sorted from the perspective of the Community Team.

Let's find out!


As one might expect, most of the teams remain grouped in the same way ... the notable exception being the split-up of the Anvil and Rubber Band teams. Mostly no surprises, though -- the same groupings persist in this model.

A sort (and thus grouping) by comparison to highly-connected team.
To be fair, if this is something you'd want to fully explore, you should bump the "Corporate Priority" for the Community Team much higher, recalculate it's overall bits, regenerate your data structures, and then resort. It may not change too much in this case, but you'd be applying consistent methods, and that's definitely the right thing to do :-) You might even see the Anvil and Rubber Band teams get back together (left as an exercise for the reader).

As a last example, let's throw caution and good sense to the wind and get crazy. You know, like the many times you've seen bizarre, anti-intuitive re-orgs done: let's do a sort that compares a team of middling importance and a relatively low corporate impact with the rest of the teams. What do we see then?


This ruins everything. Well, almost everything: the only group that doesn't get split up is the middleware product line (Jet Propelled and Iron Carrot). Everything else suffers from a bad re-org.

A sort (and thus grouping) by comparison to a non-critical team.

If you were to do this because a genuine change in priority had occurred, where the Giant Rubber Band App Team was now the corporate leader/darling, then you'd need to recompute the bit values and do re-sorts. Failing that, you'd just be falling into a trap that has beguiled many before you.


Conclusion

If there's one thing that this exercise should show you, it's this: applying tools and analyses from one field to fresh data in another -- completely unrelated -- field can provide pretty amazing results that turn mystery and guesswork into science and planning.

If we can get two things from this, the other might be: knowing the parts of the system may not necessarily reveal the whole (c.f. Complex Systems), but it may provide you with the data that lets you better predict emergent behaviours and identify patterns and structure where you didn't see them (or even think to look!) before.


18 Jul 2014 6:09am GMT

Duncan McGreggor: Interview with Erlang Co-Creators

A few weeks back -- the week of the PyCon sprints, in fact -- was the San Francisco Erlang conference. This was a small conference (I haven't been to one so small since PyCon was at GW in the early 2000s), and absolutely charming as a result. There were some really nifty talks and a lot of fantastic hallway and ballroom conversations... not to mention Robert Virding's very sweet Raspberry Pi Erlang-powered wall-sensing Lego robot.

My first Erlang Factory, the event lasted for two fun-filled days and culminated with a stroll in the evening sun of San Francisco down to the Rackspace office where we held a Meetup mini-conference (beer, food, and three more talks). Conversations lasted until well after 10pm with the remaining die-hards making a trek through the nighttime streets SOMA and the Financial District back to their respective abodes.

Before the close of the conference, however, we managed to sneak a ride (4 of us in a Mustang) to Scoble's studio and conduct an interview with Joe Armstrong and Robert Virding. We covered some of the basics in order to provide a gentle overview for folks who may not have been exposed to Erlang yet and are curious about what it has to offer our growing multi-core world. This wend up on the Rackspace blog as well as the Building 43 site (also on YouTube). We've got a couple of teams using Erlang in Rackspace; if you're interested, be sure to email Steve Pestorich and ask him what's available!


18 Jul 2014 5:49am GMT

16 Jul 2014

feedPlanet Twisted

Thomas Vander Stichele: morituri 0.2.3 ‘moved’ released!

It's two weeks shy of a year since the last morituri release. It's been a pretty crazy year for me, getting married and moving to New York, and I haven't had much time throughout the year to do any morituri hacking at all. I miss it, and it was time to do something about it, especially since there's been quite a bit of activity on github since I migrated the repository to it.

I wanted to get this release out to combine all of the bug fixes since the last release before I tackle one of the number one asked for issues - not ripping the hidden track one audio if it's digital silence. There are patches floating around that hopefully will be good enough so I can quickly do another release with that feature, and there are a lot of minor issues that should be easy to fix still floating around.

But the best way to get back into the spirit of hacking and to remove that feeling of it's-been-so-long-since-a-release-so-now-it's-even-harder-to-do-one is to just Get It Done.

I look forward to my next hacking stretch!

Happy ripping everybody.

flattr this!

16 Jul 2014 4:01am GMT

11 Jul 2014

feedPlanet Twisted

AmvTek Blog: Comparing Python performance of Protobuf/Thrift serialization…

When in need to get two software processes to exchange datas, some sort of protocol is necessary to define how to encode/decode the datas to be transported. A large number of serialization formats are available (json, xml, ASN.1…) so as to tackle the cross process/cross programming language datas encoding problem and Python provides a large number of libraries to leverage them…

What distinguishes the solution provided by protocol buffers or thrift is the need to describe the datas to be exchanged in a central schema file written using an easy to read idl language. Such schema file is then compiled so as to provide data representation for a certain programming language. Not all problems will benefit from this approach, but when developping services that needs to be accessed by clients written in different programming languages ( eg Objective C, Java…) we have found that relying on a well defined schema file allows to save a lot of time.

The need for benchmarking

We write a large part of our server side code using Python and some of the projects we support are accessed mainly by native clients over TCP or UDP. Over the years we moved gradually from our home grown custom serialization solution built on top of Python struct module to protocol buffer…

The move to protocol buffer allowed us to cut down development time required to support new types of client to a minimum. We also realized how the reliance of a central schema was valuable in that everybody can vizualize what the datas are.

One thing however we are regretting from the previous home grown solution are the performances, and at the server performances matter tremendously even more if you are using an asynchronous networking framework like twisted.

For a long while we reinssured ourselves observing that a google supported extension module was available, and that deploying it, would allow us to accelerate serialization/deserialization by a factor 10 at least. Deploying such extension module was delayed till reaching stagging development phase, because it is quite cumbersome to do so. You need to build things from source and manage some environment variables in your server processes, to force the use of the implementation it provides.

Once we activated the protobuf extension module on our stagging server we started to observe random crashes of the server processes. It took us time to understand that those crashes were related to the use of such extension module. Well, we should not have underestimated the fact that Google was labelling this extension module as experimental, but here again we assumed that Google playing in a different category than the rest of us they were probably referring to some pretty advanced usecases :(

After all those hurdles, we realized that selecting the proper serialization technology for your projects is a decision that shall not be taken lightly. Thrift provides an obvious alternative to google protocol buffer, but how does its Python implementation performs ? They exist extensive performance benchmarks of java serialization frameworks, but we found nothing similar for Python.

The benchmark

We have published on GitHub, what we consider to be a good basis to compare the various serialization frameworks which one may want to leverage. The repository for the project can be reached here.

The benchmark for now compares the performances of protobuf and thrift serializations for messages defined in the StuffTotest schema . We welcome suggestions to extend such reference schema so as to explore performances variations more in the details or external help so as to cover more serialization frameworks…

Preliminary results

We have published on GitHub a result run obtained on a low end development machine. If we consider performance to be the average in between serialization and deserialization time for a certain message of the schema, Thrift :

  • outperforms protocol buffers in 75% of the cases.
  • is stable.
  • is much easier to deploy (pip install thrift and you are done…)

So there is currently a clear winner to this benchmark. We will be happy to rerun it so as to validate that things have changed.

11 Jul 2014 9:00pm GMT

10 Jul 2014

feedPlanet Twisted

Jonathan Lange: Cutting edge debugging

One of the great pleasures of advancing in one's craft is the thrill of solving newer, more difficult problems. In the course of my work, I frequently encounter many bugs which appear at first glance to be impossible. Not just impossible to solve, but intrinsically impossible. It's as if being at the very frontiers of technology exposes one to strange new logics that the human mind is not fit to grasp.

Each of these bugs is, of course, unique, and it would be folly to suggest that there could be any sort of systematic method that could be applied at the cutting edge of technology, as if one could push back the boundaries of ignorance in much the same way as one fills in a tax return. Nevertheless, the following principles have served me pretty well, and perhaps deserve a wider audience.

I offer up these principles humbly: no single approach can suffice at the very fringes of the known. I earnestly hope that by meditating on them we can achieve things as yet undreamt of.

10 Jul 2014 11:00pm GMT

07 Jul 2014

feedPlanet Twisted

Jonathan Lange: conda and binstar: initial experiences

A while ago my friend teh recommended that I use conda for Python package management, because of its intrinsic superiority to pip & virtualenv. I'm no fan of either, and teh generally has pretty good taste, so I thought I'd check it out.

I haven't formed an overall opinion yet, but I can say without reservation that the overall experience for someone who has become used to pip & virtualenv is very frustrating.

What I found myself wanting was a one pager explaining conda & binstar to someone familiar with pip & virtualenv, which are unquestionably vastly more popular. Alas, there's none to be found.

There are some guidelines here and there on the net about how to install a package that's available on PyPI into a conda system, but they don't always work.

For example, I'd like to play around with pyrsistent. The recommended approach is to do the following:

$ conda skeleton pypi pyrsistent
$ conda build pyrsistent
$ binstar upload //anaconda/conda-bld/osx-64/pyrsistent-0.3.1-py27_0.tar.bz2
$ conda install pyrsistent

Everything works fine up until the build step, but the upload fails because someone has already uploaded the package. Fine. So conda install pyrsistent should work, right?

$ conda install pyrsistent
Fetching package metadata: ...
Error: No packages found matching: pyrsistent

Wut.

Searching binstar finds pyristent. Apparently I have to add some kind of channel (a concept that's not well explained, and that does not seem to appear at all on the binstar UI).

I try this way:

$ conda config --add channels https://binstar.org/auto

But apparently that's not a channel:

$ conda install pyrsistent
Fetching package metadata: ....Error: HTTPError: 404 Client Error: NOT FOUND: https://binstar.org/auto/osx-64/

So I try this way:

$ conda config --add channels https://binstar.org/auto

And that is a channel, but pyrsistent it doth not find:

$ conda install pyrsistent
Error: No packages found matching: pyrsistent

I take a look at the files page for pyrsistent and it looks like the only file is linux-64/pyrsistent-0.1.0-py27_0.tar.bz2. I'm running OS X (for my sins) and my guess is that the "No packages found matching" error refers to a lack of binary builds.

So, what am I supposed to do? At this point, the documentation and the error messages from the tools leave me with no obvious way forward. Perhaps there's some underlying concept that would illuminate all for me, if only I could grasp it. Perhaps it's simply a bug.

In any case, I'm not abandoning conda yet (in part because I have no idea how to extricate it from my system), but it has some way to go before it is properly better than pip & virtualenv.

07 Jul 2014 11:00pm GMT

05 Jul 2014

feedPlanet Twisted

Jonathan Lange: Migrated to Pelican

A while ago, I got sick of Blogger, so I switched to Octopress.

I really could not be bothered learning how to manage Ruby virtual environments, and kept running into version issues, so I have followed glyph's example and migrated to Pelican.

No comments yet, and still quite a few things that ought to be done. If you notice something broken or missing, please file a bug.

I'm afraid I have neither the energy nor the inclination to do a full write-up of the migration process. However, I'll provide the following notes in case they help anyone who wants to do the same thing.

Patches required

Make the blog

$ pelican-quickstart
Welcome to pelican-quickstart v3.3.1.dev.

This script will help you create a new Pelican-based website.

Please answer the following questions so this script can generate the files
needed by Pelican.


> Where do you want to create your new web site? [.] code-blog
> What will be the title of this web site? Mere Code
> Who will be the author of this web site? Jonathan M. Lange
> What will be the default language of this web site? [en]
> Do you want to specify a URL prefix? e.g., http://example.com   (Y/n) Y
> What is your URL prefix? (see above example; no trailing slash) http://code.mumak.net
> Do you want to enable article pagination? (Y/n) Y
> How many articles per page do you want? [10] 20
> Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n) Y
> Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n) Y
> Do you want to upload your website using FTP? (y/N) n
> Do you want to upload your website using SSH? (y/N)
> Do you want to upload your website using Dropbox? (y/N)
> Do you want to upload your website using S3? (y/N)
> Do you want to upload your website using Rackspace Cloud Files? (y/N)
> Do you want to upload your website using GitHub Pages? (y/N) Y
> Is this your personal page (username.github.io)? (y/N) N

Extract the things

$ pelican-import --blogger -m markdown \
  -o output-directory/ --wp-custpost --dir-page \
  blog-08-29-2013.xml

Transfer the things

$ cp output-directory/* blog-directory/content/
$ rm -rf blog-directory/content/settings blog-directory/content/templates/
$ mv blog-directory/content/comments blog-directory

Tweak the blog

Only basic tweaks. Note that the pelican_comment_system plugin is used in order to nicely render the imported blogger comments.

I had to manually tweak the theme to support the comment system.

05 Jul 2014 2:21pm GMT

30 Jun 2014

feedPlanet Twisted

Moshe Zadka: Thoughts on Warrants

Warrants (and similar things, like indictments at preliminary hearings) rely on the concept of "probable cause". The definition of probable cause varies, but in general requires for something to be more probable than not - or in terms of Bayesian probability (which applies probability to states of mind, not just to repeatable experiments), higher than 0.5 probability. Do our warrants actually obey this ruling? This is a fairly simple test - at least 50% of warrants handed should, eventually, result in evidence leading to conviction[1]. Do they? Hell, in less than 50% of cases does it lead to an indictment[2]!

Since warrant judges hand out a lot of warrants, we can decide to actually measure and automatically disbar judges whose warrant rates fall significantly below 50% rates (e.g., using p values of 0.005 would correspond to having a pretty high Bayesian prior that judges are "good"). The criteria could simply be - "of all warrants handed out where the case has ended, how many convictions have there been" (so any cases still in trial would be eliminated from consideration). After doing this for a while, we can update the Bayesian prior, periodically, to how many judges actually get disbarred by this.

In a perfect world, we would also apply this to convictions - however, since convictions are handled by one-offs (jurors), it is hard to place blame for overturning convictions. However, at least for warrants and non-grand-jury-indictments, this allows a process of automatic continuous improvement of our standards, avoiding the "judges and police co-operate to come up with lots of warrants" problem.

[1] Technically, since conviction is "beyond reasonable doubt", this is a lighter standard than the true standard of actual guilt. However, based on the "better a hundred guilty people go free…" (Benjamin Franklin) standard, reasonable doubt being >=1% probability, the effect should be minor.

[2] http://www.insurancejournal.com/news/east/2014/06/27/333106.htm

30 Jun 2014 4:29pm GMT

Ashwini Oruganti: Heads Up, San Francisco!

I will be visiting San Francisco this August and staying until November, spending my days working on a TLS v1.2 implementation in Python, using existing crypto math, thanks to Stripe's Open Source Retreat. Keep an eye out for a blog post with more details on that.

I just wrapped up work at HippyVM, and have gathered numerous stories looking at the PHP source code every day for the past six months. I've also just graduated, and it feels great to be able to spend the next few months working full-time on open source software.

What after, you ask? Some of you reading this right now have intimated that you'd like to offer me a job, if I were available. Well, now's your chance. Get in touch, and let's talk.

To my various acquaintances in the bay area, let's get together and do something fun. To others, if grabbing a coffee with me is an interesting idea to you, please drop me a line. I would love to hear your story about how PHP ruined your summer, or how Twisted changed your life, or how you had to stay up all night to fix an OpenSSL vulnerability. Really, I love stories; if you've got an interesting one to share, I'd be happy to lend an ear.

30 Jun 2014 12:53pm GMT

29 Jun 2014

feedPlanet Twisted

Thomas Vander Stichele: mach 1.0.3 ‘moved’ released

It's been very long since I last posted something. Getting married, moving across the Atlantic, enjoying the city, it's all taken its time. And the longer you don't do something, the harder it is to get back into.

So I thought I'd start simple - I updated mach to support Fedora 19 and 20, and started rebuilding some packages.

Get the source, update from my repository, or wait until updates hit the Fedora repository.

Happy packaging!

flattr this!

29 Jun 2014 9:09pm GMT

22 Jun 2014

feedPlanet Twisted

Duncan McGreggor: lfest: A Routing Party for LFE Web Apps

The Clojure community (in particular, James Reeves) has produced one of the best APIs I've seen for creating RESTful services, or any application using path dispatching (routing), really: Compojure.

I've been really missing this in LFE. Despite the fact that creating RESTful apps in LFE with YAWS is so simple it doesn't even need a framework, the routes can be a bit difficult to read, due to some of the destructuring that's done (e.g., URL paths).

Take the following route declaration from the yaws-rest-starter project:

Note that this example delegates additional routing to other functions, since all the routing in one function was pretty difficult to read. This, however, detracts from the overall readability in a different way: there's not one place to look to see what functions map to the complete URL-space for this service.

I wanted to be able to do something like what you can do in Clojure:

LFE is a Lisp with macros, so why not? Also, nox or mheise (I forget who) on the #erlang-lisp channel had previously noted all the lfe* projects (and these too), and suggested that someone create the inevitable "lfest" repo on Github.

Enter lfest. After a weekend of hacking and debugging some tiny macros, surprisingly little code now supports routes definitions like the following in LFE+YAWS web apps:

If you're curious to see what this actually expands to before getting compiled in a .beam file, you can view it here.

Bonus for LFE webdevs: this project also has a module of HTTP status codes, for easy re-use in other projects.

The recent work I've been doing with macros has really helped shape the section I've been planning to write in the LFE User Guide. I haven't wanted it to be yet another dry description of macros in a Lisp-2. Instead, my hope is to help jump-start people into how to think about macros, how to start writing them immediately (without years of study first), and how to debug them.

The trick, as with all complicated subjects, is how to remove the barrier to entry and get folks that necessary hands-on experience without dumbing-down the material. It's almost ready...

Also, it's Bay to Breakers today, and Ocean Beach is bumpin. The synchronicity is just eery. May you party just as hard with routes in LFE.


22 Jun 2014 3:16am GMT

10 Jun 2014

feedPlanet Twisted

Jp Calderone: Oh, Imaginary!

Still here, still hacking away at this vision/clothing issue[1][2][3][4]. I hope this isn't getting tedious. If it is, hang in there! There's some light up ahead.

Over the last couple weeks a few interesting things have happened to Imaginary. For one, a number of the fixes that Glyph and I made along the way have actually made their way into master now. Most significantly (I think) the documentation I started and Glyph and I worked to finish is there now. Seven other fixes landed too, though. This is the most activity Imaginary has seen in a handful of years now. Crufty, dead code has been deleted. Several interfaces has been improved. Useful functionality has been factored out so it can actually be re-used. Good stuff. ashfall gets the credit for doing the legwork of splitting out these good pieces from the larger branch where most work has been going on.

Something else came about last week which is worthy of note: the code in that branch now passes the full test suite. That is pretty big news. I think maybe it means we actually figured out how to fix the bug (I'm totally hedging here; Imaginary's test suite is pretty good). This came about after a relatively big change and a relatively little change.

The bigger change was an expansion of the interface for exits. Wait,, you're surely going to exclaim, exits? Recall that the general idea for the fix to the way clothing is rendered was to preserve information about the path from the observer to the clothing and then use that path information to inform the renderer. As you might guess, this involved some code changes to the renderer. In the process of doing this, we disturbed the way the renderer shows you exits from a place. The previous implementation was rather hacky and just included some extra code to find exits. It did so without properly traversing the simulation graph and so potentially produced incorrect results. In the new renderer, exit information comes along with all of its related path information - automatically, even! - in the same collection of paths-to-ideas that represents everything else you might be able to see.

This generalization resulted in a bit more information being rendered than we really wanted. Containers (boxes, chairs, rooms, submarines, etc) almost all have implicit "in" and "out" exits (okay exit might not be the best name for how you go in to something; it's just a name, okay?) and it's usually not very interesting to include this information along with other more interesting exits. Anyway, that's a decision we'd made previously in the renderer and we wanted to preserve it. So the exit interface got expanded to include a method which can be used to determine whether it makes sense to present that particular exit when rendering something near that exit. This removed a bunch of extra, not-very-interesting "into" and "out of" exits from the text which is generated to describe most things you might encounter in an Imaginary simulation.

The smaller change was a fix for a bug in how actions resolve targets. One of the fixes we needed to make to the garment system was to make it possible for the person wearing some clothing to always be able to refer to it. This was originally possible by virtue of the duplicate links to all clothing that I discussed in the first post I wrote about this bug. Once we got rid of those duplicate links, referring to clothing you were wearing became impossible if you couldn't see the clothing. At some level this makes sense, but even if you can't see your socks at the moment you still know they're there and you can manipulate them in various ways (pull them up, for example). The long term fix for this will most likely be to add an "awareness" component (to be specified) to action target resolution (the system that turns the "my socks" string into an Idea instance representing your socks). The short term fix was just to just add an observer argument to an existing method, isOpaque and hack the garment system to make clothing not opaque to the person wearing it. This lets you see all your own layers but still prevents other people from seeing anything but your top layer. It has problems - for example, if you look in a mirror, you'll see your socks, even if they're covered by your shoes and your pants - but it'll do for now (because we don't have mirrors yet, ha ha).

And that fix we actually made a few weeks ago. What we did last week was remember to update the code that calls isOpaque to actually pass the correct observer object.

So, pending some refactoring, some documentation, probably some new automated tests, clothing and vision are now playing nicely together - in a way with fewer net hacks than we had before. Woot.

After working on that stuff with Glyph I also tried to get the documentation that has been merged into master actually published somewhere so it can be read. Unfortunately this proved to be something of a challenge. After fighting with distutils for a couple hours I still couldn't get readthedocs.org to build Imaginary's Sphinx-based documentation and had to give up (for my sanity as much as for anything). There is a solution in sight but it involves the long hard road of actually bringing both the packaging state of Imaginary and (here's the real bummer) all of Imaginary's dependencies up to contemporary standards. After cooling off for a couple days I did take the first steps towards this, doing some work on Nevow (yes, that's a github link). There's more work to do on that front but perhaps mostly just getting a new release out. So, watch this space (but please don't hold your breath).

10 Jun 2014 12:24am GMT

27 May 2014

feedPlanet Twisted

Jp Calderone: Imaginary Graph Traversal Woes

So we worked on Imaginary some more. Unsurprisingly, the part that was supposed to be easier was surprisingly hard.

As part of the change Glyph and I are working on, we needed to rewrite the code in the presentation layer that shows a player what exits exist from a location. The implementation strategy we chose to fix looking at stuff involved passing a lot more information to this presentation code and teaching the presentation code how to (drumroll) present it.

Unfortunately, we ran into the tiny little snag that we haven't actually been passing the necessary information to the presentation layer! Recall that Imaginary represents the simulation as a graph. The graph is directed and most certainly cyclic. To avoid spending all eternity walking loops in the graph, Imaginary imposes a constraint that when using obtain, no path through the graph will be returned as part of the result if the target (a node) of the last link (a directed edge in the graph path) is the source (a node) of any other link in the path.

If Alice is in room A looking through a door at room B then the basic assumption is that she should be able to see that there is a door in room B leading to room A (there are several different grounds on which this assumption can be challenged but that's a problem for another day). The difficulty we noticed is that in this scenario, obtain gets to the path Alice -> room A -> door -> room B -> door and then it doesn't take the next step (door -> room A) because the next step is the first path that qualifies as cyclic by Imaginary's definition: it includes room A in two places.

Having given this a few more days thought, I'm curious to explore the idea that the definition of cyclic is flawed. Perhaps Alice -> room A -> door -> room B -> door -> room A should actually be considered a viable path to include in the result of obtain? It is cyclic but it represents a useful piece of information that can't easily be discovered otherwise. Going any further than this in the graph would be pointless because obtain is already going to make sure to give back paths that include all of the other edges away from room A - we don't need duplicates of those edges tacked on after the path has gone to room B and come back again.

Though there are other narrower solutions as well, such as just making the presentation layer smart enough to be able to represent Alice -> room A -> door -> room B -> door correctly even though it hasn't been directly told that the door from room B is leading back to room A. This would have a smaller impact on Imaginary's behavior overall. I'm not yet sure if avoiding the big impact here makes the most sense - if obtain is omitting useful information then maybe fixing that is the best course of action even if it is more disruptive in the short term.

27 May 2014 1:13pm GMT

25 May 2014

feedPlanet Twisted

Duncan McGreggor: LFE Design Summit

Well, maybe not summit, but certainly a meeting of minds :-)

The LFE Community is pleased to announce that we will be holding our first ever community design meeting in Stockholm, Sweden this June during the Erlang User Conference! We have set aside 1 hour for a kick-off discussion that we can continue in breakout sessions, hallways, meadhalls, in #erlang-lisp on Freenode, and on the LFE mail list

Our goals are to not only share plans, but to discover what you want in LFE, how you want to use it. This is an opportunity for us to work together, explore interesting problems to solve, build community awareness around language goals, and identify efforts around which each of us are interested in collaborating with others in the course of subsequent months.

We've opened up a Google Docs form so you can add your ideas for session topics. Here are some discussion topics that we've come up with so far:

Add your favorites!


In the next couple weeks, we'll identify the top session topics the folks have proposed and attempt to cover these during the Erlang User Conference. Robert was able to get us some meeting space & time for this; he's hoping to have something viewable on the EUC site soon.

This is a first step; let's see how this goes, and if people really enjoy it, we can have a real summit next time!

Update: Robert has announced the time and place.


25 May 2014 10:47pm GMT

21 May 2014

feedPlanet Twisted

Jp Calderone: De-cruft (Divmod Imaginary Update)

Over the weekend Glyph and I had another brief hack session on Imaginary. We continued our efforts towards fixing the visibility problem I've described over the last couple of posts. This really was a brief session but we took two excellent steps forward.

First, we made a big chunk of progress towards fixing a weird hack in the garment system. If you recall, I previously mentioned that the way obtain works in Imaginary now, the system will find two links to a hat someone is wearing. In a previous session, Glyph and I removed the code responsible for creating the second link. This was a good thing because it was straight-up deletion of code. The first link still exists and that's enough to have a path between an observer and a hat being worn by someone nearby.

The downside of this change is that the weird garment system hack was getting in the way of that remaining link being useful. The purpose of the hack was to prevent hats from showing up twice in the old system - once for each link to them. Now, with only one link, the hack would sometimes prevent the hat from even showing up once. The fix was fairly straightforward. To explain it, I have to explain annotations first.

As I've mentioned before, Imaginary represents a simulation as a graph. One powerful tool that simulation developers have when using Imaginary is that edges in the graph can be given arbitrary annotations. The behavior of simulation code will be informed by the annotations on each link along the path through the graph the simulation code is concerned with. Clear as mud, right?

Consider this example. Alice and Bob are standing in a room together. Alice is wearing a pair of white tennis shoes. Alice and Bob are standing in two parts of the room which are divided from each other by a piece of red tinted glass. A realistic vision simulation would have Bob observing Alice's shoes as red. Alice, being able to look directly at those same shoes without an intervening piece of tinted glass, perceives them as white. In Imaginary, for Bob to see the tennis shoes, the vision system uses obtain to find the path from Bob to those shoes. The resulting path necessarily traverses the Idea representing the glass - the path looks very roughly like Bob to glass to Alice to shoes. The glass, being concerned with the implementation of tinting for the vision system, annotates the link from itself to Alice with an object that participates in the vision simulation. That object takes care of representing the spectrum which is filtered out of any light which has to traverse that link. When the vision system shows the shoes to Bob, the light spectrum he uses to see them has been altered so that he now perceives them as red.

Clearly I've hand-waved over a lot of details of how this works but I hope this at least conveys the very general idea of what's going on inside Imaginary when a simulation system is basing its behavior on the simulation graph. Incidentally, better documentation for how this stuff works is one of the things that's been added in the branch Glyph and I are working on.

Now, back to hats. The hack in the garment system annotates links between clothing and the person wearing that clothing. The annotation made the clothing invisible. This produced a good effect - when you look around you, you're probably not very interested in seeing descriptions of your own clothing. Unfortunately, this annotation is on what is now the only link to your clothing. Therefore, this annotation makes it impossible for you to ever see your own clothing. You could put it on because when you're merely holding it, it doesn't receive this annotation. However, as soon as you put it on it vanishes. This poses some clear problems: for example, you can never take anything off.

The fix for this was simple and satisfying. We changed the garment system so that instead of annotating clothing in a way that says "you can't see this" it annotates clothing in a way that merely says "you're wearing this". This is very satisfying because, as you'll note, the new annotation is a much more obviously correct piece of information. Part of implementing a good simulation system in Imaginary is having a good model for what's being simulated. "Worn clothing is worn clothing" sounds like a much better piece of information to have in the model than "Worn clothing can't be seen". There's still some polish to do in this area but we're clearly headed in the right direction.

By comparison, the second thing we did is ridiculously simple. Before Imaginary had obtain it had search. Sounds similar (well, maybe...) and they were similar (well, sort of...). search was a much simpler, less featureful graph traversal API from before Imaginary was as explicitly graph-oriented as it is now. Suffice it to say most of what's cool about Imaginary now wasn't possible in the days of search. When obtain was introduced, search was re-implemented as a simple wrapper on top of obtain. This was neat - both because it maintained API compatibility and because it demonstrated that obtain was strictly more expressive than search. However, that was a while ago and search is mostly just technical baggage at this point. We noticed there were only three users of search left (outside of unit tests) and took the opportunity to delete it and update the users to use obtain directly instead. Hooray, more code deleted.

Oh, and I think we're now far enough beneath the fold that I can mention that the project has now completed most of a migration from Launchpad to github (the remaining work is mostly taking things off of Launchpad). Thanks very much to ashfall for doing very nearly all the work on this.

21 May 2014 12:25am GMT