23 Jul 2008
Planet Ruby
Nick Sieger: Jazzers and Programmers
At RubyFringe, I could have gone the easy route and talked about any manner of tech that I've been working on or associated with, including JRuby, JRuby-Rack, Warbler, image_voodoo, Glassfish, activerecord-jdbc, Connection pooling for Rails, or Ruby-Processing. Instead, I took a risk and went in a completely different direction. What follows are some thoughts I expressed in the talk. I'd be interested to hear your take as well.
Jazzers and Programmers
Why do we program? Why do you program? Not the 9-5 systems analyst working for a paycheck. I'm talking about you and me; we, the passionate fringe of the programming world. Fame and fortune? A knee-jerk answer, but that's not the one I'm looking for. What I think drives great programmers is the desire to learn, share, collaborate, and constantly get better at what you do. Which might explain, to a large degree, why all of you came to RubyFringe.
RubyFringe bills itself as deep nerd tech with punk rock spirit. So what are some elements of punk rock spirit that we identify with? While I enjoy and appreciate punk and its culture of DIY, I'm not an expert. However, I am passionate and knowledgeable about jazz.
At this point you may be shaking your head, "Sieger didn't get the memo!"
At which point I counter: if you think this clown and the music he plays is jazz, then prepare to be corrected, because you need to hear what I'm about to tell you. Listen. Jazz is punk before punk even existed. And if we're associating ourselves with punk, the fringe, or, to use another word, vanguard, perhaps we can learn a few things from this truly unique American art form.
How did I pick this topic for my talk? I was thinking about the fringe, the vanguard and growing up nerdy and a misfit, which we probably all can relate to. In addition to my interest in computers, I was also a band geek. I wore ties on random days at school just to accentuate myself away from the "in" crowd. And after school, I played in the jazz band. Playing in the jazz bands continued through college, where despite my majoring (and graduating) in electrical engineering, I decided to move to New York City to experience the big city and hopefully continue to play jazz. Instead, in an odd but practical twist, NYC became my transition into a programming career. Because you gotta pay the bills somehow, after all.
Recently, after taking in David's "Surplus" talk at RailsConf, although it sounds trite, I really took his suggestion to heart to look outside the programming world for insights on how to become a better programmer. I soon realized that jazz, my dormant passion, was that place right in front of me that might hold some of those crucial insights. And as odd as it may appear that I am speaking about jazz at a programming conference, I thought I'd share some thoughts about this from my vantage point as programmer and jazz musician. (Plus, I get a kick out of saying that I came to RubyFringe, stood up in front of a captive audience, and auditioned jazz clips for 30 odd minutes.)
Styles of Jazz
It's taken me all my life to learn what not to play. - Dizzy Gillespie
Jazz as a unique musical style is incredibly diverse. Its history spans over a hundred years, and is remarkable in its breadth. From Ragtime, New Orleans "Hot" Jazz, Swing, Bebop, Cool, Hard Bop, Free/Avant Garde, Jazz Rock/Fusion to splintering into a thousand directions, the music was (and is) constantly evolving and changing with the times.
The history of programming is undergoing a similar arc, but is nowhere near as developed as a communication medium as jazz is. Whereas modern programming only 40-50 years old depending on when you start counting, jazz was just getting a head of steam at that point. You could break it down like this: C is akin to New Orleans "hot" jazz: foundation for much that followed, and still relevant in a lot of ways, but definitely showing its age. Java is like swing (literally! c.f javax.swing.*): appeals to the masses, anyone gets it without a lot of cognitive load. Ruby is like bebop: favors minimalism, more powerful, more intellectual, and less understood.
So what does that mean for the future of programming? We're already seeing an increase in DSLs, polyglotism, and languages and tools used for specialized purposes. I would expect that trend to continue, and using the development of the many styles of jazz as a ruler, would say that there will not be a Next Big Language (NBL). Specialization and niche skills will be increasingly important as differentiators in our careers, but that means there will also be plenty of opportunity to stake out that new ground for ourselves and become leaders in these new specializations.
Jazz Fundamentals
Anyone can make the simple complicated. Creativity is making the complicated simple." - Charles Mingus
Jazz as a creative medium has a number of compelling features that surround the music itself as well as the players. Let's examine a few of these, using them as a mirror to reflect on programming.
Instrumentation. The modern jazz small ensemble is built upon the rhythm section -- piano, bass, and drums. The bass is the starting point, and is responsible for marking the beat as well as laying the harmonic foundation, often by "walking". Because of this critical role, the bass player in a jazz band is almost always positioned in the rear-center of the stage, between the piano and the drums. The piano and drums also have a role in keeping time, but have more freedom in creating tension and dissonance by layering rhythms on top of the bass, in an activity known as Comping.
The easiest parallel to programming is libraries, frameworks and patterns. These are the building blocks upon which we code. A well-designed library makes your work as a programmer easier and more enjoyable in the same way a solid, experienced rhythm section complements a soloist.
To use an example from web programming, may I suggest Bass-Drums-Piano == Model-View-Controller?
When you hit a wrong note it's the next note that makes it good or bad. - Miles Davis
Musical Structures. Much of bebop and straight-ahead jazz employs the well-established structure of the head, followed by solos, and finally repeating the head. The head is basically a crib sheet that consists of a tune or melody (usually a well-known jazz standard) and a set of chord changes that forms the harmonic basis for the entire piece. During the solo section, one or more musicians improvise over the changes for one or more "choruses".
The 12-bar blues is easily the most common form found in jazz. It is usually the first form that a budding jazz musician practices when learning how to improvise.
Every working jazz musician is expected to memorize and be able to play a large number of tunes and harmonic structures. This body of tunes is exemplified by the Real Book, an underground, illegal transcription of a large number of tunes by some Berklee College students in the 70's that has become a standard part of a jazz musician's toolbox.
Collectively, these musical conventions and rules form a jazz vocabulary and a shared ritual that allow musicians to play together in a jam session without having played together before.
The takeaway is that conventions and standards, when organically grown, tend to be powerful, liberating forces for communication and collaboration. There was no OASIS- or W3C-like committee that sat down when the roots of jazz were taking hold and dictated that it would be played this way. It just happened, night in, night out, on bandstands and in sessions all over New Orleans and Chicago and New York.
I believe that some of the best examples of programming and technology embody those same ideas. Certainly Rails is a shining example in the programming world of such a set of conventions and how they allow you to be productive no matter what Rails project you're working on.
It was when I found out I could make mistakes that I knew I was on to something. - Ornette Coleman
Improvisation. Making it up as you go. Jazz wouldn't be jazz without improvisation. The whole concept of improvisation in jazz builds upon the conventions and structure, but gives you the license to do whatever you want. Improvisation is what turns those constraints into freedom. Conviction is the key. Get up and play, tell your story, tell it with conviction, and there will be no wrong notes.
Improvisation is also where the musical conversation happens. Where the soloist connects with the crowd and his/her fellow musicians. As a listener, following along with that conversation sometimes takes a trained ear. But there are still some musical devices that are easy for anyone to appreciate and can give you something to listen for.
Trading fours is a conversational device between players in the group. Most commonly, it's used to give the drummer a chance to stretch out by alternating four bars of soloing with one or all of the instrumentalists. It's a great way to enjoy the inventiveness of the group collaboration element of jazz by listening to how the soloists play off each other in short spurts.
An outstanding example of trading fours can be found in "The Blues Walk" between Clifford Brown and Harold Land (at about the 5:30 mark in the tune). They trade fours for two choruses, then twos, then ones, then halfs. The lines they weave are unreal!
Quoting is humor and inventiveness applied to improvisation. It's named for a device where a player spontaneously inserts a recognizable melody into the middle of a solo. It's basically easter eggs applied to jazz. Dexter Gordon and Sonny Rollins are two players known for their penchant for breaking out and quoting in the middle of a solo.
Improvisation and Programming
Learn the changes, then forget them. - Charlie Parker
As far as programming is concerned, it's harder to see how improvisation can translate. Chad Fowler recently posted about programming as performance, asking for examples. Unfortunately, most of the examples given use programming as a means, but with some form of multimedia art as the end. I'd like to see programming performance where the code itself is the end result.
Do programmers improvise? I certainly believe that the best ones do. While it may be hard to capture the spontaneity of pure, real-time improvisation in a way that maps to how we write code, I think we can get close.
Continuous rewrites. Consider Fred Brooks' axiom, "Plan to throw one away; you will, anyhow." Jazz musicians do this on the bandstand every night! What if, rather than throwing one away, you rewrote a non-trivial piece of code five, ten, twenty times? What do you think would come out of that exercise as a result?
Live coding as a performance and teaching mechanism. I think live coding is an awesome example of spontaneity and risk-taking in programming, and one that not enough people attempt. Have you ever looked over the shoulder of a great programmer? It's fun to watch, because you learn a lot. I could easily sit for an hour watching a seasoned programmer create code while working in their own environment. On the other hand, there's a trend in conference talks where the presenter, to avoid the wrath of the demo gods, stands idly by while playing back a pre-recorded video of the demo. Sorry, I don't buy this. If it's not live, it's not real. Let's make live-coding something we all strive to do! Don't be afraid to show the audience that you're not perfect!
Do not fear mistakes. There are none. - Miles Davis
Instead of a hackfest, try a Coding Jam Session. Hackfests only rarely unite people to work toward a common cause; people enjoy the communal feel but too often people revert to working on their own thing instead of collaborating. Sprints are better, where the general topic or project is shared but each person is still working on their separate pieces. Rather, what I'd like to see is a group of programmers working toward a single shared goal. Plan ahead for a gathering of a handful of people for a lengthy period of time, say 8 or 12 or even 24 hours. Agree upon a theme or rough plan ahead of time, but allow for serendipity and improvisation to take hold of the group during the session. Get together, and start writing code. Try "trading fours", allowing each person in the session to drive and add their bit of code into the mix in short spurts. The overarching theme should be collaboration and building on each other's contributions. Avoid negative responses, such as saying "No" or "Yes, but..." or taking over and deleting or substantially rewriting someone else's code. Instead, say "Yes, and...". Imagine yourself on the bandstand, where once a note is played, it can't be taken back. Really try hard to put away your devil's advocate side or need to take control and instead play up to each others' strengths. Depending on the blend of personalities, this could be a great way to spike a idea for a new website or a startup!
Group Gitjour repository sharing. Crazy idea: what about using gitjour at a coding jam in a mode where everyone broadcasts their own repository and adds everyone else as remotes? Commit notifications could be broadcast on the local network with Growl or a similar tool, and pulling in someone else's changes could be a simple short one-line command, or perhaps even automatic. Imagine writing code together where suddenly a git merge is done for you automatically and the text changes right before your eyes right in your editor! Real-time collaboration!
Join the conversation!
I hope you'll join me and post your thoughts about some of these ideas, either in the comments below or on your own blogs. And find some inspiration in your passions and pastimes you have outside of the programming world that help you become a better programmer and collaborator!
Discography
Those who saw my talk in person may be interested in this list of song samples. I personally recommend any of these albums. Information is organized as Section: Artist, Song, Album.
- Intro. Naked City, Thrash Jazz Assassin, Torture Garden.
- Ragtime. Jelly Roll Morton, Maple Leaf Rag (Morton style), The Library of Congress Recordings, Vol. 1: Kansas City Stomp.
- New Orleans "Hot". King Oliver's Creole Jazz Band, Dipper Mouth Blues, Louis Armstrong And King Oliver.
- Swing. Duke Ellington & His Orchestra, Take The "A" Train, The Fabulous Swing Collection.
- Bebop. Charlie Parker, Koko, The Complete Savoy and Dial Studio Recordings.
- Cool. Miles Davis, Israel, The Birth of the Cool.
- Hard Bop. Herbie Hancock, Watermelon Man, Takin' Off.
- Free. Ornette Coleman, Free Jazz, Free Jazz.
- Jazz Rock. Miles Davis, Black Satin, On the Corner. (Also remixed on Panthalassa: The Music of Miles Davis, 1969-1974.)
- Contemporary. The Bad Plus, Iron Man, Give.
- Walking. Sonny Rollins, Blue Seven, Saxophone Colossus.
- Trading Fours. Clifford Brown, The Blues Walk, Clifford Brown and Max Roach.
Update. I've posted a muxtape of the songs here. Enjoy! (Also, I had to substitute Focus On Sanity from Ornette's Shape of Jazz to Come album due to the length of Free Jazz. SoJtC is an early free album, a little more digestible than FJ.)
23 Jul 2008 3:57am GMT
21 Jul 2008
Planet Ruby
Vladimir Sizikov: New and Noteworthy in JRuby 1.1.3
JRuby 1.1.3 has been released a couple of days ago, and I wanted to highlight some of the most interesting changes in this release, from my perspective.
- RubyGems 1.2. If you ever installed even a single Ruby gem, you know the blues, it was taking way too long, consuming lots of memory and felt pretty heavyweight. In the past, we even had to increase the memory limits for JRuby up to 500Mb so that RubyGems could work without out of memory errors. Not anymore! RubyGems 1.2 is a fantastic release that speed things up dramatically, and JRuby 1.1.3 comes with it by default. Just try it and you'll be amazed, I promise!
Not only that, but RubyGems 1.2 is much easier to customize to suit the needs of particular implementations/platforms, and we've taken full advantage of that, eliminating essentially all custom JRuby-specific patches over the RubyGems sources. All in all, kudos to RubyGems team! - Performance. Some of us are obviously obsessed with performance, so there was a lot of work put into that area, yielding out nice results. And the first measurements are pretty encouraging. Here are some of the things that were improved in this release:
- Interpreter performance is greatly improved (Tom landed a couple of nice bombs on that one!)
- Core IO performance and memory requirements significantly improved, memory leaks in IO eliminated.
- Time's methods are much faster now.
- Some loops and block invocations improvements. Things like 10000.times {} are faster now.
- General performance fixes to use faster methods where possible, JIT tweaks, reduced object churn.
- Memory leaks in RubyArray are fixed.
- Compatibility/Conformance. There were *LOADS* or compatibility test fixes, most of them were driven by RubySpec tests and test failures. For StringIO alone, we had fixed at least 50 failures. Then, lots of fixes for Socket, ARGF, IO, Kernel.select, for problems with various empty expressions, etc. I tell you, if you feel serious about some important functionality in Ruby and want it to be properly tested and consistent across ever-growing list of Ruby implementations, there is no other good way but to join the RubySpec revolution!
- Usability. New command line options like
--serveror--clientinstead of -J-server or -J-client, also--debugand--jdbnow functional on Windows. Better out of memory error messages. Some 1.8.7-level improvements, like better Tempfile (with ability to specify the suffix/prefix for the file). Also, some classloader fixes so that JRuby could work better with 3rd party libraries like Spring, XMLDecoder, Activerecord-JDBC drivers shipped via RubyGems, JXTable, etc, would work without clunky workaround (see JRUBY-2495 for more details).
I think, that's quite enough for 1.5 month work… Enjoy!
21 Jul 2008 8:25pm GMT
20 Jul 2008
Planet Ruby
Ruby on Rails: Internationalization in edge Rails and more
There won't be a Living on the Edge this week, but you won't be starved for info because the Rails community is keeping up.
Ryan Daigle has been keeping up with some of the changes on edge Rails and has done a few awesome explanatory posts on them:
- Easy memoization with ActiveSupport::Memoizable
- Nested model mass assignments
- Metaclass accessor (aka the eigenclass)
Perhaps even more noteworthy is the introduction of I18n (internationalization) support to Rails core. Sven Fuchs explains the technical details and API as well as the history of Rails and I18n. I18n support is scheduled to be fully stable in the Rails 2.2 release. Have an interest in internationalization in Rails? Lend a hand with your ideas, feedback, and patches at the Google Group.
20 Jul 2008 11:39am GMT
Tomasz Wegrzanowski: The New Law of Demeter

Once upon a time I've stumbled upon The Law of Demeter, which says that you can only call methods on:
- self
- self's fields
- current method's parameters
- objects you created
- global objects
It's often shortened to never use two dots in a row. The implication of this law is that if you legitimately obtained order object, then order.customer and order.customer_name are valid, but order.customer.name is not.
At first I interpretted this law literally and (just like you did a moment ago) reached the obvious conclussion that the idea is total bollocks - writing (or even worse autogenerating) thousands of proxy methods like def customer_name; customer.name; end or even delegate :name, :to => :customer is not going to improve quality of your code at all.
It sounded like thousands of those little silly rules like "Use factories instead of constructors" that Java code monkeys make up all the time so they can think "If only everybody adhered to these rules, Java coding would be just fine" and keep blaming other people for their problems instead of blaming themselves for being so daft and sticking with Java. The same Java monkeys don't even adhere to their rules, and so Java coding isn't fine, but even if they did adhere it wouldn't help as you cannot fix technical problems on cultural level. So I forgot about the whole thing and moved along - another stupid rule for people who are afraid of the code.
Of course if this was how the story ended I wouldn't even bother blogging about it. On the contrary, I had a sudden realization - sure, the Law of Demeter has many obvious counterexamples of perfectly valid code that's illegal according to it, and would be made much worse if it was enforced. But most of them go away once it's subtly reinterpretted. There's a loophole for "objects you created" which doesn't have to mean "object you created by ClassName.new" - every object that was created at your request and the producer doesn't hold any reference to it any more can be considered to be "created by you" (old objects whose ownership was relinquished to you should also qualify, but this is a very rare case and I'm not going to talk about it any more). Let's call this interpretation "The New Law of Demeter" and apply it to an exaggerated example:
message = hash.keys.sort_by(&:abs).map(&:to_s).join(" ")
Naively interpretting the Law of Demeter it should not only be illegal but sent straight to the Guantanamo Bay - there are 4 explicit and 2 implicit dots in it, all in a single line.
But this is perfectly sensible piece of code. It does something useful and it's definitely far saner than hash.keys_sorted_by_absolute_value_stringified_and_joined_by_spaces, hash.keys_sorted_stringified_and_joined(&:abs, " ") or any other Demeter transmogrification I can think of.
But why is it really illegal? Every single dot creates a fresh object - Hash#keys does so, and so do Array#sort_by, Numeric#abs, Array#map, Numeric#to_s, and Array#join. The code doesn't violate the New Law of Demeter in any way!
The Law of Demeter is much more permissive when interpretted this way, but I wouldn't stop here yet. I really think order.customer.name is perfectly sensible even if customer is owned by the order. What counterargument some proponents of the Law make? Mostly that it makes customer's name harder to mock if you ever want to make order return customer name not tied to any genuine customer. But why should Order#customer be limited to returning real Customer objects? How about making it return read-only snapshots of current customer? This is exactly how order.customer.XXX is used vast majority of the time. As snapshots don't need any write behaviour, it's very easy to return a snapshot with overridden #name, and as a customer snapshot is a fresh object you've just created, you are allowed to call any methods on it you want.
So let's pretend that every time you do order.customer you really mean order.customer_snapshot, and code like this becomes legal:
puts order.customer.address.postcode
The New Law of Demeter lets you do a lot more than the old, including legalizing almost all sensible view code, but it still delegalizes a lot of dubious code. For example code like this is definitely not legal, as it expects read-write objects, not fresh snapshots:
order.customer.first_name = "Bob"
This is also not legal, as snapshots should be fresh objects and not be auto-updatable:
customer = order.customer
order.add(item)
puts customer.balance # item included or not?
but it can easily be transformed into this legal code:
order.add(item)
customer = order.customer
puts customer.balance # item definitely included
Reinterpretted this way, the New Law of Demeter is really useful at detecting code smell. Unfortunately it also becomes much more difficult to apply, as you need to look not only at the syntax but also at the sema an object to which the original creator still holds a reference; and does the method affect state or does it have snapshot access semantics and so on. I'd love to see a Demeter lint which would try to check compliance with The New Law of Demeter in a meaningful way.
20 Jul 2008 1:28am GMT
19 Jul 2008
Planet Ruby
Ruby on Rails: RailsConf Europe schedule now available
The schedule for RailsConf Europe sessions have now been made public. If you still haven't signed up to go, there is still time to make it. This is the last year we'll be in Berlin, so it's a great time to come by.
19 Jul 2008 7:09pm GMT
18 Jul 2008
Planet Ruby
Dave Thomas: If you work for Apple, we need your help...

So, it's been a week. The iPhone 3G was released, and the iPhone 2.0 software upgrade shipped. We can now all write applications that will run on what is undeniably a cool platform. You can even download the iPhone SDK for free.
We love Apple stuff. Andy and I both run Macs as our main machines (at last count, I have 8 Macs in my home). When the App Store was announced, we were keen to spread the word. We have a new book on iPhone software development. We have a brand new section in Bill Dudney's Core Animation book. And Mike Clark has a set of screencasts on how to write code for the little fella.
And we can't show you any of them.
Why? Because in order to get the iPhone SDK, you have to agree to the terms and conditions-you know, that standard box of legalese that you skip over before pressing I AGREE. Except, in this case, the legalese has some unfortunate consequences. For example, it says:
3. Confidentiality. As a Registered iPhone Developer, you may be receiving information from Apple. You agree that all information disclosed by Apple to you that relates to Apple's products, designs, business plans, business opportunities, finances, research, development, know-how, personnel, or third-party confidential information, will be considered and referred to collectively as "Confidential Information." ..... You agree not to disclose, publish, or disseminate any Confidential Information to anyone other than to other Registered iPhone Developers who are employees and contractors working for the same entity as you and then only to the extent that Apple does not otherwise prohibit such disclosure in this Agreement. Except for your authorized purposes as a Registered iPhone Developer, you agree not to use Confidential Information in any way, including, without limitation, for your own or any third party's benefit without the prior written approval of an authorized representative of Apple in each instance.
Then, later on, it says
10. Use Of Apple Trademarks, Logos, etc. You agree to follow Apple's Guidelines For Using Apple Trademarks and Copyrights as published on Apple's website at www.apple.com/legal/guidelinesfor3rdparties.html and as may be modified from time to time. You agree not to use the marks "Apple," the Apple Logo, "iPhone," "iPod touch" or any other marks belonging or licensed to Apple in any way except as expressly authorized in writing by Apple in each instance.
So, to write a book about the iPhone SDK, you have to download it. In order to download it, you have to accept the agreement. And the agreement says that the download will contain confidential information that you can't pass on to third parties. That makes it hard to publish the book. And, if that wasn't enough, it also appears that you can't even use the word "iPhone" (for example, in a book title).
We'd hoped these restrictions would be lifted at the announcement of the 3G. They weren't. We'd hoped they'd be lifted when the 2.0 software shipped last week. They weren't. We'd heard rumors they'd be lifted on July 14th. They weren't. And, what's worse, we can't pin down anyone who'll tell us just what is going on.
We're not the only people in this particular boat. Manning, for example, has an iPhone development book. They've published the part on doing web-based iPhone development (using the iPhone as a browser), but they're having to hold back the SDK-related material.
So, here's where you can help. If you work for Apple, and have any ideas on who we can contact to find out what's going on, we'd really appreciate knowing. We'll keep your personal information confidential. Just drop me an e-mail at dave@pragprog.com.
Thanks
Dave
18 Jul 2008 3:42pm GMT
17 Jul 2008
Planet Ruby
Mauricio Fernandez: Reexamining qsort, eager vs. lazy algorithm analysis and Ruby's (and other's) GC
Yesterday's post on The Comonad.Reader referred to my analysis of the list-based "quicksort" and described a faster function based on difference lists. It gave an example that troubled me for a while, and added that "the author [referring to me] was counting conses, unfortunately that isn't a valid metric":
reverse (a:as) = reverse as ++ [a] reverse [] = []
This function exhibits
performance, even though there seems to be one cons per level; this would show that just counting conses isn't enough, and invalidate my analysis. The catch is that there is not only one cons per level, and one must be careful and not forget the conses implied by the list concatenation operation. In my analysis of qsort, the number of conses verified

with
, where the final
term represents the conses needed to concatenate the sorted sublists.
If the above function is analyzed the same way I did with qsort, the number of conses satisfies

which clearly implies
.
The key term is
, which represents the conses associated to the concatenation, the same way
did in qsort's expression; had it not been there, the function would be
. So, while this example doesn't invalidate my proof, it serves as a good reminder of the cost of list concatenation and how easily it turns linear-time algorithms into
.
(Incidentally, it turns out that qsort would be
even without the extra term caused by list concatenation, so the proof, albeit incorrect, would have reached the correct conclusion.)
Strict vs. non-strict, eager vs. lazy analysis
While it seems the original analysis was sound, a new question was raised indirectly when Edward Kmett wrote that
it's not the number of times you cons that matters, but the number of thunks that have to be forced to get to the answer
I've shown above that this is only true if you aren't careful when counting conses, but the problem it suggests deserves to be considered by itself: can the analysis under a strict semantics be taken as an upper bound of the complexity in the non-strict or lazy cases?
I always assumed that the answer was a straight "yes". Intuitively, under lazy evaluation you have thunks with the data necessary to evaluate their associated expressions only when needed, so surely at most as much work as if they were always evaluated is performed, right?
After pondering about this longly, I've found a case where this wouldn't be true, suggested directly by an alternative definition of non-strict semantics:
Non-strict semantics means that a function can have a definite value although its argument is undefined.
What if an algorithm depended on an undefined value for termination? In practice, this would mean that it relies on an expression raising an exception to stop the computation and return some result after recovery. Whereas in the strict case the expression would be evaluated unconditionally and the computation would terminate, with a non-strict semantics it would be left unevaluated if not needed, thus failing to raise the exception and finish in a timely manner.
Edit (2008-07-18): There's an example below for an impure language. Anyway impurity + laziness is a nonsensical combination and the example represents 2 different programs in one that happen to give the same answer, so I'll be happy to analyze pure programs and assume that "lazy evaluation always needs less reduction steps than eager evaluation" to obtain upper bounds (thanks apfelmus).
The above case is contrived, and such a program would be needlessly complex. In fact, in OCaml it would be expressed awkwardly with a ref, and in Haskell pattern matching over an Either value (implicit in bind) would normally force evaluation, thus behaving as the strict counterpart.
Barring this exception, it seems to me that analyzing under a strict semantics remains a good way to get an upper bound for the non-strict case, but lacking a formal proof (and even with one) you can never be too sure. Any pointers?
Something relevant might be found by following the trail given by this paper by P. Wadler, whose abstract states the following:
In a strict functional language, analysis of time complexity is straightforward, because of the following compositional rule: The time to evaluate (ƒ(g x)) equals the time to evaluate (g x) plus the time to evaluate (ƒ y), where y is the value of (g x). However, in a lazy language, this rule only gives an upper bound, possibly a crude one.
I haven't read the paper yet nor followed its references/citations to see if there's a proof for that. The problem I can foresee is that the counterexample I came up with might not happen in the simplified languages/computational models used in most papers.
At any rate, it seems that, for all but a few contrived programs (relying on "useless" expressions being evaluated to raise an exception and terminate), simple "eager" analysis remains a good way to obtain an upper bound for the non-strict case. If it is low enough, we can stop there and enjoy.
Superlinear GC work: why Ruby scales badly, and a possible explanation of qsort's behavior?
17 Jul 2008 11:38am GMT
15 Jul 2008
Planet Ruby
Mauricio Fernandez: Quicksort erratum
A few days ago, I called the following Haskell function, reminiscent of Quicksort and considered the epitome of beautiful code by many, unusable for taking "always
space and time":
qsort [] = [] qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
Moreover, I referred to this version (using a C-like DSEL in Haskell) due to Lennart Augustsson as a usable one (as opposed to a
algorithm):
qsortM :: forall i a m r arr .
(MonadRef m r, Num i, Ix i, MArray arr a m, Ord a Bool) =>
arr i a -> m (arr i a)
qsortM ma = runE $ do
(lb, ub) <- embed $ getBounds ma
let mlb = pure0 lb
mub = pure0 ub
a <- liftArray ma
let qsort' l r =
if1 (r > l) $ do
i <- auto l
j <- auto (r+1)
let v = a[l] :: E m a
iLTj = i < (j :: E m i)
while iLTj $ do
while ((i += 1) < mub && a[i] < v)
skip
while (a[j -= 1] > v)
skip
if1 iLTj $ do
a[i] =:= a[j]
a[l] =:= a[j]
qsort' l (j-1)
qsort' i r
qsort' mlb mub
return ma
I was wrong. Instead of analyzing it with some care, I believed the
claims I read long ago in some now forgotten blog. I just saw the list concatenations and reacted without thinking.
Finding an upper bound for the number of comparisons in the best case is easy. In fact, while I'am at it, I'll find a bound for the number of cons operations, which is clearly greater than that of comparisons.
Noting the number of conses C(n) for a list of size n, for n > 3 (there are a few irregularities for n = 1 to 3 because no cons is needed when appending or prepending [] to another list),
, where the problem for a list of length n has been divided into two halves (lists of length p and q) plus a lone pivot element. p (resp. q) conses are needed to build the list passed to the recursive call to qsort, C(p) (C(q)) conses are needed for the subproblem, and finally p + 1 conses are needed to concatenate the resulting sorted sublists.
The problem is how to find p and q. For the best case analysis,
. The n list can be divided into the desired three parts as follows:


Therefore,

(There are two choices for the number of conses in the final list concatenation, differing by 1. The above is a conservative one, which will give an upper bound.)
Even after referring to Graham & Knuth & Patashnik, I saw no immediate way to solve this recurrence (please drop a comment if you have a closed-form solution). Fortunately, I just need an upper bound so I can use

which is easily proved by induction, since all I've done is to remove the floor operations.
Then, using the master theorem


(Note the use of
for C(n) and
for D(n).)
The above program thus uses at most
space and time. The upper bound is satisfactory enough; in fact, it's the same as for the real quicksort, and, furthermore, the lowest achievable bound for a comparison sort.
So far, I've assumed that the program evaluates everything eagerly, which might not be the case in Haskell, as it has a non-strict semantics (which is not exactly the same as being lazy). Let's see if GHC is really smart and can recognize that qsort(l) is just a permutation of l to skip the sort altogether when some operation that doesn't depend on the order is performed (this is quite close to proving that the qsort function actually sorts the list, and would be an amazing exploit indeed).
To that end, I first rewrite the function in OCaml; if GHC is able to do less work thanks to its non-strict semantics, the solution will scale better than with OCaml:
let rec qsort = function [] -> [] | x::xs -> qsort (List.filter ((>) x) xs) @ [x] @ qsort (List.filter ((<=) x) xs)
It's very similar to the Haskell one; the main difference is in the signs used with List.filter: ((>) x) means "x is greater than _", as opposed to "_ is greater than x" in Haskell.
The main function just creates a list of random integers, sums them up, and finally adds the numbers from the sorted list --- obtaining the same number, I hope (the sums are printed to stdout to make sure that computation is not omitted).
Here are the times I get:

The GHC binary takes nearly 800MB to sort a 3 million list, which explains why it suffers as n increases.
Well, this isn't... what was it, GHC 11? Being lazy does pay off if you try to get the nth (and in particular the first) element, though; using
15 Jul 2008 10:52am GMT
12 Jul 2008
Planet Ruby
Ruby on Rails: This Week in Rails (July 11, 2008)
Welcome to the third edition of This Week in Rails, a weekly report with highlights from the Rails community. My apologies for the delay of this post, the past two weeks have been pretty crazy, so this edition covers the most interesting articles and news from the past two weeks.
Let's kick off this report with a couple of maintenance releases by Jamis Buck. Both Capistrano 2.4.3 and Net::SSH 2.0.3 were published two weeks ago. If you use them, consider upgrading.
Rails 2.1 has been out for a while now, but in case you didn't have a chance to catch up yet, this post collects several links to useful resources which will help bring you up-to-date.
The Pathfinder Development's blog put out three highly interesting posts. The first is More Named Scope Awesomeness by Noel Rappin, while the second and third ones are Pretty blocks in Rails views and DRYing up Rails Controllers: Polymorphic and Super Controllers, both by Josh Symonds. Another good (and quick) recent read about controllers, was MVC: How to write controllers.
The same Noel also published the second part of "Developing iPhone applications using Ruby on Rails and Eclipse" for DeveloperWorks (part 1 and 2).
FiveRuns released a valuable gem called data_fabric which adds support for sharding and replication to Active Record. The same company also has a contest up and they're offering two free tickets to RailsConf Europe in Berlin. Speaking of conferences, Fabio Akita announced that there will be a Rails Summit Brazil 2008 this coming October in São Paulo. This will be the first event of its kind for the Rails community in South America.
An improved version (i.e. 1.1.1) of the Oracle enhanced adapter was released, as well as version 0.9.5 of the IBM_DB adapter for DB2 and Informix, which adds support for Rails 2.1.
In purely chronological order, I found the following articles to be worth pointing out: Speed up slow Rails development in vista - a handy tip for developers using Vista, Adding Google Maps To Your Rails Applications, Live fulltext search in Ruby on Rails and Useful Flash Messages in Rails.
The Railscasts website published two new episodes, one on testing through Selenium, and another on semi-static pages.
Finally, let's close this edition on a lighter note. The next time you are about to create an acts_as_an_evil_genius plugin or other imaginatively named one, think about this post. ;-)
If you'd like to read more updates from the Ruby side of things, please head over to This Week in Ruby.
12 Jul 2008 2:04am GMT
11 Jul 2008
Planet Ruby
Jim Weirich: Rails Conf 2008 Summary
Conference Summary Video
Wow, what a great conference! There was a lot of energy flowing at RailsConf this year. Overall I'd rate this year as head and shoulders above last year. I'm not going cover much here, but will direct you attention to a Rails Envy VideoCase that Greg Pollack put together. The video is a series of very short interviews with a number of presenters giving summaries of their own talks. The only downside with the video is that I wish it was available before the conference. I see there were a number of interesting talks that I missed.
Followup on the "Modelling Dialogue"
Joe O'Brien, Chris Nelson and myself did a dialogue style presentation on the difference between object modelling and data modelling. The most common question I got after the talk was requests for book titles to learn more about object oriented modelling. Here are the books that Joe, Chris and I have recommended:
- Domain Driven Design-Eric Evans
- AgileSoftware Development, Principles, Patterns, and Practices-Bob Martin
- Refactoring: Improving the Design of Existing Code-Martin Fowler
11 Jul 2008 11:34am GMT
Jim Weirich: How did you get started in software development.
Tagged
Looks like Joe O'Brien tagged me for answers to the following questions. He, in turn, was tagged by Josh Owens, who in turn was tagged by Jeff Blankenburg. It looks like Sarah Dutkiewicz and Micheal Eaton started this.
OK, sounds like fun. Here goes.
How old were you when you started programming?
I was introduced to programming in high school by reading a book on the topic. The book taught me how to write machine code for a strange decimal-based machine. Unfortunately, there was no actual computer involved in the process. Shoot, who had computers back then? Certainly not our high school (the personal computers? not invented yet!)
In college, I learned a smattering of FORTRAN. Just enough to drive a Calcomp plotter to plot data from my undergraduate physics courses. But didn't really get into programming until my junior year in college. (Story continued in next question)
How did you get started in programming?
So, I was planning out the courses for my junior year in college and I had a hole in my math courses. The math class I needed was not offered that semester, so my adviser suggested taking a computer programming course. He said it would be useful and, who knows, I might enjoy it.
So I signed up for an introduction to FORTRAN course, figuring it would be easy because I already knew a little bit of FORTRAN. I show up on the first day of class and after a few preliminaries the instructor jumps right into some code, that looked like this:
(de member (pip deck) (cond
((null deck) nil)
((eq pip (car deck)) t)
(t (member pip (cdr deck)))))
I remember scratching my head and thinking this was the strangest FORTRAN I had ever seen. I was totally confused for about three days, then something clicked on the third day of class. I suddenly "got" what the instructor was trying to get across and it all made perfect sense.
If you haven't figured it out yet, the instructor taught us Lisp as part of an introduction to FORTRAN. The instructor turned out to be Daniel Friedman, the author of The Little Lisper, and was well known in the Lisp community. That small exposure to Lisp hooked me on programming from that point on. I took as many CompSci courses as I could in my remaining year and a half in college. I eventually graudated with a BS in Physics, but had a strong background in Computer Science as well.
What was your first language?
Technically, FORTRAN was my first language. But Lisp is the language I fell in love with and is what got me hooked on programming.
What was the first real program you wrote?
I have a very clear memory of the very first program I wrote professionally. The reason it is so clear is that this was the first program I wrote that was intended for actual use by someone who wanted it. Everything else up to that time was done for my own personal enjoyment or to satisfy some course requirement.
The program calculated the "critical angles" of "pieces". I was given the requirements by Anne Exline, a senior programmer, and proceeded to write the program to spec. It took a few days, but when I was done I showed the result to Anne and she was pleased with the result.
The funny thing is that I had no idea what a "piece" was nor what was so critical about the angles I was calculating. I was so excited about writing an actual program that I did not ask until the software was done. When asked, Anne just looked at me funny and said "Rocket Pieces". When Cape Canaveral lauches a rocket, they track it very carefully to make sure it stays on course. If it strays, the range safety officer is required to activate the self destruct. The critical angles are those angles that would cause the "rocket pieces" to land outside the safety area of the flight path.
So, my very first professional program was not only useful, it might actually save lives.
What languages have you used since you started programming?
Languages I have used as part of my professional career (in roughly chronological order) include FORTRAN, various assembly languages, FORTH, C, PL/M, C++, Java, Ruby.
Languages I have used in addition to those mentioned above: Pascal, Perl, Eiffel, and Lisp/Scheme.
Languages I can read, but never wrote anything significant in them: Ada, Python, Erlang, Smalltalk, SNOBOL, Algol, Pascal.
What was your first professional programming gig?
I was hired by the RCA Missile Test project in Cape Canaveral, Florida as a Near Real Time Analyst. Duties included programming various launch related software (e.g. the critical angle program mentioned above) and working launch support.
The launch support was the "Near Real Time" part of the job description. From the moment a rocket is launched until it reaches orbital velocity, any malfunction could cause it to fall back to earth. During this initial portion of the launch, the launch is monitored in "real-time" so that we know exactly where it would land if the engines were to cut off NOW. Trajectory calculations had to be done in fractions of a second and updated constantly in real time.
After the rocket reaches oribital velocity (usually somewhere between 8 and 14 minutes into its flight), it won't fall back to earth. At this point the real time trajectory program is shut down and the near real time program is started. The near real time program can take a few minutes to calculate a more exact orbital prediction and then send that prediction to downrange radars (e.g. the the Ascension Island station) that won't see the rocket until about 20 minutes after launch. It was the job of the Near Real Time analyst to run that program and provide oribital predictions for downrange station.
If there is one thing you learned along the way that you would tell new developers, what would it be?
Find something that you enjoy and do that. Life is too short to work in a job that you dislike.
What's the most fun you've ever had … programming?
Oh, the fun I have had. This story still makes me smile.
My first computer was a single board Z80 microcomputer with 4 KB of memory. I wrote a small FORTH-like interpreter for it and hacked a version of the animal game in FORTH. The animal game is a program that plays 20 questions to figure out what animal you are thinking of. It constructs a binary tree where each node is a question and the subtrees are the yes and no answers to the question. To play the game, all the program does is walk the tree, ask the question at the current node and follow either the YES branch or the NO branch as appropriate.
If the program guesses wrong, it will ask you for your animal and a question that will distinguish your animal from the one it guessed. It then adds your question to the tree. By this extremely simple mechanism, it is able to expand its knowledge base. (see Ruby Quiz #15 for more details).
I had just finished the program and had seeded it with a single animal, a mouse. I turned to my wife and asked her to play the game. She thinks of an animal and starts the program, which immediately asked her "Is it a mouse?". She turned to me with surprise and said "How did it know?". Of course, the animal she picked was a mouse.
I don't think I have ever impressed anyone with my programming skills as much as she was impressed with that game.
Who's up next?
I'm tagging the following people. Remember, this is entirely voluntary so don't feel obligated to answer. But I'm betting the answers are interesting:
11 Jul 2008 11:34am GMT
Jim Weirich: Artichoke Music Rocks
The Musician's Birds of a Feather gathering at RailsConf was great. We had a room full people, two guitars, a ukulele, a flute, several harmonicas and an improvised drum set. Unfortunately, one of the guitars was an electric travel guitar which had a dead battery, therefore no way to really hear it.
However, the other guitar was a nice Epiphone accoustic which was passed from player to player. It became the quickly became the basis for most of the music performed that night.
I want to thank Artichoke Community Music for supplying the guitar. Travelling with a guitar by plane is a big pain, so I arrived with nothing to bring to the music BOF. I called several local music stores looking for a guitar that I could rent for an evening. Artichoke music said they had a "not-for-profit" guitar that they would let me borrow for a day. Not many stores would do that for an out-of-town stranger.
So, if you're in Portland looking for a good guitar store, check out the great people at Artichoke Community Music.
11 Jul 2008 11:34am GMT
Ruby Weekly News: 8th - 14th July 2007
11 Jul 2008 11:32am GMT
Ruby Weekly News: 22nd - 28th July 2007
11 Jul 2008 11:32am GMT
Ruby Weekly News: 15th - 21st July 2007
11 Jul 2008 11:32am GMT
Tom Copeland: Browsing git repositories on RubyForge
Thanks to gitweb and an assist from Garry Dolley, you can now browse the Git repositories that are hosted on RubyForge. For example, Garry's EBay4R project's scm page now links to a browsable repo. Good times.
The speed is decent, especially considering it's a CGI script. The GitHub browser is much nicer, but, still, 'tis a start. Enjoy!
11 Jul 2008 3:52am GMT




