09 Nov 2011

feedPlanet Perl Six

Jonathan Worthington (6guts): Slides from my Optimizing Rakudo Perl 6 talk

Over the weekend, I visited to Bratislava for a few days, the beautiful city I once called home. It felt oddly familiar, and I found myself noticing all kinds of little changes here and there - where one shop had given way to another, or a statue had appeared or changed. Happily, my favorite eating and watering holes were still there, and my sadly somewhat rusted Slovak language skills were still up to decoding menus and ordering tasty beer, and I did plenty of both. :-)

I was there was to attend the Twin City Perl Workshop. I repeated my Perl 6 grammars talk, and gave a new one about optimizing Rakudo. This included both the optimization work myself and others have been doing, but also some details about the optimizer. I also made a couple of nice diagrams of Rakudo's overall architecture and what it does with a program.

You can get the slides here, or if you're heading to the London Perl Workshop this coming Saturday, I'll be delivering it there too. Enjoy! :-)


09 Nov 2011 10:41pm GMT

Solomon Foster: Fixing Tags

So, in my last post I identified 3906 files in my MP3 collection with missing tags. This time I set out to fix some of them.

So, first I went through the list I generated with the last script and singled out all 2294 files which used a standard pattern of Artist / Album / Track Number - Track Name. Then I wrote this script:

my $for-real = Bool::True;

constant $TAGLIB  = "taglib-sharp,  Version=2.0.4.0, Culture=neutral, PublicKeyToken=db62eba44689b5b0";
constant TagLib-File    = CLR::("TagLib.File,$TAGLIB");
constant String-Array   = CLR::("System.String[]");

for lines() -> $filename {
    my @path-parts = $filename.split('/').map(&Scrub);
    my $number-and-title = @path-parts.pop;
    next unless $number-and-title ~~ m/(\d+) \- (.*) .mp3/;
    my $track-number = ~$0;
    my $title = ~$1;
    my $album = @path-parts.pop;
    my $artist = @path-parts.pop;
    say "$artist: $album: $title (track $track-number)";

    if $for-real {
        my $file;
        try {
            $file = TagLib-File.Create($filename);
            CATCH { say "Error reading $filename" }
        }

        $file.Tag.Track = $track-number.Int;
        $file.Tag.Album = $album;
        $file.Tag.Title = $title;
        $file.Tag.Performers = MakeStringArray($artist);
        
        try {
            $file.Save;
            CATCH { say "Error saving changes to $filename" }
        }
    }
}

sub Scrub($a) {
    $a.subst('_', ' ', :global);
}

sub MakeStringArray(Str $a) {
    my $sa = String-Array.new(1);
    $sa.Set(0, $a);
    $sa;
}


For the main loop, the first half uses standard Perl techniques to extract the artist, album, and track info from the path. The second half sets the tags. Opening the file is the same as last time, and then setting Track, Album, and Title is as simple as could be. The Performers tag is a bit tricky, because it's a string array (the others are simple strings or integers) and Niecza doesn't know how to do the coercion automatically. MakeStringArray gets the job done nicely.

So, if you've done this sort of thing in Perl 5 using the MP3 CPAN modules, there's nothing at all revolutionary about this code. But it feels really good to be able to do it with Perl 6!


09 Nov 2011 3:12am GMT

08 Nov 2011

feedPlanet Perl Six

Carl Masak: Macro grant accepted

A while ago, I applied for a Hague grant to give Rakudo macros. It has now been accepted. My reaction:

Yay! \o/

I haven't blogged much lately, and I suspect it might be because my blogging plans are too ambitious to fit in my fragmented schedule. So I'll stop here. But expect the first status report of the grant soonish.

08 Nov 2011 10:20am GMT

06 Nov 2011

feedPlanet Perl Six

Solomon Foster: Examining MP3 Tags

I've been playing around with Niecza's ability to handle CLR libraries. It's actually kind of intoxicating; it's the closest thing yet to having a CPAN for Perl 6. So I decided to see what I could do with the TagLib# library for dealing with media file tags.

Now, I've got an MP3 library with 23564 MP3 files in it, the majority of which were created by older ripping files that didn't do anything with the ID tags. Most of those have been updated to include tags, but every now and then I add one of the old directories to iTunes and get a bunch of "Unknown Artist" / "Unknown Album" tracks.

So I thought a nice first project would be figuring out which of the tracks was correct. The first thing to do was to get TagLib# properly installed on my MacBook Pro. make install didn't add the DLL to the main GAC; I ended up installing it there by hand, which was trivially easy once I knew what to do:

sudo gacutil -i taglib-sharp.dll
sudo gacutil -i policy.2.0.taglib-sharp.dll

Once I had that done, I experimented with it for a bit, and ended up with this script:

constant $TAGLIB  = "taglib-sharp,  Version=2.0.4.0, Culture=neutral, PublicKeyToken=db62eba44689b5b0";
constant TagLib-File    = CLR::("TagLib.File,$TAGLIB");

for lines() -> $filename {
    try {
        my $file = TagLib-File.Create($filename);
        unless $file.Tag.JoinedPerformers ~~ m/\S/ && $file.Tag.Title ~~ m/\S/ {
            say $filename;
        }
        CATCH { say "Error reading $filename" }
    }
}

The first line specifies the exact assembly we want to use; you can get the details from gacutil -l. The next line effectively imports the TagLib::File class into Niecza. I get my filenames from stdin, as that allows me to use find to generate the list of MP3 files.

This was my first use of exception handling in Perl 6. I needed it because TagLib-File.Create throws an exception when it isn't happy with the MP3 file. When it is happy with it, $file is an object of type CLR::TagLib::Mpeg::AudioFile. $file.Tag.JoinedPerformers gives the list of performers (AKA artists) as a single string; $file.Tag.Title gives the title as a string. Unless we find a valid non-space character in both of them, we flag the file by printing it out.

Really, the only way it could be significantly simpler than this would be if the constant TagLib-File line were unnecessary!

End result: I have a list of 3906 files it flagged, 77 of which were read errors.

My next step is to write some code which translates the filenames (which are mostly of the form /Volumes/colomon/Albums/Dervish/Live_in_Palma/04-Slow_Reels.mp3) into artist, album, and track name fields, and then set those tags. Based on my initial experiments, I think it's is going to be incredibly easy…


06 Nov 2011 8:54am GMT

01 Nov 2011

feedPlanet Perl Six

perl6.announce: Announce: Niecza Perl 6 v11 by Stefan O'Rear


Announce: Niecza Perl 6 v11

This is the eleventh release of Niecza Perl 6, as usual scheduled on
the last Monday of the month, at least in the US west coast time zone.

You can obtain a build of Niecza from [1]. This build contains a
working compiler as a set of .exe and .dll files suitable for use with
Mono or Microsoft .NET. If you wish to follow latest developments,
you can obtain the source from [2]; however, you will still need a
binary for bootstrapping, so you gain nothing from a "source is
better" perspective.

Niecza is a Perl 6 compiler project studying questions about the
efficient implementability of Perl 6 features. It currently targets
the Common Language Runtime; both Mono and Microsoft .NET are known to
work. On Windows, Cygwin is required for source builds only; see the
README for details.


List of changes


[Major features / Breaking changes]

The compile time / runtime metamodel unification mentioned in the v10
announce is done now. Real Sub and ClassHOW objects are created at
compile time and stored in a .ser (serialized data) file alongside the
.dll. When using modules the .ser file is loaded; .nam files are gone.
Some operations are faster and others are slower.

All non-dotnet backends no longer work and have been removed, since
niecza now requires much closer integration between the front and back
ends. Restoring them would not be impossible.

BEGIN time code execution is now supported! In particular, a BEGIN in
a module is run *once*; any objects it creates will be serialized, and
the BEGIN is not re-run at runtime. It is erroneous to modify objects
owned by a different module at BEGIN time.


[Minor new features]

The x*+ syntax is now supported as a synonym of x**, doing possessive
quantification like Perl 5.

"make spectest" now respects TEST_JOBS. (Will Coleda)

todo is now supported in tests.

Various improvemends to the p5 interop system. (Paweł Murias)

CLR interop example improvements: make clock resizable, add notepad,
tetris, webbrowser. (Martin Berends)

Fleshed out TextWriter and open to some semblance of usability.
Added close, unlink, mkdir. (Solomon Foster)

Added .pick and .roll. (Solomon Foster)

Added log, log10, exp, cis, polar, gcd, lcm. (Solomon Foster)

Handling of variable types that are constrained to Mu but default to
Any are more consistant.

[Selected bug fixes]

grep /regex/, @list no longer crashes.

" (a source file with an unclosed string) no longer crashes the compiler.


Getting involved

Contact sorear in irc.freenode.net #perl6 or via the sender address of
this mailing. Also check out the TODO file; whether you want to work
on stuff on it, or have cool ideas to add to it, both are good.

Future directions

In the wake of the /serialize branch merge there will be a lot of
bugs to fix and documents to update. I am also looking at designing
a good practical realization of S11 and module packaging.


[1] https://github.com/downloads/sorear/niecza/niecza-11.zip
[2] https://github.com/sorear/niecza

01 Nov 2011 12:04pm GMT

24 Oct 2011

feedPlanet Perl Six

Tadeusz Sośnierz (tadzik): MuEvent: AnyEvent lookalike for Perl 6

Trying to struggle with Select in Parrot, I accidentally discovered that its Socket has a .poll method. What a trivial, yet satisfying way to have some simple non-blocking IO. Thus, MuEvent was born.

Why MuEvent? Well, in Perl 6, Mu can do much less than Any. MuEvent, as expected, can do much less than AnyEvent, but it's trying to keep the interface similar.

You're welcome to read the code, and criticise it all the way. Keep in mind that I can no idea how should I properly write an event loop, so bonus points if you tell me what could have been done better. I don't expect MuEvent to be an ultimate solution for event-driven programming in Perl 6, but I hope it will encourage people to play around. Have an appropriate amount of fun!


24 Oct 2011 9:06pm GMT

18 Oct 2011

feedPlanet Perl Six

perl6.announce: Parrot 3.9.0 "Archaeopteryx" Released by Jonathan "Duke" Leto

On behalf of the Parrot team, I'm proud to announce Parrot 3.9.0
"Archaeopteryx".
Parrot (http://parrot.org/) is a virtual machine aimed at running all
dynamic languages.

Parrot 3.9.0 is available on Parrot's FTP site
(ftp://ftp.parrot.org/pub/parrot/releases/supported/3.9.0/), or by following the
download instructions at http://parrot.org/download. For those who would like
to develop on Parrot, or help develop Parrot itself, we recommend using Git to
retrieve the source code to get the latest and best Parrot code.

Parrot 3.9.0 News:
- Core
+ The whiteknight/kill_threads branch was merged, which
removes the old and broken
thread/concurrency implementation. Better and more flexible
concurrency primitives
are currently being worked on. This also involved removing
some of the last vestiges
of assembly code from Parrot as well as removing the share and
share_ro vtables.
+ random_lib.pir was removed, since better alternatives already exist
+ The freeze and thaw vtables were removed from Default PMC,
because they weren't
useful and caused hard-to-find bugs.
+ A new subroutine profiling runcore was added. It can be
enabled with the command-line
argument of -R subprof . The resulting data can be analyzed
with kcachegrind.
+ Added get_string VTABLE to FixedIntegerArray and FixedFloatArray PMCs
+ The update() method was added to the Hash PMC, which updates
one Hash with the contents
of another. This speeds up rakudo/nqp startup time.
- Languages
+ Winxed
- Updated snapshot to version 1.3.0
- Added the builtin sleep
- Modifier 'multi' allows some more multi functionality
- Community
+ New repo for the Parrot Alternate Compiler Toolkit, a
re-implementation of
PCT in Winxed: https://github.com/parrot/PACT
- Documentation
+ We are in the process to migrating our Trac wiki at
http://trac.parrot.org/ to Github
at https://github.com/parrot/parrot/wiki
+ Packfile PMC documentation was updated
- Tests
+ Select PMC tests improved to pass on non-Linuxy platforms

The SHA256 message digests for the downloadable tarballs are:

923b5ef403c26dd94c04127940659aea94516f79243a80de65fbababff44bfad
parrot-3.9.0.tar.bz2
568bfffad0bc7595164f342cd39c33ac967286423844491e85a8f9767f15871c
parrot-3.9.0.tar.gz

Many thanks to all our contributors for making this possible. This
release comprises
182 commits by 17 authors on the master branch since the previous release:

Michael Schroeder, Whiteknight, soh_cah_toa, Jonathan "Duke" Leto,
Brian Gernhardt, Andy Lester, Christoph Otto, Peter Lobsinger,
jkeenan, NotFound,
Jimmy Zhuo, Stefan Seifert, Andrew Whitworth, Francois Perrad, Moritz Lenz,
Tadeusz Sośnierz, gerd

Our next scheduled release is 15 November 2011.

Enjoy and may the force be with you!

Duke

--
Jonathan "Duke" Leto <jonathan@leto.net>
Leto Labs LLC
209.691.DUKE // http://labs.leto.net
NOTE: Personal email is only checked twice a day at 10am/2pm PST,
please call/text for time-sensitive matters.

18 Oct 2011 8:19pm GMT

Solomon Foster: Ease of FatRat construction

So, on #perl6 today tried using the numeric literal .3333333333333333333333333333333. (Warning: exact number of 3's may not match original example.) By the spec (as I understand it), this is a Num, because a Rat isn't accurate enough to represent it. (Not that a Num is, mind you!)

And that got me to thinking: What if you really wanted a FatRat, so you actually got that exact number? Well, if you're using Niecza (the only p6 to implement FatRat so far), the answer is FatRat.new(3333333333333333333333333333333, 10000000000000000000000000000000). IMO, that's ridiculously awkward.

The spec may imply you can do it with ".3333333333333333333333333333333".FatRat. That at least avoids the problem of counting the zeros, but it's still on the ugly side. Likewise FatRat.new(".3333333333333333333333333333333") is awkward. Still, we should certainly support at least one of these options.

I would like to propose again adding an F suffix to indicate a numeric literal should be a FatRat. I don't think this is something that can reasonably be done with a postfix operator, because if you treat .3333333333333333333333333333333 like a normal numeric value and then try to FatRat it, you will lose the precision you want.

Just as a quick comparison, here's a bit of the old endless pi code using the FatRat constructor:

sub unit() { LFT.new(q => FatRat.new(1, 1),
                     r => FatRat.new(0, 1),
                     s => FatRat.new(0, 1),
                     t => FatRat.new(1, 1)); }


I'm proposing we should be able to write that as

sub unit() { LFT.new(q => 1F,
                     r => 0F,
                     s => 0F,
                     t => 1F); }


Much shorter and much clearer. I think that's a big win.

(Note: I'm in no way particularly attached to the letter "F" for this, that was just the first thing that came to mind.)


18 Oct 2011 5:18pm GMT

15 Oct 2011

feedPlanet Perl Six

Jonathan Worthington (6guts): An optimizer lands, bringing native operators

For some weeks now, I've been working on adding an optimizer pass to Rakudo and implementing an initial set of optimizations. The work has been taking place in a branch, which I'm happy to have just merged into our main development branch. This means that the optimizer will be included in the October release! :-) In this post, I want to talk a little about what the optimizer can do so far.

When Optimization Happens

When you feed a Perl 6 program to Rakudo, it munches its way through your code, simultaneously parsing it and building an AST for the executable bits, and a bunch of objects that represent the declarative bits. These are in various kinds of relationship; a code object knows about the bit of as-yet uncompiled AST that corresponds to its body (which it needs to go and compile just in time should it get called at BEGIN time), and the AST has references to declarative objects (types, subs, constants). Normally, the next step is to turn this AST into intermediate code for the target VM (so for Parrot, that's PIR). The optimizer nudges its way in between the two: it gets to see the fully constructed AST for the compilation unit, as well as all of the declarative objects. It can twiddle with either before we go ahead and finish the compilation process. This means that the optimizer gets to consider anything that took place at BEGIN and CHECK time also.

Using The Optimizer

The optimizer has three levels. The default level is 2. This is "optimizations we're pretty comfortable with having on by default". It's possible to pass -optimize=3, in which case we'll throw everything we've got at your program. If it breaks as a result, please tell us by filing an RT ticket; this is the pool of candidate optimizations to make it into group 2. After an optimization has had a while at level 2, combined with a happy and trouble-free history, we'll promote it into level 1. Using -optimize=1 at the moment gets you pretty much nothing - the analysis but no transformations. In the long run, it should get you just the optimizations we feel are really safe, so you won't lose everything if you need to switch down from -optimize=2 for some reason. Our goal is that you should never have to do that, of course. However, it's good to provide options. My thanks go to pmichaud++ for suggesting this scheme.

Compile Time Type Checking of Sub Arguments

One thing the optimizer can do is consider the arguments that will be passed to a subroutine. If it has sufficient type information about those arguments, it may be able to determine that the call will always be successful. In this case, it can flag to the binder that it need never do the type checks at run time. This one can actually help untyped programs too. Since the default argument type is Any, if you pass a parameter of one subroutine as an argument to another, it can know that this would never be a junction, so it never has to do the junction fail-over checks.

Compile Time Multiple Dispatch Resolution

While the multiple dispatch cache current Rakudo has is by some margin the best it has ever had in terms of lookup performance, it still implies work at run time. Given enough information about the types of the arguments is present, the optimizer is able to resolve some multiple dispatches at compile time, by working out cases where the dispatch must always lead to a certain candidate getting invoked. Of course, how well it can do this depends on the type information it has to hand and the nature of the candidates. This is a double saving: we don't have to do the multiple dispatch, and we don't have to do the type checks in the binding of the chosen candidate either.

Basic Inlining

In some (currently very constrained) cases, if we know what code is going to be called at compile time, and we know that the types of arguments being passed are all OK, we can avoid making the call altogether and just inline the body of the subroutine right into the caller. Of course, this is only beneficial in the case where the work the subroutine does is dominated by the overhead of calling it, and there are some cases where inlining is impossible to do without causing semantic differences. For now, the focus has been on doing enough to be able to inline various of the setting built-ins, but it's in no way restricted to just doing that. With time, the inline analysis will be made much smarter and more capable.

Native Operators

As part of getting the optimizer in place, moritz++ and I have also worked on native operators (that is, operators that operate on native types). This boils down to extra multiple dispatch candidates for various operators, in order to handle the natively typed case. However, something really nice happens here: because you always have to explicitly declare when you are using native types, we always have enough type information to inline them. Put another way, the native operator multis we've declared in the setting will always be inlined.

We've some way to go on this yet. However, this does already mean that there are some nice performance wins to be had by using native types in your program (int and num) where it makes sense to do so.

As an example, with -optimize=3 (the maximum optimization level, not the default one), we can compare:

my $i = 0; while $i < 10000000 { $i = $i + 1 }; say $i

Against:

my int $i = 0; while $i < 10000000 { $i = $i + 1 }; say $i

On my box, the latter typed version completes in 4.17 seconds, as opposed to the untyped version, which crawls in at 33.13 (so, a factor of 8 performance gain). If you're curious how this leaves us stacking up against Perl 5, on my box it does:

my $i = 0; while ($i < 10000000) { $i = $i + 1 }; say $i

In 0.746 seconds. This means that, with type information provided and for this one benchmark, Rakudo can get within a factor of six of Perl 5 - and the optimizer still has some way to go yet on this benchmark. (Do not read any more into this. This performance factor is certainly not generally true of Rakudo at the moment.)

We'll be continuing to work on native operators in the weeks and months ahead.

Immediate Block Inlining

We've had this in NQP for a while, but now Rakudo has it too. Where appropriate, we can now flatten simple immediate blocks (such as the bodies of while loops) into the containing block. This happens when they don't require a new lexical scope (that is, when they don't declare any lexicals).

That Could Never Work!

There's another nice fallout of the analysis that the optimizer does: as well as proving dispatches that will always work out at compile time, it can also identify some that could never possibly work. The simplest case is calling an undeclared routine, something that STD has detected for a while. However, Rakudo goes a bit further. For example, suppose you have this program:

sub foo($x) { say $x }
foo()

This will now fail at compile time:

CHECK FAILED:
Calling 'foo' will never work with no arguments (line 2)
    Expected: :(Any $x)

It can also catch some simple cases of type errors. For example:

sub foo(Str $s) { say $s }
foo(42)

Will also fail at compile time:

CHECK FAILED:
Calling 'foo' will never work with argument types (int) (line 2)
    Expected: :(Str $s)

It can handle some basic cases of this with multiple dispatch too.

Propagating Type Information

If we know what routine we're calling at compile time, we can take the declared return type of it and use it in further analysis. To give an example of how this aids failure analysis, consider the program:

sub foo() returns Int { 42 }
sub bar(Str $s) { say $s }
bar(foo())

This inevitable failure is detected at compile time now:

CHECK FAILED:
Calling 'bar' will never work with argument types (Int) (line 3)
    Expected: :(Str $s)

The real purpose of this is for inlining and compile time multi-dispatch resolution though; otherwise, we could never fully inline complex expressions like $x + $y * $z.

Optimizing The Setting

Since we have loads of tests for the core setting (many of the spectests cover it), we compile it with -optimize=3. This means that a bunch of the built-ins will now perform better. We'll doubtless be taking advantage of native types and other optimizations to further improve the built-ins.

Gradual Typing

Many of these optimizations are a consequence of Perl 6 being a gradually typed language. You don't have to use types, but when you do, we make use of them to generate better code and catch more errors for you at compile time. After quite a while just talking about these possible wins, it's nice to actually have some of them implemented. :-)

The Future

Of course, this is just the start of the work - over the coming weeks and months, we should gain plenty of other optimizations. Some will focus on type-driven optimizations, others will not depend on this. And we'll probably catch more of thsoe inevitable run time failures at compile time too. In the meantime, enjoy what we have so far. :-)


15 Oct 2011 3:56pm GMT

Carl Masak: Macros — what are they, really?

Apparently, if you schedule all of my talks at YAPC::EU 2011 on the first day, I will spend the remaining time of the conference thinking intently about how macros work. (I did some socializing too, don't worry. I even distinctly remember talking to people about other things than macros on at least one occasion.)

Like most of the rest of you, I'd heard about C preprocessor macros (and how they're both useful and kinda dangerous if you don't know what you're doing), and Lisp macros (and how they're part of what makes Lisp the awesomest programming language in the universe forever). Which one of these types does Perl 6 specify?

Both, duh. 哈哈

But I'm going to talk about the latter kind. I'll call them "AST macros", to differentiate them from "textual macros". ("AST" simply means "Abstract Syntax Tree". Forget the "abstract" part, it's just been put there to scare you into thinking this is tricky.)

Why ASTs matter

When the complexity of a codebase increases, it inevitably becomes a part of the problem it is trying to solve. We need to combat the complexity in the code itself, and we need to start talking about the code in the code. There are three broad ways we can describe code:

These three forms - string, AST, code block - reflect what a compiler does when it prepares your source code for execution:

The reason the compiler takes the detour through ASTs when creating your code is that trees are much easier to reason about and manipulate than the "flat" representations of code. An AST contains a lot of explicit relations that don't stand out in the original or final, "flat" representations of code. ASTs can be manipulated, stitched together, optimized, etc. It's this strength that AST macros make use of.

Since ASTs are the way code looks before code generation, AST macros give you a say in what code will be generated in your program.

Macros are a way to transform code. AST macros transform code by giving you the tools to build your own AST.

How to make an AST

How to construct an AST in Perl 6? Using the quasi keyword:

quasi { say "OH HAI" }

What this evaluates to is a Perl6::AST object holding a tree structure representing the program code say "OH HAI". Exactly how that tree structure looks may or may not be implementation-dependent.

quasi stands for "quasi-quote", a concept invented by Quine, the logician, who liked to think about self-reference, paradox, and words starting with the letter Q. Just as we quote code with a string literal and the result is a Str, so we can quote code with the quasi keyword and the result, in the case of Perl 6, is a Perl6::AST object.

What macros are

Macros work just like subroutines, but AST macros are expected to return a Perl6::AST. How the AST is created is the macro author's business. But we can use quasi to create them:

macro LOG {
    quasi {
        $*ERR.say(DateTime.now, ": some logging information here");
    }
}

# Meanwhile, later in the code:
LOG();

You see, it looks just like a subroutine call. But the call is made by the compiler, not by the runtime as with ordinary subroutines. And the return value is a Perl6::AST object containing the code to print something to $*ERR.

But wait, there's more!

It's a pretty useless LOG macro that doesn't take an argument with a $message. We'll fix that. There's one twist, though: AST macros deal in AST, so the $message that gets passed to the macro won't be a Str. It'll be a Perl6::AST:

macro LOG($message) {
    quasi {
        $*ERR.say(DateTime.now, ": ", {{{$message}}});
    }
}

LOG("Evacuation complete.");

When we call LOG, we do it with a Str, just as with a usual subroutine. The parser sees the string literal and does its thing with turning stuff into ASTs. The compiler then calls the macro with one argument: the resulting Perl6::AST object. In the quasi, we make sure to take this object and stitch it right into the code that says "print a bunch of stuff to $*ERR". It's right there, at the end of that line, enclosed in triple curly braces.

What do the triple curly braces do, exactly? They allow you to say "I want you to incorporate this already-parsed AST into this currently-being-parsed code". Triple curly braces are only recognized inside of quasi-quote blocks. In fact, this is what quasi-quotes specialize in: allowing an escape hatch from code to ASTs, so we can mix them. (This is what Quine used quasi-quoting for too, except in the domain of logic.)

If we didn't write {{{$message}}} there, but just the normal form $message, guess what? The LOG function would stringify the Perl6::AST object, probably to something boring like Perl6::AST()<0x80681e0>, and print that.

Right, so AST macros take ASTs, allow us to manipulate ASTs, and return ASTs. Fine. We get the message. But what makes them so powerful?

The real power comes from the fact that we can steer this process any which way we want. For example, maybe we'd like to turn logging on and off at the switch of a constant:

constant LOGGING_ENABLED = True;

macro LOG($message) {
    if LOGGING_ENABLED {
       quasi {
           $*ERR.say(DateTime.now, ": ", {{{$message}}});
        }
    }
    else {
        quasi {}
    }
}

LOG(crazily-expensive-computation());

Turn LOGGING_ENABLED off, and the crazily-expensive-computation() call will be parsed, but never executed.

This is the essence of AST macros. There's much more to it than that, but we'll get to the other parts in later posts.

15 Oct 2011 2:13pm GMT

12 Oct 2011

feedPlanet Perl Six

Moritz Lenz (Perl 6): The Three-Fold Function of the Smart Match Operator

In Perl 5, if you want to match a regex against a particular string, you write $string =~ $regex.

In the design process of Perl 6, people have realized that you cannot only match against regexes, but lots of other things can act as patterns too: types (checking type conformance), numbers, strings, junctions (composites of values), subroutine signatures and so on. So smart matching was born, and it's now written as $topic ~~ $pattern. Being a general comparison mechanism is the first function of the smart match operator.

But behold, there were problems. One of them was the perceived need for special syntactic forms on the right hand side of the smart match operator to cover some cases. Those were limited and hard to implement. There was also the fact that now we had two different ways to invoke regexes: smart matching, and direct invocation as m/.../, which matches against the topic variable $_. That wasn't really a problem as such, but it was an indicator of design smell.

And that's where the second function of the smart match operator originated: topicalization. Previously, $a ~~ $b mostly turned into a method call, $b.ACCEPTS($a). The new idea was to set the topic variable to $a in a small scope, which allowed many special cases to go away. It also nicely unified with given $topic { when $matcher { ... } }, which was already specified as being a topicalizer.

In the new model, MATCH ~~ PAT becomes something like do { $_ = MATCH; PAT.ACCEPTS($_) } -- which means that if MATCH accesses $_, it automatically does what the user wants.

Awesomeness reigned, and it worked out great.

Until the compiler writers actually started to implement a few more cases of regex matching. The first thing we noticed was that if $str ~~ $regex { ... } behaved quite unexpectedly. What happend was that $_ got set to $str, the match was conducted and returned a Match object. And then called $match.ACCEPTS($str), which failed. A quick hack around that was to modify Match.ACCEPTS to always return the invocant (ie the Match on which it was called), but of course that was only a stop gap solution.

The reason it doesn't work for other, more involved cases of regex invocations is that they don't fit into the "does $a match $b?" schema. Two examples:

# :g for "global", all matches
my @matches = $str ~~ m:g/pattern/; 

if $str ~~ s/pattern/substitution/ { ... }

People expect those to work. But global matching of a regex isn't a simple conformance check, and that is reflected in the return value: a list. So should we special-cases smart-matching against a list, just because we can't get global matching to work in smart-matching otherwise? (People have also proposed to return a kind of aggregate Match object instead of a list; that comes with the problem that Match objects aren't lazy, but lists are. You could "solve" that with a LazyMatch type; watch the pattern of workarounds unfold...)

A substitution is also not a simple matching operation. In Perl 5, a s/// returns the number of successful substitutions. In Perl 6, that wouldn't work with the current setup of the smart match operator, where it would then smart-match the string against the returned number of matches.

So to summarize, the smart match operator has three functions: comparing values to patterns, topicalization, and conducting regex matches.

These three functions are distinct enough to start to interact in weird ways, which limits the flexibility in choice of return values from regex matches and substitutions.

I don't know what the best way forward is. Maybe it is to reintroduce a dedicated operator for regex matching, which seems to be the main feature with which topicalization interacts badly. Maybe there are other good ideas out there. If so, I'd love to hear about them.

12 Oct 2011 6:37pm GMT

27 Sep 2011

feedPlanet Perl Six

perl6.announce: Announce: Niecza Perl 6 v10 by Stefan O'Rear


Announce: Niecza Perl 6 v10

This is the tenth release of Niecza Perl 6, as usual scheduled on
the last Monday of the month.

You can obtain a build of Niecza from [1]. This build contains a
working compiler as a set of .exe and .dll files suitable for use with
Mono or Microsoft .NET. If you wish to follow latest developments,
you can obtain the source from [2]; however, you will still need a
binary for bootstrapping, so you gain nothing from a "source is
better" perspective.

Niecza is a Perl 6 compiler project studying questions about the
efficient implementability of Perl 6 features. It currently targets
the Common Language Runtime; both Mono and Microsoft .NET are known to
work. On Windows, Cygwin is required for source builds only; see the
README for details.


List of changes



[Major features]

CLR interoperation is now fairly well supported! You can create
objects, call methods, get and set fields and properties, create
delegates, etc from Perl 6 code. See examples/ for usage ideas.
(Examples by Martin Berends)

The Mono.Posix dependency has been relaxed from load time to run
time, meaning .NET support is back if you don't use file tests.



[Minor new features]

\qq[] syntax is now implemented.

qp|| now returns a path object.

New Test.pm6 methods succeeds_ok and fails_ok (and eval_ variants) to
catch warnings. (Design by flussence)

@foo? and %foo? in signatures are now correctly supported.

Many more trig functions now implemented. (Solomon Foster)

Standard grammar has been updated, in particular bringing the new
concept of regex separators; x ** y is now spelled x+ % y. Do
not expect other forms of % and %% to work just yet.



[Selected bug fixes]

sqrt now returns the correct value for arguments with a negative
imaginary part. Also sqrt(0) returns Num not Complex now.



[Other]

docs/compiler.pod is more current. (Martin Berends)

Prototyping has begun on Perl 5 interoperation. (Paweł Murias)


Getting involved

Contact sorear in irc.freenode.net #perl6 or via the sender address of
this mailing. Also check out the TODO file; whether you want to work
on stuff on it, or have cool ideas to add to it, both are good.

Future directions

I have an active branch (started this month) to unify compile-time and
run-time metamodel representations, using serialization to bridge the
gap. It doesn't work yet, but when it does it will enable many
improvements, most importantly real support for BEGIN and roles.

[1] https://github.com/downloads/sorear/niecza/niecza-10.zip
[2] https://github.com/sorear/niecza

27 Sep 2011 6:19am GMT

20 Sep 2011

feedPlanet Perl Six

perl6.announce: Parrot 3.8.0 "Magrathea" Released by Kevin Polulak

On behalf of the Parrot team, I'm proud to announce Parrot 3.8.0, also known
as "Magrathea". Parrot (http://parrot.org/) is a virtual machine aimed at
running all dynamic languages.

Parrot 3.8.0 is available on Parrot's FTP site (
ftp://ftp.parrot.org/pub/parrot/releases/devel/3.8.0/), or by following the
download instructions at http://parrot.org/download. For those who would
like to develop on Parrot, or help develop Parrot itself, we recommend using
Git to retrieve the source code to get the latest and best Parrot code.

Parrot 3.8.0 News:
- Core
+ New tools/release/auto_release.pl script automates most of release
- Languages
+ Winxed
- Updated snapshot to version 1.2.0
- allowtailcall modifier in try
--debug command-line option, __DEBUG__ predefined constant
and __ASSERT__ builtin
- namespace, class, and ~ (bitwise not) operators
- Implicit nested namespace in namespace and class
declarations
- -X command-line arg
- Documentation
+ Improved release manager guide
- Tests
+ New Makefile target "resubmit_smolder" to resubmit test results
+ New Makefile target "all_hll_test" runs the test suite of all
HLLs and libraries known to work on Parrot
+ New Makefile target "interop_tests" run language
interoperability tests, which runs as part of the normal "make test"
as well

The SHA256 message digests for the downloadable tarballs are:

-

f26d9c1a5d7723b1e778394f87f8bb993e188fb05a719a78eb0204612329cd75

parrot-3.8.0.tar.bz2
-

ae10e52eaf150870949aa51c7588e3a09f8f0588c9e0a7a76c2201672b7c5c7a

parrot-3.8.0.tar.gz

Many thanks to all our contributors for making this possible, and our
sponsors for supporting this project. Our next scheduled release is 18
October 2011.

Enjoy!

Excerpt from *The Hitchhiker's Guide to the Galaxy*, page 634784, section
5a. Entry: *Magrathea*

*Far back in the mists of ancient time, in the great and glorious days of
the former Galactic Empire, life was wild, rich and largely tax free. Mighty
starships plied their way between exotic suns, seeking adventure and reward
among the farthest reaches of Galactic space. In those days spirits were
brave, the stakes were high, men were real men, women were real women and
small furry creatures from Alpha Centauri were real small furry creatures
from Alpha Centauri. And all dared to brave unknown terrors, to do mighty
deeds, to boldly split infinitives that no man had split before - and thus
was the Empire forged.*
*

Many men of course became extremely rich, but this was perfectly natural and
nothing to be ashamed of because no one was really poor - at least no one
worth speaking of. And for all the richest and most successful merchants
life inevitably became rather dull and niggly, and they began to imagine
that this was therefore the fault of the worlds they'd settled on. None of
them was entirely satisfactory: either the climate wasn't quite right in the
later part of the afternoon, or the day was half an hour too long, or the
sea was exactly the wrong shade of pink.

And thus were created the conditions for a staggering new form of specialist
industry; custom-made luxury planet building. The home of this industry was
the planet Magrathea, where hyperspatial engineers sucked matter through
white holes in space to form it into dream planets - gold planets, platinum
planets, soft rubber planets with lots of earthquakes - all lovingly made to
meet the exacting standards that the Galaxy's richest men naturally came to
expect.

But so successful was this venture that Magrathea itself soon became the
richest planet of all time and the rest of the Galaxy was reduced to abject
poverty. And so the system broke down, the Empire collapsed, and a long
sullen silence settled over a billion hungry worlds, disturbed only by the
pen scratchings of scholars as they labored into the night over smug little
treatises on the value of a planned political economy.

Magrathea itself disappeared and its memory soon passed into the obscurity
of legend.
*

*In these enlightened days, of course, no one believes a word of it.*
--
- Kevin Polulak (soh_cah_toa)

20 Sep 2011 9:07pm GMT

17 Sep 2011

feedPlanet Perl Six

Jonathan Worthington (6guts): This is not enough!

The time for some shiny new hardware came around. Sat next to me, purring decidedly more quietly that its predecessor, is my new main development machine: a quad core Intel Core i7, pimped out with 16 GB of RAM and a sufficiently generous SSD that it can hold the OS, compiler toolchain and projects I work most actively on. It's nice having a $dayjob that likes keeping their hackers…er, consultants…well kitted out. :-)

So, the question I had to ask was: how fast can this thing run the Rakudo spectests? I tried, and with -jobs=8 (the sweet spot, it seems) it chugged its way through them in 220s. That's vastly better than I'd ever been able to do before, and I could immediately see it was going to be a boon for my Rakudo productivity. 3 minutes 40 seconds. Not so long to wait to know a patch is fine to push. But…what if it was less? It's fast but…this is not enough!

A while ago, moritz++ showed how the nom branch of Rakudo ran mandelbrot 5 times faster than master. This was a fairly nice indicator. Around the time my new hardware arrived, an update was posted on #perl6: mandelbrot was now down to 2 minutes on the same machine the original tests were done. Again, I was happy to see progress in the right direction but I couldn't help but feel…this is not enough!

So, I took a few days break from bug fixing and features, and decided to see if things could get faster.

Faster Attribute Access

One of the things I've had planned for since the early days of working on 6model is being able to look up attributes by index in the single inheritance case, rather than by name. I finally got around to finishing this up (I'd already put in most of the hooks, just not done the final bits). It's not an entirely trivial thing to make work; at the point we parse an attribute access we don't know enough about how the eventual memory layout of the object will be, or whether an indexed lookup will even work. Further, we have to involve the representation in the decision, since we can't assume all types will use the same one. Mostly, it just involves a later stage of the code generation (PAST => POST in this case) having the type object reachable from the AST and asking it for a slot index, if possible.

Since I implemented it at the code-gen level, it meant the improvement was available to both NQP and Rakudo, so we get compiler and runtime performance improvements from it. Furthermore, I was able to improve various places where the VM interface does attribute lookups (for example, invocation of a code object involves grabbing the underlying VM-level thingy that represents an executable thing, and that "grabbing" is done by an attribute access on the code object). Attribute lookups never really showed up that high in the (C-level) profile, but now they're way, way down the list.

The P6opaque Diet

P6opaque is by far the most common object representation used in NQP and Rakudo. It's generally pretty smart; it has a header, and then lays out attributes - including natively typed ones - just like a C structure would be laid out in memory. In fact, it mimics C structures well enough that for a couple of parts of the low-level parts of Rakudo we have C struct definitions that let us pretend that full-blown objects are just plain old C structures. We don't have to compromise on having first class objects in order to write fast low-level code that works against them any more. Of course, you do commit to a representation - but for a handful of built-in types that's fine.

So, that's all rainbows and butterflies, so what was the problem? Back last autumn, I thought I knew how implementing mix-ins and multiple inheritance attribute storage was going to look; it involved some attributes going into a "spill hash" if they were added dynamically, or all of them would go there apart from any in a common SI prefix. Come this spring when I actually did it for real, a slightly smarter me realized I could do much better. It involved a level of indirection - apart from that level already existed, so there was actually no added cost at all. Thing is, I'd already put the spill slot in there, and naughtily used the difference between NULL and PMCNULL as the thing that marked out whether the object was a type object or not.

This week, I shuffled that indicator to be a bit in the PMC object header (Parrot makes several such bits available for us to use for things like that). This meant the spill slot in the P6opaque header could go away. Result: every object using the P6opaque representation got 4 (32-bit) or 8 (64-bit) bytes lighter. This has memory usage benefits, but also some speed ones: we get more in the CPU cache for one, and for another we can pack more objects into fixed sized pools, meaning they have less arenas to manage. Win.

Constant Pain

In Perl 6 we have Str objects. Thanks to 6model's capability to embed a native Parrot string right into an object, these got about three times cheaper in terms of memory already in nom. Well, hopefully. The thing is, there's a very painful way to shoot yourself in the foot at the implementation level. 6model differentiates coercion (a high level, language sensitive operation) from unboxing (given this object, give me the native thingy inside of it). Coercion costs somewhat more (a method call or two) than unboxing (mostly just some pointer follows). If you manage to generate code that wants a VM-level string, and it just has an object, it'll end up doing a coercion (since at that level, it doesn't know the much cheaper unbox is possible/safe). After reading some of the compiler output, I spotted a bunch of cases where this was happening - worst of all, with constant strings in places we could have just emitted VM-level constant strings! Fixing that, and some other unfortunate cases of coercion instead of unbox, meant I could make the join method a load faster. Mandelbrot uses this method heavily, and it was a surprisingly big win. String concatenation had a variant of this kind of issue, so I fixed that up too.

Optimizing Lexical Lookup

We do a lot of lexical lookups. I'm hopeful that at some point we'll have an optimizer that can deal with this (the analysis is probably quite tricky for full-blown Perl 6; in NQP it's much more tractable). In the meantime, it's nice if they can be faster. After a look over profiler output, I found a way to get a win by caching a low-level hash pointer directly in the lexpad rather than looking it up each time. Profilers. They help. :-)

Optimized MRO Compuation

The easiest optimizations for me to do are…the ones somebody else does. Earlier this week, after looking over the output from a higher level profiler that he's developing for Parrot, mls++ showed up with a patch that optimized a very common path of C3 MRO computation. Curiously, we were spending quite a bit of time at startup doing that. Of course, once we can serialize stuff fully, we won't have to do it at all, but this patch will still be a win for compile time, or any time we dynamically construct classes by doing meta-programming. A startup time improvement gets magnified by a factor of 450 times over a spectest run (that's how many files we have), and it ended up being decidedly noticeable. Again, not where I'd have thought to look…profiling wins again.

Multi-dispatch Cache

We do a lot of multiple dispatch in Perl 6. While I expect an optimizer, with enough type information to hand, will be able to decide a bunch of them at compile time, we'll always still need to do some at runtime, and they need to be fast. While we've cached the sorted candidate list for ages, it still takes a time to walk through it to find the best one. When I was doing the 6model on CLR work, I came up with a design for a multi-dispatch cache that seemed quite reasonable (of note, it does zero heap allocations in order to do a lookup and has decent cache properties). I ported this to C and…it caused loads of test failures. After an hour of frustration, I slept on it, then fixed the issue within 10 minutes the next morning. Guess sleep helps as well as profilers. Naturally, it was a big speed win.

Don't Do Stuff Twice

Somehow, in the switch over to the nom branch, I'd managed to miss setting the flag that causes us not to do type checks in the binder if the multi-dispatcher already calculated they'd succeed. Since the multi-dispatch cache, when it gets a hit, can tell us that much faster than actually doing the checks, not re-doing them is a fairly notable win.

Results

After all of this, I now have a spectest run in just short of 170 seconds (for running 14267 tests). That's solidly under the three minute mark, down 50s on earlier on this week. And if it's that much of a win for me on this hardware, I expect it's going to amount to an improvement measured in some minutes for some of our other contributors.

And what of mandelbrot? Earlier on today, moritz reported a time of 51 seconds. The best we ever got it to do in the previous generation of Rakudo was 16 minutes 14 seconds, making for a 19 times performance improvement for this benchmark.

This is not enough!

Of course, these are welcome improvements, and will make the upcoming first release of Rakudo from this new "nom" development branch nicer for our users. But it's just one step on the way. These changes make Rakudo faster - but there's still plenty to be done yet. And note that this work doesn't deliver any of the "big ticket" items I mentioned in my previous post, which should also give us some good wins. Plus there's parsing performance improvements in the pipeline - but I'll leave those for pmichaud++ to tell you about as they land. :-)


17 Sep 2011 12:11am GMT

12 Sep 2011

feedPlanet Perl Six

Jonathan Worthington (6guts): What’s coming up in September/October

So, YAPC has come, been great, gone and been recovered from, I'm done with my summer visiting, the $dayjob speaking trip has taken place, I've shaken off the obligatory start of autumn cold and it's time to get back to hacking on stuff. Actually, I've no more workshops or other trips until November and I've got Perl 6 time marked in my schedule, so all being well there should be plenty of time to Get Stuff Done. :-) So what have I got planned for the next couple of months?

Ship a "nom"-based Release

This is the current priority. Day by day, we're fixing up test files we lost in the refactor. This weekend I got most of the missing bits of parametric role support back in place (and I'm overall happy with the resulting factoring; it's a massive amount better than what we had before). Our biggest remaining holes are in the regex and grammar handling, which pmichaud++ is on with (I'm quite excited about what's coming here). Other than that, it's little bits here and there. We're getting there. :-)

A Basic Rakudo Optimizer

I've started playing with this a bit, in a branch. Nothing interesting to see yet, other than an optimizer that only knows one optimization, makes a few things a little faster and regresses a couple of spectests (so, some weird bug in the analysis or transform somewhere). The good news is that it does successfully make it through applying that optimization to CORE.setting. This is an important part of developing the optimizer: if we're going to write our built-ins in Perl 6, we really want an optimizer to go over them too. My aim is to teach this a couple more things and have it in the October release.

A Basic NQP Optimizer

NQP is the subset of Perl 6 that we write most of the compiler in. We also implement the various built-in meta-objects in it (so we want it to be fast here, and of course we want faster compiles!) It currently has no optimizer, a situation I plan to change. NQP has many restrictions that full-blown Perl 6 does not have, and as a result we'll be able to do some more aggressive optimizations, or be able to apply them with far simpler analysis. My goal is to have some form of basic optimizer in NQP by the October release. Of course, since NQP is bootstrapped, NQP's optimizer can be used to optimize NQP itself (yes, it can optimize the optimizer…) "So we just keep running NQP on itself until it runs crazy fast?" Er, no, sorry, it doesn't work like that. :-)

Bounded Serialization

Currently, as we compile programs, we build up a complete "model" of the runtime environment (for example, we build Signature/Parameter objects, meta-objects to represent classes, and so forth). If we're going to just run the program, we carry these objects over to runtime and use them. If we're in compilation mode, like we are with the setting, then we generate a bunch of "deserialization code". This gets run first of all when we load the setting/module in question. This introduces a couple of problems.

The solution is to find a way to efficiently serialize everything we create, and then be able to deserialize it efficiently at load time and do the few needed fixups. While in theory it's "easy" (keeping iterating over a worklist until you've serialized everything, basically), there's a bunch of really tricky things that also come up (especially any closures that got taken at compile time). I hope to dig into this before the end of the month, and my target is to have it for the November release (October one is a bit ambitious, unless it goes crazily well; in reality, I'd like to land it late October, so we have a couple of weeks before the November release to get it in shape).

Revive the CLR Backend

Running NQP on the CLR got a long, long way. At the time I last touched it, the majority of the non-regex tests in the NQP test suite were passing, and diakopter++ was making progress on the regex ones too. It's been dormant for a while, but it's time to get back to work on it. Getting NQP, and then Rakudo, to run on the CLR is now a vastly more tractable task to back then, since:

Of course, some problems I chose to ignore will need to be dealt with (like, how to best harness the way to CLR thinks of types in order to implement 6model representations more efficiently, and how to support gather/take). It'll need generating IL rather than C#. And…plenty more. I'm not going to set any targets for this just yet; I'll just dig in, have fun, and we'll see where things land up in a month or two. I found it great fun to work on this last time - the CLR is a nice VM and well tuned - so it shouldn't be hard to get a round tuit. :-)

Other Bits

Of course, there's still bits of the Perl 6 spec that needs implementing, and things in module space. Amongst things I'd like to hack on soon are big integer support (so we can do Int right), natively typed operators and teaching NativeCall to handle structures.

Other 6model Bits

I want to vastly improve the state of 6model's documentation. Taking a moment to look further ahead than the next couple of months, once the updated CLR implementation of it comes together, and when some other language's object systems have been shown to be buildable on 6model, I also want to think about declaring a "6model API v1″ or so. This is so that if/when Parrot integrates 6model, or others implementations show up that I won't have a close hand in, there's a clear idea of what it's expected to look like from the outside, and what are implementation details (and thus can be done in whatever way is appropriate). I also expect further extensions, refinements, and so forth, and I think it'd be best to give folks who implement 6model - myself included! - some coarser grained way of saying "we support this set of things" than just listing off implemented features. This is some way off, though I do already have a slowly forming picture of what I'd like to tackle in the area of meta-model design in the future.

So, that's what I've got in mind. Oh, and I should be sure to blog as I work on this stuff! Feel free to prod me if I forget. :-)


12 Sep 2011 11:41pm GMT

09 Sep 2011

feedPlanet Perl Six

Carl Masak: -n and -p, part three

(This blog post is part three of a series; there's also a part one and a part two.)

Shortly after I wrote the last post on -n and -p, and how I didn't really understand what a setting was in Perl 6, sorear++ and TimToady++ filled me in on all of the details. So here I am, a third time, to pass the knowledge on.

I wrote last time that I found the term "setting" confusing and overloaded. That's because I thought it was a single, defined thing. In effect, there's no the setting in Perl 6; there can be many at the same time.

A setting is simply something that surrounds your code on the outside. (Haskell has a Prelude, but a prelude only comes before your code. A setting envelopes your code both before and after.) Your code simply finds itself lexically inside some setting or other. In technical parlance, whatever is the OUTER:: of your code is a setting.

So, you can have several settings, just like I wished for. They stack, you see. Or rather peel, like onion layers. One man's setting is another man's code, all the way outwards into the final OUTER:: nothingness of empty space.

And - the final piece of the puzzle - the big default "here, friend, are all of your builtins" setting is called CORE. I'd always wondered why we keep saying both "setting" and CORE. (And why Rakudo calls the directory src/core, not src/setting.) That's why; CORE is just a setting among others.

The discussion - which consists mostly of sorear and TimToady telling how things really are - can be found here.

And now I think I'm finally done writing on how -n and -p work. 哈哈

09 Sep 2011 10:20pm GMT

21 Mar 2011

feedPlanet Perl

Perl NOC Log: Planet Perl is going dormant

Planet Perl is going dormant. This will be the last post there for a while.

image from planet.perl.org

Why? There are better ways to get your Perl blog fix these days.

You might enjoy some of the following:

Will Planet Perl awaken again in the future? It might! The universe is a big place, filled with interesting places, people and things. You never know what might happen, so keep your towel handy.

21 Mar 2011 2:04am GMT

Ricardo Signes: improving on my little wooden "miniatures"

A few years ago, I wrote about cheap wooden discs as D&D minis, and I've been using them ever since. They do a great job, and cost nearly nothing. For the most part, we've used a few for the PCs, marked with the characters' initials, and the rest for NPCs and enemies, usually marked with numbers.

With D&D 4E, we've tended to have combats with more and more varied enemies. (Minions are wonderful things.) Numbering has become insufficient. It's too hard to remember what numbers are what monster, and to keep initiative order separate from token numbers. In the past, I've colored a few tokens in with the red or green whiteboard markers, and that has been useful. So, this afternoon I found my old paints and painted six sets of five colors. (The black ones I'd already made with sharpies.)

D&D tokens: now in color

I'm not sure what I'll want next: either I'll want five more of each color or I'll want five more colors. More colors will require that I pick up some white paint, while more of those colors will only require that I re-match the secondary colors when mixing. I think I'll wait to see which I end up wanting during real combats.

These colored tokens should work together well with my previous post about using a whiteboard for combat overview. Like-type monsters will get one color, and will all get grouped to one slot on initiative. Last night, for example, the two halfling warriors were red and acted in the same initiative slot. The three halfling minions were unpainted, and acted in another, later slot. Only PCs get their own initiative.

I think that it did a good amount to speed up combat, and that's even when I totally forgot to bring the combat whiteboard (and the character sheets!) with me. Next time, we'll see how it works when it's all brought together.

21 Mar 2011 12:47am GMT

20 Mar 2011

feedPlanet Perl

Dave Cross: Perl Vogue T-Shirts

Is Plack the new Black?In Pisa I gave a lightning talk about Perl Vogue. People enjoyed it and for a while I thought that it might actually turn into a project.

I won't though. It would just take far too much effort. And, besides, a couple of people have pointed out to be that the real Vogue are rather protective of their brand.

So it's not going to happen, I'm afraid. But as a subtle reminder of the ideas behind Perl Vogue I've created some t-shirts containing the article titles from the talk. You can get them from my Spreadshirt shop.

20 Mar 2011 12:02pm GMT

Perl NOC Log: Big CPAN.org update

CPAN has gotten its first real update in a while tonight; the content is from the cpanorg git repository.

We tried to get the FAQ cleaned up a bit (though there's plenty of work left) and Leo Lapworth pretty heroically also did a first pass on cleaning up the ports page.

You might also notice a search box for search.cpan.org which we find appropriate, a list of recently uploaded modules on the homepage and a new page on how to mirror CPAN.

If you read the latter page, you'll see that the master mirror is now cpan-rsync.perl.org::CPAN (rsync only). In the coming weeks we'll work on encouraging the CPAN mirrors to switch to mirror from here to ease the load on FUnet, the sponsor of the master mirror for the last 15 years.

Work is also coming along well on the instant update mirroring system.

- ask

20 Mar 2011 9:10am GMT

19 Mar 2011

feedPlanet Perl

David Golden: With LWP 6, you probably need Mozilla::CA

LWP 6 makes hostname verification the default -- so note this from LWP::UserAgent:

If hostname verification is requested, and neither SSL_ca_file nor SSL_ca_path is set, then SSL_ca_file is implied to be the one provided by Mozilla::CA. If the Mozilla::CA module isn't available SSL requests will fail. Either install this module, set up an alternative SSL_ca_file or disable hostname verification.

If you use LWP and want SSL, you need IO::Socket::SSL (recommended) and Mozilla::CA.

19 Mar 2011 2:57am GMT

17 Mar 2011

feedPlanet Perl

Dave Cross: Perl News

Remember use.perl? It's moth-balled now, but for years it provided two valuable services to the Perl community.

Firstly it provided a hosted blog platform which many people used to write about many things - sometimes even Perl. Of course we now have blogs.perl.org which provides a very similar service.

And secondly, it provided a place where people could submit stories related to Perl and then editors would approve the stories and publish them on the front page. Since use.perl closed down, the Perl community hasn't really had a centralised site for that.

Over the last eighteen months or so I've had conversations with people about building a site that replaced that part of use.perl. But there's always been something more interesting to work on.

Then, at the start of this week, Leo asked if I knew of a good Perl news feed that he could use on the front page of perl.org. And I realised that I'd been putting it off so too long. A few hours of WordPress configuration and Perl News was ready to go.

So if you have any interesting Perl news to share, please submit it to the site.

17 Mar 2011 2:01pm GMT

Leo Lapworth: New Perl news site launches

http://perlnews.org/ has just launched and will be providing a source for major announcements related to The Perl Programming Language (http://www.perl.org/). Find out more at http://perlnews.org/about/ - or if you have a story submit it http://perlnews.org/submit/.

All stories are approved to ensure relevance.

Thanks

The Perl News Team.

17 Mar 2011 1:44pm GMT

Curtis Poe: 80% Hacks

I'm still blogging five days a week, but obviously not here. That's largely because my new daughter is forcing me to choose where I spend my time and I can't blog too much about what I do lest I reveal trade secrets. So, just to keep my hand in, here's an ugly little "80% hack" that lets me find bugs like mad in OO code. I should really combine this with my warnings::unused hack and start building up a tool find find issues in legacy code.

First, an "80% Hack" is based on the Pareto Principle which states that 80% of the results stem from 20% of the effort. So I often write what I call 80% hacks which are simply quick and dirty tools which get things done.

The idea is simple. In legacy OO code where we're not using Moose, we have a nasty tendency to reach inside a blessed hashref. However, as classes start getting old and crufty, particularly in legacy code which is earning the company a ton of money, it's easy for someone to either misspell a hash key or refer to keys which are no longer used. What I've done is assume that each of these keys are used once and only once and I also assume they look like this:

$self->{ foo }
$_[0]  ->  { "bar" } # yeah, we need arbitrary whitespace
shift->{'something'} # and quotes

Yes, this code could be improved tremendously, but 80% hacks are personal hack which I simply don't pour a lot of time and effort into. Besides, they're fun.

#!/usr/bin/env perl                                                                                                                                                                                                                       

use strict;
use warnings;
use autodie ':all';
use Regexp::Common;

my $module = shift or die "usage: $0 pm_file";

#my $module = '/home/cpoe/git_tree/main/test_slot';

my $key_found = qr/
    (?: \$self | \$_\[0\] | shift )  # $self or $_[0] or shift
    \s* ->                         # ->
    \s* {                          # { 
    \s* ($RE{quoted}|\w*)          # $hash_key
    \s* }                          # }
/x;

open my $fh, '<', $module;

my %count_for;
while (<$fh>) {
    while (/$key_found/g) {
        my $key = $1;
        $key =~ s/^["']|['"]$//g;    # try and strip the quotes

        no warnings 'uninitialized';
        $count_for{$key}{count}++;
        $count_for{$key}{line} = $.;
    }
}

foreach my $key ( sort keys %count_for ) {
    next if $count_for{$key}{count} > 1;
    print "Possibly unused key '$key' at line $count_for{$key}{line}\n";
}

I run that with a .pm file as an argument and I get a report like:

Possibly unused key '_key1' at line 1338
Possibly unused key '_key2' at line 5325
...
Possibly unused key '_keyX' at line 4031

It's amazing how many bugs I've found with this.

Leïla and Lilly-Rose. Lilly-Rose is 3 weeks old in this photo.

I can't blog as much as I used to, but they make it all worth it.

17 Mar 2011 9:33am GMT

brian d foy: Recreating a Perl installation with MyCPAN

A goal of the MyCPAN work was to start with an existing Perl distribution and work backward to the MiniCPAN that would re-install the same thing. I hadn't had time to work on that part of the project until this month.

The first step I've had for awhile. I've created a database of any information I can collect about a file in the 150,000 distributions on BackPAN. There are about 3,000,000 candidate Perl module or script files. That includes basics such as MD5 digest of the file, the file size, the Perl packages declared in the file, and the package versions.

The next step is what I've been doing this week: collect the same information on the files in a Perl installation, which is much easier to do. There's not wacky distribution stuff involved.

Putting those two together should find the distributions that could make up the installation. With that list of distros, it's just a matter of creating the right 02packages file that a CPAN client can use. Easy peasy, I thought.

But, it's not that easy. Each file in the existing installation might have come from several distributions. That is, between different versions of a distribution, it's likely that many of the modules didn't change. So, looking at a single file doesn't lead to a single distribution. It might list several possible distributions.

But that's a start. Other files from that distribution should be present, and they each might come from several distributions even if one of them changed. If there's any file that only belongs to one distribution, that collapses everything for that distribution. If not, I have to find the overlap in possible distributions. There should be one distribution that overlaps more than all of the others, and that should be the right distribution.

That's not quite right either though, because some distribution versions don't change the module files. They update a test or the build file or something besides whatever is in lib. You'd think that at least the $VERSION would change, but think of any exception and you'll probably find it on BackPAN. That's not as horrible as it seems though. If all of the module files are the same, it doesn't matter which distribution I use, does it?

But then, there are some files that not only might come from more than one version of a particular distribution, but might also be in a completely different distribution. Some distributions have lifted files from other distributions. Files from the URI and LWP modules show up in other distributions. How should I figure out which one should be the candidate distribution?

The database I was using was just an extract of all of the information I have on each distribution and it's oriented to individual files. I select records to match up MD5 digests. However, when I get records back with different distributions, which one might be installed? If an installed file might have come from both Foo-Bar and Baz-Quux, I have to remove one of the distributions somehow. In that case, I have to step back to look at what else either distribution might have been installed. If the other files from Foo-Bar aren't there, it's probably not Foo-Bar.

That might be the end of the story, but what if both Foo-Bar and Baz-Quux are installed? That part I haven't figured out, but it's likely that the previous step will be inconclusive since the files from both distributions will all be there. However, there's also the chance that an older version of Foo-Bar and a newer Baz-Quux is there. If they both install a Foo.pm file, the older version in Foo-Bar might have been over written by an updated version from Baz-Quux. So, Every file except one from Foo-Bar is there. That means that there's possibly some path independence there so I would have to make sure I install modules in the right order to recreate the installation.

If the module installation order matters, I think that might rule out creating a Task::* distribution, which can't guarantee the installation order, I think. A Bundle::* might be able to do it though.

So, you think that's the end of it? Think about configure_requires and build_requires. Anything those need to be in the MiniCPAN too, even if they aren't in the installation. You have the option of not permanently installing those modules, so you might not see them in the analysis. Even when I get a list of distributions, I then have to check their dependencies to see if there's anything extra I need to add.

So, not so bad.

17 Mar 2011 8:37am GMT

16 Mar 2011

feedPlanet Perl

Dave Rolsky: Who Are the Perl 5 Core Docs For?

I've been spending a fair bit of time working on Perl 5 core documentation. I started by editing and reorganizing some of the documents related to core hacking. This update will be in 5.14, due to be released in April. I'm also working on replacing the existing OO tutorials and updating the OO reference docs, though this won't make it into the 5.14 release.

There's been a lot of discussion on my OO doc changes, some of it useful, some of it useless, and some of it very rude (welcome to p5p!). Many of the people in the discussion don't have a clear vision of who the docs are for. Without that vision, it's really not possible to say whether a particular piece of documentation is good or not. A piece of documentation has to be good for a particular audience.

There's a number of audiences for the Perl 5 core docs, and they fall along several axes. Here are the axes I've identified.

Newbies vs experienced users

Newbie-ness is about being new to a particular concept. You could be an experienced Perl user and still be new to OO programming in general, or new to OO in Perl.

For my OO docs, I'm writing for two audiences. First, I'm writing for people who are learning OO. That's why the document starts with a general introduction to OO concepts. Second, I'm writing for people who want to learn more about how to do OO in Perl 5. For those people, the tutorial points them at several good OO systems on CPAN.

I'm not writing for people who already know Perl 5 OO and want to learn more, that's what the perlobj document is for.

From the discussion on p5p, I can see that many people there have trouble understanding how newbies think. I like how chromatic addresses these issues in a couple of his blog posts.

How the reader uses Perl

Perl is used for lots of different tasks, including sysadmin scripts, glue code in a mostly non-Perl environment, full app development, etc.

Ideally, we'd have tutorial documents that are appropriate for each of these areas. I think the OO tutorial is most likely to be of interest to people writing full Perl applications. If you're just whipping up some glue code, OO is probably overkill.

It would also be great to see some job-focused tutorials, like "Basic Perl Concepts for Sysadmins" or "Intro to Web Dev in Perl 5". Yes, I know there are books on these topics, but having at least some pointers to modules/books/websites in the core docs is useful.

Constraints on the reader's coding

If you're doing green field development, you have the luxury of using the latest and greatest stuff on CPAN. If you're maintaining a 10-year old Perl web app (I'm so sorry), then you probably don't. Some readers may not be able to install CPAN modules. Some readers are stuck with in house web frameworks.

People stuck with old code need good reference docs that explain all the weird shit they come across. People writing new code should be guided to modern best practices. They don't need to know that you can implement Perl 5 OO by hand using array references, ties, and lvalue methods

My OO tutorial is obviously aimed toward the green field developers. It's all about pointing them at good options on CPAN. As I revise perlobj, I'm trying to make sure that I cover every nook and cranny so that the poor developer stuck with (2001 Perl OO code can understand what they're maintaining.

(Sadly, that's probably my code they're stuck with.)

Conclusion

I'd like to see more explicit discussion of who the intended readers are when we discuss core documentation. Any major doc revision should start with a vision of who the docs are for.

There's probably other axes we can think about when writing documentation as well. Comments on this are most welcome.

16 Mar 2011 8:13pm GMT

15 Mar 2011

feedPlanet Perl

perl.com: Facebook Authentication with Perl and Facebook::Graph

Basic integration of software and web sites with Facebook, Twitter, and other social networking systems has become a litmus test for business these days. Depending on the software or site you might need to fetch some data, make a post, create events, upload photos, or use one or more of the social networking sites as a single sign-on system. This series will show you how to do exactly those things on Facebook using Facebook::Graph.

This first article starts small by using Facebook as an authentication mechanism. There are certainly simpler things to do, but this is one of the more popular things people want to be able to do. Before you can do anything, you need to have a Facebook account. Then register your new application (Figure 1).

registering a Facebook application
Figure 1. Registering a Facebook application.

Then fill out the "Web Site" section of your new app (Figure 2).

registering your application's web site
Figure 2. Registering your application's web site.

Registering an application with Facebook gives you a unique identifier for your application as well as a secret key. This allows your app to communicate with Facebook and use its API. Without it, you can't do much (besides screen scraping and hoping).

Now you're ready to start creating your app. I've used the Dancer web app framework, but feel free to use your favorite. Start with a basic Dancer module:

package MyFacebook;

use strict;
use Dancer ':syntax';
use Facebook::Graph;

get '/' => sub {
  template 'home.tt'
};

true;

That's sufficient to give the app a home page. The next step is to force people to log in if they haven't already:

before sub {
    if (request->path_info !~ m{^/facebook}) {
        if (session->{access_token} eq '') {
            request->path_info('/facebook/login')
        }
    }
};

This little bit of Dancer magic says that if the path is not /facebook and the user has no access_token attached to their session, then redirect them to our login page. Speaking of our login page, create that now:

get '/facebook/login' => sub {
    my $fb = Facebook::Graph->new( config->{facebook} );
    redirect $fb->authorize->uri_as_string;
};

This creates a page that will redirect the user to Facebook, and ask them if it's ok for the app to use their basic Facebook information. That code passes Facebook::Graph some configuration information, so remember to add a section to Dancer's config.yml to keep track of that:

facebook:
    postback: "http://www.madmongers.org/facebook/postback/"
    app_id: "XXXXXXXXXXXXXXXX"
    secret: "XXXXXXXXXXXXXXXXXXXXXXXXXXX"

Remember, you get the app_id and the secret from Facebook's developer application after you create the app. The postback tells Facebook where to post back to after the user has granted the app authorization. Note that Facebook requires a slash (/) on the end of the URL for the postback. With Facebook ready to post to a URL, it's time to create it:

get '/facebook/postback/' => sub {
    my $authorization_code = params->{code};
    my $fb                 = Facebook::Graph->new( config->{facebook} );

    $fb->request_access_token($authorization_code);
    session access_token => $fb->access_token;
    redirect '/';
};

NOTE: I know it's called a postback, but for whatever reason Facebook does the POST as a GET.

Facebook's postback passes an authorization code-a sort of temporary password. Use that code to ask Facebook for an access token (like a session id). An access token allows you to request information from Facebook on behalf of the user, so all of those steps are, essentially, your app logging in to Facebook. However, unless you store that access token to use again in the future, the next request to Facebook will log you out. Therefore, the example shoves the access token into a Dancer session to store it for future use before redirecting the user back to the front page of the site.

NOTE: The access token we have will only last for two hours. After that, you have to request it again.

Now you can update the front page to include a little bit of information from Facebook. Replace the existing front page with this one:

get '/' => sub {
    my $fb = Facebook::Graph->new( config->{facebook} );

    $fb->access_token(session->{access_token});

    my $response = $fb->query->find('me')->request;
    my $user     = $response->as_hashref;
    template 'home.tt', { name => $user->{name} }
};

This code fetches the access token back out of the session and uses it to find out some information about the current user. It passes the name of that user into the home template as a template parameter so that the home page can display the user's name. (How do you know what to request and what responses you get? See the Facebook Graph API documentation.)

While there is a bit of a trick to using Facebook as an authentication system, it's not terribly difficult. Stay tuned for Part II where I'll show you how to post something to a user's wall.

15 Mar 2011 6:36pm GMT

CPAN Testers: Metabase SSL Certificate

For anyone who may have been affacted by the upgrade to LWP, the situation should now be resolved. David has put in place a 3rd party verified SSL certificate on the Metabase server, so all submissions should now be able to resolve certificate authenticity.

If you have implemented any short term fixes, you may need to remove them, before accepting the new certificate.

We now return you to your scheduled programming :)

Cross-posted from the CPAN Testers Blog

15 Mar 2011 2:08pm GMT

David Golden: Fixed CPAN Testers reporting with LWP 6

As Barbie reported, CPAN Testers broke under LWP version 6, as this version of LWP now defaults to rejecting unverifiable SSL connection (e.g. self-signed certificates). That meant that CPAN Testers upgrading their LWP could no longer submit reports (at least via https). The quick and obvious solution was to buy an SSL certificate and that's now done. If you visit https://metabase.cpantesters.org/, you can see the new certificate in action.

15 Mar 2011 1:19pm GMT

14 Mar 2011

feedPlanet Perl

Chris Williams: Mangling Exchange GUIDs

I spent a good few hours today attempting to use the MailboxGUID returned from the WMI Exchange provider to search for the associated Active Directory account, using the msExchMailboxGuid attribute.

Here's two functions I came up with in the end. One to convert MailboxGUID to something that a search on msExchMailboxGuid will like:

sub exch_to_ad {
  my $guid = shift;
  $guid =~ s/[\{\}]+//g;
  my $string = '';
  my $count = 0;
  foreach my $part ( split /\-/, $guid ) {
    $count++;
    if ( $count >= 4 ) {
      $string .= "\\$_" for unpack "(A2)*", $part;
    }
    else {
      $string .= "\\$_" for reverse unpack "(A2)*", $part;
    }
  }
  return $string;
}

And another to take a msExchMailboxGuid field, which is a byte array, and convert it to a MailboxGUID.

sub ad_to_exch {
  my $guid = shift;
  my @vals = map { sprintf("%.2X", ord $_) } unpack "(a1)*", $guid;
  my $string = '{';
  $string .= join '', @vals[3,2,1,0], '-', @vals[5,4], '-', 
     @vals[7,6], '-', @vals[8,9], '-', @vals[10..$#vals], '}';
  return $string;
}

Hopefully this should save other people some time.

14 Mar 2011 1:50pm GMT

CPAN Testers: LWP v6.00 & Self-signed Certificates.

If you're an existing CPAN Tester, and have recently upgraded LWP, you may have noticed that your report submissions have been failing. The reason being that LWP::UserAgent now requires that any https protocol request, needs to verify the certificate associated with it. With the Metabase having a self-signed certificate, this doesn't provide enough verification and so fails.

In the short term if you don't need to update LWP (libwww-perl), refrain from doing so for the time being. For those that have already done so, or have recently built test machines from a clean starting point, you will either need to wait until we have put a long term solution in place, or may wish to look at a solution from Douglas Wilson. Douglas has created a "hypothetical distribution", which you can see via a gist.

Others have also blogged about the problem, and have suggests and insights as to how to overcome this for the short term:

We will have more details of the longer term solution soon.

Cross-posted from the CPAN Testers Blog

14 Mar 2011 9:08am GMT

Sawyer X: Dancer release codename "The Schwern Cometh"

We've decided we're gonna start releasing Dancer under codenames that relate to people who've worked on the release.

This release (1.3020) we've seen the continued (and blessed!) involvement of a one "Michael G. Schwern". To some of you he might just be a "mike" or "michael" (or perhaps "the schwern"), but none of us in the core knew Schwern personally before his involvement with Dancer, and this came as a very welcomed and pleasant surprise.

Considering the storm of issues and pull requests done by Schwern, we decided the next version should be named after him, hence "The Schwern Cometh". :)

The latest version is only a week-or-so of development but carries the following statistics:

I really do see this as exceptional work. Other than Schwern I also want to thank Naveed Massjouni and Maurice Mengel for their contributions to this release (and any previous release!).

In the near future we'll also unveil the most elaborate hooks subsystem in the micro web framework world. I already know whose names will be splashes on that release. :)

14 Mar 2011 8:32am GMT