3 posts tagged “monads”
I've been away for a while from Haskell so I thought I should do some revision and really get my head around Monads. While I plodded through the wonderful "meet the monads" tutorial, I decided that the best way to learn would be to do. By implementing Monads in Perl. I'd highly recommend trying to implement monads in Your Favourite Language, if it supports lambdas. Perl has already been done by Greg Buchholz and rather nicely too, but there's no Monad library on CPAN so I thought it would be worth a try.
First of all, the question of how to model "types" is easily resolved. We bless each monad into the Monad class or a subclass. These can then have methods for bind and return etc.
Now I do like the haskell >> and by a stroke of good fortune, Perl allows us to overload that symbol too.
use overload '>>' => 'Bind';
I use the string 'Bind' rather than the reference \&Bind, so that the subclasses can easily override it.
Some default bind methods in Monad.pm and Monad::Maybe etc., available here and we have some simple examples like this one (in test.pl):
my $result =
(Writer 2) >>
L { my $x = shift; (Writer $x*2, "Doubled. ") >>
L { my $y = shift; (Writer $y+1, "Plus 1. ") >>
L { my $z = shift; (Writer $z*3, "Tripled $z. ")
}}};
Woot! OK, that's not entirely beautiful, but it's been slightly improved by the overloading of >>.
The L lambda generator is also there for readability. It's basically defined as
sub L (&) { shift }i.e. it's an identity function, but it's an L (like lambda) and to my mind, lined up on the left, it looks pleasingly like "and then".
Nests
This didn't just fall straight out of the text editor into fully working code, of course. A blow-by-blow account of me getting confused wouldn't be especially interesting, but one big "aha" moment is worth pointing out. I realised that I was thinking of monads as being a chain of lambdas, each one passing control to the next, like OO chaining:
But that doesn't work, as of course then the $x, $y, $z of each scope would be separate, whereas in fact, in "later" sections, you can refer to $x too. This implies that the model is more like a nest of lambdas:
This is made fairly clear in the Perl above, with its delimited braces, if you look at where the closing "}" are, and which opening "{" they match up with.
This is an interesting mind shift, and one that I still haven't really fully grasped, as I'll demonstrate a bit later.
Polymorphic functions on monads
In Haskell, you can call "return" in a monadic block to "lift" a value to the appropriate monad. Similarly, you can call "fail", and the function will fail in the right way (returning Nothing in a Maybe, throwing an error in IO). This is a function call, not a method, so how does it know which monad to behave as?
Of course Haskell does this with its strong inferencing typechecker. The compiler "knows" that we are in Maybe, so "fail" will be fail :: Maybe.
Perl on the other hand doesn't have a strong type-inferencing compiler... Right now I'm doing some shonky magic with caller() that works in this very simple test case (and I believe only in this test case). I think I could just simplify things and set a dynamic variable "$Monad::current_monad" on the first occurrence of Bind. Yeah, global variables, yuck. The final alternative that occurs to me would be to run the whole thing in a Reader monad which just passes the name of the monad... but I'm fairly sure that's slightly insane.
So what can it do right now?
The test script shows the current capabilities. As of r246, I have Writer, Maybe, and List implemented (the Monad superclass is effectively Identity).
I think Maybe is very useful - with some wrapper functions that raise Perl functions to monadic ones using a variety of strategies (fail on undef/0/die etc.) it could be a useful addition to the toolbox, simplifying a nested set of if checks.
The List monad already does list comprehensions, albeit with a rather yucky syntax. Which is of course the big problem, 'cos Perl programmers (and this statement may surprise non Perl programmers :-) are often obsessive about syntax.
Making it look pretty
OK, so we already added a bit of sugar with the >> overloading, and the L function for lambda generators, but it's still rather ugly with the mix of Perlish argument unpacking (my $x = shift), scope delimiters (}}}) etc.
Source filters!
The original Perl monad tutorial used a source filter to give a monadic Do notation. It's a fairly nice one as they go, but I don't really want to treat my program as a string if I can help it, so let's look at some other techniques first!
Devel::Declare
Matt Trout has been working on some crazy parsing magic in Devel::Declare. This isn't a source filter, but (I think) hooks into Perl's parser to change the way that subroutine declarations are parsed. It'd designed to give us parameter unpacking, so that we could substitute:
with:L {my $x = shift; .... }
L ($x) { .... }
In the current version this doesn't work (you can define L like that easily, but the overloaded >> evidences a minor parsing bug (you'd have to put the expression between parentheses to get the precedence right, which loses the syntactic advantage we gain).
Still, hopefully will be fixed in a future release.
Generators
"Valued Lessons" has a beautiful post on Monads in Python (with nice syntax!). The parenthesis is not hyperbole: the post describes a monadic do block which looks about as pretty as Haskell's, but which works in a different way. We spell 'bind' (Haskell's <-) as 'yield'. So a control sub calls the 'do' block, gets out monadic values one by one as they are yielded back, and deals with the nitty gritty of binding them to the rest of the generator.
It took quite a while to understand the Python code: in fact I'm not sure I understand it fully, I really don't buy into the "Python is so easy to read" meme, and certainly the "@whatever" syntax, which seems to be 'decorators' that modify the subroutine that follows them, are rather confusing at first. But it's quite impressive, and it took me a while to replicate in Perl.
First hurdle: Perl doesn't have generators. OK, that shouldn't be an issue, I thought, because we have the CPAN. And yes, I found Brock Wilcox's Coro::Generator.
This doesn't quite do what I want though. The yield only works one way, so
doesn't actually bind $x to 3. I asked Brock on IRC, and apparently this behaviour is desired (I'm not quite sure why) so I forked his code to play with it :-) Also, the coroutine restarts immediately it finishes, which is inconvenient. Brock suggested yielding undef at the end, which is fine, I can do that from the control sub. (The Python version deals with finishing by throwing an exception, so perhaps it has the same semantics?)my $x = yield (Monad 3);
After a lot of ugly pain, I finally got this working, and we can now do:
my $result = Do {
my $x = yield (Just 3);
my $y = yield (Nothing);
my $z = yield (Just 5);
warn "x=$x, y=$y, z=$z";
Just 6;
Why the pain? Failing to understand coroutines while trying to use them to implement monads (which I understand only very slightly) was a bad start. I found myself using the Do function to repeatedly take a value from the generator and bind it with the next value (rather than letting the monadic bind deal with those details). And even when I'd realised that the sub that I needed to bind was a lambda that would abstract the details of invoking the coroutine, I still ended up flailing around more or less at random till I finally got it working.
The current code is ugly (declared inline in test.pl rather than modularized) but the result is pleasantly magical and readable.
Props of course to Python for having powerful techniques like yield and decorators in core!
Hold the champagne
Of course the final test example, in the List monad doesn't work. Why? The List monad's bind strategy is to call the function on every element of the list, so the coroutine will get called repeatedly. And every time it's called, the execution pointer will move on.
I wonder whether the Python version has the same problem? I looked again at the Coro modules on CPAN, and noted that they are advertised as being able to implement "(non-clonable) continuations". I think this is the problem: I want to be able to take the point at which the next Bind will be called, and call exactly that same point multiple times (for the List monad). I asked various people including Brock again, and Scott Walters (the authors of Continuity, a continuation-based web application framework in Perl) and got the answer that Perl really doesn't do proper continuations. (As far as I understood it, they're more or less practically impossible, due to the way Perl models its execution context).
So, unless I've misunderstood (and please let me know if I have!) this technique is limited to monads that only call the bound function once (e.g. most of them except List). That's a shame though, as the List comprehension semantics would be lovely to express in a monadic do block.
Meta continuations
The Valued Lesson post does implement continuations monadically... Could we do that and then implement monadic do using these monadic continuations? I think the answer might be "Yes but my brain would explode trying to implement it".
Plan B
I think that the most sensible method may be to take the contents of the monadic do block and use the B:: modules to convert them from what looks like
tomy $x = bind ...;
. Which is pretty much the approach of Greg Buchholz's source filter. But I think a parse tree transformation may be more elegant. (This said, I don't know the Perl source or understand the opcodes, so it may just be slightly crazy).... >> sub { my $x = shift;
Update: Some discussion on reddit, as Vox still doesn't support OpenID
I'd been out of Liverpool for 3 years or so, and I completely missed GeekUp: a loosely affiliated, grassroots tech meetup society in the North West. The Liverpool branch is pretty active, and linked with various other groups, such as the DotNet user group, where Chris Alcock gave a very interesting talk on F#.
Not sure what the Dot Net programmers made of it, there were some questions afterwards amounting to "What's the point? Are academics going to use it?" :-) Which I thought was amusing as I was looking at it from the perspective of a Haskell newbie, thinking a) that's cool, b) it's simpler than Haskell, c) it plugs into the .Net libraries and development environment, and had concluded that it could (possibly) become a massive hit in real-world programming too...
Some notes, mainly comparisons with Haskell:
- The #light pragma adds syntactic sugar, very much like the Haskell do notation. (No need for open/close/statement delimiters, whitespace significant, skip in off let statements)
- The REPL is multiline by default (dons on #haskell noted that ghci supports this with :{
- Functions aren't recursive by default and have to be introduced as such (let rec factorial = ...)
- Lists aren't lazy, but there's a separate datatype called Seq which is.
- You can use yield in a function to define lazy sequences.
- Like Ocaml, F# supports mutable data, though the default is pure data.
- It also supports reference types - I'm not sure I understand what these are for in a functional programming language, or if they're just for updating, then how they're different from plain old mutable data. The example Chris gave was the classic closure example of a counter function.
- The examples he made of pipelining with |> and >> looked very much Haskellish monads (especially in a #light style block). I quickly leafed through Chris's copy of till I found the section on monads. They're called "Workflows", because that's less scary than Monads. They're also largely transparent, which on one hand is nice, as there's none of the painful and ugly "lifting" of values to the appropriate monad. (On the other, it means that you can't easily separate action and non-action code).
- This of course means that you don't get some of the benefits of FP's purity. Martin Owen, who is keen on Erlang and its capacity for massively scalable, high performance networking, also pointed out that allowing mutability means that you can't guarantee that the application is threadsafe, and is the wrong default as we're coming up against multicore programming.
All in all, it was a very interesting talk, and I'm looking forward to playing with F#. I apt-got Mono, and promptly failed to install F# on it, as the provided install.sh script whined about something to do with gac and aot. Ah well, perhaps that's a good excuse to boot up into Windows...
The next obvious functions after “Pick” are “Unpick” to eject a value randomly from the list, and “Pick N” values from the list.
unpick is fairly straight-forward. To do in a single pass, you have to thread through the list of values, but otherwise it's more or less the same as the previous code for pick
> unpick :: [a] -> IO [a]
> unpick [] = undefined
> unpick [x] = do return []
> unpick (x:xs) = do zs <- unpick' [] [x] xs 2
> return (reverse zs)
>
> unpick' :: (Num p, Random p) => [t] -> [t] -> [t] -> p -> IO [t]
> unpick' curr orig [] _
> = do return curr
> unpick' curr orig (next:rest) prob
> = do r <- getStdRandom (randomR (1,prob))
> let curr' = if r == 1 then orig else (next:curr)
> unpick' curr' (next:orig) rest (prob+1)
To do pickN I'd originally thought of picking the required number, and then following an algorithm similar to that of pick but dropping one (with unpick) every time I selected a new entry. That would be quadratic, and also insane. I discussed with dakkar, and we agreed that it might just be simpler to call something like pick N times (maybe with a function that returned the picked and the rejected part).
But, if we know the length of the array things are much easier. We can just scan the array, selecting elements with a probability of (r/e) where r is the number of required elements left to be selected, and e the number of elements left in the array.
> pickN :: Int -> [a] -> IO [a]
> pickN n xs = let len = length xs
> in pickN' n len xs
And we just need
> pickN' :: Int -> Int -> [a] -> IO [a]
> pickN' n l [] = do return []
> pickN' n l (x:xs)
> = do b <- roll n l
> if b
> then x: (pickN' (n-1) (l-1) xs)
> else pickN' n (l-1) xs
and to define the auxhiliary function roll, because I got tired of writing it out all over the place:
> roll p q = do r <- getStdRandom (randomR (1,q))
> return $ r <= p
Which is all great apart from not working. The problem is that pickN’ returns a monadic value, while x is a pure one. I asked on #haskell and got help from oerjan, scook0, and DRMacIver.
First of all they suggested:
> if b
> then (x:) `liftM` (pickN' (n-1) (l-1) xs)
> else pickN' n (l-1) xs
Now I intuitively sort-of-understood what this was doing but not why. (Bear in mind that I'm still confused about things like “why do I return this but not that?” and “How does it know which Monad it's in?”)
It turns out that liftM lifts a function with one pure argument into a function with one monadic argument. But I'm concatenating x (pure) with picked xs (monadic)! So the section (x:) is actually a pure function specialised to concatenate “x” with a pure list (which is the single argument).
My first reaction was a mix of wonder and “good god that's horrible”. As they pointed out though, I could just use another do block.
> if b
> then do xs <- pickN' (n-1) (l-1) xs
> return (x:xs)
> else pickN' n (l-1) xs
Hurrah! Now I should go and check out a random monad like MonadRandom...
