25 posts tagged “haskell”
The Countdown code I showed you isn't really taking advantage of Haskell's laziness. We should only have to check entries up until the point that we have enough matches (in the current code 'take 4 $ getAnagrams') and that's good. However, we have to generate the whole powerset first, so that we can sort it in reverse order of length. Ideally, we'd generate the powerset breadth-first, in order of length.
OK, so generating the powerset doesn't take all that much time and this isn't a good optimization as such, but I did think it might be fun trying to do the breadth first powerset, as all the examples I'd seen had been depth first.
But first: an interlude to marvel at one of the scariest things I've seen this year. A monadic definition of powerset.
import Control.Monad (filterM)
powerset = filterM (const [True, False])
I'm not even going to attempt to twist my head around that now, but it's very beautiful, though it's impossible to tell just by reading it what the intent of the code is.
I asked on #london.pm if anyone knew good breadth-first algorithms for powerset. Joel helpfully pasted the following
I think it surprised Joel (and actually, it surprised me a little) that I was more or less unable to read this at all. Yes, I know that the syntax of Lisp is incredibly simple, and you can learn all the syntax of Scheme in 20 minutes or whatever. The noise of the parenthesis, the cdrs etc. is just noise, you can filter it out if you look at it calmly. But I still don't understand where the pairs are in pair-fold-right, and cut apparently does something similar to currying, but what does that mean in context?(define (combinations set n)
(if (zero? n)
(list '())
(let ((n2 (- n 1)))
(pair-fold-right
(lambda (pr acc)
(let ((first (car pr)))
(append (map (cut cons first <>)
(combinations (cdr pr) n2))
acc)))
'()
set))))
(define (power-set set)
(let ((size (length set)))
(let loop ((i 0))
(if (> i size)
'()
(append (combinations set i)
(loop (+ i 1)))))))
To cut a long story short, I was reading this as a foreign language rather than a piece of generic pseudocode. When I was very little, I read with my mother a picture book about frogs, in German. With the help of the pictures and a little imagination, it was easy to tell which word meant "frog", which meant "tree", and what the sense of the story was. After we finished, very puffed up with just how damn clever I was, I started trying to read it again, and got utterly confused about all these strange words like 'in' and 'auf' and 'dem' that I just hadn't worried about the first time around.
So... trying to see the frog for the trees, we can see that combinations gives every set of combinations of a particular length, and that power-set merely loops through 0..length, appending the combinations found for that length.
We can write combinations as a variant on the original powerset function, but which refuses to carry on once it's got enough letters:
combinations xs n = comb' xs n (length xs)
comb' xss@(x:xs) n l | n == 0 = [[]]
| l == n = [xss]
| otherwise = (map (x:) $ comb' xs (n-1) (l-1)) ++ (comb' xs n (l-1))
comb' [] _ _ = [[]]
And powerset is easy as:
We can now remove out the lines with sortBy and filter longEnough, as the new definition already presents the items in the right order.powerset xs = powerset' xs (length xs)
powerset' xs l = if l < minLength
then []
else (combinations xs l) ++ (powerset' xs (l-1))minLength = 3
Does this make it any faster? Apparently not: as I guessed, powerset is not the hotspot. I guess that the problem is the repeated lookups in the Data.Map — any suggestions on how to profile the Haskell code, and better algorithms to deal with it?
In the last post, I mentioned that we might be able to improve the performance of our sort using a "http://en.wikipedia.org/wiki/Schwartzian_transform".
This basically involves precaching the expensive calculation (in this case, length), sorting by the cached value, then taking just the original values.
Let's test with a list of words:
words "If on a winter's night a traveller"
So instead of the original sortBy (flip $ comparing length), we'd have something like:
map fst
. sortBy (flip $ comparing snd)
. map (id &&& length)
Let's read it from the bottom. First of all we create a list of tuples using the rather wonderful &&& operator from Control.Arrow.
[("If",2),("on",2),("a",1),("winter's",8),("night",5),("a",1),("traveller",9)]
Then we sort comparing the new second field of this tuple.
[("traveller",9),("winter's",8),("night",5),("If",2),("on",2),("a",1),("a",1)]
Finally we map again, getting just the first element of the tuple.
["traveller","winter's","night","If","on","a","a"]
And we can easily abstract this with a new function
sortST cmp f = map fst
. sortBy (cmp snd)
. map (id &&& f)
Now we can write:
sortST comparing length listOfWords
sortST (flip.comparing) length listOfWords
This is very similar to the sortBy syntax, except that we've separated out the "comparing" from the "length" clause, in order to compose the two separately for the new transformation.
Will on #geekup has been working on a Countdown letters and numbers game solver written in Python. I thought it'd be fun to try to do it in Haskell, and started with the letters game (anagram) solver.
Starting with a string of jumbled letters, the goal is to make the longest possible anagram. I remember the first time I tried to solve anagrams I jumped into the problem without thinking and got mixed up in all kinds of complicated combinatorial mess. The actual answer is very simple: let's take two words which are anagrams of each other:
- monad
- nomad
Both of them contain the same letters, so they are identical in some form of "canonical representation", for example
- {a:1, d:1, m:1, n:1, o:1} -- dictionary mapping letter to number of times used
- "admno" -- a string with the letters sorted
So for example:
This function is called a powerset. I'm lazy so I googled a definition. We want the longest words first. The definition of powerset I found does a depth first search so it's not in order of length. What we want to do is to work on a list like thispmqnrdzoa
...
m n d oa
...
where canonicalize is just sort . map toLower . filter isLetter.list s =
sortBy (flip $ comparing length) -- longest first
. nub -- unique entries only
. powerset -- all combinations of
. canonicalize -- canonical (sorted) string
$ s
comparing is a nice litle utility sub that makes the above effectively the same as (\a b -> length a `compare` length b). We then flip it to reverse the ordering (and this is actually a good use for flip ;-).
Ordering by length is potentially inefficient — it checks the length of each element twice, and unlike Perl (where a string knows its own length), a string is just a list, so it has to descend the list to find it out. This is easy to optimize by precalculating the lengths, using a technique that in Perl we call the "Schwartzian transform", and I'll probably come back to this.
OK, so we have a list of subsets to compare, now we need to find a dictionary of canonical representations of words. Luckily most unixy distributions ship with one, often /usr/dict/words, but Ubuntu sticks is elsewhere.
I asked on #haskell, and was told I should use a Data.Map, Haskell's basic equivalent of a hash or associative array, but implemented using a Functional Programming friendly tree representation. In actual fact, quicksilver, mrs, and mmorrow told me the answer straight away, but let's pretend for the purpose of this post that we're working it out now :-)
Assuming I load that module, as is common, as M, I'd essentially want to call M.insertWith (++) on each element. The (++) is the concatenation operator, and it's the right thing to use because the dictionary is mapping String -> [String], for example
insertWith returns a new copy of the Map each time. It's like an accumulator which gradually takes on the entries from the list of words. And whenever we think about accumulators, we can think about folds.fromList [("admno", ["nomad","monad","Damon"]),...]
mempty is shorthand here for "an empty Data.Map". But we can go one better as apparently fold/insertWith is so common that there is a shorthand, fromListWith!foldl' (\m x -> M.insertWith (++) (canonicalize x) [x] m) mempty listOfWords
Woah! That's quite compact, and I just introduced some new syntax too: The &&& is basically saying "let's make a tuple with the result of calling these 2 functions on my input!" so it's the same asfromListWith (++) . map (canonicalize &&& return)
And return just means "wrap this value in the appropriate Monad". So it's a scary way of saying [a], because we're "in" the List monad. (In the same way that mempty above was an empty Data.Map, because it was "in" the Map Monad.)fromListWith (++) . map (\a -> (canonicalize a, return a))
Whenever I play with Map, I get angry errors about the monomorphism restriction. The way around that is to add an explicit type signature. If, like me, you're not quite sure what to put there, you can add a compiler directive to quell the error, then work out what the signature would be by calling :t my_function from the GHCI command line. (You'll often find afterwards that you can remove the signatures if you wanted to, because later on the compiler has more information to work out the types of things. It's only really during incremental development that you get the problem.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- (that's the compiler directive, you can comment this out later)
makeAnag s = do
d <- dict
return $ take 4 $ getAnagrams s d
dict = do file <- readFile "/etc/dictionaries-common/words"
return $ mkdict $ lines file
mkdict :: [String] -> M.Map String [String]
mkdict = M.fromListWith (++) . map (canonicalize &&& return) . filter longEnough
longEnough = (>=3) . length
As you can see, for all the perceived difficulty of doing IO in a pure language
like Haskell, it doesn't seem all that hard in this simple case. readFile
reads the file, and lines splits it into an array of lines.
The final thing is to check each powerset against the dictionary. To extract the value, we use M.lookup. This function fails if it can't find a value. So we could do
- For each powerset in the list
- Check if it's present
- And add it to the list if so
With an empty list for each failure. We can use concatMap to join these together. So it's:[ ["anagram"], [], [], ["anagram 1", "anagram 2"], [] ]
Though that actually returns:concatMap (\v -> M.lookup v dict) listOfPowersets
which I hadn't expected. (M.lookup returned a list like ["anagram 1", "anagram 2"]. Quite literally it returned it, which in List context means it actually passed [["anagram 1", "anagram 2"]], which is why the list isn't completely flattened by concatMap. I get around this by using join. This is another of those monadic functions: in List context it does exactly what we want here, flattening this list.[ ["anagram"], ["anagram 1", "anagram 2"], ]
You can look at the final Haskell Countdown code. I'll look at optimizing the sort and the powersets soon, any comments on other improvements (including better algorithms) very welcome. (Sorry, comments require Vox signup...)getAnagrams s d = join
. concatMap (flip M.lookup $ d)
$ filter longEnough -- 3 or more letters
. sortBy (flip $ comparing length) -- longest first
. nub -- unique entries only
. powerset -- all combinations of
. canonicalize -- canonical (sorted) string
$ s
I've been away for a while from Haskell so I thought I should do some revision and really get my head around Monads. While I plodded through the wonderful "meet the monads" tutorial, I decided that the best way to learn would be to do. By implementing Monads in Perl. I'd highly recommend trying to implement monads in Your Favourite Language, if it supports lambdas. Perl has already been done by Greg Buchholz and rather nicely too, but there's no Monad library on CPAN so I thought it would be worth a try.
First of all, the question of how to model "types" is easily resolved. We bless each monad into the Monad class or a subclass. These can then have methods for bind and return etc.
Now I do like the haskell >> and by a stroke of good fortune, Perl allows us to overload that symbol too.
use overload '>>' => 'Bind';
I use the string 'Bind' rather than the reference \&Bind, so that the subclasses can easily override it.
Some default bind methods in Monad.pm and Monad::Maybe etc., available here and we have some simple examples like this one (in test.pl):
my $result =
(Writer 2) >>
L { my $x = shift; (Writer $x*2, "Doubled. ") >>
L { my $y = shift; (Writer $y+1, "Plus 1. ") >>
L { my $z = shift; (Writer $z*3, "Tripled $z. ")
}}};
Woot! OK, that's not entirely beautiful, but it's been slightly improved by the overloading of >>.
The L lambda generator is also there for readability. It's basically defined as
sub L (&) { shift }i.e. it's an identity function, but it's an L (like lambda) and to my mind, lined up on the left, it looks pleasingly like "and then".
Nests
This didn't just fall straight out of the text editor into fully working code, of course. A blow-by-blow account of me getting confused wouldn't be especially interesting, but one big "aha" moment is worth pointing out. I realised that I was thinking of monads as being a chain of lambdas, each one passing control to the next, like OO chaining:
But that doesn't work, as of course then the $x, $y, $z of each scope would be separate, whereas in fact, in "later" sections, you can refer to $x too. This implies that the model is more like a nest of lambdas:
This is made fairly clear in the Perl above, with its delimited braces, if you look at where the closing "}" are, and which opening "{" they match up with.
This is an interesting mind shift, and one that I still haven't really fully grasped, as I'll demonstrate a bit later.
Polymorphic functions on monads
In Haskell, you can call "return" in a monadic block to "lift" a value to the appropriate monad. Similarly, you can call "fail", and the function will fail in the right way (returning Nothing in a Maybe, throwing an error in IO). This is a function call, not a method, so how does it know which monad to behave as?
Of course Haskell does this with its strong inferencing typechecker. The compiler "knows" that we are in Maybe, so "fail" will be fail :: Maybe.
Perl on the other hand doesn't have a strong type-inferencing compiler... Right now I'm doing some shonky magic with caller() that works in this very simple test case (and I believe only in this test case). I think I could just simplify things and set a dynamic variable "$Monad::current_monad" on the first occurrence of Bind. Yeah, global variables, yuck. The final alternative that occurs to me would be to run the whole thing in a Reader monad which just passes the name of the monad... but I'm fairly sure that's slightly insane.
So what can it do right now?
The test script shows the current capabilities. As of r246, I have Writer, Maybe, and List implemented (the Monad superclass is effectively Identity).
I think Maybe is very useful - with some wrapper functions that raise Perl functions to monadic ones using a variety of strategies (fail on undef/0/die etc.) it could be a useful addition to the toolbox, simplifying a nested set of if checks.
The List monad already does list comprehensions, albeit with a rather yucky syntax. Which is of course the big problem, 'cos Perl programmers (and this statement may surprise non Perl programmers :-) are often obsessive about syntax.
Making it look pretty
OK, so we already added a bit of sugar with the >> overloading, and the L function for lambda generators, but it's still rather ugly with the mix of Perlish argument unpacking (my $x = shift), scope delimiters (}}}) etc.
Source filters!
The original Perl monad tutorial used a source filter to give a monadic Do notation. It's a fairly nice one as they go, but I don't really want to treat my program as a string if I can help it, so let's look at some other techniques first!
Devel::Declare
Matt Trout has been working on some crazy parsing magic in Devel::Declare. This isn't a source filter, but (I think) hooks into Perl's parser to change the way that subroutine declarations are parsed. It'd designed to give us parameter unpacking, so that we could substitute:
with:L {my $x = shift; .... }
L ($x) { .... }
In the current version this doesn't work (you can define L like that easily, but the overloaded >> evidences a minor parsing bug (you'd have to put the expression between parentheses to get the precedence right, which loses the syntactic advantage we gain).
Still, hopefully will be fixed in a future release.
Generators
"Valued Lessons" has a beautiful post on Monads in Python (with nice syntax!). The parenthesis is not hyperbole: the post describes a monadic do block which looks about as pretty as Haskell's, but which works in a different way. We spell 'bind' (Haskell's <-) as 'yield'. So a control sub calls the 'do' block, gets out monadic values one by one as they are yielded back, and deals with the nitty gritty of binding them to the rest of the generator.
It took quite a while to understand the Python code: in fact I'm not sure I understand it fully, I really don't buy into the "Python is so easy to read" meme, and certainly the "@whatever" syntax, which seems to be 'decorators' that modify the subroutine that follows them, are rather confusing at first. But it's quite impressive, and it took me a while to replicate in Perl.
First hurdle: Perl doesn't have generators. OK, that shouldn't be an issue, I thought, because we have the CPAN. And yes, I found Brock Wilcox's Coro::Generator.
This doesn't quite do what I want though. The yield only works one way, so
doesn't actually bind $x to 3. I asked Brock on IRC, and apparently this behaviour is desired (I'm not quite sure why) so I forked his code to play with it :-) Also, the coroutine restarts immediately it finishes, which is inconvenient. Brock suggested yielding undef at the end, which is fine, I can do that from the control sub. (The Python version deals with finishing by throwing an exception, so perhaps it has the same semantics?)my $x = yield (Monad 3);
After a lot of ugly pain, I finally got this working, and we can now do:
my $result = Do {
my $x = yield (Just 3);
my $y = yield (Nothing);
my $z = yield (Just 5);
warn "x=$x, y=$y, z=$z";
Just 6;
Why the pain? Failing to understand coroutines while trying to use them to implement monads (which I understand only very slightly) was a bad start. I found myself using the Do function to repeatedly take a value from the generator and bind it with the next value (rather than letting the monadic bind deal with those details). And even when I'd realised that the sub that I needed to bind was a lambda that would abstract the details of invoking the coroutine, I still ended up flailing around more or less at random till I finally got it working.
The current code is ugly (declared inline in test.pl rather than modularized) but the result is pleasantly magical and readable.
Props of course to Python for having powerful techniques like yield and decorators in core!
Hold the champagne
Of course the final test example, in the List monad doesn't work. Why? The List monad's bind strategy is to call the function on every element of the list, so the coroutine will get called repeatedly. And every time it's called, the execution pointer will move on.
I wonder whether the Python version has the same problem? I looked again at the Coro modules on CPAN, and noted that they are advertised as being able to implement "(non-clonable) continuations". I think this is the problem: I want to be able to take the point at which the next Bind will be called, and call exactly that same point multiple times (for the List monad). I asked various people including Brock again, and Scott Walters (the authors of Continuity, a continuation-based web application framework in Perl) and got the answer that Perl really doesn't do proper continuations. (As far as I understood it, they're more or less practically impossible, due to the way Perl models its execution context).
So, unless I've misunderstood (and please let me know if I have!) this technique is limited to monads that only call the bound function once (e.g. most of them except List). That's a shame though, as the List comprehension semantics would be lovely to express in a monadic do block.
Meta continuations
The Valued Lesson post does implement continuations monadically... Could we do that and then implement monadic do using these monadic continuations? I think the answer might be "Yes but my brain would explode trying to implement it".
Plan B
I think that the most sensible method may be to take the contents of the monadic do block and use the B:: modules to convert them from what looks like
tomy $x = bind ...;
. Which is pretty much the approach of Greg Buchholz's source filter. But I think a parse tree transformation may be more elegant. (This said, I don't know the Perl source or understand the opcodes, so it may just be slightly crazy).... >> sub { my $x = shift;
Update: Some discussion on reddit, as Vox still doesn't support OpenID
I'd been out of Liverpool for 3 years or so, and I completely missed GeekUp: a loosely affiliated, grassroots tech meetup society in the North West. The Liverpool branch is pretty active, and linked with various other groups, such as the DotNet user group, where Chris Alcock gave a very interesting talk on F#.
Not sure what the Dot Net programmers made of it, there were some questions afterwards amounting to "What's the point? Are academics going to use it?" :-) Which I thought was amusing as I was looking at it from the perspective of a Haskell newbie, thinking a) that's cool, b) it's simpler than Haskell, c) it plugs into the .Net libraries and development environment, and had concluded that it could (possibly) become a massive hit in real-world programming too...
Some notes, mainly comparisons with Haskell:
- The #light pragma adds syntactic sugar, very much like the Haskell do notation. (No need for open/close/statement delimiters, whitespace significant, skip in off let statements)
- The REPL is multiline by default (dons on #haskell noted that ghci supports this with :{
- Functions aren't recursive by default and have to be introduced as such (let rec factorial = ...)
- Lists aren't lazy, but there's a separate datatype called Seq which is.
- You can use yield in a function to define lazy sequences.
- Like Ocaml, F# supports mutable data, though the default is pure data.
- It also supports reference types - I'm not sure I understand what these are for in a functional programming language, or if they're just for updating, then how they're different from plain old mutable data. The example Chris gave was the classic closure example of a counter function.
- The examples he made of pipelining with |> and >> looked very much Haskellish monads (especially in a #light style block). I quickly leafed through Chris's copy of till I found the section on monads. They're called "Workflows", because that's less scary than Monads. They're also largely transparent, which on one hand is nice, as there's none of the painful and ugly "lifting" of values to the appropriate monad. (On the other, it means that you can't easily separate action and non-action code).
- This of course means that you don't get some of the benefits of FP's purity. Martin Owen, who is keen on Erlang and its capacity for massively scalable, high performance networking, also pointed out that allowing mutability means that you can't guarantee that the application is threadsafe, and is the wrong default as we're coming up against multicore programming.
All in all, it was a very interesting talk, and I'm looking forward to playing with F#. I apt-got Mono, and promptly failed to install F# on it, as the provided install.sh script whined about something to do with gac and aot. Ah well, perhaps that's a good excuse to boot up into Windows...
Chessguy pointed out that it's currently hard to play along with the monad wars code.
It would be nice for the posts to be “literate haskell”, where sections preceded by “>“ characters are valid Haskell. The idea is great - that you can mix sections of introduction and description with sections of actual code, ending up with an article that is also executable code! Which is very much the style of these posts, but right now I'm being too lazy to go that extra step:
- sometimes I show multiple version of the same function, (some of them might not work)
- I tend to introduce imports as needed but (I believe) they need to be at the top of the file
- I don't always repeat functions from earlier posts but just refer to them
So I've posted the current source to my subversion repo. As you can see they're currently related to the number of the associated post, and contain different areas of functionality: this is actually how I'm working on it for the moment - I'm hoping to put together some of the pieces in part 5 or so (Blog Driven Development is a rather odd way to structure your work but there you go...)
Update: Vincenz on #haskell convinced me that I really should try literate haskell - watch this space...
I'm now probably going to fall off the internet for a week while I move country and job. I'll be at the London Perl Workshop this Saturday and will talk (for a whole 5 minutes!) about Monad Wars. Maybe see you there :D
After the last post, we have parser actions that can recognise an integer or an item of merchandise. Now we need to be able to process a command, like “jet bronx” or “buy 4 lambdas”. Let's start off with this basis:
> parseCommand = parseMap commandMap
> commandMap = getPrefixMap [
> ( "buy", cmdBuy ),
> ( "sell", cmdSell ),
> ( "jet", cmdJet ),
> ( "quit", cmdQuit )
> ]
Now, what we roughly want to do is:
- Tokenise the line
- Check if the first token maps to a command
- Check if the rest of the tokens can be handled by that command.
- The result (if applicable) is a function that maps an original GameState into a new state.
We might come up with something like this:
> parseLine s = do let (cmd:pars) = tokenizeLine s
> c <- parseCommand cmd
> c' <- c pars
> return c'
But I'd promised that we'd check if the parsing worked! Can you see any checks like that above?
Possibly Maybe
If we check the type of parseCommand, we'll notice it returns a Maybe
parseCommand :: [Char]
-> Maybe ([String] -> Maybe (GameState -> Maybe GameState))
This means that the do expression starts with a Maybe (we don't count the let expression) and so is in the Maybe monad. If any of the sequence fails, parseLine will return Nothing, without us having to specify anything! This is quite cute and, once you get used to it, rather intuitive (the definition above fell naturally out of my text editor).
Here's another example of Maybe - parsing the expressions like “buy 4 curry” or “sell 10 stm”. First of all, we notice that cmdBuy and cmdSell both have the same form, so we'll share the code in a common parser called cmdMerchandise.
> cmdBuy = cmdMerchandise doBuy
> cmdSell = cmdMerchandise doSell
This parser looks at the first 2 parameters, and tries to parse them respectively as an Int or an item of Merchandise.
If either of the parses fails, it will magically return Nothing.
If it succeeds, it will return the result of, for example doBuy 10 lambdas. (The result is of course a function that takes a GameState in input, and returns a GameState that is the result of having bought 10 lambdas. Very meta.)
> cmdMerchandise f (n:m:_) = do n' <- parseInt n
> m' <- parseMerchandise m
> Just $ f n' m'
> cmdMerchandise _ _ = Nothing
Let's play
This is getting a little abstract if we can't test it. Right now our GameState record doesn't have a “list of merchandise” structure, so let's keep it simple for the sake of argument and add a debug string instead.
> data GameState = GameState {
> turn :: Integer,
> score :: Integer,
> location :: Location,
> debug :: String
> } deriving Show
>
> type Location = Int
(and add a debug = ““ to the startState declaration.)
We'll make the doBuy and doSell functions just modify the debug string:
> doBuy n m gs = return gs {
> debug = "You bought " ++
> (show n) ++ " " ++
> (name m)
> }
> doSell n m gs = return gs {
> debug = "You sold " ++
> (show n) ++ " " ++
> (name m)
> }
OK, we could factor these out as an exercise, but we'll be replacing them soon. Now, what we really want to do is to test it! I'll look at plugging this into the prompt structure next time, for now let's just create a test function. This just takes a GameState and a line, and returns the new state if it all worked out.
> test gs s = do c <- parseLine s
> c gs
We can play with this to see if it worked:
*Main> test startState "sell 3 la"
Just (GameState {turn = 1, score = 0, location = 0,
debug = "You sold 3 Lambdas"})
*Main> test startState "panic"
Nothing
The other actions are similar. We'll use the record mutators modScore and nextTurn that we saw last time.
> -- Just a stub: We'll probably want to set an "endflag" or similar.
> cmdQuit _ = Just doQuit
> doQuit gs = return $ modScore (-10) gs {
> debug = "Quitter!" }
>
> cmdJet (n:_) = do n' <- parseInt n
> Just $ doJet n'
> cmdJet _ = Nothing
>
> doJet n gs | n == location gs
> = return $ gs {
> debug = "You are already in location "
> ++ show n }
> | otherwise
> -- Jetting increments the turn counter
> = return $ nextTurn gs {
> location = n,
> debug = "You have moved to "
> ++ show n }
And we can now test the remaining actions:
*Main> test startState "jet"
Nothing
*Main> test startState "jet 1"
Just (GameState {turn = 2, score = 0, location = 1,
debug = "You have moved to 1"})
*Main> test startState "jet 0"
Just (GameState {turn = 1, score = 0, location = 0,
debug = "You are already in location 0"})
*Main> test startState "qu"
Just (GameState {turn = 1, score = -10, location = 0,
debug = "Quitter!"})
Next time around, we'll plug these actions into our prompt, and we'll work on representing the game state
One of the advantages of demonstrating your ignorance in public is that you may receive useful corrections... thanks to everyone who replied on these recent posts, I found the comments very instructive, and thought it was worth writing up as a new post.
Strict records
ddarius got in touch to mention that I might want to use “strict fields”. This might be an issue if I'm incrementing, say, turn, but not actually using the value. I'd end up building up a “thunk” (an unevaluated expression) like 1+1+1+1+1+1+1, which will get evaluated later (and if too much later, it could cause some problems like stack overflow). Actually, I don't think this will happen in this particular case (I'll be printing the turn count every time) but it's not hard to implement (just need to put a “!” before the strict fields)
> data GameState = GameState {
> turn :: !Integer,
> score :: Integer,
> location :: Location
> } deriving Show
Also, as nominolo suggested, in conjunction with -funbox-strict-fields, it can open up some possible optimizations.
Not Just Maybe
Now this is an interesting one. I was whining in my last post about Data.Map.lookup
but which monad is it in, and more to the point, why?
As you might imagine, I really wasn't getting it... and the code I wrote around it rather reflects that...
Vincenz, Rich Neswold, and “rm” all pointed out in rapid succession that the function I'd created for parseMap was completely redundant. Here, for comparison, is my first version.
> parseMap m s | M.member s m = do v <- M.lookup s m
> Just v
> | otherwise = Nothing
I wrote this because, from the ghci command line, it looked like M.lookup threw an error if it couldn't find the key. The suggestion, which is rather briefer is as follows:
> parseMap = flip M.lookup
The flip is only there because parseMap and Data.Map.lookup take their arguments in opposite orders. Otherwise lookup and parseMap are identical!
But how does this return a “Just” or a “Nothing” appropriately? Apparently, on success, it returns a Monad of the appropriate type by default. If on the other hand it doesn't work, it will fail.
The IO monad maps a fail to an error (which is why I saw the exception I mentioned in the post!) But Maybe will map it to Nothing.
So from the ghci command line, we can create a small test Data.Map and run some lookups against it “in” various monads.
Prelude Data.Map> let m = Data.Map.fromList ("one", 1)
-- success
Prelude Data.Map> Data.Map.lookup "one" m :: Maybe String
Just "uno"
Prelude Data.Map> Data.Map.lookup "one" m :: [String]
["uno"]
Prelude Data.Map> Data.Map.lookup "one" m :: IO String
"uno"
-- fail
Prelude Data.Map> Data.Map.lookup "two" m :: Maybe String
Nothing
Prelude Data.Map> Data.Map.lookup "two" m :: [String]
[]
Prelude Data.Map> Data.Map.lookup "two" m :: IO String
*** Exception: user error (Data.Map.lookup: Key not found)
ddarius gave a name to this technique, “Not Just Maybe”. That is, if you were going to write a function that returns Maybe “Just 1“ and Maybe “Nothing”, then you might as well just write it as a generic monad. This will then be usable within Maybe, as planned, but also in IO and List too.
This sparked an interesting discussion about “Common Idioms” in Haskell. Apparently the pages that used to exist on this topic haven't yet been migrated to the new wiki. But there are some notes, for example on this snapshot of the NotJustMaybe page.
ddarius also suggested I could rewrite parseInt similarly
> import Control.Monad
>
> parseInt :: MonadPlus m => String -> m Int
> parseInt s | all isDigit s = return $ read s
> | otherwise = mzero
Using MonadPlus, 1) requires the Control.Monad import. 2) seems to require the type signature. 3) it looks like IO doesn't have an mzero, so you can't now type
*Main> parseInt "4"
<interactive>:1:0:
Ambiguous type variable `m' in the constraint:
`MonadPlus m'
at the command line and have it Do The Right Thing. I'd read that fail is considered bad style (for some reason), but it seems to be rather more convenient on these 3 counts at least:
> -- type will be inferred if omitted
> parseInt :: Monad m => String -> m Int
> parseInt s | all isDigit s = return $ read s
> | otherwise = fail "not an int"
This time around, we're going to look at how we'll turn user input into commands in Monad Wars.
I think that the easiest option to implement will also be very convenient to play with: a command line where we issue commands like:
$ buy 4 foo
$ sell 20 bar
$ jet bronx
or with abbreviations
$ b 4 f
and where it's unambiguous, collapse spaces:
$ b4f
Tokenising
We'll start by tokenising the command line string. This is almost as easy as using the Prelude function words. The only complication is that we want to tokenise alternate letters and numbers separately, like the “b4f” example above.
We'll use groupBy from Data.List, and isLetter from Data.Char.
> import Data.List
> import Data.Char
>
> tokenizeLine :: [Char] -> [[Char]]
> tokenizeLine = concatMap
> (groupBy ((==) `on` isLetter))
> . words
Annoyingly, on (in the Prelude in GHC 6.8) doesn't exist in my 6.6.1 installation, so we'll have to define it:
> op `on` p = (\a b -> p a `op` p b)
Let's just see how groupBy works:
*Main> groupBy (==) "aabbbcdd"
["aa","bbb","c","dd"]
*Main> groupBy (\a b -> isLetter a == isLetter b) "abc123def"
["abc","123","def"]
The on expression is a nicer way of writing the second case. We then map this grouping over each of the tokens, getting our final list.
*Main> tokenizeLine "jet quuxville"
["jet","quuxville"]
*Main> tokenizeLine "b3p"
["b","3","p"]
Parsing
Now we'll want to do things with the tokens. Yes, there are libraries to do this (Parsec etc.) I know that in the Perl world I'd often tell other people not to reinvent the wheel and to use CPAN, so I do feel a little naughty that I'm going to ignore these and handroll something myself. In my defence m'ludd,
- I'm only parsing very simple commands
- Learning a new library requires cognitive effort. I have limited time for this task, and I believe (possibly wrongly) that I will be able to “roll my own” more quickly.
- Reimplementing functionality can be instructive in and of itself
- It also makes you appreciate how simple the “official” solution really is, when you finally get around to learning it.
In an expression like buy 4 foo I'm imagining that “buy” will map to some command. Then we'll need to parse “4“ as a number, and “foo” as a some merchandise. The first case is the simplest:
> parseInt :: String -> Maybe Int
> parseInt s | all isDigit s = Just $ read s
> | otherwise = Nothing
This reads tantalisingly close to English: If all the string is made up of digits, we just read it as an Integer. Otherwise we return nothing. OK, I'm glossing rather over the “Just” and “Nothing” which indicate whether the parse succeeded using the “Maybe” monad.
*Main> parseInt "42"
Just 42
*Main> parseInt "wibble"
Nothing
Now for parsing the merchandise: if there is an item called “foo”, we'd want to match “foo”, “fo”, “f”. By amazing coincidence, I recently wrote about a function that does exactly what we need:
> import qualified Data.Map as M
> import Control.Arrow
> getPrefixes :: [([a], t)] -> [([a], t)]
> getPrefixes = concatMap $ uncurry zip .
> (tail . inits . fst &&& repeat . snd)
> getPrefixMap :: (Ord a) => [([a], t)] -> M.Map [a] t
> getPrefixMap = M.fromList . getPrefixes
So we'd need a list of merchandise. Which means we should think a little about what the datatype will look like. I'm going with a record type again, because I know there is more information that we'll need to store:
> data Merchandise = Merchandise {
> name :: String,
> min :: Int, -- minimum price
> max :: Int -- maximum price
> }
> deriving Show
Next, we define the products on offer, by looking at the Dope Wars configuration pages, we can copy the prices, but of course we have to theme the names...
> merchandise :: [Merchandise]
> merchandise = [
> Merchandise "Arrows" 1000 4400,
> Merchandise "Curry" 15000 29000,
> Merchandise "Kleisli" 480 1280,
> Merchandise "Haskell" 5500 13000,
> Merchandise "Lambdas" 11 60,
> Merchandise "STM" 1500 4400,
> Merchandise "Monads" 540 1250,
> Merchandise "GHC" 1000 2500,
> Merchandise "Peyton" 220 700,
> Merchandise "Fundeps" 630 1300,
> Merchandise "Zipper" 90 250,
> Merchandise "Endo" 315 890
> ]
Now this isn't in the form we need it yet. First of all, we'll map this list to a list of tuples of [(string, thing),...], which is exactly what getPrefixMap is expecting:
> merchandiseMap = getPrefixMap $
> map (\i -> (toLowerS $ name i, i))
> merchandise
> -- toLowerS over a string isn't defined by default, but it's just:
> toLowerS = map toLower
OK, for a bit of fun, we could write the map as:
> -- map (toLowerS . name &&& id)
(I don't yet understand why “arrows” are supposed to be an “even more generic model of computation than monads”, but they're certainly good for putting things in tuples :-)
Now we can build the parser parseMerchandise. Or rather (as we'll probably use this technique again, for example for the names of locations, and even the commands like “buy” and “jet”), we'll create parseMap
Interestingly Data.Map's lookup function throws an exception (as a “user error”!) if you try to look up a key that doesn't exist. (This is rather different from Perl hashes, but it makes sense in a strongly typed language - there is no “undef” value which is of the requested type!) So that I can avoid having to learn exceptions just yet, I'm going to check first of all if the key exists using member
> parseMap m s | M.member s m = do v <- M.lookup s m
> Just v
> | otherwise = Nothing
I spent about half an hour trying to get the above to work. After a number of rather unhelpful error messages about monads, I changed from my original attempt v = M.lookup s m to the do-notation form. This is rather odd, as it implies that lookup is monadic. And its type does indeed suggest that the result is in some monad...
*Main> :t M.lookup
M.lookup :: (Ord k, Monad m) => k -> M.Map k a -> m a
but which monad is it in, and more to the point, why?
In any case, now it's easy as pie(*) to create our merchandise parser as a specialization of our general function:
> parseMerchandise :: String -> Maybe Merchandise
> parseMerchandise = parseMap merchandiseMap
and we can now recognize the names and abbreviations of elements in the list!
*Main> parseMerchandise "pey"
Just (Merchandise {name = "Peyton", min = 220, max = 700})
*Main> parseMerchandise "vb"
Nothing
Next time around, we'll create “buy” and “sell” handlers that parse the whole command line, and stub in the actual interaction with the game state!
(*) Well, I say easy, but at this point (and ongoing) we get bitten by the “monomorphism restriction”, whatever that is. The error message suggests that you add explicit types to all the functions involved, but when I try that, I regularly get even stranger error messages about rigid variables and monads. The easier solution seems to be to add -fno-monomorphism-restriction to your ghci call. (I don't know enough to know whether this is a Bad Thing). Of course, now this error message doesn't come up. Pah!
A lot of learning projects involve writing games: people have written clones of Tetris, Asteroids, Space Invaders, and even first person shooters (Frag) in Haskell. As I'm far less clever than these people, I thought I'd start with something a bit simpler: Dope Wars.
Dope Wars is basically a trading game. In 30 turns, you move from one location to another, buying and selling, er, drugs on the streets of New York. It's a fairly simple concept, but one which includes elements like:
- Input and Output
- Game state
- Random numbers
all of which seem like a good way to learn Monads and the other building blocks that you need to actually do anything useful in a functional programming language like Haskell.
So I had the idea, wrote up a couple of datatypes and some functions, and then forgot about it. Then, when Greg McCarroll mentioned that he'd accept talks about other languages for the London Perl Workshop on December 1st, I thought it would be a great opportunity to push myself to do it by proposing a lightning talk.
Only problem: I now have to actually write the game in order to talk about it (*). So... here goes. In this post, I'm going to show a first draft for the game prompt.
State
OK, people often find it most convenient to do State using “Monads”. I think I'm going to leave that for now, and just thread state explicitly. Mainly because I haven't yet got around to learning how to use the State Monad. Hopefully this will eventually become annoying enough that it will give me impetus to learn the monadic version.
Anyway, the idea here is that we'd have some function playTurn that will look a bit like this:
> playTurn :: GameState -> IO ()
> playTurn gs = let gs' = doSomething gs
> in playTurn gs'
(There is an interesting post on haskell-cafe about a more sophisticated monadic representation of the prompt).
So we just need to work out how to represent this GameState object. To start off with, we'll want to store information like
- Which turn it is
- What our score is (how much money we have)
- Where we are on the game map
We could create a normal tuple:
> data GameState = GameState Integer Integer Location
> -- turn score location
and then pattern match on this, but it's going to get horrible if we add any fields later on! In Perl I'd just use a hash, but remember that Haskell Data.Map objects map from one type to another, and we might well have values of various types.
When I asked on #haskell, dons and firefly told me about the record syntax:
> data GameState = GameState {
> turn :: Integer,
> score :: Integer,
> location :: Location
> } deriving Show
>
> -- for now:
> type Location = Integer
Defining the original state is easy:
> startState = GameState {
> turn = 1,
> score = 0,
> location = 0
> }
And to “modify” it (or rather to clone it, overriding certain fields) there is a convenient syntax that just lets us declare those fields which have changed:
> movedToBronx = gs { location = bronx }
Setting a field to a value is easy, but we might want to define some mutators to change the field relative to its current value:
> nextTurn :: GameState -> GameState
> nextTurn gs = gs { turn = succ $ turn gs }
>
> modScore :: Integer -> GameState -> GameState
> modScore d gs = gs { score = score gs + d }
The record syntax will work even when we inevitably add new fields later. Yay!
Prompt
Now, the game cycle is a bit more complicated than the version I suggested above, as it will allow IO actions in it. Something perhaps like this:
> playTurn gs = do showStatus gs
> putStr prompt
> s <- getLine
> let f = parseLine s
> let gs' = f gs
> if isEnd gs'
> then endGame gs'
> else playTurn gs'
For now we'll just stub some of the declarations we need. showStatus can just show the GameState record (which is why we derived the Show class).
> showStatus gs = putStrLn $ show gs
We may as well set the prompt to the dollar sign, appropriately:
> prompt = "$ "
Though we'll need to parse the line read from standard input to one of the commands like “Buy 4 X” or “Goto location 3“, right now, we'll just stub in a function that increments the score and the turn counter:
> parseLine s = nextTurn
> . modScore 10
We need to know if we're at the end of the game, and take action appropriately.
> isEnd gs = turn gs > maxTurns
>
> maxTurns = 3
>
> endGame gs = do putStrLn "Game over!"
> putStrLn $ "Your score was " ++ (show $ score gs)
> return ()
And here's a transcript
*Main> playTurn startState
GameState {turn = 1, score = 10, location = 0}
> buy 2 foo
GameState {turn = 2, score = 20, location = 0}
> sell 4 bar
GameState {turn = 3, score = 30, location = 0}
> goto quuxtown
Game over!
Your score was 40
(*) to be fair, it's only a 5 minute “Lightning Talk”, so I could probably get away without even writing the whole game, but I'll feel better if I actually know what I'm talking about...
