2 posts tagged “random”
The next obvious functions after “Pick” are “Unpick” to eject a value randomly from the list, and “Pick N” values from the list.
unpick is fairly straight-forward. To do in a single pass, you have to thread through the list of values, but otherwise it's more or less the same as the previous code for pick
> unpick :: [a] -> IO [a]
> unpick [] = undefined
> unpick [x] = do return []
> unpick (x:xs) = do zs <- unpick' [] [x] xs 2
> return (reverse zs)
>
> unpick' :: (Num p, Random p) => [t] -> [t] -> [t] -> p -> IO [t]
> unpick' curr orig [] _
> = do return curr
> unpick' curr orig (next:rest) prob
> = do r <- getStdRandom (randomR (1,prob))
> let curr' = if r == 1 then orig else (next:curr)
> unpick' curr' (next:orig) rest (prob+1)
To do pickN I'd originally thought of picking the required number, and then following an algorithm similar to that of pick but dropping one (with unpick) every time I selected a new entry. That would be quadratic, and also insane. I discussed with dakkar, and we agreed that it might just be simpler to call something like pick N times (maybe with a function that returned the picked and the rejected part).
But, if we know the length of the array things are much easier. We can just scan the array, selecting elements with a probability of (r/e) where r is the number of required elements left to be selected, and e the number of elements left in the array.
> pickN :: Int -> [a] -> IO [a]
> pickN n xs = let len = length xs
> in pickN' n len xs
And we just need
> pickN' :: Int -> Int -> [a] -> IO [a]
> pickN' n l [] = do return []
> pickN' n l (x:xs)
> = do b <- roll n l
> if b
> then x: (pickN' (n-1) (l-1) xs)
> else pickN' n (l-1) xs
and to define the auxhiliary function roll, because I got tired of writing it out all over the place:
> roll p q = do r <- getStdRandom (randomR (1,q))
> return $ r <= p
Which is all great apart from not working. The problem is that pickN’ returns a monadic value, while x is a pure one. I asked on #haskell and got help from oerjan, scook0, and DRMacIver.
First of all they suggested:
> if b
> then (x:) `liftM` (pickN' (n-1) (l-1) xs)
> else pickN' n (l-1) xs
Now I intuitively sort-of-understood what this was doing but not why. (Bear in mind that I'm still confused about things like “why do I return this but not that?” and “How does it know which Monad it's in?”)
It turns out that liftM lifts a function with one pure argument into a function with one monadic argument. But I'm concatenating x (pure) with picked xs (monadic)! So the section (x:) is actually a pure function specialised to concatenate “x” with a pure list (which is the single argument).
My first reaction was a mix of wonder and “good god that's horrible”. As they pointed out though, I could just use another do block.
> if b
> then do xs <- pickN' (n-1) (l-1) xs
> return (x:xs)
> else pickN' n (l-1) xs
Hurrah! Now I should go and check out a random monad like MonadRandom...
Random numbers seem to be a bit of a pain in Haskell. Because they are not a “pure” function (they involve hooking up with a system or user-supplied random number generator) there is a little scaffolding involved. Also when the number comes back, it's wrapped in an IO Ref, making the whole of your function impure — causing me at least a fair amount of grief.
I got stuck trying to do something more complicated and then went back to a simpler task: implementing a function pick
> pick :: [a] -> a
or rather
> pick :: [a] -> IO a
which will return an element from [a], chosen at random
The first, obvious implementation is
> pick a = do r <- getStdRandom (randomR (1, length a))
> return $ a !! (r-1)
The getStdRandom is the boilerplate I was mentioning (there may be better ways of doing this — I just cargo culted from the Haskell Report) while randomR gives a random integer in the range (min,max) provided.
This is (for once) rather longer than the equivalent Perl
sub pick { $_[rand( @_ )] }
It's also inefficient, as we have to scan all the elements of a to get the length and then another r elements to do the indexing.
There's a lovely algorithm to choose a random element of a sequence that only involves scanning it once. You start with your first element and select it. You move to the second element, and have a probability of (1/2) of selecting it. You move to the third, with a probability of (1/3)
Here's my code for it:
> pick :: [a] -> IO a
> pick [] = undefined
> pick [x] = do return x
> pick (x:xs) = pick' x xs 2
>
> pick' :: (Num p, Random p) => t -> [t] -> p -> IO t
> pick' curr [] _ = do return curr
> pick' curr (next:rest) prob
> = do r <- getStdRandom (randomR (1,prob))
> let curr' = if r == 1 then next else curr
> pick' curr' rest (prob+1)
Why in the auxhiliary function pick’ does p get typed as Random? As far as I can see, it's just a number, it increases monotonically and only participates in the function randomR as an upper bound.
Now this works in ghci:
*Main> pick [1,2,3,4,5]
3
Now if we want a list of random numbers:
*Main> map (\_ -> pick [1..5]) [1..5]
<interactive>:1:0:
No instance for (Show (IO t))
arising from use of `print' at <interactive>:1:0-32
Possible fix: add an instance declaration for (Show (IO t))
In the expression: print it
In a 'do' expression: print it
This is annoying. OK, maybe an IO value can't be showable because show returns a String, not an IO String, so using Show would allow you to subvert the IO sandbox. Or something. But if ghci can fake it in the case of a single IO value, why can't it do something sensible for a list of IO values?
Some interesting comments also on the algorithm to choose random numbers. Fair comment that my description of it is a little confusing - by "select", I mean "change the currently selected number for the new number". But the code is quite clear - evir and Sorear pointed out that Nethack (at least 3.4.3) has a citation of the algorithm and a proof in the comments! (in eat.c Elliott Kleinrock, October 5, 1990)
