Random Code, Permutations, and `unsafePerformIO`

Note. I first published this post to another corner of the Internet on 26 December 2007, and had completely forgotten about it until a couple days ago. I’ve decided to edit, refresh, and republish it here in the hopes that it will help newcomers to Haskell get a feel for unsafePerformIO and how it interacts with the rest of GHC.

There’s more than one way to shuffle cards, but the Knuth Shuffle is one of the better ways you can find. It’ll produce outputs that are uniformly distributed across all the possible shuffles of the deck, and it’ll do it in linear time. But to implement it efficiently you need a mutable array. If your implementation language is Haskell, that means using the IO monad. That also means you’re doing computer science, so you have to call shuffles permutations. Here’s the code.

module Permutation ( permute ) where

import Data.Array.MArray
import Data.Array.IO

import System.Random


permute :: [a] -> IO [a]
permute xs =
  let len = length xs in
    do a <- newListArray (0, len - 1) xs
       mapM_ (\i -> randomRIO (i, len - 1) >>= swap a i) [0..len-1]
       getElems a

swap :: (Ix i) => IOArray i a -> i -> i -> IO ()
swap a i j = do
    e1 <- readArray a j
    e2 <- readArray a i
    writeArray a i e1
    writeArray a j e2

If you’re like me when I first discovered unsafePerformIO, you might wonder if you can use it to get permute out of the IO monad. Here’s what the documentation says.

unsafePerformIO :: IO a -> a

This is the “back door” into the IO monad, allowing IO computation to be performed at any time. For this to be safe, the IO computation should be free of side effects and independent of its environment.

If the I/O computation wrapped in unsafePerformIO performs side effects, then the relative order in which those side effects take place (relative to the main I/O trunk, or other calls to unsafePerformIO) is indeterminate.

Haskell is a lazy language, and the second paragraph points out the problem of mixing laziness with side effects. You don’t have a guarantee of when an expression will be evaluated, so you don’t know in what order side effects will take place. Hence the IO monad, which requires you to explicitly sequence computations that depend on or mutate some implicit state.

unsafePeformIO is a way to circumvent that requirement. When you use it, you’re telling Haskell that you don’t care when you program performs a particular computation relative to other side effects. But caveat emptor, the documentation claims this is only “safe” if “the IO computation … [is] free of side effects and independent of its environment.” This seems like a sufficient condition for its safe use. If you can say that about a computation, then it certainly doesn’t matter when or in what relative order your program performs that computation. But is that a necessary condition? No, and I believe the permute function above demonstrates that.

There is one qualification to the claim that “the order in which my program mutes and reads from theStdGen is irrelevant.” Jump to it if you’re impatient.

In permute, every execution of a randomRIO computation mutates and reads from theStdGen, which is where Haskell stores the state of its random number generator. So clearly, permute mutates implicit state and depends on its environment. But whether you get this random number or some other one doesn’t really matter. You still get a random number. The order in which your program mutates and reads from theStdGen is irrelevant. So it seems you should be able to define permute like this.

permute :: [a] -> [a]
permute l = unsafePerformIO $
  let len = length l in
    do a <- newListArray (0, len - 1) l
       mapM_ (\i -> randomRIO (i, len - 1) >>= swap a i) [0..len-1]
       getElems a

A Sequence of Precautions

That’s an easy change to make, but it’s not the only change that’s necessary. Expressions involving unsafePerformIO violate one of GHC’s core assumptions: that all expressions are referentially transparent. Referential transparency allows GHC to perform optimizations by reasoning about program behavior equationally. If an expression e is referentially transparent, and it evaluates to some value v, then you can replace any occurrence of e with v in the text of your program without changing its meaning. More generally if two expressions are referentially transparent and they evaluate to the same value, then you can use them interchangeably.

Since permute mutates the state of Haskell’s random number generator, substituting an application of permute with its result would discard those mutations. That would change the result of subsequent calls to the random number generator, thus changing the result of your program. So it’s not referentially transparent. This doesn’t seem like a big deal—one random number’s as good as the next, after all —until you consider the sorts of optimizations that GHC may perform on your program.

Common Subexpression Elimination

Consider the following expression.

(permute [1..4], permute [1..4])

GHC assumes that the two applications of permute are referentially transparent. Since they’re identical expressions, they should evaluate to the same result. So we should be able to just evaluate the expression once, and substitute the result in place of those two expressions. This is called common subexpression elimination, and it would transform the expression like this.

let e = permute [1..4] in (e, e)

While the original expression is likely to produce two distinct permutations, the transformed program will always produce two identical ones. That’s less work at runtime, but it doesn’t preserve the intent of the original program. You can turn off common subexpression elimination using GHC’s -fno-cse compiler flag. If a module uses permute, you should compile it with this flag.

Avoid `let`-floating

If you’re let-binding an expression in a function that does’t depend on the function’s arguments…

compute x = let n = 1 + 1
              in x / (x + n)

…GHC may avoid recomputing n on every application by “floating the let” outside of the function. It’ll turn your program into something like this.

n = 1 + 1
compute x = x / (x + n)

If n is bound to an expression that involves unsafePerformIO, then you likely intended for n to be evaluated for each application of compute. But if GHC applies this optimization, this won’t happen. compute will always return the same result for the same inputs. You can turn this off by compiling with the -fno-full-laziness flag.

In the case of permute’s module you don’t have to use that flag since all the expressions directly or indirectly depend on the argument l. You may have to use it in other modules to prevent calls to permute from floating out of your functions.

`NOINLINE`-ing

When the GHC inliner inlines a function, it may completely eliminate the application, substituting the syntactic arguments at the call site into the body of the function. So GHC might optimize a call to compute (f 2) like this:

let n = 1 + 1
  in (f 2) / ((f 2) + n)

It can even go a step further and inline n, leaving nothing but

(f 2) / ((f 2) + (1 + 1))

…which it can then subject to further optimization passes. If the argument itself reads from or mutates implicit state—imagine if f were were a function like permute—then those operations may be duplicated by the inliner. What your program assumed would be identical permutations may in fact be distinct, which is roughly the opposite effect that common subexpression elimination might have. Mark permute with NOINLINE to avoid this.

Seeds of Doubt

Everything up until now has been an attempt to make the fact that permute violates GHC’s assumption of referential transparency irrelevant. GHC’s optimizations may throw out necessary work—as in the case of common subexpression elimination—or it might duplicate work—as in the case of inlining. That may make random program too random or not random enough. But even when your program’s just random enough, there’s still the question of when it’s random.

You might want to seed theStdGen manually using setStdGen. If you use setStdGen after you’ve called permute, then calls to permute might be floating around in unevaluated thunks. When your program forces those thunks, permute will read from the new state of theStdGen—not the old state, as you might have intended. This is a legitimate concern, so it’s best to either seed theStdGen at the beginning of your program, or not at all. Otherwise, it may be hard to get the timing right.

Having to worry about the behavior of GHC optimizations when you use unsafePerformIO is not pleasant. It’s fascinating, but it makes you question the correctness of your program. GHC is a complex system, and using unsafePerformIO violates one of its core assumptions. That’s hard to recover from, but that’s no reason not to try. In the process, you acquire a better understanding and appreciation for the system you’re breaking.

A Sequence of Precautions

Common Subexpression Elimination

Avoid let-floating

NOINLINE-ing

Seeds of Doubt

Avoid `let`-floating

`NOINLINE`-ing