# Analysis of lazy-stream programs.

What’s the difference between the following two functions, a and b:

import Control.Monad.Instances import Control.Arrow   a, b :: (Int->Int) -> (Int,Int) -> (Int,Int) a = fmap b = second

Can you figure it out? If not, here’s a hint

f :: ((Int -> Int) -> Int -> (Int, Int)) -> Int -> Int f sec a = case sec (+1) a of (x,y) -> 6

The two functions, a and b, are identical for every single input, except one. Did you figure it out?

*Main> f a undefined *** Exception: Prelude.undefined *Main> f b undefined 6

Undefined, also called bottom or $\bot$, is a subtle value in lazy languages like haskell and for certain types of programs, this can make a big difference. The functions a and b apply to $\bot$ as follows:

$a f \bot \Rightarrow \bot$
$b f \bot \Rightarrow (\bot,\bot)$

We say a function $f$ is strict if $f \bot \Rightarrow \bot$ and is non-strict otherwise. Looking at the definition of fmap for tuples, we can see why it is strict in its second argument:

instance Functor ((,) a) where fmap f (x,y) = (x, f y)

Before evaluation even gets to the right hand side of the equal sign, it will return $\perp$ because it cannot pattern match (x,y) on undefined. We can make this non-strict in a couple ways, the first uses functions as an alternative to pattern matching:

instance Functor ((,) a) where fmap f t = (fst t, f (snd t))

The second uses the special tilde (~) symbol to delay the pattern matching until the result is actually used.

instance Functor ((,) a) where fmap f ~(x,y) = (x, f y)

And we see, the non-strict second function, is implemented as follows, using the tilde approach.

instance Arrow (->) where -- ... first f = f *** id second f = id *** f (f *** g) ~(x,y) = (f x, g y)

## Why non-strict?

At first glance, it might seem silly to implement functions non-strictly. After all, the earlier these functions fail out with bottom, the more efficient the program is in these cases. However, one particular point of interest is lazy streams.

Recall that the getContents :: IO [Char]  function returns a lazy list of input characters. Lets create a couple functions that motivate extra laziness.

import IO   withNext :: [Char] -> [(Char,Char)] withNext (a:a':as) = (a,a'):withNext (a':as)   main = do hSetBuffering stdout NoBuffering hSetBuffering stdin NoBuffering fmap withNext getContents >>= putStr . map fst

When I run this program (sorry windows users, this won’t work for you), and type “Hello World!”, here is what I get.

 
 

runhaskell test HeHlellol oW oWrolrdl!d 

The underlined letters are the ones I typed. How did this happen? Note the following about withNext:

$\mbox{\bf withNext} (x:\bot) \Rightarrow \perp$

It is strict in the second half of its argument. Lets modify our function to be more lazy in this respect.

withNext' :: [Char] -> [(Char,Char)] withNext' (a:as) = (a, head as):withNext as -- Or, the tilde version -- withNext' (a:(~(a':as))) = (a,a'):withNext (a':as)

As was the case with withNext, withNext' is strict in its argument.

$\mbox{\bf withNext' } \bot \Rightarrow \bot$

Using this, we can see that withNext' is not strict in the second half of its argument.

$\mbox{\bf withNext' } (x:\bot) \Rightarrow (x,\bot):\bot$

When we run our more lazy version of this program, we get what we’d expect:

 
 

runhaskell test HHeelllloo WWoorrlldd!! 

## Connecting Strictness to Streams

As can be seen above, strictness and streams are tightly coupled. Here we present a theoretical evaluation ($\leadsto$) of a function at a given point in time. Here we introduce a new symbol we’ll call partial, $\partial$, that represents a value that we have no information about at the given moment. $\hat{x}$ represents a symbol where everything is known and all other variables could mean either. So, for our withNext' function above, we may write the following equations:

$\mbox{\bf withNext' } \partial \leadsto \partial$
$\mbox{\bf withNext' } (x:\partial) \leadsto (x,\partial):\partial$

Note that our $\leadsto$ equations are the same as our strictness equations. Thus we may use these functions to prove invariants on our stream-based code.

## Example from Reactive

Take the following mappend code from reactive:

instance Ord t => Monoid (FutureG t a) where -- ... -- Pick the earlier future. Future (s,a) mappend Future (t,b) = Future (s min t, if s <= t then a else b)

Using strictness analysis, we can come up with the following $\leadsto$ equations.

$\mbox{\bf mappend } (\mbox{\bf Future } \partial) (\mbox{\bf Future } x) \leadsto \partial$
$\mbox{\bf mappend } (\mbox{\bf Future }x) (\mbox{\bf Future }\partial) \leadsto \partial$

This tells us that if we do not know anything about the tuples in the Future arguments to mappend, then we cannot know anything about the result. So a developer might want to maintain an invariant that $\mbox{\bf Future } x$ implies that $x\leadsto (y,z)$.

Continuing on…

$\mbox{\bf mappend } (\mbox{\bf Future } (\partial, \partial)) (\mbox{\bf Future } (\hat{x},\hat{y})) \leadsto$
$\mbox{\bf Future }(\mbox{\bf min }\partial\;\hat{x}, \mbox{\bf if } \partial <= \hat{x}\; \mbox{\bf then } \partial \;\mbox{\bf else } \hat{y} )$

If we add some more invariants, we can go even further. Lets say that if we have the invariant that $\mbox{\bf Future t}$ implies that min ($\partial$ :: t) (x :: t) $\leadsto$ x and (<=) ($\partial$ :: t) (x :: t) $\leadsto$ False. Then we get a Future that allows us to mappend now with a past future and a future future.

$\mbox{\bf mappend } (\mbox{\bf Future } (\partial, \partial)) (\mbox{\bf Future } (\hat{x},\hat{y})) \leadsto\mbox{\bf Future }(\hat{x}, \hat{y} )$

Verifying the the code is sufficiently lazy is now just a matter of seeing that these invariants are met.

noneNovember 9th, 2008 at 12:04 am

But, there is a spreading sentiment that lazy i/o is a gross hack, relying at its root on unsafeInterleaveIO. Oleg has a big rant about this in his recent Iteratee paper.

adminNovember 10th, 2008 at 10:02 am

I’ve found that working with lazy IO can be difficult due to the subtlety of laziness issues like the one mentioned in the above post. Fortunately, projects like Reactive are doing much to remove the difficulty and enable programmers to develop highly interactive applications in a DSEL that actually models the problem domain directly instead exposing the machine’s low-level imperative operations.

Peter VerswyvelenNovember 10th, 2008 at 1:21 pm

Hi David,

Yes this lazy pattern matching stuff is really tricky.

When I first read about it in the Haskell School of Expression, I said wtf… here I am trying to escape from the imperative and object oriented world, and now I have to deal with this! But it did make a lot of sense - to match against some thing one must observe that thing.

Unfortunately this took away a bit of fun when programming Haskell; writing pattern matches on anything that has a “spine” felt like carefully walking through a minefield… albeit a pure minefield

The same tricky lazy pattern matching is also described in detail in the Programming with Arrows paper from John Hughes, at http://www.cs.chalmers.se/~rjmh/afp-arrows.pdf

Keep up the great blog!

[...] instance of pairing, we can also use fmap in place second. (The two, however, are not quite the same.) An advantage of names like result, element and second is that they’re more specifically [...]

[...] Netsuperbrain.com has a very good article on lazy pattern matching. [...]