A non-regular data type challenge

While playing around with generalized functional references I encountered the following list-like data type:

data FunList a b
    = Done b
    | More a (FunList a (a -> b))

This is a non-regular data type, meaning that inside the FunList a b there is a FunList a not-b. So, what does a value of this type look like? Well, it can be

We either have just b, or an a and a function a->b, or two as (i.e. a2) and a function a2->b, or a3 and a3->b, etc.

A FunList a b is therefore a list of as together with a function that takes exactly that number of as to give you a b. Extracting the single represented b value is easy:

getB :: FunList a b -> b
getB (Done b)   = b
getB (More a z) = getB z a

As is getting to the list of as:

getAs :: FunList a b -> [a] 
getAs (Done _)   = []
getAs (More a z) = a : getAs z

But then things quickly get much trickier. Since a FunList a b holds exactly one b, we might ask how much access we have to it. First of, FunList a is a Functor, so the b value can be changed:

instance Functor (FunList a) where
    fmap f (Done b)   = Done (f b)
    fmap f (More a z) = More a (fmap (f .) z)

The above case for More looks a bit strange, but remember that the data type is non-regular, so we recurse with a different function f. In this case instead of having type b -> c as the outer f does, we need something with type (a -> b) -> (a -> c).

The Applicative instance is even stranger. There is a flip there, where the heck did that come from?

instance Applicative (FunList a) where
    pure = Done
    Done b   <*> c = fmap b c                    -- follows from Applicative laws
    More a z <*> c = More a (flip <$> z <*> c)   -- flip??

Aside from manipulating the b value we can also do more list like things to the list of as, such as zipping:

zipFun :: FunList a b -> FunList c d -> FunList (a,c) (b,d)
zipFun (Done b)   d          = Done (b,getB d)
zipFun b          (Done d)   = Done (getB b,d)
zipFun (More a b) (More c d) = More (a,c) (applyPair <$> zipFun b d)
    where applyPair (f,g) (x,y) = (f x,g y)

Surprisingly, the applicative operator defined above can be used as a kind of append, just look at the type:

(<*>) :: FunList a (b -> c) -> FunList a b -> FunList a c

it takes two 'lists' and combines them into one. It is indeed true that getAs a ++ getAs b == getAs (a <*> b).

This is as far as I got, so I will end this post with a couple of challenges:

Comments

Joeri van Eekelenx

You could take "join" = "getB" for the Monad definition. I'm currently trying to prove the monad laws.

Dan Doelx

I did some cheating with Agda. It implements O(n) (I think) reverse by splitting the FunList into a vector and a function, reversing the vector, and combining them back. You could do that in Haskell, too, but the proofs are more of a pain to do.

I fooled around for a while prior to this trying to write reverse directly with polymorphic recursion and such, but didn't get anywhere. Defeated by non-regular types again. :(

http://hpaste.org:80/fastcgi/hpaste.fcgi/view?id=4158#a4158

Dan Doelx

I added to that paste with a version that works directly with FunLs. It still requires some pretty hairy proof manipulation, though. The central issue is getting the system to recognize that when constructing:

a -> a -> ..n times.. -> b

it doesn't matter if we do it like:

construct 0     a b = b
construct (n+1) a b = a -> construct n a b

or like:

construct 0     a b = b
construct (n+1) a b = construct n a (a -> b)

Which is obvious, because all the as are the same, but it isn't obvious to, say, Agda.

Maybe I'm over-complicating things, though.

The dog ate my types
It was the dog, I swear!

reverseFun :: FunList a b -> FunList a b
reverseFun xs = putAs xs . reverse . getAs $ xs
putAs :: FunList a b -> [a] -> FunList a b putAs (Done b) _ = Done b putAs (More _ z) (a:as) = More a (putAs z as)</pre>

Also, the internet ate my formatting. :-/

Joeri van Eekelenx

Sorry for the identical double post earlier, and it breaks the monad laws. Using "join = fmap getB" also breaks things.

I tried the same thing as Dan with splitting a FunList up in a vector and a function, which gives me some ideas.

Meanwhile, instance Comonad (FunList a) anyone?

Joeri van Eekelenx

Note: While I sometimes use ">" to prefix lines, it's not literal Haskell. It's just to make the newlines with code/types on them clear, in case this post gets mangled in the same way as apfelmus' post.

I don't think there's a Monad instance, here is my proof, about as formal as I am currently willing to. It comes down to using the vector+function equivalence Dan described, and then prove that the existential length of the vector will be changed by operations that should be the identity according to the monad laws.

join :: FunList a (Funlist a b) -> FunList a b

Once we have this, we have our Monad instance. As Dan mentioned, "FunList a b" is equivalent to "exists n. (Vec a n, Vec a n -> b)". So, our input type is equivalent to:

"exists n . (Vec a n, Vec a n -> (exists m . (Vec a m, Vec a m -> b)))"

And the output type is equivalent to:

"exists k . (Vec a k, Vec a k -> b)"

The trick is that the 'm' can depend on the value of the "Vec a n" argument.

So which values can we give 'k' in the output type? If we try "n", we get the "join = fmap getB" case. If we try to use "m", we also must specify what "Vec a n" it depends on. We only have the first component of the input for this, so whatever 'm' is, it's the length of the vector of "getB input".

So let's try k = k' + m, with m the length of "getB input", and k' some number (can be 'n', can be a constant, it does not matter right now). The types here are in some pseudo type language that supports dependent types. This is needed to make the dependency of 'm' on the value of "Vec a n" explicit.

return :: x -> (Nil :: Vec a 0, Vec a 0 -> x) -- Added Nil, because that's the only valid constructor.
join :: (r :: Vec a n, Vec a n -> (Vec a (m r), Vec a (m r) -> b))
     -> (Vec a (k' + m r), Vec a (k' + m r) -> b) -- for some n, k', and m.

One of the monad laws says:

join . return == id

The type of "join . return" will become:

join . return :: (Vec a (m Nil), Vec a (m Nil) -> b)
              -> (Vec a (k' + m Nil), Vec a (k' + m Nil) -> b)

(done by hand-typechecking & inferring, someone please double-check it)

From this, we can see that unless "m Nil" == "k' + m Nil", we cannot get the existential length of the vectors to match. From this, we conclude that, whatever it is that "m" does, k' should be equal to 0. With this information, we refine the type of join to:

join :: (r :: Vec a n, Vec a n -> (Vec a (m r), Vec a (m r) -> b))
     -> (Vec a (m r), Vec a (m r) -> b) -- for some n and m

The only possible way to get this, is to apply the function directly to the input vector. This is essentially the same as using "join = getB" in the original FunList definition. When we have that, we can check the monad laws of FunList, instead of the pair:

join . return == id  -- holds, because getB (Done x) = x;
join . fmap return == id -- doesn't hold.

Try it with "More x (Done f)", we get:

  getB $ fmap Done (More x (Done f))
= getB $ More x (fmap (Done .) (Done f))
= getB $ More x (Done (Done . f))
= getB (Done (Done . f)) x
= (Done . f) x
= (Done (f x))
/= More x (Done f)

And done! No Monad instance for FunList.

The layout probably looks awful when posted, if it does the same thing it did to apfelmus.

Jared Putnamx

The conclusion from the type of join . return does not follow. join sees the result of return x, not x itself, so the only thing we can conclude is that k' is 0 whenever n is 0. This supports the intuition that k' is n rather than undermining it.

apfelmus: nice, I hadn't thought of that. But the patterns in putAs are not exhaustive, can we do something about that?

Joeri: Your proof seems to check out, so FunList is not a Monad. Indeed for some more intuition we can define

replicateFun :: a -> Int -> FunList a [a]
replicateFun a 0 = Done []
replicateFun a n = More a (flip (:)  replicateFun a (n-1))

also define

lots :: FunList Int (FunList Int [Int])
lots = More 100 (Done (replicateFun 1))

Clearly join lots = replicateFun 1 100, since that is the only we we can get something of type FunList Int [Int]. So join has to evaluate the outer layer. But as Joeri's post shows this violates the monad laws. The problem is that the shape of the inner FunList can depend on the contents of the outer one in an unpredictable way.

A Comonad instance is surprisingly easy:

extract :: FunList a b -> b
extract = getB</pre>
cojoin :: FunList a b -> FunList a (FunList a b)
cojoin (Done b) = Done (Done b)
cojoin (More a f) = More a (fmap (flip More) (cojoin f))

Do we have all the laws?

First: extract . fmap f = f . extract

base case:

  extract (fmap f (Done b))
= extract (Done (f b))
= f b

induction step, assume that for all g, extract (fmap g z) = g (extract z):

  extract (fmap f (More a z))
= extract (More a (fmap (f.) z))
= extract (fmap (f.) z) a
= f (extract z) a
= f (extract (More a z))

Next law: extract . cojoin = id

base case:

  extract (cojoin (Done b))
= extract (Done (Done b))
= Done b

induction step, assume that extract (cojoin z) = z:

  extract (cojoin (More a z))
= extract (More a (fmap (flip More) (cojoin z)))
= extract (fmap (flip More) (cojoin z)) a
= flip More (extract (cojoin z)) a
= flip More z a
= More a z

Finally: fmap extract . cojoin = id

base case:

  fmap extract (cojoin (Done b))
= fmap extract (Done (Done b))
= Done (extract (Done b))
= Done b

induction step, assume that fmap extract (cojoin z) = z:

  fmap extract (cojoin (More a z))
= fmap extract (More a (fmap (flip More) (cojoin z)))
= More a (fmap (extract.) (fmap (flip More) (cojoin z)))
= More a (fmap ((extract.) . flip More) (cojoin z))
= More a (fmap (\x y -> extract (More y x)) (cojoin z))
= More a (fmap (\x y -> extract x y) (cojoin z))
= More a (fmap extract (cojoin z))
= More a z

So, FunList is a comonad!

You also want a constraint on what happens to the last element, otherwise the following is fine:

makeFun :: [a] -> b -> FunList a b
makeFun [] x = Done x
makeFun (x:xs) y = More x (makeFun xs (const y))
reverseCheat x = makeFun (reverse $ getAs x) (getB x)

I assume, based on the quadratic reverse you give in the source file, that you want

reverseProp comp reverseCandidate hd tl = 
    case reverseCandidate $ reverseCandidate $ More hd tl of 
      More _ y -> comp (getB y) (getB tl)

where comp is extensional equality.

data List n a where
    Nil :: List Z a
    Cons :: a -> List n a -> List (S n) a
data Z data S n
data ListFun a b = forall n. ListFun (List n a) (List n a -> b)
togadt :: FunList a b -> ListFun a b togadt (Done x) = ListFun Nil (const x) togadt (More x y) = case togadt y of ListFun l f -> ListFun (Cons x l) (\(Cons c d) -> f d c)
fromgadt :: ListFun a b -> FunList a b fromgadt (ListFun Nil f) = Done (f Nil) fromgadt (ListFun (Cons x y) f) = More x (fromgadt $ ListFun y (\p q -> f (Cons q p)))
reverseL :: List n a -> List n a reverseL x = reverseHelp (plus0 $ lengthL x) x Nil
data Nat n where Zero :: Nat Z Succ :: Nat n -> Nat (S n)
lengthL :: List n a -> Nat n lengthL Nil = Zero lengthL (Cons _ xs) = Succ $ lengthL xs
plus0 :: Nat n -> Sum n Z n plus0 Zero = SumZ plus0 (Succ n) = plusSucc $ plus0 n
plusSucc :: Sum n v m -> Sum (S n) v (S m) plusSucc SumZ = SumS SumZ plusSucc (SumS x) = SumS (plusSucc x)
data Sum a b c where SumZ :: Sum Z a a SumS :: Sum a (S b) c -> Sum (S a) b c
reverseHelp :: Sum n m r -> List n a -> List m a -> List r a reverseHelp SumZ Nil x = x reverseHelp (SumS v) (Cons x y) z = reverseHelp v y (Cons x z)
funReverse :: FunList a b -> FunList a b funReverse x = case togadt x of ListFun x y -> fromgadt $ ListFun (reverseL x) y
snocFun :: FunList a (a -> b) -> a -> FunList a b snocFun (Done b) z = More z (Done b) snocFun (More a f) z = More a (snocFun f z)
reverseFun :: FunList a b -> FunList a b reverseFun (Done b) = (Done b) reverseFun (More a z) = reverseFun z `snocFun` a
Joeri van Eekelenx

I don't think we can do anything about the non-exhaustive patterns in apfelmus' code; at least, not without radically changing the whole function. If we use the equivalence with the pair of vector and function again, we see that putAs has the type:

(exists n. (Vec a n, Vec a n -> b)) -> [a] -> (exists m. (Vec a m, Vec a m -> b)

The intuition here is that n == m, which is indeed enforced by saying that the function in the output pair is the same as the function in the input pair. To be able to statically check that we have enough 'a's in the list, we'd like the second argument to have type "Vec a n" (or "Vec a m") instead. Alas, this is not possible, because neither n nor m are in scope.

Thanks for the interesing post, I'm having much fun with this :)

Dan Doelx

You should be able to eliminate the incomplete patterns by adding a case:

putAs fl [] = Done (getB fl)

or even:

putAs fl [] = fl

The function will then throw away extra as from the list, use as from the old values if there aren't enough, and if there are exactly the right amount, just replace the old elements with the new ones.

In the case of reverse, the third case always happens, and everything works out fine, even if not provably so.

I'm trying HTML in this comment. Hopefully that's how I'm supposed to mark things up appropriately.

I think FunList is the same as the FunArg comonad introduced in Uustalu and Vene's "The Essence of Dataflow Programming:" http://www.cs.ioc.ee/~tarmo/papers/aplas05.pdf

I also recall that the connection with Applicative has been noted, but not in that paper (which I believe predates the introduction Applicative anyway).

Reply

(optional)
(optional, will not be revealed)
0 + 9 =
Use > code for code blocks, @code@ for inline code. Some html is also allowed.