Twan van Laarhoven's blog

Type theory with indexed equality - the theory

2018-01-04T21:18:00Z

In a previous post I introduced the TTIE language, along with a type checker and interpreter. My motivation for writing that (aside from it being fun!) was to explore the type system. At the time I started this project, formalizing this system as a shallow embedding in Agda was not easy. But with the addition of a rewriting mechanism, it has become much easier to use Agda without going insane from having to put substitutions everywhere. So, in this post I will formalize the TTIE type system.

This post is literate Agda, and uses my own utility library. The utility library mainly defines automatic rewrite rules like trans x (sym x) ≡ refl, which make life a bit more pleasant. All these rewrites use the standard library propositional equality ≡, which I will call meta equality. . All these rewrites use the standard library propositional equality, which I will denote as ⟹ and call meta equality.

{-# OPTIONS --rewriting #-}
module _ where

open import Util.Equality as Meta using (_∎) renaming (_≡_ to _⟹_; refl to □; _≡⟨_⟩_ to _⟹⟨_⟩_; _≡⟨_⟩⁻¹_ to _⟸⟨_⟩_)
open import Data.Product
open import Data.Sum
open import Data.Nat using (ℕ; zero; suc)
open import Data.Vec
open import Function
open import Level renaming (zero to lzero; suc to lsuc)

First we postulate the existence of the interval. I will abbreviate the interval type as I.

postulate I : Set
postulate i₀ : I
postulate i₁ : I

The canonical eliminator for the interval needs equalities, to show that i₀ and i₁ are mapped to equal values. But we haven't defined those yet. However, there is one eliminator that we can define, namely into I, since values in I are always equal.

postulate icase : I → I → I → I
postulate icase-i₀ : ∀ a b → icase a b i₀ ⟹ a
postulate icase-i₁ : ∀ a b → icase a b i₁ ⟹ b
{-# REWRITE icase-i₀ icase-i₁ #-}

And with this icase construct, we can define conjunction, disjunction, and negation

_&&_ : I → I → I
i && j = icase i₀ j i

_||_ : I → I → I
i || j = icase j i₁ i

inot : I → I
inot = icase i₁ i₀

We can define some extra computation rules based on the principle that when evaluating icase a b c, if we use the a branch then c = i₀, and similarly for b.

postulate icase-same : ∀ (a b c : I → I) d → a i₀ ⟹ c i₀ → b i₁ ⟹ c i₁
                     → icase (a d) (b d) d ⟹ c d

icase-const : ∀ a b → icase a a b ⟹ a
icase-id    : ∀ a   → icase i₀ i₁ a ⟹ a
icase-i₀-x  : ∀ b   → icase i₀ b b ⟹ b
icase-i₁-x  : ∀ b   → icase i₁ b b ⟹ i₁
icase-x-i₀  : ∀ a   → icase a i₀ a ⟹ i₀
icase-x-i₁  : ∀ a   → icase a i₁ a ⟹ a

Show implementation
icase-const a b = icase-same (const a) (const a) (const a) b □ □
icase-id    a   = icase-same (const i₀) (const i₁) id a □ □
icase-i₀-x  b   = icase-same (const i₀) id id b □ □
icase-i₁-x  b   = icase-same (const i₁) id (const i₁) b □ □
icase-x-i₀  a   = icase-same id (const i₀) (const i₀) a □ □
icase-x-i₁  a   = icase-same id (const i₁) id a □ □

{-# REWRITE icase-const #-}
{-# REWRITE icase-id #-}
{-# REWRITE icase-i₀-x #-}
{-# REWRITE icase-i₁-x #-}
{-# REWRITE icase-x-i₀ #-}
{-# REWRITE icase-x-i₁ #-}

The equality type

We can now define the indexed equality type

data Eq {a} (A : I → Set a) : A i₀ → A i₁ → Set a where
  refl : ∀ (x : (i : I) → A i) → Eq A (x i₀) (x i₁)

For convenience we write the non-indexed object level equality as

_≡_ : ∀ {a} {A : Set a} → A → A → Set a
_≡_ {A = A} x y = Eq (\_ → A) x y

And now that we have equalities, we can write down the the general dependent eliminator for the interval,

postulate _^_ : ∀ {a A x y} → Eq {a} A x y → (i : I) → A i
postulate ^-i₀   : ∀ {a A x y} x≡y → _^_ {a} {A} {x} {y} x≡y i₀ ⟹ x
postulate ^-i₁   : ∀ {a A x y} x≡y → _^_ {a} {A} {x} {y} x≡y i₁ ⟹ y
postulate ^-refl : ∀ {a A} x → _^_ {a} {A} {x i₀} {x i₁} (refl x) ⟹ x
{-# REWRITE ^-i₀ ^-i₁ ^-refl #-}
infixl 6 _^_

At the same time, the _^_ operator also functions as an eliminator for Eq, projecting out the argument to refl. This also means that we have the following eta contraction rule

refl-eta : ∀ {a A x y} (x≡y : Eq {a} A x y) → refl (\i → x≡y ^ i) ⟹ x≡y -- HIDE a
refl-eta (refl x) = □
{-# REWRITE refl-eta #-}

These definitions are enough to state some object level theorems, such as function extensionality

ext′ : ∀ {a} {A B : Set a} {f g : A → B} → (∀ x → f x ≡ g x) → f ≡ g -- HIDE a
ext′ f≡g = refl \i → \x → f≡g x ^ i

congruence,

cong′ : ∀ {a b} {A : Set a} {B : Set b} (f : A → B) {x y} → x ≡ y → f x ≡ f y -- HIDE a|b
cong′ f x≡y = refl \i → f (x≡y ^ i)

and symmetry of ≡,

sym′ : ∀ {a} {A : Set a} {x y : A} → x ≡ y → y ≡ x -- HIDE a
sym′ x≡y = refl \i → x≡y ^ inot i

We can also define dependent versions of all of the above, which are the same, only with more general types. I'll leave these as an exercise for the reader.

spoiler
sym : ∀ {a} {A : I → Set a} {x y} → Eq A x y → Eq (A ∘ inot) y x
sym x≡y = refl \i → x≡y ^ inot i

Transport

In general, to make full use of equalities, you would use substitution, also called transport. I will formalize this as

postulate tr : ∀ {a} (A : I → Set a) → A i₀ → A i₁ -- HIDE a

Where tr stands for transport, since we transport a value of type A i₀ along A, to a value of type A i₁. This should be possible, because there is a path between i₀ and i₁, that is, they are indistinguishable, and because functions are continuous. So A is a continuous path between A i₀ and A i₁. In a previous blog post I have used a more general cast primitive, which can be defined in terms of tr,

cast : ∀ {a} (A : I → Set a) → (j₀ j₁ : I) → A j₀ → A j₁ -- HIDE a
cast A j₀ j₁ = tr (\i → A (icase j₀ j₁ i))

And now we can define things like the usual substitution

subst : ∀ {a b} {A : I → Set a} (B : {i : I} → A i → Set b) {x} {y} → Eq A x y → B x → B y -- HIDE a|b
subst B xy = tr (\i → B (xy ^ i))

and the J axiom

jay : ∀ {A : Set} {x : A} (B : {y : A} → x ≡ y → Set) → {y : A} → (x≡y : x ≡ y)
    → B (refl (\_ → x)) → B x≡y
jay B xy = tr (\i → B {xy ^ i} (refl \j → xy ^ (j && i)))

Yay, jay!

Evaluating transport

To be useful as a theory of computation, all primitives in our theory should reduce. In particular, we need to know how to evaluate tr, at least when it is applied to arguments without free variables. We do this by pattern matching on the first argument of tr, and defining transport for each type constructor.

The simplest case is if the type being transported along doesn't depend on the index at all

postulate tr-const : ∀ {a} {A : Set a} {x} → tr (\_ → A) x ⟹ x -- HIDE a
{-# REWRITE tr-const #-}

Much more interesting is the case when the type is a function type. To cast function types, we first transport the argument 'back', apply the function, and then transport the result forward. First look at the non-dependent case, i.e. going from A i₀ → B i₀ to A i₁ → B i₁:

postulate tr-arrow : ∀ {a b} {A : I → Set a} {B : I → Set b} {f} -- HIDE a|b
                   → tr (\i → A i → B i) f
                   ⟹ (\x → tr B (f (cast A i₁ i₀ x)))

The dependent case is a bit more complicated, since the type of the result depends on the transported argument. The result of the function has type B i₀ (cast A i₁ i₀ x), and we have to transport this to B i₁ x. So as we go from i₀ to i₁, we want to "undo" the cast operation. We can do this by changing both i₀'s to i₁'s, to get a value of the type B i₁ (cast A i₁ i₁ x). Because cast A i₁ i₁ x ⟹ x by icase-const and tr-const, this is equivalent to B i₁ x.

postulate tr-pi : ∀ {a b} {A : I → Set a} {B : (i : I) → (A i) → Set b} {f} -- HIDE a|b
                → tr (\i → (x : A i) → B i x) f
                ⟹ (\x → tr (\i → B i (cast A i₁ i x)) (f (cast A i₁ i₀ x)))

Besides function/pi types, there are also product/sigma types. The idea here is similar: transport both parts of the pair independently. Again, the type of the second part can depend on the transported first part,

postulate tr-sigma : ∀ {a b} {A : I → Set a} {B : (i : I) → A i → Set b} {x y} -- HIDE a|b
                      → tr (\i → Σ (A i) (B i)) (x , y)
                      ⟹ (tr A x , tr (\i → B i (cast A i₀ i x)) y)

Finally, let's look at sum types, for which we use simple recursion,

postulate tr-sum₁ : ∀ {a b} {A : I → Set a} {B : I → Set b} {x} -- HIDE a|b
                  → tr (\i → A i ⊎ B i) (inj₁ x) ⟹ inj₁ (tr A x)
postulate tr-sum₂ : ∀ {a b} {A : I → Set a} {B : I → Set b} {x} -- HIDE a|b
                  → tr (\i → A i ⊎ B i) (inj₂ x) ⟹ inj₂ (tr B x)

Transport for equality types

The final type constructors in our language are equality types, and this is where things get more hairy. The idea is that a type like Eq A x y behaves like A in many respects. Its values will just be wrapped in a refl constructor.

Consider the case of equalities over (dependent) function types. The evaluation rule could look like

postulate tr-eq-pi
           : ∀ {a b} {A : I → I → Set a} -- HIDE a|b
               {B : ∀ i j → A i j → Set b} -- HIDE a|b
               {u : ∀ i → (x : A i i₀) → B i i₀ x}
               {v : ∀ i → (x : A i i₁) → B i i₁ x}
               {f₀ : Eq (\j → (x : A i₀ j) → B i₀ j x) (u i₀) (v i₀)}
           → tr (\i → Eq (\j → (x : A i j) → B i j x) (u i) (v i)) f₀
           ⟹ refl \j → \x →
             let x' = \i' j' → tr (\i → A (icase i₁ i' i) (icase j j' i)) x in
             (tr (\i → Eq (\j' → B i j' (x' i j')) (u i (x' i i₀)) (v i (x' i i₁)))
                 (refl \j' → (f₀ ^ j') (x' i₀ j'))) ^ j

Of course the A in Eq A x y could again be an equality type, and we would have to repeat the construction. To do this systematically, I start by collecting all the 'sides' of the equality type recursively. For example the sides of Eq (\i → Eq (\j → _) x y) u v) are eq (\i → eq (\j → done) x y) u v,

mutual
  data Sides {a} : ∀ n (A : Vec I n → Set a) → Set (lsuc a) where
    done : ∀ {A} → Sides zero A
    eq   : ∀ {n A}
         → (sides : (i : I) → Sides n (\is → A (i ∷ is)))
         → Eqs (sides i₀)
         → Eqs (sides i₁)
         → Sides (suc n) A

  Eqs : ∀ {a n A} → Sides {a} n A → Set a
  Eqs {A = A} done = A []
  Eqs {A = A} (eq sides x y) = Eq (\i → Eqs (sides i)) x y

Since I → A are the continuous functions out of the 1-dimensional interval, you can think of a Vec I n → A as a continuous function out of the n-dimensional hypercube. So in geometric terms, we can draw such a function as assigning a value to all elements of the hypercube. Similarly, you can think of Sides {n = n} as a function out of the n-dimensional hypercube with the central cell removed, and Eqs as filling in that central cell.

`Eqs 0`	`Sides 1`	`Eqs 1`	`Vec I 1 → A`	`Sides 2`	`Eqs 2`	`Vec I 2 → A`

I will spare you the details, see the source code of this post if you are interested. Suffice to say, that if we generalize _^_, icase, etc. from I to Vec I n and from Eq to Eqs, then we can generalize tr-eq-pi to arbitrarily deep Eqs.

tr-eqs-pi-rhs : ∀ {a b n} {A : I → Vec I n → Set a} -- HIDE a|b
                {B : (i : I) → (is : Vec I n) → A i is → Set b} -- HIDE a|b
              → (sides : (i : I) → Sides n (\js → (x : A i js) → B i js x))
              → Eqs (sides i₀)
              → Eqs (sides i₁)

postulate tr-eqs-pi : ∀ {a b n}
                        {A : I → Vec I n → Set a}
                        {B : (i : I) → (is : Vec I n) → A i is → Set b}
                        (sides : (i : I) → Sides n (\js → (x : A i js) → B i js x))
                        (f₀ : Eqs (sides i₀))
                    → tr (Eqs ∘ sides) f₀
                    ⟹ tr-eqs-pi-rhs sides f₀

You can do a similar thing for sigma types, except that the types get even messier there because we need a dependently typed map function for Eqs and Sides.

This is the evaluation strategy implemented in the current TTIE interpreter. But it has two issues: 1) it is error prone and ugly 2) we still haven't defined tr (Eq Set u v)

What remains is to define tr (Eq Set u v).

A note about transitivity

Note that transitivity can be defined by transporting along an equality,

trans′ : ∀ {a} {A : Set a} {x y z : A} → x ≡ y → y ≡ z → x ≡ z -- HIDE a
trans′ {y = y} x≡y y≡z = tr (\i → (x≡y ^ inot i) ≡ (y≡z ^ i)) (refl \_ → y)

There are several ways to generalize this to dependent types. I'll use a variant that is explicit about the type

trans : ∀ {a} (A : I → I → Set a) {x y z} -- HIDE a
      → Eq (\i → A i₀ i) x y
      → Eq (\i → A i i₁) y z
      → Eq (\i → A i i) x z
trans A {y = y} x≡y y≡z = tr (\i → Eq (\j → A (icase i₀ i j) (icase (inot i) i₁ j)) (x≡y ^ inot i) (y≡z ^ i)) (refl \_ → y)

Just as transitivity can be defined in terms of tr, the converse is also true. Instead of specifying transport for nested equality types, we could define tr for Eq types in terms of transitivity and symmetry.

The most general case of such a transport is

xy = fw (\i → Eq (\j → A i j) (ux ^ i) (vy ^ i)) uv

where

ux : Eq (\i → A i i₀) u x
vy : Eq (\i → A i i₁) v y
uv : Eq (\j → A i₀ j) u v

which we can draw in a diagram as

If you ignore the types for now, it seems obvious that

xy = trans (trans ((sym ux) uv) vy)

So, we could take

postulate tr-eq : ∀ {a} {A : I → I → Set a} -- HIDE a
                    (ux : ∀ i → A i i₀)
                    (vy : ∀ i → A i i₁)
                    (uv : Eq (A i₀) (ux i₀) (vy i₀))
                → tr (\i → Eq (A i) (ux i) (vy i)) uv
                ⟹ trans (\i j → A (icase i₁ i j) (icase i i j))
                    (refl (ux ∘ inot)) (trans A uv (refl vy))

I will stick to taking tr as primitive. However, this definition will come in handy for defining transport along paths between types.

Inductive types

It is straightforward to extend the theory with inductive types and higher inductive types. Here are some concrete examples, taken from the HoTT book.

The homotopy circle

postulate Circle : Set
postulate point  : Circle
postulate loop   : Eq (\_ → Circle) point point
postulate Circle-elim : ∀ {a} {A : Circle → Set a} -- HIDE a
                      → (p : A point)
                      → (l : Eq (\i → A (loop ^ i)) p p)
                      → (x : Circle) → A x

with the computation rules

postulate elim-point : ∀ {a A p l} → Circle-elim {a} {A} p l point ⟹ p -- HIDE a
postulate elim-loop  : ∀ {a A p l i} → Circle-elim {a} {A} p l (loop ^ i) ⟹ l ^ i -- HIDE a
{-# REWRITE elim-point #-}
{-# REWRITE elim-loop #-}

Technically we would also need to specify elim for transitive paths (or paths constructed with tr). First the non-dependent version,

postulate Circle-elim′-tr-eq : ∀ {a A p l} (x y : I → Circle) xy i -- HIDE a
            → Circle-elim {a} {\_ → A} p l (tr (\j → x j ≡ y j) xy ^ i) -- HIDE a
            ⟹ tr (\j → Circle-elim {a} {\_ → A} p l (x j) -- HIDE a
                      ≡ Circle-elim {a} {\_ → A} p l (y j)) -- HIDE a
                  (refl \k → Circle-elim {a} {\_ → A} p l (xy ^ k)) ^ i -- HIDE a

To write down the dependent version, it is helpful to first define a generalized version of transport over equality types. This generalized equality transport doesn't just give the final path, but also any of the sides, depending on the argument. Fortunately, it can be defined in terms of the existing transport primitive tr.

treq : ∀ {a} (A : I → I → Set a) -- HIDE a
     → (x : ∀ i → A i i₀) (y : ∀ i → A i i₁) (xy : Eq (\j → A i₀ j) (x i₀) (y i₀))
     → (i j : I) → A i j
treq A x y xy i j = tr (\k → Eq (A (i && k)) (x (i && k)) (y (i && k))) xy ^ j

Note that we have

treq-i-i₀ : ∀ {a} A x y xy i → treq {a} A x y xy i i₀ ⟹ x i -- HIDE a
treq-i-i₁ : ∀ {a} A x y xy i → treq {a} A x y xy i i₁ ⟹ y i -- HIDE a
treq-i₀-j : ∀ {a} A x y xy j → treq {a} A x y xy i₀ j ⟹ xy ^ j -- HIDE a
treq-i₁-j : ∀ {a} A x y xy j → treq {a} A x y xy i₁ j ⟹ tr (\i → Eq (A i) (x i) (y i)) xy ^ j -- HIDE a

Now the dependent version of commuting Circle-elim for transitive paths looks like this:

postulate Circle-elim-tr-eq : ∀ {a A p l} (x y : I → Circle) xy i -- HIDE a
            → Circle-elim {a} {A} p l (tr (\j → x j ≡ y j) xy ^ i)
            ⟹ tr (\j → Eq (\k → A (treq _ x y xy j k)) 
                          (Circle-elim {a} {A} p l (x j))
                          (Circle-elim {a} {A} p l (y j)))
                 (refl \k → Circle-elim {a} {A} p l (xy ^ k)) ^ i

We also need to continue this for higher paths, but that should be straightforward, if tedious.

tedious next step...
postulate Circle-elim-tr-eq-eq : ∀ {a A p ll} (x y : I → I → Circle) -- HIDE a
                                   (xy₀ : ∀ k → x k i₀ ≡ y k i₀) (xy₁ : ∀ k → x k i₁ ≡ y k i₁)
                                   xy i j
            → Circle-elim {a} {A} p ll (tr (\k → Eq (\l → x k l ≡ y k l) (xy₀ k) (xy₁ k)) xy ^ i ^ j)
            ⟹ tr (\k → Eq (\l → Eq (\m → A (tr (\k' → Eq (\l' → x (k && k') l' ≡ y (k && k') l')
                                                         (xy₀ (k && k'))
                                                         (xy₁ (k && k'))) xy ^ l ^ m) )
                                   (Circle-elim {a} {A} p ll (x k l))
                                   (Circle-elim {a} {A} p ll (y k l)))
                          (refl \l → Circle-elim {a} {A} p ll (xy₀ k ^ l))
                          (refl \l → Circle-elim {a} {A} p ll (xy₁ k ^ l)))
                 (refl \k → refl \l → Circle-elim {a} {A} p ll (xy ^ k ^ l)) ^ i ^ j

Truncation

postulate Truncate : Set → Set
postulate box  : ∀ {A} → A → Truncate A
postulate same : ∀ {A} x y → Eq (\_ → Truncate A) x y

module _ {p} {A} {P : Truncate A → Set p} -- HIDE p
         (b : (x : A) → P (box x))
         (s : ∀ {x y} (px : P x) (py : P y) → Eq (\i → P (same x y ^ i)) px py) where

  postulate Truncate-elim : (x : Truncate A) → P x

  postulate elim-box  : ∀ x → Truncate-elim (box x) ⟹ b x
  postulate elim-same : ∀ x y i → Truncate-elim (same x y ^ i)
                                ⟹ s (Truncate-elim x) (Truncate-elim y) ^ i

Notice that in the eliminator for every path constructor, we expect an argument of type P "along that path constructor".

Quotient types

postulate _/_      : (A : Set) → (R : A → A → Set) → Set
postulate quot     : ∀ {A R} → A → A / R
postulate eqn      : ∀ {A R} → (x y : A) → R x y → Eq (\_ → A / R) (quot x) (quot y)
postulate truncate : ∀ {A R} → (x y : A / R) → (r s : Eq (\_ → A / R) x y) → r ≡ s

module _ {A R} {P : A / R → Set}
         (q : (x : A) → P (quot x))
         (e : ∀ {x y} → (r : R x y) → Eq (\i → P (eqn x y r ^ i)) (q x) (q y))
         (t : ∀ {x y r s}
            → (px : P x) (py : P y) (pr : Eq (\i → P (r ^ i)) px py) (ps : Eq (\i → P (s ^ i)) px py)
            → Eq (\i → Eq (\j → P (truncate x y r s ^ i ^ j)) px py) pr ps) where

  postulate /-elim : (x : A / R) → P x

  postulate elim-quot : ∀ x → /-elim  (quot x) ⟹ q x
  postulate elim-eqn  : ∀ x y r i → /-elim (eqn x y r ^ i) ⟹ e r ^ i
  postulate elim-truncate : ∀ x y r s i j
                          → /-elim (truncate x y r s ^ i ^ j)
                          ⟹ t (/-elim x) (/-elim y) (refl \k → /-elim (r ^ k)) (refl \k → /-elim (s ^ k)) ^ i ^ j

Indexed types

One caveat to the support of inductive types are indexed types. These are the types with parameters whose value can depend on the constructor, written after the colon in Agda. An obvious example is the standard inductive equality type as it is defined in the standard library,

data _≡_ {A : Set} (x : A) : A → Set where
  refl : x ⟹ x

Another example are length indexed vectors,

data Vec (A : Set) : ℕ → Set where
  [] : Vec A zero
  _∷_ : ∀ {n} → A → Vec A n → Vec A (suc n)

Such inductive types introduce a new kind of equality, and we can't have that in TTIE.

Fortunately, outlawing such definitions is not a big limitation, since any indexed type can be rewritten to a normal inductive type by making the equalities explicit. For example

data Vec (A : Set) (n : ℕ) : Set where
  [] : n ≡ zero → Vec A n
  _∷_ : ∀ {m} → A → Vec A m → n ≡ suc m → Vec A n

Univalence

The final ingredient to turn TTIE into a homotopy type theory is the univalence axiom. A univalence primitive might look like this:

postulate univalence : ∀ {a} {A B : Set a} -- HIDE a
                     → (f : A → B)
                     → (g : B → A)
                     → (gf : ∀ x → g (f x) ≡ x)
                     → (fg : ∀ x → f (g x) ≡ x)
                     → (fgf : ∀ x → cong′ f (gf x) ≡ fg (f x))
                     → Eq (\_ → Set a) A B -- HIDE a

By using an equality constructed with univalence in a transport, you can recover the forward and backward functions,

fw : ∀ {a} {A B : Set a} → A ≡ B → A → B -- HIDE a
fw A≡B = tr (_^_ A≡B)

bw : ∀ {a} {A B : Set a} → A ≡ B → B → A -- HIDE a
bw A≡B = tr (_^_ A≡B ∘ inot)

as well as the proofs of left and right-inverse,

bw∘fw : ∀ {a} {A B : Set a} → (A≡B : A ≡ B) → ∀ x → bw A≡B (fw A≡B x) ≡ x -- HIDE a
bw∘fw A≡B x = refl \j → tr (\i → A≡B ^ icase (inot j) i₀ i)
                       (tr (\i → A≡B ^ icase i₀ (inot j) i) x)

fw∘bw : ∀ {a} {A B : Set a} → (A≡B : A ≡ B) → ∀ x → fw A≡B (bw A≡B x) ≡ x -- HIDE a
fw∘bw A≡B x = refl \j → tr (\i → A≡B ^ icase j i₁ i)
                       (tr (\i → A≡B ^ icase i₁ j i) x)

Here the trick is that when j = i₁, the transports become the identity, while otherwise they become fw and bw.

Getting out the adjunction fgf is a bit harder. You need to come up with an expression that reduces to f (gf x ^ k) when j = i₀ and that reduces to (fg (f x) ^ k) when j = i₁. The following does the trick

not-quite-fw∘bw∘fw : ∀ {a} {A B : Set a} → (A≡B : A ≡ B) → ∀ x -- HIDE a
                   → cong′ (fw A≡B) (bw∘fw A≡B x) ≡ fw∘bw A≡B (fw A≡B x)
not-quite-fw∘bw∘fw A≡B x = refl \j →
  refl \k → tr (\i → A≡B ^ icase                          (icase i₀ k j) i₁ i)
          $ tr (\i → A≡B ^ icase    (icase (inot k) i₁ j) (icase i₀ k j)    i)
          $ tr (\i → A≡B ^ icase i₀ (icase (inot k) i₁ j)                   i) x)

but the type is not right. We want an equality between two equalities, both of type fw (bw (fw x)) ≡ x. But instead we get a dependent equality type that mirrors the body of the definition.

To resolve this, we need to add another reduction rule to the language, which states that if you transport from i₀ to i and then to i₁, this is the same as going directly from i₀ to i₁. This should hold regardless of what i is.

postulate tr-tr : ∀ {a} (A : I → Set a) i x → tr (A ∘ icase i i₁) (tr (A ∘ icase i₀ i) x) ⟹ tr A x -- HIDE a
postulate tr-tr-i₀ : ∀ {a} A x → tr-tr {a} A i₀ x ⟹ □ -- HIDE a
postulate tr-tr-i₁ : ∀ {a} A x → tr-tr {a} A i₁ x ⟹ □ -- HIDE a
{-# REWRITE tr-tr-i₀ tr-tr-i₁ #-}

fw∘bw∘fw : ∀ {a} {A B : Set a} → (A≡B : A ≡ B) → ∀ x 
         → cong′ (fw A≡B) (bw∘fw A≡B x) ≡ fw∘bw A≡B (fw A≡B x)
fw∘bw∘fw A≡B x = 
-- same as above, with ugly rewriting details...
  Meta.subst id (cong-Eq
    (ext \j → cong-Eq □ □ (tr-tr (\i → A≡B ^ i) (j) x)) □ □)
    (refl \j → refl \k
          → tr (\i → A≡B ^ icase                          (icase i₀ k j) i₁ i)
          $ tr (\i → A≡B ^ icase    (icase (inot k) i₁ j) (icase i₀ k j)    i)
          $ tr (\i → A≡B ^ icase i₀ (icase (inot k) i₁ j)                   i) x)

Computation rules

The computation rules are now obvious: when fw, bw, etc. are applied to a univalence primitive, return the appropriate field.

module _ {a} {A B} f g gf fg fgf (let AB = univalence {a} {A} {B} f g gf fg fgf) where -- HIDE a
  postulate tr-univalence-f : ∀ x → tr (\i → AB ^ i) x ⟹ f x
  postulate tr-univalence-g : ∀ x → tr (\i → AB ^ inot i) x ⟹ g x
  {-# REWRITE tr-univalence-f #-}
  {-# REWRITE tr-univalence-g #-}

  postulate tr-univalence-gf : ∀ x j
                             → tr (\i → AB ^ icase j i₀ i) (tr (\i → AB ^ icase i₀ j i) x)
                             ⟹ gf x ^ inot j
  postulate tr-univalence-fg : ∀ x j
                             → tr (\i → AB ^ icase j i₁ i) (tr (\i → AB ^ icase i₁ j i) x)
                             ⟹ fg x ^ j
  {-# REWRITE tr-univalence-gf #-}
  {-# REWRITE tr-univalence-fg #-}
  -- tr-univalence-fgf ommitted

Ideally, we would be able to compute tr for AB ^ f i for any function f, and even

tr (\i → AB ^ f₁ i) ∘ ⋯ ∘ tr (\i → AB ^ f_n i)

But we quickly run into problems. Consider

  problem : I → I → A → B
  problem j k = tr (\i → AB ^ icase k i₁ i)
              ∘ tr (\i → AB ^ icase j k i)
              ∘ tr (\i → AB ^ icase i₀ j i)

When j=i₁, this reduces to

problem i₁ k = fg ^ k ∘ f

and when k=i₀, it reduces to

problem j i₀ = f ∘ gf ^ j

These two types look a lot like the adjunction fgf, but there are two differences: 1. For the two reductions of problem to be confluent, the two right hand sides should be equal in the meta language (judgementally equal). But an adjunction inside the theory doesn't guarantee this.

2. Even when using fgf, we can not get an expression for problem with the right reductions. The issue is that depending on j and k, problem can represent any of the following compositions

problem i₀ i₀ = f  ∘ id ∘ id
problem i₀ i₁ = id ∘ f  ∘ id
problem i₁ i₀ = f  ∘ g  ∘ f
problem i₁ i₁ = id ∘ id ∘ f

Transporting univalent paths

Finally, we also need to decide how to transport along equality types involving univalence. As I showed previously, transporting along equalities can be defined in terms of transitivity. So that is what we will do here. The idea is that to transport along trans AB BC, you first transport along AB, and then along BC. The same goes for other directions of using this transitive path (bw, fw∘bw, etc.)

module _ {a} {A B C : Set a} (A≡B : A ≡ B) (B≡C : B ≡ C) where
  trans-f : A → C
  trans-f = fw B≡C ∘ fw A≡B

  trans-g : C → A
  trans-g = bw A≡B ∘ bw B≡C

  trans-gf : ∀ x → trans-g (trans-f x) ≡ x
  trans-gf x = cong′ (bw A≡B) (bw∘fw B≡C (fw A≡B x)) ⟨ trans′ ⟩ bw∘fw A≡B x

  trans-fg : ∀ x → trans-f (trans-g x) ≡ x
  trans-fg x = cong′ (fw B≡C) (fw∘bw A≡B (bw B≡C x)) ⟨ trans′ ⟩ fw∘bw B≡C x

  postulate trans-fgf : ∀ x → cong′ trans-f (trans-gf x) ≡ trans-fg (trans-f x)
  -- trans-fgf should be provable, but proof is omitted here

  trans-equivalence : A ≡ C
  trans-equivalence = univalence trans-f trans-g trans-gf trans-fg trans-fgf

And we use this transitivity to define transport,

postulate tr-eq-Set : ∀ {a} (A B : I → Set a) (A₀≡B₀ : A i₀ ≡ B i₀)
                    → tr (\i → Eq (\_ → Set a) (A i) (B i)) A₀≡B₀
                    ⟹ trans-equivalence (refl (A ∘ inot)) (trans-equivalence A₀≡B₀ (refl B))

-- spacial case for fw
tr-tr-eq-Set : ∀ {a} (A B : I → Set a) (A₀≡B₀ : A i₀ ≡ B i₀) x
             → tr (\j → tr (\i → Eq (\_ → Set a) (A i) (B i)) A₀≡B₀ ^ j) x
             ⟹ tr B (tr (_^_ A₀≡B₀) (tr (A ∘ inot) x))
tr-tr-eq-Set A B A₀≡B₀ x = Meta.cong (\A₁≡B₁ → tr (_^_ A₁≡B₁) x) (tr-eq-Set A B A₀≡B₀)

Note that tr-eq-Set cannot be used as a rewrite rule. Agda incorrectly complains about universe levels, and when removing those the rule is accepted, but the file takes more than 10 minutes to type check.

Reduction rules spoiled by univalence

While we are at it, it would be nice if we could add some additional judgemental equalities to the type system. For instance, trans xy (sym xy) = refl \_ → x should hold for all xy.

However, we can not add this as a reduction. The reason is that for paths build with univalence, transporting along the left hand side reduces to bw∘fw, and this is not necessarily the same as reflexivity. Here is an example

-- A path that flips the interval in one direction, but not in the other
-- so fw ∘ bw ≠ refl
flip-I : I ≡ I
flip-I = univalence id inot
  (\i → refl (icase (inot i) i))
  (\i → refl (icase (inot i) i))
  (\i → refl \_ → refl (icase (inot i) i))

module _ (trans-sym : ∀ {a A x y} xy → trans′ {a} {A} {x} {y} xy (sym xy) -- hide {a}
                                     ⟹ (refl \_ → x)) where
  problem2 : i₀ ⟹ i₁
  problem2 = 
    Meta.begin
      i₀
    ⟸⟨ tr-tr-eq-Set (_^_ flip-I ∘ inot) (_^_ flip-I ∘ inot) (refl \_ → I) i₁ ⟩
      tr (\i → trans′ flip-I (sym flip-I) ^ i) i₁
    ⟹⟨ Meta.cong (\AB → tr (\i → AB ^ i) i₁) (trans-sym flip-I) ⟩
      i₁
    ∎

tr (\i → trans flip-I (sym flip-I) ^ i) i₁ evaluates to i₁ with tr-eq-Set, since we follow the equivalence backward and then forward. But according to trans-sym it is an identity path, and so this expression evaluates to i₀. So, we have a term that can evaluate to either i₀ or to i₁, depending on the evaluation order. In other words, reduction is no longer confluent.

This might not seem too bad, since i₀ ≡ i₁ inside the theory. But note that the reduction relation ⟹ is not a homotopy equality. And it might even be untyped if we were using an untyped meta-theory, like the Haskell TTIE implementation. With a non-confluent reduction relation, it is easy to break the type system,

flip-Bool : Bool ≡ Bool
flip-Bool = univalence not not not-not not-not not-not-not

bad : i₀ ⟹ i₁ → (Bool , false) ⟹ (_,_ {B = id} Bool true)
bad x = Meta.cong {B = Σ Set id} (\i → flip-Bool ^ i , tr (\j → flip-Bool ^ i && j) false) x

worse : i₀ ⟹ i₁ → ⊥
worse x with bad x
... | ()

So, trans-sym is out.

Another seemingly sensible reduction is that cong f (trans xy yz) ≡ trans (cong f xy) (cong f yz). But, if we also postulate that all paths over the interval can be defined in terms of icase, we end up in the same problematic situation.

module _ (trans-cong : ∀ {a b A B x y z} (f : A → B) xy yz -- HIDE a|b
                     → cong′ f (trans′ {a} {A} {x} {y} {z} xy yz) -- HIDE a
                     ⟹ trans′ {b} (cong′ f xy) (cong′ f yz)) -- HIDE b
         (tr-eq-I : ∀ (j k : I → I) jk₀ → tr (\i → Eq (\_ → I) (j i) (k i)) jk₀
                                        ⟹ refl (icase (j i₁) (k i₁))) where
  trans-sym : ∀ {a A x y} xy → trans′ {a} {A} {x} {y} xy (sym xy) ⟹ (refl \_ → x) -- HIDE a
  trans-sym {x = x} xy =
    Meta.begin
      trans′ xy (sym xy)
    ⟸⟨ trans-cong (_^_ xy) (refl id) (refl inot) ⟩
      cong′ (\i → xy ^ i) (trans′ (refl id) (refl inot))
    ⟹⟨ Meta.cong (cong′ (_^_ xy)) (tr-eq-I inot inot (refl \_ → i₁)) ⟩
      refl (\_ → x)
    ∎

  problem3 : i₀ ⟹ i₁
  problem3 = problem2 trans-sym

I don't have any solution to these problems, aside from not adding the problematic reductions.

Reductions that do seem fine are those involving only a single path. For instance, things like trans xy (refl \_ → y) ⟹ xy.

Conclusion

What I have presented is the type theory with indexed equality. As mentioned before, there is also a prototype implementation in Haskell.

The theory is quite similar to the cubical system, but it is developed mostly independently.

Some area's I haven't discussed or investigated yet, and some issues with the theory are:

1. Transitive paths involving HIT path constructors are not reduced, so trans loop (sym loop) is not the same as refl \_ → point, however, the two are provably equal inside the theory. As with a general trans-sym rule, adding such a reduction would break confluence.

2. I have defined a function treq that generalizes tr (Eq ..). This could be taken as a primitive instead of tr. In that case we should further generalize it to take Sides, so that it also works for higher paths.

3. It is possible to combine transports to write terms that do not reduce, for example

x : A
AB : A ≡ B
f : A → Set
y : f (bw AB (fw AB x))
tr (\i → f (tr (\j → AB ^ icase (inot i) i₀ j)
           (tr (\j → AB ^ icase i (inot i) j)
           (tr (\j → AB ^ icase i₀ i j) x)))) y

the tr-tr rule handles one such case, but more are possible. For well-behaved equalities flatting all this out is not a problem, but with univalence the intermediate steps become important.

4. I am not entirely happy with univalence breaking confluence in combination with trans-sym. It means that you have to be really careful about what, seemingly benign, reductions are allowed.

Traversing syntax trees

2017-08-23T00:26:00Z

When working with syntax trees (such as in a type theory interpreter) you often want to apply some operation to all subtrees of a node, or to all nodes of a certain type. Of course you can do this easily by writing a recursive function. But then you would need to have a case for every constructor, and there can be many constructors.

Instead of writing a big recursive function for each operation, it is often easier to use a traversal function. Which is what this post is about. In particular, I will describe my favorite way to handle such traversal, in the hope that it is useful to others as well.

As a running example we will use the following data type, which represents expressions in a simple lambda calculus

-- Lambda calculus with de Bruijn indices
data Exp
  = Var !Int
  | Lam Exp
  | App Exp Exp
  | Global String
  deriving Show

example₁ :: Exp
example₁ = Lam $ Var 0 -- The identity function

example₂ :: Exp
example₂ = Lam $ Lam $ Var 1 -- The const function

example₃ :: Exp
example₃ = Lam $ Lam $ Lam $ App (Var 2) (App (Var 1) (Var 0)) -- Function composition

Now, what do I mean by a traversal function? The base library comes with the Traversable class, but that doesn't quite fit our purposes, because that class is designed for containers that can contain any type a. But expressions can only contain other sub-expressions. Instead we need a monomorphic variant of traverse for our expression type:

traverseExp :: Applicative f => (Exp -> f Exp) -> (Exp -> f Exp)

The idea is that traverseExp applies a given function to all direct children of an expression.

The uniplate package defines a similar function, descendM. But it has two problems: 1) descendM has a Monad constraint instead of Applicative, and 2) the class actually requires you to implement a uniplate method, which is more annoying to do.

The ever intimidating lens package has a closer match in plate. But aside from the terrible name, that function also lacks a way to keep track of bound variables.

For a language with binders, like the lambda calculus, many operations need to know which variables are bound. In particular, when working with de Bruijn indices, it is necessary to keep track of the number of bound variables. To do that we define

type Depth = Int
-- Traverse over immediate children, with depth
traverseExpD :: Applicative f => (Depth -> Exp -> f Exp) -> (Depth -> Exp -> f Exp)
traverseExpD _ _ (Var i)    = pure (Var i)
traverseExpD f d (Lam x)    = Lam <$> f (d+1) x
traverseExpD f d (App x y)  = App <$> f d x <*> f d y
traverseExpD _ _ (Global x) = pure (Global x)

Once we have written this function, other traversals can be defined in terms of traverseExpD

-- Traverse over immediate children
traverseExp :: Applicative f => (Exp -> f Exp) -> (Exp -> f Exp)
traverseExp f = traverseExpD (const f) 0

And map and fold are just traversals with a specific applicative functor, Identity and Const a respectively. Recent versions of GHC are smart enough to know that it is safe to coerce from a traversal function to a mapping or folding one.

-- Map over immediate children, with depth
mapExpD :: (Depth -> Exp -> Exp) -> (Depth -> Exp -> Exp)
mapExpD = coerce (traverseExpD :: (Depth -> Exp -> Identity Exp) -> (Depth -> Exp -> Identity Exp))

-- Map over immediate children
mapExp :: (Exp -> Exp) -> (Exp -> Exp)
mapExp = coerce (traverseExp :: (Exp -> Identity Exp) -> (Exp -> Identity Exp))

-- Fold over immediate children, with depth
foldExpD :: forall a. Monoid a => (Depth -> Exp -> a) -> (Depth -> Exp -> a)
foldExpD = coerce (traverseExpD :: (Depth -> Exp -> Const a Exp) -> (Depth -> Exp -> Const a Exp))

-- Fold over immediate children
foldExp :: forall a. Monoid a => (Exp -> a) -> (Exp -> a)
foldExp = coerce (traverseExp :: (Exp -> Const a Exp) -> (Exp -> Const a Exp))

After doing all this work, it is easy to answer questions like "how often is a variable used?"

varCount :: Depth -> Exp -> Sum Int
varCount i (Var j)
  | i == j   = Sum 1
varCount i x = foldExpD varCount i x

or "what is the set of all free variables?"

freeVars :: Depth -> Exp -> Set Int
freeVars d (Var i)
  | i < d     = Set.empty             -- bound variable
  | otherwise = Set.singleton (i - d) -- free variable
freeVars d x = foldExpD freeVars d x

Or to perform (silly) operations like changing all globals to lower case

lowerCase :: Exp -> Exp
lowerCase (Global x) = Global (map toLower x)
lowerCase x = mapExp lowerCase x

These functions follows a common pattern of specifying how a particular constructor, in this case Var or Global, is handled, while for all other constructors traversing over the child expressions.

As another example, consider substitution, a very important operation on syntax trees. In its most general form, we can combine substitution with raising expressions to a larger context (also called weakening). And we should also consider leaving the innermost, bound, variables alone. This means that there are three possibilities for what to do with a variable.

substRaiseByAt :: [Exp] -> Int -> Depth -> Exp -> Exp
substRaiseByAt ss r d (Var i)
  | i < d           = Var i -- A bound variable, leave it alone
  | i-d < length ss = raiseBy d (ss !! (i-d)) -- substitution
  | otherwise       = Var (i - length ss + r) -- free variable, raising
substRaiseByAt ss r d x = mapExpD (substRaiseByAt ss r) d x

Similarly to varCount, we use mapExpD to handle all constructors besides variables. Plain substitution and raising are just special cases.

-- Substitute the first few free variables, weaken the rest
substRaiseBy :: [Exp] -> Int -> Exp -> Exp
substRaiseBy ss r = substRaiseByAt ss r 0

raiseBy :: Int -> Exp -> Exp
raiseBy r = substRaiseBy [] r

subst :: [Exp] -> Exp -> Exp
subst ss = substRaiseBy ss 0

λ> raiseBy 2 (App (Var 1) (Var 2))
App (Var 3) (Var 4)

λ> subst [Global "x"] (App (Var 0) (Lam (Var 0)))
App (Global "x") (Lam (Var 0))

λ> substRaiseBy [App (Global "x") (Var 0)] 2 $ App (Lam (App (Var 1) (Var 0))) (Var 2)
App (Lam (App (App (Global "x") (Var 1)) (Var 0))) (Var 3)

As a slight generalization, it can also make sense to put traverseExpD into a type class. That way we can traverse over the subexpressions inside other data types. For instance, if the language uses a separate data type for case alternatives, we might write

data Exp
  = ...
  | Case [Alt]

data Alt
  = Alt Pat Exp

class TraverseExp a where
  traverseExpD :: Applicative f => (Depth -> Exp -> f Exp) -> (Depth -> a -> f a)

instance TraverseExp a => TraverseExp [a] where
  traverseExpD f d = traverse (traverseExpD f d)

instance TraverseExp Exp where
  traverseExpD f d ...
  traverseExpD f d (Case xs) = Case <$> traverseExpD f d xs

instance TraverseExp Alt where
  traverseExpD f d (Alt x y) = Alt x <$> traverseExpD f (d + varsBoundByPat x) y

Another variation is to track other things besides the number of bound variables. For example we might track the names and types of bound variables for better error messages. And with a type class it is possible to track different aspects of bindings as needed,

class Env env where
  extend :: VarBinding -> env -> env

instance Env Depth where
  extend _ = (+1)

instance Env [VarBinding] where
  extend = (:)

instance Env () where
  extend _ _ = ()

traverseExpEnv :: Applicative f => (env -> Exp -> f Exp) -> (env -> Exp -> f Exp)
traverseExpEnv f env (Lam name x) = Lam <$> f (extend name env) x
traverseExpEnv f env ...

Overall, I have found that after writing traverseExpD once, I rarely have to look at all constructors again. I can just handle the default cases by traversing the children.

A nice thing about this pattern is that it is very efficient. The traverseExpD function is not recursive, which means that the compiler can inline it. So after optimization, a function like lowerCase or varCount is exactly what you would have written by hand.

A type theory based on indexed equality - Implementation

2017-04-06T11:15:47Z

In this post I would like to present the type theory I have been working on, where the usual equality is replaced by an equality type indexed by the homotopy interval. This results in ideas very similar to those from the cubical system. I have a prototype implementation of this system in Haskell, which you can find on github. The system is unimaginatively called TTIE, a type theory with indexed equality. In this post I will focus on the introducing the type system and its implementation. I save the technical details for another post.

To recap: I have previously written about the 'indexed equality' type. The idea is that if we have the homotopy interval type with two points and a path between them,

-- Pseudo Agda notation
data Interval : Type where
  0 : Interval
  1 : Interval
  01 : Eq _ 0 1

then we can then define a type of equality, 'indexed' by the interval:

data Eq (A : Interval → Type) : A 0 → A 1 → Type where
  refl : (x : (i : Interval) → A i) → Eq A (x 0) (x 1)

Rather than using lambdas all the time in the argument of Eq and refl, in TTIE I write the bound variable in a subscript. So refl_i (x i) means refl (\i → x i) and Eq_i (A i) x y means Eq (\i → A i) x y. If we represent all paths with this indexed equality type, then we can actually take 01 = refl_i i.

Now the (dependent) eliminator for the interval is

iv : ∀ {A} → {x : A 0} → {y : A 1} → (xy : Eq_i (A i) x y) → (i : Interval) → A i
iv {A} {x} {y} xy 0 = x
iv {A} {x} {y} xy 1 = y
iv {A} {x} {y} (refl_i (xy i)) i = xy i
refl_i (iv {A} {x} {y} xy i) = xy

For readability, I write iv xy i as xyⁱ. This combination already makes it possible to prove, for instance, congruence of functions without needing to use substitution (the J rule):

cong : ∀ {A B x y} → (f : A → B) → Eq_i A x y → Eq_i B (f x) (f y)
cong f xy = refl_i (f xyⁱ)

this can be generalized to dependent types

cong : ∀ {A B x y} → (f : (x : A) → B x) → (xy : Eq_i A x y) → Eq_i (B xyⁱ) (f x) (f y)
cong f xy = refl_i (f xyⁱ)

And we also get extensionality up to eta equality:

ext : ∀ {A B f g} → ((x : A) → Eq_i (B x) (f x) (g x)) → Eq_i ((x : A) → B x) (\x → f x) (\x → g x)
ext fg = refl_i (\x → (fg x)ⁱ)

So far, however, we can not yet represent general substitution. I have found that the most convenient primitive is

cast : (A : I → Type) → (i : Interval) → (j : Interval) → A i → A j

where cast_i A 0 0 x = x and cast_i A 1 1 x = x.

This generalized cast makes all kinds of proofs really convenient. For instance, we would like that cast A 1 0 ∘ cast A 0 1 = id. But it is already the case that cast A 0 0 ∘ cast A 0 0 = id. So we just have to change some of those 0s to 1s,

lemma : ∀ {A : Type} {x} → Eq _ (cast_i A 1 0 (cast_i A 0 1 x)) x
lemma {A} {x} = cast_j (Eq _ (cast_i A j 0 (cast_i A 0 j x)) x) 0 1 (refl_i x)

As another example, most type theories don't come with a built in dependent or indexed equalty type. Instead, a common approach is to take

Eq_i (A i) x y = Eq (A 0) x (cast_i (A i) 1 0 y)

Eq_i (A i) x y = Eq (A 1) (cast_i (A i) 0 1 x) y

We can prove that these are equivalent:

lemma' : ∀ {A} {x y} → Eq Type (Eq (A 0) x (cast_i (A i) 1 0 y)) (Eq_i (A i) x y)
lemma' {A} {x} {y} = refl_j (Eq_k (A (j && k)) x (cast_i (A i) 1 j y))

where i && j is the and operator on intervals, i.e.

_&&_ : Interval → Interval → Interval
0 && j = 0
1 && j = j
i && 0 = 0
i && 1 = i

We can even go one step further to derive the ultimate in substitution, the J axiom:

J : ∀ {A : Type} {x : A} → (P : (y : A) → Eq A x y → Type)
  → (y : A) → (xy : Eq A x y) → P x (refl x) → P y xy
J P y xy pxy = cast_i (P xyⁱ (refl_j (xy^(j && i)))) 0 1 pxy

With the TTIE implementation, you can type check the all of the above examples. The implementation comes with a REPL, where you can ask for types, evaluate expressions, and so on. Expressions and types can have holes, which will be inferred by unification, like in Agda.

On the other hand, this is by no means a complete programming language. For example, there are no inductive data types. You will instead have to work with product types (x , y : A * B) and sum types (value foo x : data {foo : A; bar : B}). See the readme for a full description of the syntax.

Stream fusion for streaming, without writing any code

2016-06-07T21:02:00Z

I recently came accross the streaming library. This library defines a type Stream (Of a) m b for computations that produce zero or more values of type a in a monad m, and eventually produce a value of type b. This stream type can be used for efficient IO without having to load whole files into memory. The streaming library touts bechmark results showing superior performance compared to other libraries like conduit, pipes and machines.

Looking at the datatype definition,

data Stream f m r = Step !(f (Stream f m r))
                  | Effect (m (Stream f m r))
                  | Return r

it struck me how similar this type is to what is used in the stream fusion framework. The main difference being that the streaming library allows for interleaved monadic actions, and of course the lack of decoupling of the state from the stream to allow for fusion. But the vector library actually also uses such a monadic stream fusion framework, to allow for writing into buffers and such. This is type is defined in the module Data.Vector.Fusion.Stream.Monadic.

data Stream m a = forall s. Stream (s -> m (Step s a)) s
data Step s a where
   Yield :: a -> s -> Step s a
   Skip  :: s -> Step s a
   Done  :: Step s a

So, why not try to use vector's stream type directly as a representation of streams? I added this type as an extra alternative to the benchmark, and without writing any more code, the results are pretty impressive:

The only function that could be improved is scanL. In vector this function is implemented in terms of prescan (scanL without the first element) and cons, which makes it pretty inefficient. So I made a specialized implementation.

And that's all. A simple streaming 'library' with state of the art performance, while writing hardly any new code. Now to be fair, there are some reasons why you wouldn't always want to use these fusing streams. In particular, the resulting code could get quite large, and without fusion they may not be the most efficient.

Extra unsafe sequencing of IO actions

2015-10-10T22:00:00Z

Warning: evil ahead!

A while ago Neil Mitchell wrote about a different implementation of sequence for the IO monad. The issue with the usual definition is that it is not tail recursive. Neil's version uses some hacks to essentially break out of the IO monad. But the solution does require two traversals of the list.

Now in any language other than Haskell this IO monad wouldn't exist at all, and with a bit of luck lists would be mutable. Then you could implement sequence by just appending items to the end of a list. In Haskell you can not do that. Or can you?

A obvious way to implement mutable lists in haskell is with IORefs. But then you end up with something that is not an ordinary list, and you would have to use the IO monad even for reading from it. Instead, why not be unsafe? Just because Haskell doesn't let you change the tail of a list doesn't mean that it is impossible.

Now obviously this requires something beyond ordinary haskell. And even doing it from C via the foreign function interface is hard, because GHC will try to marshall values you pass to C functions. But GHC also allows you to write primitive operations in C--, which is essentially a portable assembly language. In C-- you *can* just overwrite the tail pointer of a (:) constructor to point to something else.

So I wrote a simple unsafeSetField function.

unsafeSetFieldzh (W_ i, gcptr x, gcptr y) {
  W_ bd;
  x = UNTAG(x);
  P_[x + SIZEOF_StgHeader + WDS(i)] = y; // write in memory

  bd = Bdescr(x);
  if (bdescr_gen_no(bd) != 0 :: bits16) {
    recordMutableCap(x, TO_W_(bdescr_gen_no(bd)));
    return ();
  } else {
    return ();
  }
}

There are several things going on here. First of all, GHC uses pointer tagging, meaning that we need to untag the incomming pointer. Secondly, it might be the case that the x lives in the old GC generation, in which case we have to mark the fact that we changed it, since otherwise y might get garbage collected. By the way, the zh in the end of the function name is the z encoding for the # character.

Now to use this function from Haskell we import it and add some unsafeCoerce,

foreign import prim "unsafeSetFieldzh" unsafeSetField#
  :: Int# -> Any -> Any -> (##)

unsafeSetField :: Int -> a -> b -> IO ()
unsafeSetField (I# i) !x y =
  case unsafeSetField# i (unsafeCoerce# x :: Any) (unsafeCoerce# y :: Any) of
    (##) -> return ()
{-# INLINEABLE unsafeSetField #-}

With it we can implement sequence as follows

sequenceU :: [IO a] -> IO [a]
sequenceU [] = return []
sequenceU (mx0:xs0) = do
    x0 <- mx0
    let front = x0:[]
    go front xs0
    return front
  where
  go back [] = return ()
  go back (mx:xs) = do
    x <- mx
    let back' = x:[]
    unsafeSetField 1 back back'
    go back' xs
{-# INLINEABLE sequenceT #-}

Now for the big questions: Does it work? The answer is that, yes it does! Benchmarking shows that the unsafe sequenceU is between 11% and 23% faster than Neil's sequenceIO in all cases. For small lists the standard sequence implementation is still marginally faster.

You should be aware that GHC sometimes shares values, so overwriting part of one might overwrite them all. And also that constant lists might become static values, meaning not allocated on the heap. So trying to overwrite parts of those will just crash the program.

I also wouldn't be surprised if the above code is subtly wrong. Perhaps I am missing a write barrier or doing something wrong with the generation check. So I wouldn't use it in production code if I were you.

What if you don't even know beforehand what constructor to use? The GHC runtime system has something called indirections. These are used to replace a thunk with its result after evaluation. But we can also use indirections to replace a value itself. But because of pointer tagging you can't just replace one constructor by another, because the tag would be wrong.

Instead the idea is to first allocate a special "hole" value, and then later fill that hole by overwriting it with an indirection. Note that you can only do that once, because the runtime system will follow and remove indirections when possible. So you get an API that looks like

newHole :: IO a
setHole :: a -> a -> IO a

It is also possible to implement sequence with holes. But, perhaps unsurprisingly, this turns out to be a bit slower. I'll leave the actual implementation as an exercise for the interested reader, as well as the question of what other evil you can commit with it.

I wanted to publish the functions from this post on hackage, but unfortunately I haven't yet figured out how to include C-- files in a cabal package. So instead everything is on github.

Dependent equality with the interval

2014-07-01T22:56:00Z

Here is a way to represent heterogeneous or dependent equalities, based on an interval type. In Homotopy Type Theory the interval is usually presented as a Higher Inductive Type with two constructors and a path between them. Here I will just give the two constructors, the path is implicit

data I : Set where
  i₁ : I
  i₂ : I
  -- there is usually a path, i-edge : i₁ ≡ i₂

The eliminator is

i-elim : ∀ {a} {A : I → Set a}
       → (x₁ : A i₁) → (x₂ : A i₂) → (Eq A x₁ x₂) → (i : I) → A i
i-elim x₁ x₂ eq i₁ = x₁
i-elim x₁ x₂ eq i₂ = x₂

Here the type Eq is the dependent equality, which has type

Eq : ∀ {a} (A : I → Set a) → (x₁ : A i₁) → (x₂ : A i₂) → Set a

so we take a type parametrized by an interval, and two values of that type at the two endpoints of this interval. We can also define "heterogeneous reflexivity", a generalization of the usual refl function:

refl : ∀ {a} {A : I → Set a} → (x : (i : I) → A i) → Eq A (x i₁) (x i₂)

This function can be used to extract the third part of i-elim, with the reduction

refl (i-elim x₁ x₂ eq) = eq

I believe this can be used as the basis for an observational type theory, where Eq A and refl x reduce. The above is the first case for refl, the rest is "just" tedious structural recursion such as

Eq (\i → A i × B i) x y = Eq A (proj₁ x) (proj₁ y) × Eq B (proj₂ x) (proj₂ y)
refl (\i → x i , y i) = refl x , refl y

and

Eq (\i → A i → B i) f g = {x : A i₁} → {y : A i₂} → Eq A x y → Eq B (f x) (g y)
refl (\i → \(x : A i) → f i x) = \{x} {y} xy → refl (\i → f i (i-elim x y xy i))

or we can actually use the dependent equality and be more general

Eq (\i → Σ (x₁ : A i) (B i x₁)) x y =
  Σ (x₁y₁ : Eq A (proj₁ x) (proj₁ y))
    (Eq (\i → B i (i-elim (proj₁ x) (proj₁ y) x₁y₁ i)) (proj₂ x) (proj₂ y))
Eq (\i → (x : A i) → B i) f g =
  {x : A i₁} → {y : A i₂} → (xy : Eq A x y)
  → Eq (\i → B i (i-elim x y xy i)) (f x) (g y)

Of course there is a lot more to it, but that is not the subject of this post.

As a final remark: if you are not too touchy about typing, then refl could even be implemented with the path i-edge between i₁ and i₂

i-edge : Eq (\_ → I) i₁ i₂
i-elim x₁ x₂ eq i-edge = eq
refl foo = foo i-edge

But I'd rather not do that.

cong from refl in univalent OTT

2013-07-04T16:00:00Z

This is a follow up on last week's post. There I showed that in a univalent Observational Type Theory, you can derive subst from cong. Now I am going to go one step further.

Suppose we change the definition of paths for functions from

Path (A → B) f g ≡ ∀ x → f x ≡ g x

Path (A → B) f g ≡ ∀ {x y} → x ≡ y → f x ≡ g y

Then for a function f, refl f is actually the same thing as cong f!. So that's one less primitive to worry about. In fact the only two path related primitives that remain are Path and refl. The rest is just in the computation rules.

Here are the changes in the agda code compared to last week:

postulate Path-→ : ∀ {a b} {A : Set a} {B : Set b} (f g : A → B)
                 → Path (A → B) f g
                 ≡ ((x y : A) → Path A x y → Path B (f x) (g y))

-- cong = refl
cong : ∀ {a b} {A : Set a} {B : Set b}
     → (f : A → B) → ∀ {x y} → Path A x y → Path B (f x) (f y)
cong f x=y = Meta.subst id (Path-→ f f) (refl _ f) _ _ x=y

-- subst is the same as last time
subst : ∀ {a b} {A : Set a} (B : A → Set b)
      → {x y : A} → (Path A x y) → B x → B y
subst B {x} {y} p with Meta.subst id (Path-Type (B x) (B y)) (cong B p)
... | lift (fw , bw , _ , _) = fw

-- and paths for dependent functions
postulate Path-Π : ∀ {a b} {A : Set a} {B : A → Set b} (f g : Π A B)
                 → Path (Π A B) f g
                 ≡ ((x y : A) → (pa : Path A x y)
                              → Path (B y) (subst B pa (f x)) (g y))

Of course this doesn't really change anything, since defining refl for function types is no easier than defining cong.

Representation

You might also notice that for all types A (except Set), the structure of Path A is essentially the same as that of A. In fact, for a (non-indexed) data type

data Foo : Set where
  foo₀ : Foo
  foo₁ : A → Foo
  foo₂ : Foo → Foo → Foo

you can mechanically derive its path type to be

data Path Foo : Foo → Foo → Set where
  refl-foo₀  : Path (foo₀ x) (foo₀ x)
  cong₁-foo₁ : ∀ {x x'} → Path A x x' → Path Foo (foo₁ x) (foo₁ x')
  cong₂-foo₂ : ∀ {x x' y y'} → Path Foo x x' → Path Foo y y'
                              → Path Foo (foo₂ x y) (foo₂ x' y')

In theory this allows for a nice implementation trick: we can take the representation of x and refl x to be the same. So for example 5 : Path Int 5 5 is a path that asserts that 5 = 5, and it is the only such path.

Originally I thought that an implementation would have to pass cong f along with every parameter f of a function type (which would suck). But in this way we don't have to, since f and cong f are the same function.

This also corresponds nicely to the idea that extra path constructors can be added in Higher Inductive Types. But I am not quite sure yet how that works out.

Food for thought

What is refl _→_?
What is refl refl? Does this even make sense?
For the representation of x : A and refl x to be the same, A and Path A x x also need to have the same representation. That seems works for functions and inductive types, but what about Set?
Is Path an applicative functor in some sense? With refl as return and cong as ap?

Substitution from congruence in univalent OTT

2013-06-22T14:51:00Z

In this post I will show that in an univalence style observational type theory, it is enough to take congruence as a primitive, rather than the more complicated substitution or J axioms. This post is literate Agda, so here are some boring import declarations

module subst-from-cong where

open import Level
open import Function
open import Data.Unit
open import Data.Bool
open import Data.Empty
open import Data.Product

I will be using the standard propositional equality as a meta equality,

open import Relation.Binary.PropositionalEquality as Meta using (_≡_)

while postulating a path type (equality type) and its computation rules for me to prove things about,

postulate Path : ∀ {a} → (A : Set a) → A → A → Set a
postulate refl : ∀ {a} → (A : Set a) → (x : A) → Path A x x

The idea of Observational Type Theory (OTT) is that Path is actually defined by case analysis on the structure of the argument type. For the finite types this is simple, there is a path if and only if the values are the same,

postulate Path-⊤ : Path ⊤ tt tt ≡ ⊤

postulate Path-Bool00 : Path Bool false false ≡ ⊤
postulate Path-Bool01 : Path Bool false true ≡ ⊥
postulate Path-Bool10 : Path Bool true false ≡ ⊥
postulate Path-Bool11 : Path Bool true true ≡ ⊤

A path for functions is a function to paths, which also means that we have functional extensionality.

Π : ∀ {a b} (A : Set a) (B : A → Set b) → Set (a ⊔ b)
Π A B = (x : A) → B x

postulate Path-Π : ∀ {a b} {A : Set a} {B : A → Set b} (f g : Π A B)
                 → Path (Π A B) f g ≡ ((x : A) → Path (B x) (f x) (g x))

In their original OTT paper, Alternkirch et.al. defined equality for types also by structure matching. I.e. Π types are equal to Π types with equal arguments, Σ types are equal to Σ types, etc. But this is incompatible with the univalence axiom from Homotopy Type Theory. That axiom states that equivalent or isomorphic types are equal. So, what happens if we take isomorphism as our definition of equality between types?

Iso : ∀ {a} → (A B : Set a) → Set a
Iso {a} A B
  = Σ (A → B) \fw →
    Σ (B → A) \bw →
    (∀ x → Path A (bw (fw x)) x) ×
    (∀ y → Path B (fw (bw y)) y)

id-Iso : ∀ {a} → (A : Set a) → Iso A A
id-Iso A = (id , id , refl A , refl A)

postulate Path-Type : ∀ {a} (A B : Set a)
                    → Path (Set a) A B ≡ Lift {a} {suc a} (Iso A B)

Now suppose that we have a congruence, i.e. that all functions preserve paths. So from a path between x and y, we can construct a path between f x and f y for any function f.

-- we have congruence for non-dependent functions
postulate cong : ∀ {a b} {A : Set a} {B : Set b}
               → (f : A → B) → ∀ {x y} → Path A x y → Path B (f x) (f y)

Then this is enough to define substitution, since the paths for a type B x are isomorphisms, and we can apply these in the forward direction

subst : ∀ {a b} {A : Set a} (B : A → Set b) {x y : A} → (Path A x y) → B x → B y
subst B {x} {y} p with Meta.subst id (Path-Type (B x) (B y)) (cong B p)
... | lift (fw , bw , _ , _) = fw

With substitution we can now finally define what paths are for dependent Σ types. A path between pairs is a pair of paths,

postulate Path-Σ : ∀ {a b} {A : Set a} {B : A → Set b} (x y : Σ A B)
                 → Path (Σ A B) x y
                 ≡ Σ (Path A (proj₁ x) (proj₁ y))
                     (\pa → Path (B (proj₁ y)) (subst B pa (proj₂ x)) (proj₂ y))

Substitution is not the most general eliminator for paths. It is not enough to prove properties about paths. For that we need the general induction principle for paths, often called J

J : ∀ {a b} {A : Set a} {x : A} → (B : (y : A) → Path A x y → Set b)
  → {y : A} → (p : Path A x y) → B x (refl A x) → B y p

Unfortunately, I was unable to prove J from just congruence. For that I needed an additional lemma,

postulate subst-refl : ∀ {a} {A : Set a} {x y : A} → (p : Path A x y)
                     → p ≡ subst (Path A x) p (refl A x)

Since Path A is inductively defined, I believe that subst-refl should be provable by case analysis on A, but I have not yet done so. We can now implement J by using subst with a dependent pair. Note that here I have to manually apply the comptuation rules for Path (Σ _ _) and use the subst-refl lemma.

J {A = A} {x = x} B {y} p
  = subst (uncurry B)
      (Meta.subst id (Meta.sym $ Path-Σ (x , refl A x) (y , p)) $
      (p , Meta.subst (\q → Path (Path A x y) q p) (subst-refl p)
           (refl (Path A x y) p)))

Does it compute

An important question to ask is whether this style of OTT is actually implementable. We can certainly implement the definitions, but would they allow us to compute?

The type Path A certainly reduces, by definition. Similarly, it is not hard to implemenent refl. The hard part is defining what cong means for various functions, and then proving subst-refl. Somewhere in there we should put the fact that paths are transitive and symmetric, since we have not used that property so far. For what I have done up till now I could equally well have taken Iso A B = A → B.

Here are the implementations of refl,

_≡[_]≡_ : ∀ {a} {A B : Set a} → A → A ≡ B → B → Set a
a ≡[ p ]≡ b = Meta.subst id p a ≡ b

postulate
  refl-⊤     : refl ⊤ tt ≡[ Path-⊤ ]≡ tt
  refl-Bool0 : refl Bool false ≡[ Path-Bool00 ]≡ tt
  refl-Bool1 : refl Bool true  ≡[ Path-Bool11 ]≡ tt
  refl-Π     : ∀ {a b} {A : Set a} {B : A → Set b} (f : Π A B)
             → refl (Π A B) f ≡[ Path-Π f f ]≡ (\x → refl (B x) (f x))
  refl-Type  : ∀ {a} (A : Set a)
             → refl (Set a) A ≡[ Path-Type A A ]≡ lift (id-Iso A)

For refl (Σ _ _) we need yet another lemma, which is a bit a dual to subst-refl₁, allowing refl in the second argument instead of the third.

postulate
  subst-refl₁ : ∀ {a b} {A : Set a} {B : A → Set b} {x : A} {y : B x}
              → y ≡ subst B (refl A x) y

  refl-Σ : ∀ {a b} {A : Set a} {B : A → Set b} (x : Σ A B)
         → refl (Σ A B) x ≡[ Path-Σ x x ]≡
          (refl A (proj₁ x) ,
           Meta.subst (\x1 → Path (B (proj₁ x)) x1 (proj₂ x))
                      (subst-refl₁ {B = B} {y = proj₂ x})
                      (refl (B (proj₁ x)) (proj₂ x)))

And here is a start of the implementation of cong,

postulate
  cong-const : ∀ {a b} {A : Set a} {B : Set b} {x x'} {y} {p : Path A x x'}
             → cong (\x → y) p ≡ refl B y
  cong-id    : ∀ {a} {A : Set a} {x x'} {p : Path A x x'}
             → cong (\x → x) p ≡ p
  cong-∘     : ∀ {a b c} {A : Set a} {x x'} {p : Path A x x'}
                 {B : Set b} {C : Set c} {f : B → C} {g : A → B}
             → cong (\x → f (g x)) p ≡ cong f (cong g p)
  -- etc.

At some point I think you will also need a dependent cong.

But this is enough postulating for one day.

The complete correctness of sorting

2013-05-23T12:43:33Z

A while ago I set out to prove the correctness of merge sort in Agda. Of course this has been done before. But most proofs you find are far from complete. All they prove is a lemma such as

is-sorted : ∀ (xs : List A) → IsSortedList (sort xs)

Maybe even restricted to lists of natural numbers. While it is nice that a sort function indeed produces a sorted output, that is only half of the story. Consider this function:

cheat-sort : List A → List A
cheat-sort _ = []

Clearly the empty list is sorted. So we are done. What is missing is the second half of correctness of sorting: that the output is a permutation of the input. You want something like:

sort : (xs : List A) → Sorted' A
record Sorted' (xs : List A) : Set where
  field
    ys       : List A
    isSorted : IsSorted ys
    isPerm   : IsPermutation ys xs

While I was at it, I decided to add the third half of correctness: a bound on the runtime or computational complexity. In the end I was able to define:

insertion-sort : ∀ xs → (Sorted xs) in-time (length xs * length xs)
selection-sort : ∀ xs → (Sorted xs) in-time (length xs * length xs)
merge-sort : ∀ xs → (Sorted xs) in-time (length xs * ⌈log₂ length xs ⌉)

This was not as easy as I would have hoped. In this post I will not bore you with all the details, I'll just go over some of the highlights. The full code is on github.

What it means to be sorted

There are roughly two ways to define sorted lists that I know of:

Parametrize the sorted list by a lower bound on the values it contains. For a cons cell the head should be smaller than the lower bound, and the tail should be larger than the head. This requires the type to have a smallest element, but you can adjoin -∞ with a new datatype.
Parametrize the sorted list by a list of all values in it. For a cons cell require that the head is smaller than all the values in the tail.

Since I already need to parametrize by all values in the list to show that the sorted list contains a permutation of them, I went with the second approach:

-- A proof that x is less than all values in xs
data _≤*_ (x : A) : List A → Set where
  []  : x ≤* []
  _∷_ : ∀ {y ys} → (x ≤ y) → x ≤* ys → x ≤* (y ∷ ys)

-- Proof that a list is sorted
data IsSorted : List A → Set where
  []  : IsSorted []
  _∷_ : ∀ {x xs} → x ≤* xs → IsSorted xs → IsSorted (x ∷ xs)

What it means to be a permutation

To show that one list is a permutation of another I again used two data types. Suppose that we know that xs is a permutation of ys. Then when is x ∷ xs a permutation of some list xys? Well, we can permute xs to ys, and insert x anywhere. I used ◂ to denote this insertion,

-- x ◂ xs ≡ xys means that xys is equal to xs with x inserted somewhere
data _◂_≡_ (x : A) : List A → List A → Set a where
  here  : ∀ {xs}           → x ◂ xs ≡ (x ∷ xs)
  there : ∀ {y} {xs} {xys} → (p : x ◂ xs ≡ xys) → x ◂ (y ∷ xs) ≡ (y ∷ xys)

-- Proof that a list is a permutation of another one
data IsPermutation : List A → List A → Set a where
  []  : IsPermutation [] []
  _∷_ : ∀ {x xs ys xys}
      → (p : x ◂ ys ≡ xys)
      → (ps : IsPermutation xs ys)
      → IsPermutation (x ∷ xs) xys

Now the Sorted data type has three components: the sorted list, a proof that it is sorted, and a proof that it is a permutation of the input. These parts are either all [], or they are all _∷_. It turns out to be much nicer to combine the parts together,

-- Sorted permutations of a list
data Sorted : List A → Set  where
  []   : Sorted []
  cons : ∀ x {xs xxs}
       → (p : x ◂ xs ≡ xxs) -- inserting x somewhere into xs gives xxs
       → (least : x ≤* xs)  -- x is the smallest element of the list
       → (rest : Sorted xs) -- and we have also sorted xs
       → Sorted xxs

Of course Sorted and Sorted' are equivalent.

As an aside, these are all the ingredients necessary for proving

sorted-unique : ∀ {xs} → (ys zs : Sorted xs)
              → sorted-to-List ys ≡ sorted-to-List zs

A monad for keeping track of the runtime

To be able to reason about the runtime, as measured in the number of comparisons performed, I decided to use a monad. The type is simply

data _in-time_ (A : Set) (n : ℕ) : Set a where
  box : A → C A n

the constructor box is private, and it can only be accessed through the standard monad operations,

return : ∀ {A n} → A → A in-time n

_>>=_ : ∀ {A B} {m n} → A in-time n → (A → B in-time m) → B in-time (n + m)

Then the sorting functions will be parametrized by a function that for some partial order decides between x ≤ y and y ≤ x in one step, using the monad we defined above:

module Sorting
    {A : Set} {l} {_≤_ : Rel A l}
    (isPartialOrder : IsPartialOrder _≡_ _≤_) 
    (_≤?_ : (x y : A) → (x ≤ y ⊎ y ≤ x) in-time 1)
  where ...

Note that I specify that _≤_ is a partial order, because the Agda standard library definition of a total order actually comes with a function

total : ∀ x y → (x ≤ y) ⊎ (y ≤ x)

which would defeat the whole prupose of _≤?_. In fact, the standard TotalOrders are decidable up to base equality, and if the base equality is propositional equality, then they are decidable. I.e.

total-decidable : ∀ {a r} {A : Set a} → (_≤_ : Rel A r)
                → IsTotalOrder _≡_ _≤_
                → IsDecTotalOrder _≡_ _≤_

See the source for the proof of this side theorem. It relies on a trick to show that total x y can only be different from total y x if x ≢ y. Which holds for propositional equality, but not in general.

Logarithms

To be able to complete the specification of merge sort, we still need to add some missing functions on natural numbers. In particular, we need a logarithm. This logarithm turns out to be surprisingly tricky to define in Agda. Why? Because the usual definition uses non-structural recursion. In haskell you would write

-- @log n@ calculates ⌊log₂ (n+1)⌋
log 0 = 0
log n = 1 + log (n `div` 2)

But Agda is not able to see that n `div` 2 (or in agda notation, ⌊ n /2⌋) is smaller than n. There are two approaches to circumvent this problem:

Use a different algorithm: Convert n to a binary representation, and count the number of digits.
Use well-founded recursion, manually supplying a proof that ⌊ n /2⌋ < n.

I went with the second option, because I will also be using the same shape of recursion inside merge sort itself. The standard way to use well-founded recursion is through the function <-rec, which works a bit like fix in haskell, except that you need to pass in a proof that the argument is smaller. The code would look like this:

log = <-rec log'
  where
  log′ self 0 = 0
  log′ self (suc n) = 1 + self ⌊ suc n /2⌋ ({-proof ommitted-})

But this leads to a problem as soon as you want to prove a property of logarithms. For example, you would think that log (suc n) ≡ 1 + (log ⌊ suc n /2⌋). But that is not definitionally true, since one <-rec is not like another. I found that the well-founded recursion library was in general a pain to work with, especially because it uses so many type synonyms. My solution was to use the slightly lower level accessibility relation. A value of type Acc _<′_ n allows you to do recursion with any m <′ n. Now I can use actual recursion:

log-acc : ∀ n → Acc _<′_ n → ℕ
log-acc 0 _ = 0
log-acc (suc n) (acc more) = 1 + log-acc ⌊ suc n /2⌋ (more _ {-proof ommitted-})

And use the well-foundedness of ℕ to get an Acc for any number:

log : ℕ → ℕ
log n = log-acc n (<-well-founded n)

⌈log₂_⌉ : ℕ → ℕ
⌈log₂ n ⌉ = log (pred n)

There is still a snag when proving properties of log or log-acc, namely that you need to prove that (more n ...) ≡ <-well-founded n. But the accessibility relation doesn't actually matter for the computation, so I decided to just postulate

postulate acc-irrelevance : ∀ {n : ℕ} → {a b : Acc _<′_ n} → a ≡ b
 -- this also follows from function extensionality

If anyone knows a better way to prove properties of functions defined with well-founded recursion, I am open to suggestions.

Vectors versus lists

While working on the proofs I had to choose: Do I use fixed length Vecs or variable length Lists? Both have their pros and cons.

On the one hand, the sorting functions with vectors look a bit nicer, because we can use n instead of length xs:

merge-sort : ∀ {n} (xs : Vec A n) → Sorted xs in-time (n * ⌈log₂ n ⌉)

Additionally, with lists we can only do recursion on the input list, with vectors we can do recursion on the length of the list. The former works fine for insertion sort, where in each step you do something with the head element of the list; but it fails for selection and merge sort.

On the other hand, with vectors you sometimes can't even state the property that one vector is equal to another. For the term xs ≡ ys ++ zs to be well-typed, xs must have the type Vec A (m + n).

I went back and forth a couple of times between vectors and lists. In the end I settled for using vectors only when needed, and specifying properties in terms of lists. For example the split function for merge sort has the type

splitHalf : ∀ {n} → (xs : Vec A n)
          → ∃₂ \(ys : Vec A ⌈ n /2⌉) (zs : Vec A ⌊ n /2⌋)
               → toList ys ++ toList zs ≡ toList xs

So instead of using Vec._++_, I use List._++_. In this style 'select' from selection sort looks like

select : ∀ {n} (xs : Vec A (suc n))
       → (∃₂ \y ys → (y ◂ toList ys ≡ toList xs) × (y ≤* toList ys)) in-time n

I.e. given a vector xs with n+1 elements, return a vector ys with n elements, such that inserting y into it gives us back xs. And this item y should be the smallest one.

Extension: expected runtime

An extension of this post would be to look at randomized sorting algorithms. In particular, quick sort with a randomly chosen pivot has expected runtime O(n * log n). At first I thought that all that would be needed is a function

expected : ∀ {P}
         → (ns : List ℕ)             -- A list of numbers
         → All (\n → P in-time n) ns -- for each n we have P in-time n
         → P in-time ⌈mean ns ⌉      -- then expect time is mean of ns

But that is not quite right, since if we actually knew the runtimes ns we could just pick the fastest one. With the randomized quicksort you will end up in a situation where you have two or more computations to choose from, and you know that some are faster than the others, but you don't yet know which one. That sounds a bit classical. A second idea is to return the runtimes at a later time, something like

expected : ∀ {P} {long-time}
         → (xs : List (\ex n P in-time n) in-time long-time)
         → P in-time ⌈mean map proj1 xs ⌉

But this is not quite right either, since after long-time computing P (i.e. a sorting) can be done in 0 time. Rather, we need to decouple the proof about the runtime from the computation. This is not possible with the _in-time_ monad. We would need to get rid of the runtime from the type, and store it as a value instead.

I have tried redoing the proofs in this post with the monad

data Timed (A : Set) : Set a where
  _in-time_ : A → ℕ → Timed A
runtime : Timed A → ℕ

But I didn't succeed; I ended up with the baffling error message

runtime (big-lambda-term (unbox (x ≤? u)))
!=
runtime (big-lambda-term (unbox (x ≤? u)))

Another extension: lower bound on runtime

So far I have proved that you can sort a list in time n * log n. It would also be interesting to look at the well known lower bound on the runtime of sorting, and prove a theorem such as

can't-sort-in-linear-time : ¬ ∃ \k → ∀ xs → Sorted xs in-time k * length xs

unfortunately this statement is not actually true for all types. For finite sets you actually can sort in linear time with counting sort. It also fails if we happen to have some decidable total order for that type lying around. But it might be possible to prove

can't-sort-in-linear-time
  : (no-fast-compare : ∀ x y → (x ≤ y ⊎ y ≤ x) in-time 0 → x ≡ y)
  → ¬ ∃ \k → ∀ xs → Sorted xs in-time k * length xs

But you have to be really careful with a term like no-fast-compare, because inside the runtime monad we do have values of type (x ≤ y ⊎ y ≤ x). And so you can derive ∀ x y → x ≡ y in-time 1, and therefore also ⊥ in-time 1 for non trivial types. Which certainly looks wrong to me.

I don't know a way around this problem, but it might be related to the same issue as expected runtime. I.e. the problem is that all information about the runtime is bundled together with the return value. The lower bound proof essentially asks to sort a 'random' list, and by a counting argument shows that at least a certain number of comparisons are needed to be able to produce all outputs.

Categories over pairs of types

2012-07-26T19:33:00Z

Today Dan Burton remarked that Pipe is a category-like thing, and to express it we would need "type bundling". I myself said something similar a while ago. More formally, rather than a category where the objects are Haskell types, we have a category where the objects are pairs of types.

It turns out that with a bunch of recent Ghc extensions we can actually write this in Haskell.

{-# LANGUAGE PolyKinds, DataKinds, KindSignatures #-}
{-# LANGUAGE MultiParamTypeClasses, TypeFamilies #-}

For the purposes of this blogpost I'll use a dummy type for Pipe, there are plenty of other blog posts that give an actually functional one. The important thing to note is that in the type of (>+>), there are two types that are composed over, input/output io and upstream/downstream result ur.

-- Ceci n'est pas une pipe
data Pipe i o u m r = Pipe { runPipe :: Either i u -> m (Either o r) }

(>+>) :: Monad m
      => Pipe io₁ io₂ ur₁ m ur₂
      -> Pipe io₂ io₃ ur₂ m ur₃
      -> Pipe io₁ io₃ ur₁ m ur₃
(>+>) (Pipe f) (Pipe g) = Pipe (f >=> g)

idP :: Monad m => Pipe i i r m r
idP = Pipe return

With the PolyKinds extension we can make a variant of Category that works for tuples of types as well as for normal types. This class looks exactly the same as the normal one:

class Category cat where
    id :: cat a a
    (.) :: cat b c -> cat a b -> cat a c

But because of PolyKinds it magically becomes more general. You can see this by comparing their kinds

λ> :kind Category
Category :: (AnyK -> AnyK -> *) -> Constraint
λ> :kind Control.Category.Category
Control.Category.Category :: (* -> * -> *) -> Constraint

With DataKinds it becomes possible to have tuples of types, which are written as '(Type1,Type2). Unfortunately we can not (yet?) pattern match on these directly in data declarations. So we need type families to unwrap them:

type family Fst (xy :: (*,*)) :: *
type family Snd (xy :: (*,*)) :: *
type instance Fst '(x,y) = x
type instance Snd '(x,y) = y

Note that the kind signatures are necessary. Without them Ghc will give errors like

Couldn't match kind `BOX' against `*'

With these type functions in hand we can write

newtype WrapPipe m iu or = WrapPipe
     { unWrapPipe :: Pipe (Fst iu) (Fst or) (Snd iu) m (Snd or) }

instance Monad m => Category (WrapPipe m) where
    id = WrapPipe idP
    x . y = WrapPipe (unWrapPipe y >+> unWrapPipe x)

And that's it. We now have a category whose objects are not Haskell types, but rather pairs of Haskell types. In Ghc's terms, an instance of Category (*,*) instead of Category *. The kind parameter is why we need MultiParamTypeClasses.

With this same trick we can also define Category instances for product categories and lens families. Or going the other way, you can wrap Monoids as a Category over objects of kind (). You could even go one step further and have a category for lists of functions of different types.

There is a big downside, however. And that is that the type inference engine is not able to see past the type families. You need to give an explicit type annotation on the wrapped pipe. Compare

λ> type MyWPipe = WrapPipe IO '(Int,String) '(Int,String)
λ> runPipe (unWrapPipe (id . id :: MyWPipe)) $ Right "done"
Right "done"

with

λ> type MyPipe = Pipe Int Int String IO String
λ> runPipe (unWrapPipe (id . id) :: MyPipe) $ Right "done"
<interactive>:2:10:
   Couldn't match type `Fst or0' with `Int'
   blah, blah, blah, etc.

This makes sense, since Ghc doesn't know that there are no other instances of Fst and Snd. Ideally we would like to write

newtype WrapPipe m '(i,u) '(o,r) = WrapPipe { unWrapPipe :: Pipe i o u m r }

Benchmark: unpacked values in containers

2012-06-08T21:59:00Z

Inspired by a discussion on the ghc mailing list, I wondered how much performance can be gained by specializing and unboxing certain data types. In particular, I looked at Data.Map. Suppose that you have a map from ints to ints. First of all, you should be using Data.IntMap instead, but that is besides the point.

If you know that the keys and values are always strict integers, then the data type could be specialized from

data Map k a
    = Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)
    | Tip

data MapIntInt
    = Tip
    | Bin {-# UNPACK #-} !Size {-# UNPACK #-} !Int {-# UNPACK #-} !Int
          !(MapIntInt) !(MapIntInt)

It would be great if this could be generated automatically by the compiler. But as was pointed out, that is really hard to do, because the size of the constructors would change, depending on the type arguments. So generic functions become impossible. It would also require multiple different info tables for the garbage collector, among other problems.

So, it's probably easier to do this specialization manually. I was thinking of using template haskell, in combination with type families. This would allow you to write something like

deriveSpecializedUnboxedType [d|type UnboxedMapIntInt = Map !Int !Int |]

but before going there, let's first see whether this is worth the effort at all.

So, I did the specialization by hand for Map Int Int, and ran the containers benchmarks. Here is a representative part of the results,

click for full the criterion report. The horribly hacky code is available on github.

In this graph

generic = generic Map Int Int.
unboxed = Map with both key and value specialized to strict and unpacked Int.
gintmap = value generic IntMap Int
uintmap = IntMap with values specialized to unpacked Int.

As you can see, specializing and unboxing gives a modest performance improvement. There is probably also an improvement in memory usage, but this benchmark doesn't directly measure that. Switching to a better data structure, i.e. patricia tries instead of balanced trees helps a lot more for some benchmarks, such as delete, but very little for others such as map.

Overall, it seems like specialization can definitely be worth it; in some cases improving performance by 40%. And it never has a negative impact, at least in this benchmark. Real life might be different though, especially if there are also Maps with other types of keys and values around.

Note also that this benchmark was compiled for a 32-bit architecture. On 64-bit, pointers and hence boxed values have more overhead.

Building pipes with monad transformers

2012-06-02T23:32:00Z

In this post I show another way to implement pipes, by combining a producer and consumer monad transformer. This implementation is for educational and entertainment purposes only: you probably shouldn't try to use it in production software. To quote Donald Knuth: I have only proved it correct, not tried it. One obvious thing that is missing is finalization, but that could be added by passing along a finalizer with each call to yield, as described by Gabriel Gonzalez.

Producers

Let's start with producers. A producer can produce a stream of values of type o, and then ends with a value of type a. At each step, it performs a monad action.

data ProducerT' o m a = Done a | More o (ProducerT o m a)
newtype ProducerT o m a = ProducerT { runProducerT :: m (ProducerT' o m a) }

This monad transformer is similar to the ListT-done-right monad transformer. The difference is that the producer has a value at the end, while ListT ends with an empty list. More importantly, ListT is a monad over the list items, while ProducerT is a monad over the end value. It can't be a monad over the stream values, because then return would have to conjure a value of type a out of nowhere.

The Monad and MonadTrans instances are straightforward:

instance Monad m => Monad (ProducerT o m) where
    return = ProducerT . return . Done
    a >>= b = ProducerT $ runProducerT a >>= bind'
      where
        bind' (Done x) = runProducerT (b x)
        bind' (More o k) = return (More o (k >>= b))

instance MonadTrans (ProducerT o) where
    lift = ProducerT . liftM Done

The point of producers is that they can produce values. So, let's make a function for that

yield :: Monad m => o -> ProducerT o m ()
yield x = ProducerT $ return $ More x (return ())

Given a producer, we can try to extract the first value. This succeeds if the stream is not empty, otherwise it returns the end value. In both cases we also return a the remaining producer:

headProducerT :: Monad m => ProducerT o m a -> m (Either a o, ProducerT o m a)
headProducerT = liftM step . runProducerT
  where
    step (Done x) = (Left x, return x)
    step (More o k) = (Right o, k)

This head function will form the building block for building consumers. If you look at the function's type, you might notice that it is very similar to that of a state monad. We could imagine a consumer as something that keeps track of the input producer, and repeatedly takes the head of it. So, a first idea might be

type ConsumerT' i t m = StateT (ProducerT i m t) m
await' :: Monad m => ConsumerT' i t m (Either t i)
await' = StateT headProducerT

This seems to work. We can compose a producer and a consumer very easily with evalStateT:

compose_pc :: Monad m => ProducerT i m t -> ConsumerT' i t m a -> m a
compose_pc = flip evalStateT

In terms of pipes, we have composed a pipe with no input together with a pipe that produces no output, to give a 'pipe' with neither input nor output.

Pipes

A general pipe is a computation that is both a producer and a consumer. There are two obvious ways of building one: with the consumer on the outside, or with the producer on the outside.

type Pipe_CP i a o m b = ConsumerT' i a (ProducerT o m) b
type Pipe_PC i a o m b = ProducerT o (ConsumerT' i a m) b

These types are not the same. Pipe_CP first consumes a whole bunch of input, and then produces a whole bunch of output. In particular, it is impossible to stop early. In Pipe_PC, the operations are interleaved; before each output there can be more consuming. This second formulation is therefore the one that we want.

Before doing fully general composition, let's first compose a producer with a pipe,

compose_p :: Monad m => ProducerT b m s -> Pipe_PC b s c m t -> ProducerT c m t

Remember that a value p₁ of type Pipe_PC can look something like this:

p₁ = ProducerT $ StateT $ \s₁ -> act >> return (More o₁ p₂, s₂)

As before, the upstream producer is the state for the downstream consumer. So, we can fill in the upstream producer for s₁. Once we do so, we get access to s₂, which should be filled in into p₂, etc. In this way we turn the pipe from a ProducerT o (StateT .. m) a into a ProducerT o m a. So more generally, we change the base monad of a monad transformer.

class MonadTransRebase t where
    rebase :: NestTrans m n -> t m a -> t n a

instance MonadTransRebase (ProducerT o) where
    rebase f = ProducerT . runNestTrans f rebase' . runProducerT
      where
        rebase' f' (Done d) = Done d
        rebase' f' (More x k) = More x (rebase f' k)

The type NestTrans is a function from m a to n b, where the transformation inside the monadic values can use a different NestTrans. Hence the 'nested' part of the name.

newtype NestTrans m n = NestTrans
    { runNestTrans :: forall a b. (NestTrans m n -> a -> b) -> (m a -> n b) }

As said above, given the initial state, we can pass this state through the transfomer. Then the new state is used for nested StateT computations.

nestTransStateT :: Monad m => s -> NestTrans (StateT s m) m
nestTransStateT s = NestTrans $ \f m ->
    liftM (\(a,s') -> f (nestTransStateT s') a) (runStateT m s)

This is all we need to define the composition:

compose_p u v = rebase (nestTransStateT u) v

Note that it is possible to write all this without the NestTrans newtype, but to do so generically requires rank 3 types (the first time that I have ever needed those). I leave that solution as an exercise to the reader.

Consumers, take 2

Now let's also try to do this the other way around, and compose a pipe with a consumer,

compose_c :: Monad m
         => Pipe_PC a r b m s -> ConsumerT' b s m t -> ConsumerT' a r m t

But immediately we hit a problem. The downstream consumer expects its state to be of type ProducerT b m s, but the upstream has type ProducerT b (ConsumerT .. m) s. This is still a producer, but over a different base monad. In fact, the upstream producer's base monad is of the form (t m), where t is another monad transformer. We can't just get rid of the ConsumerT, like we did on the downstream side, because we still need to be able to pass in the state later on.

The solution is to make the state type more general, and allow it to be ProducerT over any transformation of a given base monad. Effectively we replace the state type s m a by forall t. s (t m) a. This gives us the transformed state monad:

data TStateT s a m b = TStateT
    { runTStateT :: forall t. (MonadTrans t, Monad (t m))
                 => s (t m) a -> t m (b, s (t m) a) }

Note that s is not a state type, but a state monad transformer. The instances are straightforward, and look identical to the instances for StateT, with the exception of an extra lift in the MonadTrans instance.

instance Monad (TStateT s t m) where
    return a = TStateT $ \s -> return (a, s)
    m >>= k = TStateT $ \s -> do
        (a,s') <- runTStateT m s
        runTStateT (k a) s'

instance MonadTrans (TStateT s t) where
    lift mx = TStateT $ \s -> lift $ liftM (\x -> (x,s)) mx

The new consumer type is just a TState with ProducerT as the state:

type ConsumerT i = TStateT (ProducerT i)
type GPipe i o m = o (i m)
type Pipe i a o m = GPipe (ConsumerT i a) (ProducerT o) m

Awaiting looks much like before,

await :: Monad m => Pipe i t o m (Either t i)
await = lift $ TStateT headProducerT

All we need to do now to define composition is to make a NestTrans for TStateT. The function to do this is essentially the same as nestTransStateT above:

nestTransTStateT :: (Monad (t m), MonadTrans t)
                 => s (t m) a -> NestTrans (TStateT s a m) (t m)
nestTransTStateT s = NestTrans $ \f m ->
    liftM (\(a,s') -> f (nestTransTStateT s') a) (runTStateT m s)

and by magic, we get composition:

compose :: Monad m => Pipe a r b m s -> Pipe b s c m t -> Pipe a r c m t
compose = rebase . nestTransTStateT

General consumers and producers

There is nothing specific to ConsumerT or ProducerT in the composition function. All we require is that the 'consumer' on the left is a monad transformer, and that 'producer' on the right can be rebased. This leads to the more general type of compose:

compose :: (MonadTransRebase t, MonadTrans r, Monad (r m))
        => GPipe r s m a -> GPipe (TStateT s a) t m b -> GPipe r t m b

There are some interesting choices for r, s and t here. By picking r = IdentityT, we get an upstream 'pipe' with no input, i.e. a producer. By picking t = IdentityT, we get a downstream 'pipe' with no output, i.e. a consumer.

Finally, the transformer s determines what information is based between the two pipes. By using ProducerT o you get a stream of os followed by an a at the end. If you use ListT, there is a stream of as with no value at the end. If you use IdentityT, just a single value is passed, so you get function composition. If you use InfiniteListT you get a producer that guarantees that it gives an infinite stream of values. And I believe it should also be possible to define more complex protocols, such as "first give 10 values of type a, then an unlimited number of b, and end with a c". However, you do need a different await function for all of these.

To close things off, here are the producers and consumers based on IdentityT.

instance MonadTransRebase IdentityT where
    rebase f = IdentityT . runNestTrans f (const id) . runIdentityT

type ProducerPipe o m = GPipe IdentityT (ProducerT o) m
type ConsumerPipe i a m = GPipe (ConsumerT i a) IdentityT m
type Pipeline m = GPipe IdentityT IdentityT m

runPipeline :: Pipeline m a -> m a
runPipeline = runIdentityT . runIdentityT

What to do with the results of upstream pipes

2012-04-04T19:35:00Z

In the pipes library, the type of the composition operator is

(>+>) :: Pipe m a b r -> Pipe m b c r -> Pipe m a c r

If you look closely, then you will notice that all three pipes have result type r. How does this work? Simple: whichever pipe stops first provides the final result.

In my opinion this is wrong. The upstream pipe produces values, and the downstream pipe does something with them. The downstream pipe is the one that leads the computation, by pulling results from the upstream pipe. It is therefore always the downstream pipe that should provide the result. So, in the pipification of conduit, the proposed type for composition is instead

(>+>) :: Pipe m a b () -> Pipe m b c r -> Pipe m a c r

This makes it clear that the result of the first pipe is not used, the result of the composition always has to come from downstream. But now the result of the first pipe would be discarded completely.

Another, more general, solution is to communicate the result of the first pipe to the second one. That would give the await function in the downstream pipe the type

await :: Pipe m a b (Either r₁ a)

where r₁ is the result of the upstream pipe. Of course that r₁ type needs to come from somewhere. So Pipe would need another type argument

data Pipe m stream_in stream_out final_in final_out

giving await the type

await :: Pipe m a b x (Either x a)

Composition becomes

(>+>) :: Pipe m a b x y -> Pipe m b c y z -> Pipe m a c x z

I think this makes Pipe into a category over pairs of Haskell types. I was tempted to call this a bicategory, in analogy with bifunctor, but that term apparently means something else.

Note that this article is just about a quick idea I had. I am not saying that this is the best way to do things. In fact, I am not even sure if propagating result values in this way actually helps solve any real world problems.

My blog software

2012-03-28T21:13:00Z

In this post I will explain the software behind my blog, since several visitors have asked about it. But I will have to disappoint those of you hoping for fancy Haskell code: it is written in PHP. So no pandoc, no hakyll, no happstack and no Yesod.

Some reasons for picking php are:

PHP works pretty much everywhere, and
PHP hosting is very cheap.
I already had much of the code laying around from other website projects.
I honestly think that PHP is a better tool for a simple hacked together website like this.

From files to blog posts

The code is inspired by blosxom, which I used before. Blog posts, as well as most other pages on the website, are stored in text files. The script blog.php calls a function Resolver::find_all_pages, which does a simple directory listing to find all the .txt and .lhs files in the blog subdirectory.

For me this makes it very easy to write a new blog post. I just fire up a text editor, and save the file under the right name. I can then view it on localhost. To publish to the rest of the internet I copy the file to the webserver.

The parser for the text files is also quite simple. First comes a header block, with lines like

title: My blog software
tags: blog, bananas
date: 2012-03-28 23:13 CEST

After that is the page body, using a markup syntax strongly inspired by MediaWiki.

-- This is a header --
Body text with ''italic'', @embeded haskell@, <a href="#">embeded html</a>

> haskell_code = 1 + 1
>   where more haskell = undefined

]> also Haskell, but ignored by the literate haskell preprocessor

To parse the markup, I use a state machine. I loop over the lines, and determine the line's type. For example, lines starting with "> " indicate Haskell code, -- (.*) -- indicates a header, etc. I then just output the appropriate html code for that line. The state machine comes in when merging multiple lines of code into one <pre> tag. I just keep a variable with the last used open tag. If the previous line uses the same open tag, then do nothing, otherwise insert the close tag for that state, and the open tag for the new state. The details are in WikiFormat.php.

Then the source code itself. It needs to be turned into fancy syntax highlighted html. That is done with a simple hand-written lexer. Lexing is surprisingly easy if you have access to a build in regular expression library. Just repeatedly look for the first match after the current index for a set of possible token regexes.

There are some backdoors in the lexer, to allow arbirary html inside code blocks, so

]> !!!<span style="background:red">wrong</span>!!!

gets rendered as

wrong

This sometimes comes in handy when writing blog posts. Usually I add these backdoors as they are needed.

One issue is that all this on-the-fly parsing can be a bit slow. For that I use a cache. I just capture the entire rendered page, and save it in a file. Then before rendering, and in fact before even loading the file, I check if the cache is up to date. If it is, output the cache contents and exit.

Finally the comments, which again use a simple hand-written solution. I just store the comments for each post in a single text file. New comments are appended at the end. The comments use the same markup parser as the article bodies. The most annoying part of the comment system is actually the spam filter. I have a blacklist of words and urls that are not allowed, and a script for retroactively removing spam posts. But some spam does get through.

That's it. The code is on github, if anyone is interested.

Conduits vs. Pipes

2012-03-24T14:31:00Z

Michael Snoyman released conduit-0.3 this week. The conduit package provides three datatypes that can be chained together: Source, Counduit and Sink. If you were to look at the source code, you will notice that there is a lot of overlap between these datatypes. In this post I'll show how these types can be combined into a single one, which is the idea used by the pipes package.

Compare:

data Sink i m o =
    Processing (i -> Sink i m o) (SinkClose m o)
  | Done (Maybe i) o
  | SinkM (m (Sink i m o))
type SinkClose m o = m o

data Conduit i m o =
    NeedInput (i -> Conduit i m o) (ConduitClose m o)
  | HaveOutput (Conduit i m o) (m ()) o
  | Finished (Maybe i)
  | ConduitM (m (Conduit i m o)) (m ())
type ConduitClose m o = Source m o

The differences between the two types are that:

Done returns output o, whereas Finished does not.
Conduit has a HaveOutput constructor, while a Sink does not.
ConduitM has an 'early close' action of type m ().
SinkClose just gives a result, while ConduitClose can return an entire stream in the form of a Source m o

The term output is in fact used differently by the two types, it becomes clearer when we say that Sink has a result of type r. Then the result of Conduit is r = (). On the other hand, a sink doesn't produce output to downstream conduits, so its output type would be Void.

Now let's also bring in Source,

data Source m a =
    Open (Source m a) (m ()) a
  | Closed
  | SourceM (m (Source m a)) (m ())

The SourceM constructor is exactly analogous to ConduitM, and Open is analogous to HaveOutput. A Source doesn't have input, so there is no analogue to NeedInput or Processing. The Closed constructor doesn't provide remaining input or result, since a source doesn't have either. However, we could say that its input is i = (), and its result is r = ().

It then becomes possible to unify the three datatypes into:

data Pipe m i o r =
    NeedInput (i -> Pipe m i o r) (Pipe m () o r)
  | HaveOutput (Pipe m i o r) (m ()) o
  | Finished (Maybe i) r
  | PipeM (m (Pipe m i o r)) (m r)

type Source m o = Pipe m () o ()
type Conduit i m o = Pipe m i o ()
type Sink i m r = Pipe m i Void r

This is almost exactly the type provided by the various incarnations of the pipes package!

The three composition operators of conduits become a single operator on pipes. The top level "run" operation takes a Pipe m () Void r, that is, a (composition of) pipes that takes no input and has no output.

What about the instances for Source, Conduit and Sink? In the conduit package Sink is an instance of Monad and its superclasses. That is also the case for Pipe. Source and Conduit are instances of Functor, which allows you to map a function over the output. The output is no longer the last type variable of Pipe. Instead we should provide an instance of Functor2 or Bifunctor, which have a method fmap2 :: (a -> b) -> f a r -> f b r.

Overall, reducing the number of datatypes from 3 to 1 sounds like a pretty good deal to me. I therefore think it would be great if conduit adopted the ideas from pipes.

Dependently typed DAGs

2012-03-19T21:16:00Z

A colleague of mine recently needed to represent DAGs (directed acyclic graphs) in Coq, and asked around for ideas. Since Coq is not a nice language to program in, I decided to use Haskell instead. Something close to dependently typed programming is possible in Haskell thanks to GADTs. And other extensions will be helpful too,

{-# LANGUAGE GADTs, TypeOperators, Rank2Types #-}

My idea is to represent a DAG as a list of nodes. Nodes have a list of children, where each child is a reference to an element later in the list.

For example, the DAG

would be represented as

[Node "a" [1,2,2,4], Node "b" [3,3], Node "c" [3,4], Node "d" [], Node "e" []]

Data types

To make the above representation safe, we need to ensure two things:

Each child-reference is greater than the index of the parent.
Each child-reference refers to an actual node, so it must be smaller than the size of the list.

The first condition is easily satisfied, by making the reference relative to the current position and using natural numbers. So the representation would be

[Node "a" [0,1,1,3], Node "b" [1,1], Node "c" [0,1], Node "d" [], Node "e" []]

For the second condition we need dependent types. In particular the type Fin n of numbers smaller than n.

data Zero
data Succ n

data Fin n where
    Fin0 :: Fin (Succ n)
    FinS :: Fin n -> Fin (Succ n)

A node then holds a label of type a and a list of numbers less than n.

data Node a n where
    Node :: a -> [Fin n] -> Node a n
  deriving (Eq,Show)

For the list of nodes we will use a dependently typed vector,

data Vec f n where
    Empty :: Vec f Zero
    (:::) :: f n -> Vec f n -> Vec f (Succ n)
infixr 5 :::

A value of Vec f n is a list of the form [] or [x₀::f 0] or [x__1::f 1, x__0::f 0] or [x₂::f 2, x₁::f 1, x₀::f 0] etc., with a length equal to the parameter n. That is exactly what we need for DAGs:

type DAG a = Vec (Node a)

Instances

I would like to define Eq and Show instances for these datatypes. But the instance for Vec would look something like

instance (forall m. Eq (f m)) => Eq (Vec f n)

which is not valid Haskell, even with extensions. The solution is to use another class, Eq1

class Eq1 f where
    eq1 :: f n -> f n -> Bool

Now we can define

instance Eq1 f => Eq (Vec f n) where
    Empty == Empty = True
    (x ::: xs) == (y ::: ys) = eq1 x y && xs == ys

The boring instances for Node and Fin are

instance Eq a => Eq1 (Node a) where
    eq1 = (==)
instance Eq1 Fin where
    eq1 = (==)
instance Eq1 f => Eq1 (Vec f) where
    eq1 = (==)

-- ghc can't derive this
instance Eq (Fin n) where
    Fin0   == Fin0   = True
    FinS i == FinS j = i == j
    _      == _      = False

The same goes for Show

class Show1 a where
    showsPrec1 :: Int -> a n -> ShowS

-- instances ommitted, see source code

Convert to tree

To show that these DAGs work, we can convert from a DAG to a tree by duplicating all nodes. The tree type is a simple rose tree, as those in Data.Tree:

data Tree a = TNode a [Tree a]  deriving Show

To be able to make a dag into a tree, we need to know the root node. So we give the toTree a DAG n and an Fin n to indicate the root.

-- Convert a DAG to a tree, using the given node index as root
toTree :: Fin n -> DAG a n -> Tree a
toTree Fin0 (Node x cs ::: ns) = TNode x [toTree c ns | c <- cs]
toTree (FinS i) (_ ::: ns) = toTree i ns -- drop the head until we reach the root

And for convenience, a function that assumes that the first node in the list is the root.

toTree' :: DAG a (Succ n) -> Tree a
toTree' = toTree Fin0

Here is the example from above

example = Node "a" [Fin0,FinS Fin0,FinS Fin0,FinS (FinS (FinS (Fin0)))]
      ::: Node "b" [FinS Fin0,FinS Fin0]
      ::: Node "c" [Fin0,FinS Fin0]
      ::: Node "d" []
      ::: Node "e" []
      ::: Empty

λ> toTree' example
TNode "a" [TNode "b" [TNode "d" [],TNode "d" []]
          ,TNode "c" [TNode "d" [],TNode "e" []]
          ,TNode "c" [TNode "d" [],TNode "e" []]
          ,TNode "e" []]

As an image:
.

Convert from a tree

More interesting is the conversion from a tree to a DAG, in such a way that we share identical nodes. For that we first of all need to be able to search a DAG to see if it already contains a particular node.

Let's do this a bit more generic, and define a search over any Vec f.

findVec :: (Eq1 f, Pred1 f) => f n -> Vec f n -> Maybe (Fin n)

What is that Pred1 class? And why do we need it? When you have a value of type f n, and you want to compare it to the elements of a vector, you will quickly discover that these elements have different types, f m with m < n. So, we need to either convert the f n to the f m or vice-versa.

I'll go with the former, because that means the search can stop early. If a node refers to a child Fin0, that means it points to the first node in the DAG. So there is no point in looking if it is duplicated anywhere in vector, because other nodes can't possibly refer to earlier ones.

What the Pred1 class does is tell you: "if this item occurred one place later in the vector, what would it look like?". And if it can not occur in later places return Nothing:

class Pred1 f where
    pred1 :: f (Succ n) -> Maybe (f n)

instance Pred1 Fin where
    pred1 Fin0 = Nothing
    pred1 (FinS i) = Just i

instance Pred1 (Node a) where
    pred1 (Node x cs) = Node x `fmap` mapM pred1 cs

Now the search becomes relatively straight forward:

findVec x (y ::: ys) = case pred1 x of
    Just x' | eq1 x' y  -> Just Fin0
            | otherwise -> FinS `fmap` findVec x' ys
    Nothing -> Nothing
findVec _ _ = Nothing

The nice thing about GADTs is that it becomes almost impossible to make mistakes, because the typechecker will complain if you do.

Lifting boxes

When converting a Tree to a DAG, we do not know beforehand how many nodes that DAG is going to have. Therefore, we need to put the produced DAG into an existential box, that hides the parameter n.

That is fine for the end result, but it will not work when incrementally constructing a DAG. Suppose you wanted to add two nodes to a DAG. Adding the first node is fine, but then you need to ensure that the children of the second node are still there. In addition, the second node will need to be adjusted: all child references have to be incremented, to skip the first added node.

That adjusting is done with the the counterpart to Pred1, the Succ1 class

class Succ1 f where
    succ1 :: f n -> f (Succ n)

instance Succ1 Fin where
    succ1 = FinS

instance Succ1 (Node a) where
    succ1 (Node x cs) = Node x (map FinS cs)

Our box will come with the ability to 'lift' any succable value into it:

data Box f n where
    Box :: (forall g. Succ1 g => g n -> g m) -> f m -> Box f n

You can think of Box f n as a value of f m where m >= n. This allows turning any g n into a g m, which can be combined with the value in the box. Before we can see Box in action, we will first need some functors to store things:

-- product functor
data (:*:) f g a = (:*:) { fst1 :: f a, snd1 :: g a }
-- functor composition
newtype (:.:) f g a = Comp { getComp :: f (g a) }

Now when adding a node we check if it is already in the DAG, and if so, return the index. If the node is not yet in the DAG, then add it. By adding the node the DAG becomes 1 larger, from a DAG n we get a DAG (Succ n). Therefore, we need one level of succ.

consNode :: Eq a => Node a n -> DAG a n -> Box (Fin :*: DAG a) n
consNode n dag = case findVec n dag of
    Just i  -> Box id    (i :*: dag)
    Nothing -> Box succ1 (Fin0 :*: (n ::: dag))

Now the ugly part: converting an entire node.

fromTree :: Eq a => Tree a -> DAG a n -> Box (Fin :*: DAG a) n
fromTree (TNode x cs) dag₀
 = case fromForest cs dag₀ of
    Box to₁ (Comp cs₁ :*: dag₁) ->
     case consNode (Node x cs₁) dag₁ of
      Box to₂ ans -> Box (to₂ . to₁) ans

And a forest, aka. a list of trees:

fromForest :: Eq a => [Tree a] -> DAG a n -> Box (([] :.: Fin) :*: DAG a) n
fromForest [] dag = Box id $ Comp [] :*: dag
fromForest (x:xs) dag₀
   = case fromForest xs dag₀ of
      Box to₁ (Comp xs₁ :*: dag₁) ->
       case fromTree x dag₁ of
        Box to₂ (x₂ :*: dag₂) ->
         Box (to₂ . to₁) (Comp (x₂ : map to₂ xs₁) :*: dag₂)

At the top level we start with an empty DAG, and ignore the index of the root (which will always be Fin0).

fromTree' :: Eq a => Tree a -> Box (DAG a) Zero
fromTree' x
  = case fromTree x Empty of
     Box to₁ (_ :*: dag₁) ->
      Box to₁ dag₁

To understand these functions, you should ignore the Box constructors, what you are left with is

fromTree_pseudo (TNode x cs) dag
    = let (cs',dag') = fromForest_pseudo cs dag
      in constNode (Node x cs') dag
fromForest_pseudo []     dag = dag
fromForest_pseudo (x:xs) dag
    = let (ns,dag') = fromForest_pseudo xs
          (n,dag'') = fromTree_pseudo x dag'
      in (n:ns,dag'')

Here is a test that shows that we are able to recover the sharing that was removed by toTree':

λ> fromTree' (toTree' example)
Node "a" [0,1,1,3] ::: Node "b" [1,1] ::: Node "c" [0,1]
                   ::: Node "d" [] ::: Node "e" [] ::: Empty
λ> fromTree' (toTree' example)
      == Box (succ1 . succ1 . succ1 . succ1 . succ1) example
True

Box is a monad

All this wrapping and unwrapping of Box is really ugly. It should also remind you of something. That something is a monad. And Box is indeed a monad, just not a normal Haskell one. Instead it is (surprise, surprise) a 'Monad1':

class Monad1 m where
    return1 :: f a -> m f a
    (>>>=) :: m f a -> (forall b. f b -> m g b) -> m g a

instance Monad1 Box where
    return1 = Box id
    Box l x >>>= f = case f x of
        Box l' y -> Box (l' . l) y

Combine this with two utility functions:

-- Lift a value y into a Box
boxLift :: Succ1 g => Box f n -> g n -> Box (f :*: g) n
boxLift (Box l x) y = Box l (x :*: l y)

-- Apply one level of succ before putting things into a Box
boxSucc :: Box f (Succ n) -> Box f n
boxSucc (Box l x) = Box (l . succ1) x

And one more Succ1 instance:

instance (Functor f, Succ1 g) => Succ1 (f :.: g) where
    succ1 (Comp x) = Comp (fmap succ1 x)

Now we can write this slightly less ugly code

fromTree_m :: Eq a => Tree a -> DAG a n -> Box (Fin :*: DAG a) n
fromTree_m (TNode x cs) dag₀
  = fromForest_m cs dag₀
       >>>= \(Comp cs₁ :*: dag₁) ->
    consNode (Node x cs₁) dag₁

fromForest_m :: Eq a => [Tree a] -> DAG a n -> Box (([] :.: Fin) :*: DAG a) n
fromForest_m [] dag
  = return1 $ Comp [] :*: dag
fromForest_m (x:xs) dag₀
  = fromForest_m xs dag₀
       >>>= \(xs₁ :*: dag₁) ->
    fromTree_m x dag₁ `boxLift` xs₁
       >>>= \(x₂ :*: dag₂ :*: Comp xs₂) ->
    return1 $ Comp (x₂ : xs₂) :*: dag₂

This might be even nicer when we add in a state monad for the DAG, but I'll leave that for (maybe) another time.

Bonus: alternative definition of Box

If you don't like existential boxes, then here is another way to define the Box monad.

data Box' f n where
    Box0 :: f n -> Box' f n
    BoxS :: Box' f (Succ n) -> Box' f n

instance Monad1 Box' where
    return1 = Box0
    Box0 x >>>= y = y x
    BoxS x >>>= y = BoxS (x >>>= y)

boxSucc' :: Box' f (Succ n) -> Box' f n
boxSucc' = BoxS

boxLift' :: Succ1 g => Box' f n -> g n -> Box' (f :*: g) n
boxLift' (Box0 x) y = Box0 (x :*: y)
boxLift' (BoxS x) y = BoxS (boxLift' x (succ1 y))

The two types are equivalent, as shown by

equiv₁ :: Box' f n -> Box f n
equiv₁ (Box0 x) = return1 x
equiv₁ (BoxS x) = boxSucc (equiv₁ x)

equiv₂ :: Box f n -> Box' f n
equiv₂ (Box l x) = runUnBox' (l (UnBox' id)) (Box0 x)

newtype UnBox' f m n = UnBox' {runUnBox' :: Box' f n -> Box' f m}
instance Succ1 (UnBox' r f) where
    succ1 f = UnBox' (runUnBox' f . BoxS)

Finding rectangles, part 3: divide and conquer

2012-02-11T22:13:00Z

In part 2 of this series, I looked at finding axis aligned rectangles in binary images. I left you hanging with a hint of a more efficient algorithm than the O(n³) one from that post. Formally, the problem we were trying to solve was:

Given a binary image, find the largest axis aligned rectangle with a 1 pixel wide border that consists entirely of foreground pixels.

Here is the same example as last time,
,
where white pixels are the background and blue is the foreground. The rectangle with the largest area is indicated in red.

A rectangle is two brackets

The idea behind the more efficient algorithm is simple. Draw a vertical line x=x_mid through the middle of the image,
.
If the largest rectangle in an image is large enough, then it will intersect this line. The one in the example above certainly does. So, the idea is to only look at rectangles that intersect x=x_mid. We will worry about other cases later.

Each rectangle that intersects the vertical line consists of of a left bracket and a right bracket, just look at this ASCII art: [], or at these images:
and .

To find all rectangles intersecting x=x_mid, we need to find these left and right brackets, and combine them. Note that the middle column is included in both the left and right bracket, because that makes it easier to handle rectangles and bracked of width=1.

Let's focus on finding the right brackets first. For each pair of y-coordinates and height, there is at most one largest right bracket. We don't need to consider the smaller ones. So, let's define a function that finds the width of the largest right bracket for all y-coordinates and heights. The function takes as input just the right part of the image, and it will return the result in list of lists:

rightBracketWidths :: Image -> [[Maybe Int]]

Here is a slow 'specification-style' implementation

rightBracketWidths_slow im
  = [ [ findLast (containsBracket im y h) [1..imWidth im]
      | h <- [1..imHeight im-y]
      ]
    | y <- [0..imHeight im-1]
    ]
  where findLast pred = find pred . reverse

How do we even check that a right bracket is in an image? For that we can look at right and bottom endpoints:

-- pseudo code
containsBracket im y h w
    = r !! y       !! 0     >= w   -- top border
   && r !! (y+h-1) !! 0     >= w   -- bottom border
   && b !! y       !! (w-1) >= y+h -- right border

Here I used two dimensional indexing. So r!!y!!0 stands for the right endpoint for column 0 of row y; or in other words, the number of foreground pixels at the start of that row.

This image illustrates the right endpoints of the top and bottom border in green, and the bottom endpoint of the right border in red. The bracket (in glorious pink) has to fit between these indicated endpoints.
.

The rightBracketWidths_slow function is, as the name suggests, slow. It does a linear search over the possible widths. With that it would take O(m²*n) to find all widths for an m by n image. That is no better than the complexity of the algorithm from last time.

Faster searching

In my previous blog post, I introduced a SearchTree type that answers just the type findLast query that we need. In fact, this rectangle problem was why I made that SearchTree data structure in the first place.

There are three conditions in containsBracket. We will handle the one for the top border, r!!y!!0 >= w by building a separate search tree for each y. This search tree then only the widths w <- [1..r!!y!!0].

That leaves the conditions on the bottom and right borders. Since we fixed y, we can directly write these conditions in terms of a SearchTree query: For the bottom border we need satisfy (Ge (y+h)) (b!!y!!(w-1)), and for the right border satisfy (Le (r!!(y+h-1)!!0)) w. As you can hopefully see, these are exactly the same as the conditions in the containsBracket function above.

We can combine the two conditions into a pair, to give (Ge (y+h), Le (r!!(y+h-1)!!0)). While, to build the search tree, we need to pair b!!y!!(w-1) with w. That is just a matter of zipping two lists together:

bracketsAtY :: Int -> [Int] -> [Int] -> [Maybe Int]
bracketsAtY y bs_y rs@(r_y:_)
    = [ fmap (\(Max b_yw1, Min w) -> w)
             (findLast (Ge (y+h),Le r_yh1) searchTree)
      | (h,r_yh1) <- zip [1..] rs
      ]
  where
    searchTree = fromList [ (Max b_yw1, Min w) | (b_yw1,w) <- zip bs_y [1..r_y] ]
-- notation:
--    bs_y  = [b!!y!!0, b!!y!!1, ..]
--    rs     = [r!!y!!0, r!!(y+1)!!0, ..]
--    b_yw1 = b!!y!!(w-1)
--    r_y   = r!!y!!0
--    r_yh1 = r!!(y+h-1)!!0

We need to call bracketsAtY for each y, together with the right row of bottom endpoints, and right endpoints:

rightBracketWidths a = zipWith3 bracketsAtY [0..] b (tails (map head r))
  where
    -- as in the previous posts
    x = scanRightward (\_ x -> x + 1) (-1) a :: [[Int]]
    y = scanDownward  (\_ y -> y + 1) (-1) a :: [[Int]]
    r = scanLeftward  (\(a,x) r -> if a then r else x) (imWidth a) (zip2d a x)
    b = scanUpward    (\(a,y) b -> if a then b else y) (imHeight a) (zip2d a y)

QuickCheck will confirm that this function is the same as the slow version above:

prop_rightBracketWidths = forAll genImage $ \im ->
    rightBracketWidths im == rightBracketWidths_slow im

λ> quickCheck prop_rightBracketWidths
+++ OK, passed 100 tests.

With the efficient search trees, bracketsAtY takes O((m+n)*log n) time, and rightBracketWidths takes O(m*(m+n)*log n) time for an m by n image. For large images this is much faster than the O(m²*n) linear search.

From brackets to rectangles

If we have a way of finding right brackets, we can easily reuse that function for left brackets, by just flipping the image.

leftBracketWidths :: Image -> [[Maybe Int]]
leftBracketWidths = rightBracketWidths . flipHorziontal
  where flipHorziontal = map reverse

Once we have the left and right brackets, we can combine them into rectangles. Suppose that a left bracket has width lw, and the right bracket rw. Then the width of the rectangle they form is lw+rw-1, since both include the middle line.

combineBrackets :: Int -> [[Maybe Int]] -> [[Maybe Int]] -> [Rect]
combineBrackets x_mid lbrackets rbrackets
    = [ Rect (x_mid-lw+1) y (lw+rw-1) h
      | (y,lws,rws) <- zip3 [0..] lbrackets rbrackets
      , (h,Just lw,Just rw) <- zip3 [1..] lws rws
      ]

And finding (a superset of) all the maximal rectangles intersecting the vertical line x=x_mid can be done by cutting the image on that line, finding brackets, and combining them.

rectsIntersecting :: Int -> Image -> [Rect]
rectsIntersecting x_mid im = combineBrackets x_mid brackets_left brackets_right
  where
    im_left  = map (take (x_mid+1)) im
    im_right = map (drop x_mid) im
    brackets_left  = leftBracketWidths im_left
    brackets_right = rightBracketWidths im_right

Divide and conquer

We left out one important case: what if the largest rectangle does not intersect the mid-line? For that purpose man has invented recursion: First look for rectangles intersecting the middle, and then look for rectangles not intersecting the middle. For that we need to look in the left and right halves.

To make this asymptotically fast, we have to ensure that both the width and height decrease. Since the time complexity of rectsIntersecting includes a log n term, it is faster for wide images. So, if the image is tall, we just transpose it to make it wide instead.

The recursion pattern of vertical and horizontal and middle lines will in the end look something like this:
,
with the first level in yellow, the second in green, then magenta and red. So in the first level we find all rectangles intersecting the yellow line. Then in the second level all rectangles intersecting a green line, and so on.

Here is the code:

allRectsRecurse :: Image -> [Rect]
allRectsRecurse im
    -- empty image ==> no rectangles
    | imWidth im == 0 || imHeight im == 0
        = []
    -- height > width? ==> transpose
    | imHeight im > imWidth im
        = map transposeRect . allRectsRecurse . transpose $ im
    -- find and recruse
    | otherwise
        = rectsIntersecting x_mid im -- find
       ++ findRectsRecurse im_left  -- recurse left
       ++ map (moveRect (x_mid+1) 0) (allRectsRecurse im_right) -- recurse right
  where
    x_mid = imWidth im `div` 2
    im_left  = map (take x_mid) im -- *excluding* the middle line
    im_right = map (drop (x_mid+1)) im

where

transposeRect :: Rect -> Rect
transposeRect (Rect x y w h) = Rect y x h w

moveRect :: Int -> Int -> Rect -> Rect
moveRect dx dy (Rect x y w h) = Rect (x+dx) (y+dy) w h

Since the image is roughly halved in each recursion step, the recursio will have depth O(log n) for an n by n image. At each level, the rectsIntersecting calls will take O(n²*log n) time, for a total of O(n²*(log n)²). This is significantly faster than the O(n³) from the previous post.

For the complexity theorists: it is possible to do slightly better by using a disjoint-set (union-find) data structure instead of a search tree for finding brackets. I believe that would bring the runtime down to O(n²*log n*α(n)) where α is the inverse Ackermann function. Unfortunately such a data structure requires mutation, the correctness proofs are much harder, and the gain is quite small.

Let me end by checking that the set of maximal rectangles we find are the same as those found by the specification from the previous post. Then by extension the largest rectangle found will also be the same.

prop_maximalRects = forAll genImage $ \im ->
      (sort . onlyTrulyMaximalRects . allRects) im
   == (sort . onlyTrulyMaximalRects . allRectsRecurse) im

λ> quickCheck prop_maximalRects
+++ OK, passed 100 tests.

Search trees without sorting

2012-02-07T22:56:00Z

Binary search trees are used quite often for storing or finding values. Like a binary search, they essentially work by sorting the items.

In this post I will describe a search tree that does not require that the items be sorted. Hence, the tree can support some interesting queries. The queries will always be correct, but they will only be fast in some cases.

Bounds

Usually, to make searching fast, each branch in a search tree stores information that helps to decide whether to go left or right. But if we want to be able to construct a tree for any possible type of query, then that is not always possible. Instead, we can still aim to eliminate large parts of the search space, by storing bounds.

Suppose we have a tree that stores integers, and we want to find the first item in the tree that is greater or equal to some query integer. In each branch of the tree, we could store the maximum of all values in that subtree. Call it the upper bound of the subtree. If this upper bound is less than the query, then we can eliminate the entire subtree from consideration.

Now let's generalize that. The maximum value is an example of a semilattice. That is just a fancy way of saying that for a pair of values we can get some kind of bound. As a typeclass it looks like

class Semilattice a where
    meet :: a -> a -> a
    -- Laws: meet is associative, commutative and idempotent:
    --         meet a (meet b c) = meet (meet a b) c
    --         meet a b = meet b a
    --         meet a a = a

The queries we perform on the tree should of course work together with the bounds. That means that if a bound for a branch in the tree doesn't satisfy the query, then none of the values in the subtree do. In haskell terms:

class Semilattice a => Satisfy q a | q -> a where
    satisfy :: q -> a -> Bool
    -- Law: satisfy q a || satisfy q b  ==>  satisfy q (meet a b)

Note that a semilattice always gives a partial order, and hence a satisfy function by

satisfy q a = meet q a == a

because

     satisfy q a || satisfy q b
<=>  meet q a == a || meet q b == b
==>  meet (meet q a) b == meet a b || meet a (meet q b) == meet a b
<=>  meet q (meet a b) == meet a b || meet q (meet a b) == meet a b
<=>  meet q (meet a b) == meet a b
<=>  satisfy q (meet a b)

However, I keep the distinction between the query and value type for more flexibility and for more descriptive types.

Implementation

Given the Satisfy and Semilattice typeclasses, the search tree datastructure is straight forward. A search tree can be empty, a single value, or a branch. In each branch we store the bound of that subtree.

data SearchTree a
    = Empty
    | Leaf !a
    | Branch !a (SearchTree a) (SearchTree a)
    deriving (Show)

bound :: SearchTree a -> a
bound (Leaf a) = a
bound (Branch a _ _) = a
bound Empty = error "bound Empty"

If we have a SearchTree, then we can find the first element that satisfies a query, simply by searching both sides of each branch. The trick to making the search faster is to only continue as long as the bound satisfies the query:

-- Find the first element in the tree that satisfies the query
findFirst :: Satisfy q a => q -> SearchTree a -> Maybe a
findFirst q (Leaf a)       | satisfy q a = Just a
findFirst q (Branch a x y) | satisfy q a = findFirst q x `mplus` findFirst q y
findFirst _ _ = Nothing

Completely analogously, we can find the last satisfied item instead:

-- Find the last element in the tree that satisfies the query
findLast :: Satisfy q a => q -> SearchTree a -> Maybe a
findLast q (Leaf a)       | satisfy q a = Just a
findLast q (Branch a x y) | satisfy q a = findLast q y `mplus` findLast q x
findLast _ _ = Nothing

Or we can even generalize this search to any Monoid, where the above are for the First and Last monoids respectively. I will this leave as an exercise for the reader.

Constructing

The basis of each tree are branches. We will always construct branches with a smart constructor that calculates the bound as the meet of the bounds of its two arguments. That way, the stored bound is always correct.

mkBranch :: Semilattice a => SearchTree a -> SearchTree a -> SearchTree a
mkBranch Empty y = y
mkBranch x Empty = x
mkBranch x y = Branch (bound x `meet` bound y) x y

A search will always take time at least linear in the depth of the tree. So, for fast searches we need a balanced tree, where each subtree has roughly the same size. Here is arguably the most tricky part of the code, which converts a list to a balanced search tree.

-- /O(n*log n)/
-- Convert a list to a balanced search tree
fromList :: Semilattice a => [a] -> SearchTree a
fromList []  = Empty
fromList [x] = Leaf x
fromList xs  = mkBranch (fromList ys) (fromList zs)
  where
    (ys,zs) = splitAt (length xs `div` 2) xs

And that's it. I use this data structure for finding rectangles (more about that in a future post), and there I only needed to build the search structure once, and use it multiple times. So, in this post I am not going to talk about updates at all. If you wanted to do updates efficiently, then you would need to worry about updating bounds, rebalancing etc.

Example uses

Here is an example of the search tree in action. The query will be to find a value (>= q) for a given q. The bounds will be maximum values.

newtype Max a = Max { getMax :: a } deriving (Show)
instance Ord a => Semilattice (Max a) where
    meet (Max a) (Max b) = Max (max a b)

newtype Ge a = Ge a deriving (Show)
instance Ord a => Satisfy (Ge a) (Max a) where
    satisfy (Ge q) = (>= q) . getMax

First, check the satisfy law:

     satisfy (Ge q) (Max a) || satisfy (Ge q) (Max b)
<=>  a >= q || b >= q
<=>  if a >= b then a >= q else b >= q
<=>  (if a >= b then a else b) >= q
<=>  max a b >= q
<=>  satisfy (Ge q) (Max (max a b))
<=>  satisfy (Ge q) (meet (Max a) (Max b))

The search tree corresponding to fromList (map Max [1..5]). Circles are Leafs and squares are Branches.

So indeed, satisfy q a || satisfy q b ==> satisfy q (meet a b). And this bound is in fact tight, so also the other way around satisfy q (meet a b) ==> satisfy q a || satisfy q b. This will become important later. Now here are some example queries:

λ> findFirst (Ge 3) (fromList $ map Max [1,2,3,4,5])
Just (Max 3)
λ> findFirst (Ge 3) (fromList $ map Max [2,4,6])
Just (Max 4)
λ> findFirst (Ge 3) (fromList $ map Max [6,4,2])
Just (Max 6)
λ> findFirst (Ge 7) (fromList $ map Max [2,4,6])
Nothing

Semilattices and queries can easily be combined into tuples. For a tree of pairs, and queries of pairs, you could use.

instance (Semilattice a, Semilattice b) => Semilattice (a,b) where
    meet (a,b) (c,d) = (meet a c, meet b d)
instance (Satisfy a b, Satisfy c d) => Satisfy (a,c) (b,d) where
    satisfy (a,c) (b,d) = satisfy a b && satisfy c d

Now we can not only questions like "What is the first/last/smallest element that is greater than some given query?". But also "What is the first/last/smallest element greater than a given query that also satisfies some other property?".

When is it efficient?

It's nice that we now have a search tree that always gives correct answers. But is it also efficient? As hinted in the introduction, that is not always the case. First of all, meet could give a really bad bound. For example, if meet a b = Bottom for all a /= b, and Bottom satisfies everything, then we really can do no better than a brute force search. On the other hand, suppose that meet gives 'perfect' information, like the Ge example above,

satisfy q (meet a b)  ==>  satisfy q a || satisfy q b

That is equivalent to saying that

not (satisfy q a) && not (satisfy q b) ==> not (satisfy q (meet a b))

Then for any Branch, we only have to search either the left or the right subtree. Because, if a subtree doesn't contain the value, we know can see so from the bound. For a balanced tree, that means the search takes O(log n) time. Another efficient case is when the items are sorted. By that I mean that, if an item satisfies the query, then all items after it also satisfy that query. We actually need something slightly more restrictive: namely that if a query is satisfied for the meet of some items, then all items after them also satisfy the query. In terms of code:

let st = fromList (xs₁ ++ xs₂ ++ xs₃)
satisfy q (meet xs₂)  ==>  all (satisfy q) xs₃

Now suppose that we are searching a tree of the form st = mkBranch a b with findFirst q. Then there are three cases:

not (satisfy q (bound st)).
not (satisfy q (bound a)).
satisfy q (bound a).

In the first case the search fails, and we are done. In the second case, we only have to search b, which by induction can be done efficiently. The third case is not so clear. In fact, there are two sub cases:

3a. findFirst q a = Just someResult
3b. findFirst q b = Nothing

In case 3a we found something in the left branch. Since we are only interested in the first result, that means we are done. In case 3b, we get to use the fact that the items are sorted. Since we have satisfy q (bound a), that means that all items in b will satisfy the query. So when searching b, in all cases we take the left branch. Overall, the search time will be at most twice the depth of the tree, which is O(log n). The really cool thing is that we can combine the two conditions. If satisfy can be written as

satisfy q a == satisfy₁ q a && satisfy₂ q a

where satisfy₁ has exact bounds, and the tree is sorted for satisfy₂, then queries still take O(log n) time.

Closing example

Finally, here is an example that makes use of efficient searching with the two conditions. I make use of the Semilattice and Satisfy instances for pairs which I defined above.

treeOfPresidents :: SearchTree (Max Int, Max String)
treeOfPresidents
   = fromList [ (Max year, Max name) | (year,name) <- usPresidents ]
 where
   usPresidents =
     [(1789,"George Washington")
     ,(1797,"John Adams")
     ,(1801,"Thomas Jefferson")
     -- etc

The tree is ordered by year of election, and the Max semilattice gives tight bounds for names. So we can efficiently search for the first US presidents elected after 1850 who's name comes starts with a letter after "P":

λ> findFirst (Ge 1850,Ge "P") treeOfPresidents
Just (Max 1869,Max "Ulysses S. Grant")

And with the following query type we can search on just one of the elements of the tuple. Note that we need the type parameter in Any because of the functional dependency in the Satisfy class.

data Any s = Any
instance Semilattice s => Satisfy (Any s) s where
    satisfy _ _ = True

λ> findFirst (Ge 1911,Any) treeOfPresidents
Just (Max 1913,Max "Woodrow Wilson")

Finding rectangles, part 2: borders

2011-10-12T07:57:00Z

In the previous post, we looked at finding axis aligned rectangles in a binary image. Today I am going to solve a variation of that problem:

Given a binary image, find the largest axis aligned rectangle with a 1 pixel wide border that consists entirely of foreground pixels.

Here is an example:
,
where white pixels are the background and blue is the foreground. The rectangle with the largest area is indicated in red.

Like the previous rectangle finding problem, this one also came up in my masters thesis. The application was to, given a scan of a book, find the part that is a page, cutting away clutter:
.

Specification

The types we are going to need are exactly the same as in my previous post:

-- An image is a 2D list of booleans, True is the foreground
type Image = [[Bool]]

-- An axis-aligned rectangle
data Rect = Rect { left, top, width, height :: Int }
    deriving (Eq,Ord,Show)

The difference compared to last time is the contains function, which tells whether an image contains a given rectangle. We are now looking only at the borders of rectangles, or 'border rectangles' for short.

-- does an image contain a given border rectangle?
contains :: Image -> Rect -> Bool
contains im rect = isBorder (cropRect im rect)

-- crop an image to the pixels inside a given rectangle
cropRect :: Image -> Rect -> Image
cropRect im (Rect x y w h) = map cols (rows im)
  where
    rows = take h . drop y . (++repeat [])
    cols = take w . drop x . (++repeat False)

-- is the border of an image foreground?
isBorder :: Image -> Bool
isBorder im
    = and (head im)     -- top border
   && and (last im)     -- bottom border
   && and (map head im) -- left border
   && and (map last im) -- right border

Finding the largest border rectangle can again be done by enumerating all rectangles contained in the image, and picking the largest one:

largestRect_spec :: Image -> Rect
largestRect_spec = maximalRectBy area . allRects

allRects :: Image -> [Rect]
allRects im = filter (im `contains`) rects
  where -- boring details omitted, see previous post

Just as before, this specification has runtime O(n⁶) for an n by n image, which is completely impractical.

An O(n⁴) algorithm

Unfortunately, the nice properties of maximal rectangles will not help us out this time. In particular, whenever a filled rectangle is contained in an image, then so are all smaller subrectangles So we could 'grow' filled rectangles one row or column at a time. This is no longer true for border rectangles.

We can, however, easily improve the above O(n⁶) algorithm to an O(n⁴) one by using the line endpoints. With those we can check if an image contains a rectangle in constant time. We just need to check all four of the sides:

-- pseudo code, not actually O(n^4) without O(1) array lookup
contains_fast im (Rect x y w h)
    = r!!(x,y)     >= x+w -- top border
   && r!!(x,y+h-1) >= x+w -- bottom border
   && b!!(x,y)     >= y+h -- left border
   && b!!(x+w-1,y) >= y+h -- right border

Where r and b give the right and bottom endpoints of the horizontal and vertical lines through each pixel.
r = , b = .

An O(n³) algorithm

As the title of this section hints, a still more efficient algorithm is possible. The trick is to only look for rectangles with a specific height h. For any given height h, we will be able to find only maximal rectangles of that height.

For example, for h=6 we would expect to find these rectangles:
.
Notice how each of these rectangles consist of three parts: a left side, a middle and a right side:
.

The left and right parts both consist of a vertical line at least h pixels high. We can find those vertical lines by looking at the top (or bottom) line endpoints. The top endpoint for pixel (x,y+h-1) should be at most y,

let h = 6 -- for example
let av = zipWith2d (<=) (drop (h-1) t) y
-- in images:
av =  <= 
   =

Each True pixel in av corresponds to a point where there is a h pixel high vertical line. So, a potential left or right side of a rectangle.

The middle part of each rectangle has both pixel (x,y) and (x,y+h-1) set,

let ah = zipWith2d (&&) a (drop (h-1) a)
-- in images:
ah =  && 
   =

To find the rectangles of height h, we just need to find runs that start and end with a pixel in av, and where all pixels in between are in ah. First we find the left coordinates of the rectangles,

let leStep (av,ah,x) le
      | av = min le x  -- pixel in av ==> left part
      | ah = le        -- pixel in ah, but not av ==> continue middle part
      | otherwise = maxBound
let le = scanRightward leStep maxBound (zip2d3 av ah x)
le =

Finally we need to look for right sides. These are again given by av. For each right side, le gives the leftmost left side, and h gives the height of the rectangles:

let mkRect x y av le
      | av = [Rect le y (x-le+1) h] -- pixel in av ==> right part
      | otherwise = []
let rects = zipWith2d4 mkRect x y av le
rects =

Compare the resulting image to the one at the start of this section. We found the same rectangles.

Just like last time, all we need to do now is put the steps together in a function:

rectsWithHeight :: Int -> Image -> [Rect]
rectsWithHeight h a = concat . concat $ rects
  where
    x  = scanRightward (\_ x -> x + 1) (-1) a
    y  = scanDownward  (\_ y -> y + 1) (-1) a
    t  = scanDownward  (\(a,y) t -> if a then t else y+1) 0 (zip2d a y)
    ah = zipWith2d (&&) (drop (h-1) a) a
    av = zipWith2d (<=) (drop (h-1) t) y
    leStep (av,ah,x) le
      | av = min le x
      | ah = le
      | otherwise = maxBound
    le = scanRightward leStep maxBound (zip2d3 av ah x)
    mkRect x y av le
      | av = [Rect le y (x-le+1) h]
      | otherwise = []
    rects = zipWith2d4 mkRect x y av le

Of course, finding (a superset of) all maximal rectangles in an image is just a matter of calling rectsWithHeight for all possible heights.

findRects_fast :: Image -> [Rect]
findRects_fast im = concat [ rectsWithHeight h im | h <- [1..imHeight im] ]

largestRect_fast :: Image -> Rect
largestRect_fast = maximalRectBy area . findRects_fast

Let's quickly check that this function does the same as the specification,

prop_{fast_spec} = forAll genImage $ \a -> largestRect_spec a == largestRect_fast a

λ> quickCheck prop_{fast_spec}
+++ OK, passed 100 tests.

Great.

Conclusions

The runtime of rectsWithHeight is linear in the number of pixels; and it is called n times for an n by n image. Therefore the total runtime of largestRect_fast is O(n³). While this is much better than what we started with, it can still be quite slow. For example, the book page that motivated this problem is around 2000 pixels squared. Finding the largest rectangle takes on the order of 2000³ = 8*10⁹, or 8 giga-operations, which is still a pretty large number.

To make this algorithm faster in practice, I used a couple of tricks. Most importantly, if we know what function we are maximizing, say area, then we can stop as soon as we know that we can't possibly find a better rectangle. The idea is to start with h=imHeight im, and work downwards. Keep track of the area a of the largest rectangle. Then as soon as h * imWidth im < a, we can stop, because any rectangle we can find from then on will be smaller.

Is this the best we can do? No. I know an algorithm for finding all maximal border rectangles in O(n²*(log n)²) time. But it is rather complicated, and this post is long enough already. So I will save it for another time. If anyone thinks they can come up with such an algorithm themselves, I would love to read about it in the comments.

Finding rectangles

2011-09-28T19:49:00Z

This post is based on a part of my masters thesis. The topic of my thesis was OCR of historical documents. A problem that came up there was the following:

Given a binary image, find the largest axis aligned rectangle that consists only of foreground pixels.

These largest rectangles can be used, for instance, to find columns in a page of text. Although in that case one would use large rectangles of background pixels.

Here is an example image,
.
White pixels are background and blue is the foreground. The rectangle with the largest area is indicated in red. The images you encounter in practical application will be much larger than this example, so efficiency is going to be important.

Specification

Let's start with the types of images and rectangles

-- An image is a 2D list of booleans, True is the foreground
type Image = [[Bool]]

-- An axis-aligned rectangle
data Rect = Rect { left, top, width, height :: Int }
    deriving (Eq,Ord,Show)

And some properties of them,

-- The size of an image
imWidth, imHeight :: Image -> Int
imHeight = length
imWidth (x:_) = length x
imWidth []    = 0

-- The area and perimeter of a rectangle
area, perimeter :: Rect -> Int
area rect = width rect * height rect
perimeter rect = 2 * width rect + 2 * height rect

I will say that an image 'contains' a rectangle if all pixels inside the rectangle are foreground pixels.

contains :: Image -> Rect -> Bool
contains im (Rect x y w h) = and pixelsInRect
  where
    pixelsInRect = concatMap cols (rows im)
    rows = take h . drop y . (++repeat [])
    cols = take w . drop x . (++repeat False)

Now the obvious, stupid, way of finding the largest rectangle is to enumerate all rectangles in the image, and pick the largest from that list:

-- List all rectangles contained in an image
allRects :: Image -> [Rect]
allRects im = filter (im `contains`) rects
  where
    rects = [Rect x y w h | x <- [0..iw], y <- [0..ih]
                          , w <- [1..iw-x], h <- [1..ih-y]]
    iw = imWidth im
    ih = imHeight im

For now, I will take 'largest rectangle' to mean one with the maximal area. I will come back to this choice soon.

largestRect_spec :: Image -> Rect
largestRect_spec = maximalRectBy area . allRects

-- Return the rectangle with maximum f,
--  using lexicographical ordering to break ties
--  return noRect if there are no rectangles in the input list.
maximalRectBy :: Ord a => (Rect -> a) -> [Rect] -> Rect
maximalRectBy f = maximumBy (comparing f `mappend` compare) . (noRect:)
  where noRect = Rect 0 0 0 0

The above code should hopefully be easy to understand. It will find the correct answer for the above example:

λ> largestRect_spec example
Rect {left = 3, top = 2, width = 4, height = 5}

Of course largestRect_spec is horribly slow. In an n by n image there are O(n⁴) rectangles to consider, and checking if one is contained in the image takes O(n²) work, for a total of O(n⁶).

What is 'largest'?

Before continuing, let's determine what it means for a rectangle to be the largest. We could compare the area of rectangles, as we did before. But it is equally valid to look for the rectangle with the largest perimeter.

Can we pick the maximum according to any arbitrary function f :: (Rect -> a)? Not all of these functions will correspond to the intuitive notion of 'largest'. For example f = negate . area will actually lead to the smallest rectangle. In general there is going to be no efficient way of finding the rectangle that maximizes f. All we could do is optimize contains, to get an O(n⁴) algorithm.

We should therefore restrict f to be monotonic. What I mean by monotonic is that f x >= f y whenever rectangle x contains rectangle y. In QuickCheck code:

prop_isMonotonic :: Ord a => (Rect -> a) -> Property
prop_isMonotonic f = property $ \x y ->  x `rectContains` y  ==>  f x >= f y

rectContains :: Rect -> Rect -> Bool
rectContains (Rect x y w h) (Rect x' y' w' h') = x <= x' && y <= y' && x+w >= x'+w' && y+h >= y'+h'

Area is a monotonic function, and so is perimeter. But you could also add weird constraints. For example, only consider rectangles that are at least 10 pixels tall, or only rectangles that contain the point (123,456).

Maximizing a monotonic function, as opposed to just any function, means that we can skip a lot of rectangles. In particular, whenever rectangle x contains rectangle y, rectangle y doesn't need to be considered. I will call rectangles in the image that are not contained in other (larger) rectangles maximal. The strategy for finding the largest rectangle is then simply to enumerate only the maximal rectangles, and pick the best of those:

largestRect_fast :: Image -> Rect
largestRect_fast = maximalRectBy area . allMaximalRects

For each maximal rectangle there is (trivially) a monotonic function that is maximal for that rectangle. So we can't do any better without taking the specific function f into account.

Machinery

To find maximal rectangles, we are first of all going to need some machinery for working with images. In particular, zipping images together,

zip2d :: [[a]] -> [[b]] -> [[(a,b)]]
zip2d = zipWith zip

zipWith2d :: (a -> b -> c) -> [[a]] -> [[b]] -> [[c]]
zipWith2d = zipWith . zipWith

zipWith2d4 :: (a -> b -> c -> d -> e) -> [[a]] -> [[b]] -> [[c]] -> [[d]] -> [[e]]
zipWith2d4 = zipWith4 . zipWith4

And accumulating/scanning over images. This scanning can be done in four directions. Each scanX function takes a function to apply, and the initial value to use just outside the image. The scans that I use here are slightly different from scanl and scanr, because the output will have the same size as the input, instead of being one element larger.

scanLeftward, scanRightward, scanUpward, scanDownward
    :: (a -> b -> b) -> b -> [[a]] -> [[b]]

scanLeftward  f z = map (init . scanr f z)
scanRightward f z = map (tail . scanl (flip f) z)
scanUpward    f z = init . scanr (\as bs -> zipWith f as bs) (repeat z)
scanDownward  f z = tail . scanl (\as bs -> zipWith f bs as) (repeat z)

Here is an example of a scan that calculates the x-coordinate of each pixel,

let x = scanRightward (\a x -> x + 1) (-1) a
x = .

And the y-coordinates are of course

let y = scanDownward (\a x -> x + 1) (-1) a
y = .

Finding lines

If we were looking for one-dimensional images, then a 'rectangle' would just be a single line of pixels. Now each pixel is contained in at most one maximal line of foreground pixels. To find the coordinates of this line, we just need to know the left and right endpoints.

For a foreground pixel, the left endpoint of the line it is in is the same as the left endpoint of its left neighbor. On the other hand, a background pixel is not in any foreground line. So the left endpoint of all lines to the right of it will be at least x+1, where x is the x-coordinate of the background pixel. In both these cases information flows from left to right; and so the left endpoint for all pixels can be determined with a rightward scan.

Unsurprisingly, we can find the right endpoints of all foreground lines with a leftward scan. Now let's do this for all lines in the image. Notice that we need the x coordinates defined previously:

let l = scanRightward (\(a,x) l -> if a then l else x+1) 0 (zip2d a x)
l = 
let r = scanLeftward (\(a,x) r -> if a then r else x) (imWidth a) (zip2d a x)
r =

In the images I have marked the left and right endpoints of the foreground lines in red. Note also, the values in the background pixels are not important, and you should just ignore them.

Vertically we can of course do the same thing, giving top and bottom endpoints:

let t = scanDownward (\(a,y) t -> if a then t else y+1) 0 (zip2d a y)
t = 
let b = scanUpward (\(a,y) b -> if a then b else y) (imHeight a) (zip2d a y)
b =

However, combining these left/right/top/bottom line endpoints does not yet give rectangles containing only foreground pixels. Rather, it gives something like a cross. For example using the endpoints for (6,4) leads to the following incorrect rectangle,
.

In fact, there are many rectangles around this point (6,4):
,
and before looking at the area (or whatever function we are maximizing) there is way no telling which is the best one.

If there was some way to find just a single maximal rectangle for each pixel, then we would have an O(n²) algorithm. Assuming of course that we do find all maximal rectangles.

Finding maximal rectangles

Suppose that Rect x y w h is a maximal rectangle. What does that mean? First of all, one of the points above the rectangle, (x,y-1),(x+1,y-1),..,(x+w-1,y-1), must not be the a foreground pixel. Because if all these points are foreground, then the rectangle could be extended upwards, and it would not be maximal. So, suppose that (u,y-1) is a background pixel (or outside the image). Then (u,y) is the top endpoint of the vertical line that contains (u,y+h-1).

If we start from (u,v), we can recover the height of a maximal rectangle using the top endpoint image t. Just take t!!(u,v) as the top coordinate, and u+1 as the bottom. This image illustrates the idea:
.
Here the green point (u,v) has the red top endpoint, and it gives the height and vertical position of the yellow maximal rectangle.

Then to make this vertical line into a maximal rectangle, we just extend it horizontally as far as possible:
.

For this last step, we need to know the first background pixel that will be encountered when extending the rectangle to the left. That is the maximum value of all left endpoints in the rows t,t+1,..,b-1. This maximum can again be determined with a scan over the image:

let lt = scanDownward (\(a,l) lt -> if a then max l lt else minBound) minBound (zip2d a l)
lt =

For extending to the right the story is exactly the same, only taking the minimum right endpoint instead:

let rt = scanDownward (\(a,r) rt -> if a then min r rt else maxBound) maxBound (zip2d a r)
rt =

Now we have all the ingredients for finding maximal rectangles:

For a foreground pixel (u,v):
Take as top t!!(u,v)
Take as left lt!!(u,v)
Take as right rt!!(u,v)
Take as bottom v+1.

Every maximal rectangle can be found in this way. However, not all rectangles we get in this way are maximal. In particular, they could potentially still be extended downward. However, for finding the largest rectangle, it doesn't matter if we also see some non-maximal ones. There might also be duplicates, which again does not matter.

So now finishing up is just a matter of putting all the steps together in a function:

allMaximalRects :: Image -> [Rect]
allMaximalRects a = catMaybes . concat $ zipWith2d4 mkRect lt rt t y
  where
    x  = scanRightward (\_ x -> x + 1) (-1) a
    y  = scanDownward  (\_ y -> y + 1) (-1) a
    l  = scanRightward (\(a,x) l -> if a then l else x+1) 0 (zip2d a x)
    r  = scanLeftward  (\(a,x) r -> if a then r else x) (imWidth a) (zip2d a x)
    t  = scanDownward  (\(a,y) t -> if a then t else y+1) 0 (zip2d a y)
    lt = scanDownward  (\(a,l) lt -> if a then max l lt else minBound) minBound (zip2d a l)
    rt = scanDownward  (\(a,r) rt -> if a then min r rt else maxBound) maxBound (zip2d a r)
    mkRect l r t y
        | l /= minBound = Just $ Rect l t (r-l) (y-t+1)
        | otherwise     = Nothing

A quick QuickCheck shows that largestRect_fast finds the same answer as the slow specification:

prop_{fast_spec} = forAll genImage $ \a -> largestRect_spec a == largestRect_fast a

λ> quickCheck prop_{fast_spec}
+++ OK, passed 100 tests.

Conclusion

It is possible to find all maximal rectangles that consist entirely of foreground pixels in an n*n image in O(n²) time. That is linear in the number of pixels. Obviously it is not possible to do any better in general.

You may wonder whether this method also works in higher dimensions. And the answer to that question is no. The reason is that there can be more than O(n³) maximal cubes in a three dimensional image. In fact, there can be at least O(n^(d-1)²) maximal hypercubes in d dimensions. Just generalize this image to 3D:
. Or click here for a 3D version.

A small rant on writing academic papers

2011-07-22T14:10:13Z

Warning: rant ahead.

This week I submitted for review the second revision of what will hopefully become my first scientific publication. Together with my supervisor I spent countless hours on this article. But does that mean that it is now the best text that I have ever written? I don't think so.

While a lot of effort did go into improving the clarity, structure, etc.; there are several competing interests which make things harder:

The article should be short, which means that a lot of material had to be cut. Some things could be better explained, if only there was more space.
On the other hand, the article should be complete and self contained. If I were to write a blog post on the same topic, I would split it up into a series of posts. But each part by itself would be unpublishable, so it has to be a whole.
There are many asides, either to remark on something interesting, or sometimes just to appease the reviewers.

Pleasing the reviewers is something which I especially disliked. To be fair, a lot of comments raised by reviewers were valid, and pointed to actual shortcomings or errors in the manuscript. But some of the comments were of the form "Could you also compare with X", "Did you consider Y" and "This is related to prior work Z". As a result of trying to cover these comments, the paper becomes a Frankenstein's monster of irrelevant remarks. Where before we had:

General Point 1
Detailed Point 2
Therefore Point 3

It now becomes

Point 1
Remark saying that point 1 was previously considered by SomePaper2010.
Detailed Point 2
Contrasting approach 2 against approach 2b from OtherPaper2009.
Therefore Point 3
Aside saying that also 3b, which is irrelevant for the rest of the article.

Okay, I am exaggerating a bit here. But still, I feel that the article would be better if it didn't try to do so many things at once.

Suggestions, criticism and comments on my sanity are welcome.

Isomorphism lenses

2011-05-22T16:23:12Z

In the past I have blogged about functional references. From now on I will conform to most of the rest of the world, and call these things lenses.

Giving a presentation on these objects has forced me to think about them some more. As a result of this thinking I have a new favorite representation, at least from a theory point of view:

A lens from type a to b is a bijection between a and a pair of b and some residual r.

In pseudo-code we would write that as

type Lens a b = exists r. a <-> (b,r)

Except that Haskell has no exists keyword, so you have to use a newtype wrapper:

-- Isomorphisms/bijections between type @a@ and @b@
data Iso a b = Iso { fw :: a -> b, bw :: b -> a }

-- Lenses with a data wrapper, in practice you might want to unpack the Iso type
data Lens a b = forall r. Lens (Iso a (b,r))

So, why do I like this representation so much?

Intuition

I believe this representation captures the intuition of what a lens does extremely well: You have some record type a, and you want to take out and a field of (a smaller) type b. When you do that you are left with some residual, which you can think of as a-b (or should that be a/b?).

I imagine this graphically as
,
where we have a square record a, containing a smaller circular field of type b.

Implementing the usual get, modify and set functions is now very easy, by going back and forth through the lens.

get :: Lens a b -> a -> b
get (Lens l) = fst . fw l

modify :: Lens a b -> (b -> b) -> (a -> a)
modify (Lens l) f = bw l . first f . fw l

set :: Lens a b -> b -> a -> a
set l b a = modify l (const b) a

The nice thing about the existential quantification is that the residual type r can be anything you like. In some cases it is obvious what it could be, such as the case of tuples:

myFst :: Lens (a,b) a
myFst = Lens (Iso id id) -- r = b

but we could also pick any other representation,

myCrazyFst :: Lens (a,String) a
myCrazyFst = Lens (Iso fw bw) -- r = strings starting with "Banana"
   where fw (a,b) = (a, "Banana" ++ b)
         bw (a,'B':'a':'n':'a':'n':'a':b) = (a,b)

For this to be an actual isomorphism we have to restrict the residual to only strings that start with "Banana". That is not something we can actually enforce in Haskell, but then again, we don't check that a lens is an isomorphism at all.

Besides the simple intuition and the freedom in declaring them, there is another reason for liking these lenses.

Laws

There are two (or one, depending on how you count) obvious laws you want isomorphisms to satisfy:

fw i . bw i = bw i . fw i = id

On the other hand, there are several less obvious laws for lenses:

set l (get l a) a = a
get l (set l b a) a = b
set l c (set l b a) = set l c a

And now comes the magic: with isomorphism lenses all of these laws follow from the simple laws of isomorphisms. Here are the quick and dirty proofs. One:

  set l (get l a) a
= {- expanding definitions of get and set -}
  (bw l . first (const ((fst . fw l) a)) . fw l) a
= {- let x = fw l a, rewrite -}
  bw l (first (const (fst x)) x) where x = fw l a
= {- (first (const (fst x)) x) = x -}
  bw l x where x = fw l a
= {- fill in x and rewrite -}
  (bw l . fw l) a
= {- isomorphism law -}
  a

Two:

  get l (set l b a) a
= {- expanding definitions of get and set, rewrite to use composition -}
  fst . fw l . bw l . first (const b) . fw l $ a
= {- isomorphism law -}
  fst . first (const b) . fw l $ a
= {- expanding fst, first and const -}
  (\(x,y) -> x) . (\(x,y) -> (b,y)) . fw l $ a
= {- composing the two lambda terms -}
  const b . fw l $ a
= {- definition of const -}
  b

Three:

  set l c (set l b a)
= {- expanding definition of set, rewrite to use composition -}
  bw l . first (const c) . fw l . bw l . first (const b) . fw l $ a
= {- isomorphism law -}
  bw l . first (const c) . first (const b) . fw l $ a
= {- first f . first g = first (f . g) -}
  bw l . first (const c . const b) . fw l $ a
= {- const c . const b = const c -}
  bw l . first (const c) . fw l $ a
= {- definition of set -}
  set l c a

So there you have it. A simple representation of lenses that gives a nice intuition of what these things actually do. And as an added bonus, the laws for lenses follow directly from the definition. Finally I should say that this representation is not my idea, it has been around in the literature for quite some time.

Talk on Lenses

2011-05-19T18:15:00Z

Two days ago I gave a talk on lenses at the Radboud Unviersity (where I work on my PhD on machine learning). I put the slides online for your enjoyment, although it might be hard to follow, since it is light on explanatory text.

This talk includes information from at least three different earlier blog posts, as well as Russel O'Connor's recent paper on multiplate. There is no new information in the talk, but I do have a new favorite representation for lenses.

Moving to a new website

2011-03-19T18:31:00Z

I am moving my website from http://twan.home.fmf.nl/ to http://twanvl.nl/. The moves comes with a fancy new design, as well as rewritten backend code.

More composition operators

2010-07-12T22:00:00Z

Here is an idea for some more function composition operators, beyond just (.):

(f .$ g) x = (f) . (g $ x)
(f $. g) x = (f $ x) . (g)
(f .$$ g) x y = (f) . (g $ x $ y)
(f $.$ g) x y = (f $ x) . (g $ y)
(f $$. g) x y = (f $ x $ y) . (g)
-- etc.
infixl 8 .$, $., .$$, $.$, $$. -- slightly less tight than (.)

The .$ name is supposed suggests that an extra argument is applied on the right before the functions are composed. Notice also that the dollars and dot on the left hand site match those on the right hand side. These combinators make writing point free code easier:

concatMap = concat .$ map
sum23 = (+) . (2*) $. (3*)  -- \x y -> 2*x + 3*y

Here is another family of composition operators:

(f $. g) x = (f) (g x)    -- a.k.a. (.)
(f .$ g) x = (f x) (g)    -- a.k.a. flip
(f $.. g) x y = (f) (g x y)
(f .$. g) x y = (f x) (g y)
(f ..$ g) x y = (f x y) (g)
(f $... g) x y z = (f) (g x y z)
(f .$.. g) x y z = (f x) (g y z)
(f ..$. g) x y z = (f x y) (g z)
(f ...$ g) x y z = (f x y z) (g)
-- etc.
infixl 8 $., .$, $..,.$.,..$, $...,.$..,..$.,...$

Think of the . as the placeholder for an argument. It would be better if I could use _, but Haskell doesn't allow that. You can also think of the dots as the points from point-free style, so these operators allow for the preservation of the number of points :). With these operators the previous example becomes:

concatMap = concat $.. map
sum23 = (+) $. (2*) .$. (3*)  -- \x y -> 2*x + 3*y

I like the second family better, because they do not use (.), which makes the first family more confusing. What do you think? Would these operators be useful in practice?

Four ways to fold an array

2009-11-08T23:00:00Z

As most Haskell programmers know, there are two ways to fold a list: from the right with foldr and from the left with foldl. foldr is corecursive (productive), which is great when the output can be produced lazily. foldl (or better, its strict cousin foldl') is tail recursive, preventing stack overflows.

We can define analogous operations for other data structures like 1-dimensional arrays. Libraries like 'Data.ByteString' and 'Data.Vector' provide these. But as I will show in this post there are more fold operations than the common two.

The data type I will use in this post is simply

type Array1D a = Array Int a
-- and two utility functions for getting the lower and upper bounds
lo,hi :: Array1D a -> Int
lo = fst . bounds
hi = snd . bounds

The right fold applies a function f to the current value and the folded result of the rest of the array:

foldr_a :: (a -> b -> b) -> b -> Array1D a -> b
foldr_a f z0 ar = go (lo ar)
  where go i
         | i > hi ar = z0
         | otherwise = f (ar ! i) (go (i + 1))

The (strict) left fold uses an accumulator parameter:

-- IGNORE, this function is the same as foldl' which is more interesting anyway
foldl_a :: (b -> a -> b) -> b -> Array1D a -> b
foldl_a f z0 ar = go z0 (lo ar)
  where go z i
         | i > hi ar = z
         | otherwise = go (f z (ar ! i)) (i + 1)

foldl'_a :: (b -> a -> b) -> b -> Array1D a -> b
foldl'_a f z0 ar = go z0 (lo ar)
  where go !z i
         | i > hi ar = z
         | otherwise = go (f z (ar ! i)) (i + 1)

In each case, the recursive go function is very similar in structure to the list version; only instead of recursing for the tail of the list we recurse for index i+1. The time and space behavior is also similar. For example, if you have a large array

testArray :: Array1D Integer
testArray = listArray (1,10^6) [1..]

Then for computing something like the sum of all elements, you should use a strict left fold:

*Main> foldl'_a (+) 0 testArray
50000005000000
*Main> foldr_a (+) 0 testArray
*** Exception: stack overflow

On the other hand, a right fold is the way to go when you are only interested in a part of a lazily produced result. For example when converting an array to a list:

*Main> take 10 . foldr_a (:) [] $ testArray
[1,2,3,4,5,6,7,8,9,10]
(0.02 secs, 520824 bytes)
*Main> take 10 . foldl'_a (flip (:)) [] $ testArray
[1000000,999999,999998,999997,999996,999995,999994,999993,999992,999991]
(5.89 secs, 263122464 bytes)

All of this is exactly the same as with lists.

But, if you look at foldr_a and foldl'_a, you will see that they both contain a loop doing (i + 1). So in a sense, both of these functions work from left to right!

Because arrays allow for random access, it is possible to make true right to left folds, just start at the end and do (i - 1) in each iteration.

foldl_b :: (b -> a -> b) -> b -> Array1D a -> b
foldl_b f z ar = go (hi ar)
  where go i
         | i < lo ar = z
         | otherwise = f (go (i - 1)) (ar ! i)

foldr'_b :: (a -> b -> b) -> b -> Array1D a -> b
foldr'_b f z0 ar = go z0 (hi ar)
  where go !z i
         | i < lo ar = z
         | otherwise = go (f (ar ! i) z) (i - 1)

Just look at the pretty duality there! We now have a lazy left fold and a strict right fold.

The behavior is exactly the opposite of that of the fold_a functions above:

*Main> foldl_b (+) 0 testArray
*** Exception: stack overflow
*Main> foldr'_b (+) 0 testArray
50000005000000

*Main> take 10 . foldr'_b (:) [] $ testArray
[1,2,3,4,5,6,7,8,9,10]
(6.19 secs, 263055372 bytes)
*Main> take 10 . foldl_b (flip (:)) [] $ testArray
[1000000,999999,999998,999997,999996,999995,999994,999993,999992,999991]
(0.00 secs, 524836 bytes)

To summarize, four ways to fold an array are:

	`lo` to `hi`, `i+1`	`hi` to `lo`, `i-1`
corecursion, productive, lazy	`foldr_a`	`foldl_b`
accumulator, tail recursive, strict	`foldl'_a`	`foldr'_b`

Exercise: can you think of other ways to fold an array?

CPS based functional references

2009-07-19T22:00:00Z

I have recently come up with a new way of representing functional references.

As you might recall, functional references (also called lenses) are like a pointer into a field of some data structure. The value of this field can be extracted and modified. For example:

GHCi> get fstF (123,"hey")
123
GHCi> set fstF 456 (123,"hey")
(456,"hey")
GHCi> modify fstF (*2) (123,"hey")
(246,"hey")

where fstF is a functional reference to the first element of a pair. It has the type RefF (a,b) a, i.e. in a 'record' of type (a,b) it points to an a.

Previous representations relied on a record that contained the get and set or the get an modify functions. But there is a much nicer looking representation possible using Functors.

First of all we will need a language extension and some modules:

{-# LANGUAGE Rank2Types #-}
import Control.Applicative
import Control.Monad.Identity

Now the representation for functional references I came up with is:

type RefF a b = forall f. Functor f => (b -> f b) -> (a -> f a)

This type looks a lot like a continuation passing style function, which would be simply (b -> r) -> (a -> r), but where the result is f a instead of any r. With different functors you get different behaviors. With the constant functor we can get the field pointed to:

get :: RefF a b -> a -> b
get r = getConst . r Const

While the identity functor allows a function us to modify the field:

modify :: RefF a b -> (b -> b) -> a -> a
modify r m = runIdentity . r (Identity . m)

set :: RefF a b -> b -> a -> a
set r b = modify r (const b)

As an example of an 'instance', here is the fstF function I used in the introduction:

fstF :: RefF (a,b) a
fstF a_to_fa (a,b) = (\a' -> (a',b)) <$> a_to_fa a

If we had tuple sections it could be written as simply

fstF x (a,b) = (,b) <$> x a

To get access to inner fields, functional references can be composed. So compose fstF fstF points to the first element inner inside the first outer element of a nested pair. One of the things that I like about the cps/functor based representation is that composition is quite beautiful and symmetric:

compose :: RefF b c -> RefF a b -> RefF a c
compose r s = s . r

idF :: RefF a a
idF = id

Let me conclude with the pair operator, called (***) in Control.Arrow. Unfortunately this operator is not as easy to define.

pair :: RefF a c -> RefF b d -> RefF (a,b) (c,d)
pair r s cd_to_fcd (a,b) = some_ugly_code

In fact, the only way I know of implementing pair is by moving back and forth to a get/set representation

 where some_ugly_code =
         let fcd = cd_to_fcd (get r a, get s b)      -- :: f (c,d)
             cd_to_ab (c,d) = (set r c a, set s d b) -- :: (c,d) -> (a,b)
         in fmap cd_to_ab fcd                        -- :: f (a,b)

The problem is that we need to split one function of type (c,d) -> f (c,d) into two, c -> f c and d -> f d, because that is what the left and right arguments expect. Then later, we would need to do the reverse and combine two of these functions again.

Does anyone have a better suggestion for implementing pair?

Where do I get my non-regular types?

2009-04-24T22:00:00Z

Friday I wrote about the type

data FunList a b
    = Done b
    | More a (FunList a (a -> b))

Where did this type come from? What can you use it for?

The story starts with another way of constructing FunLists, besides pure. For contrast I will call it 'impure'.

impure :: a -> FunList a a
impure a = More a (Done id)

I claim that any FunList can be written in the form

pure b <*> impure a₁ <*> impure a₂ <*> ...

for some b and a₁, a₂, etc. In other words, impure and Applicative are all that you need. The following function converts a FunList to the above form, where impure and the Applicative instance are left as parameters:

withImpure :: Applicative f => (a -> f a) -> FunList a b -> f b
withImpure imp (Done b)   = pure b
withImpure imp (More a f) = withImpure imp f <*> imp a

If you use this with the Applicative instance from last time you will find that getAs . withImpure impure = reverse . getAs!, I have written a reverse function without realizing it. Since this time we don't want to reverse the list, I am going to turn the Applicative instance around for this post:

instance Applicative (FunList a) where
    pure = Done
    c <*> Done b   = fmap ($b) c
    c <*> More a z = More a ((.) <$> c <*> z)

To support my claim above I need to prove that withImpure impure = id. This is a simple exercise in proof by induction. First of, we have that

withImpure impure (Done b) = pure b = Done b

Now assume that the theorem holds for z, i.e. withImpure impure z = z. Then

  withImpure impure (More a z)
= withImpure impure z <*> impure a
= z <*> impure a -- by induction hypotheis
= z <*> More a (Done id)
= More a ((.) <$> z <*> Done id)
= More a (fmap ($id) (fmap (.) z))
= More a (fmap (.id) z)
= More a z

By induction withImpure impure z = z for all z.

I actually came upon FunList from the other direction. I started with the higher order type

type ApplicativeFunList a b = forall f. Applicative f => (a -> f a) -> f b

An ApplicativeFunList is a function of the form \imp -> applicativeStuff. Since the applicativeStuff has to work for any applicative functor it can only use operations from that class in addition to the imp argument. Because of the Applicative laws, things like anything <*> pure x are the same as (x) <> anything, so the only interesting functions of this form are

\imp -> pure b
\imp -> pure b <*> imp a₁
\imp -> pure b <*> imp a₁ <*> imp a₂
-- etc.

Which is precisely what a FunList can represent! Indeed, we can convert any FunList to an ApplicativeFunList, and back again:

toAFL :: FunList a b -> ApplicativeFunList a b
toAFL fl imp = withImpure imp fl

fromAFL :: ApplicativeFunList a b -> FunList a b
fromAFL afl = afl impure

We already know that fromAFL . toAFL = withImpure impure = id. The other way around, I claim (but do not prove yet) that toAFL . fromAFL = id. Hence, FunList and ApplicativeFunList are isomorphic!

A non-regular data type challenge

2009-04-22T22:00:00Z

While playing around with generalized functional references I encountered the following list-like data type:

data FunList a b
    = Done b
    | More a (FunList a (a -> b))

This is a non-regular data type, meaning that inside the FunList a b there is a FunList a not-b. So, what does a value of this type look like? Well, it can be

Done (x :: b), or
More a₁ (Done (x :: a -> b)), or
More a₁ (More a₂ (Done (x :: a -> a -> b))), etc.

We either have just b, or an a and a function a->b, or two as (i.e. a²) and a function a²->b, or a³ and a³->b, etc.

A FunList a b is therefore a list of as together with a function that takes exactly that number of as to give you a b. Extracting the single represented b value is easy:

getB :: FunList a b -> b
getB (Done b)   = b
getB (More a z) = getB z a

As is getting to the list of as:

getAs :: FunList a b -> [a] 
getAs (Done _)   = []
getAs (More a z) = a : getAs z

But then things quickly get much trickier. Since a FunList a b holds exactly one b, we might ask how much access we have to it. First of, FunList a is a Functor, so the b value can be changed:

instance Functor (FunList a) where
    fmap f (Done b)   = Done (f b)
    fmap f (More a z) = More a (fmap (f .) z)

The above case for More looks a bit strange, but remember that the data type is non-regular, so we recurse with a different function f. In this case instead of having type b -> c as the outer f does, we need something with type (a -> b) -> (a -> c).

The Applicative instance is even stranger. There is a flip there, where the heck did that come from?

instance Applicative (FunList a) where
    pure = Done
    Done b   <*> c = fmap b c                    -- follows from Applicative laws
    More a z <*> c = More a (flip <$> z <*> c)   -- flip??

Aside from manipulating the b value we can also do more list like things to the list of as, such as zipping:

zipFun :: FunList a b -> FunList c d -> FunList (a,c) (b,d)
zipFun (Done b)   d          = Done (b,getB d)
zipFun b          (Done d)   = Done (getB b,d)
zipFun (More a b) (More c d) = More (a,c) (applyPair <$> zipFun b d)
    where applyPair (f,g) (x,y) = (f x,g y)

Surprisingly, the applicative operator defined above can be used as a kind of append, just look at the type:

(<*>) :: FunList a (b -> c) -> FunList a b -> FunList a c

it takes two 'lists' and combines them into one. It is indeed true that getAs a ++ getAs b == getAs (a <*> b).

This is as far as I got, so I will end this post with a couple of challenges:

Show that FunList a is a monad.
Show that FunList a is not a monad.
Write a function reverseFun :: FunList a b -> FunList a b that reverses a FunList, i.e. getAs . reverseFun == reverse . getAs.
Write a O(n) reverse function.

Knight in n, part 4: tensors

2008-12-09T23:00:00Z

Previously in this series:

Welcome to the fourth installement of the Knight in n series. In part 3 we talked about the direct product of rings, and how they helped us solve the knight moves problem. This time yet another type of product is going to help in decomposing the algorithm to allow faster parts to be put in.

The tensor product of rings

In part three I introduced the direct product on rings, which is nothing more than a pair of numbers. Confusingly this operation is also called direct sum. To illustrate this name, take the direct sum/product of Array i a with Array j b. For every index i (within the bounds of the first array) there is a value of type a, and for every index j there is a value of type b. Instead of a pair of arrays, this could also be implemented as a single array with the type Array (Either i j) (Either a b). "Either" is just Haskell's way of saying "disjoint union" or "sum type", hence "direct sum".

There is another product operation that we can perform on two rings: the tensor product. Dually to the direct sum, the tensor product of Array i a and Array j b has type Array (i,j) (a,b). The definition is very simple: the array contains all pairs where the first part comes from the first array, and the second part comes from the second array.

Slightly more generally, we can use any combining function. The general tensor product of two arrays can be implemented as:

tensorWith :: (Ix i, Ix j) => (a -> b -> c) -> Array i a -> Array j b -> Array (i,j) c
tensorWith f a b
    = array ((a_lo,b_lo),(a_hi,b_hi))
      [ ((i,j), f x y) | (i,x) <- assocs a, (j,y) <- assocs b ]
  where (a_lo,a_hi) = bounds a
        (b_lo,b_hi) = bounds b

Usually elements are multiplied:

(><) :: (Ix i, Ix j, Num a) => Array i a -> Array j a -> Array (i,j) a
(><) = tensorWith (*)

The mathematical notation for this (><) operator is ⊗. Now an example: Here we take two 4-element vectors, their tensor product has 4*4=16 elements. The two vectors are "one dimensional*" objects, their tensor product is a "two dimensional" matrix.

A special case we will use often is the tensor product of an array with itself:

square x = x >< x

For example (using simple reflection of expressions which is now on hackage as Debug.SimpleReflect):

Knight4> square (listArray (0,2) [u,v,w])
listArray ((0,0),(2,2)) [u*u, u*v, u*w
                        ,v*u, v*v, v*w
                        ,w*u, w*v, w*w]

Interchange law

The tensor product and convolution operations satisfy the very useful interchange law:

!!!style="margin-top:.1em"!!!And since exponentiation is repeated convolution, also

For a proof sketch of this equation, compare the definitions of (><) and mulArray. Ignoring array bounds stuff, we have:

convolution:     [ ( i+j,  x*y) | (i,x) <- assocs a, (j,y) <- assocs b ]
tensor product:  [ ((i,j), x*y) | (i,x) <- assocs a, (j,y) <- assocs b ]

The only difference is in what happens to indices, with convolution the indices are added, with the tensor product a pair is formed. Now consider the interchange law. Informally, the indices of the left hand side are of the form (i_a,i_b)+(i_c,i_d), and on the right hand side (i_a+i_c,i_b+i_d). This corresponds exactly to the piecewise addition for Num (α,β).

The interchange law is often exploited to perform faster convolutions. For example, consider blurring an image by taking the convolution with a Gaussian blur kernel:

Performing this convolution requires O(n⁴) operations for an n by n image.

The two dimensional Gaussian blur kernel can be written as the tensor product of two one dimensional kernels, with a bit algebra this gives:

So now to blur an image we can perform two convolution, first with the horizontal kernel, and then with the vertical one:

This procedure needs only O(n³) operations.

Back to business

Blurring images is not what we are trying to do. Instead of convolution with the Gaussian blur kernel, we are interested in convolution with moveMatrix. We could try the same trick, finding an a such that moveMatrix == a >< a. Unfortunately, this is impossible.

But we can still get close, there is a way to write moveMatrix == square a + square b, well, almost. Actually, what we have is:

2 * moveMatrix
      0 2 0 2 0        1 1 0 1 1      1 -1  0 -1  1 
      2 0 0 0 2        1 1 0 1 1     -1  1  0  1 -1 
 ==   0 0 0 0 0   ==   0 0 0 0 0  -   0  0  0  0  0   ==   square a - square b
      2 0 0 0 2        1 1 0 1 1     -1  1  0  1 -1 
      0 2 0 2 0        1 1 0 1 1      1 -1  0 -1  1

where

a,b :: Array Int Integer
a = listArray (-2,2) [1,1,0,1,1]
b = listArray (-2,2) [1,-1,0,-1,1]

Now we can start with paths_conv from last time:

paths_conv n ij = (moveMatrix ^ n) `safeAt` ij

Where safeAt is a safe array indexing operator, that returns 0 for indices that are out of bounds:

safeAt ar i
    | inRange (bounds ar) i = ar ! i
    | otherwise             = 0

Now let's do some algebraic manipulation:

    paths_conv n ij
= {- by definition of paths_conv -}
    (moveMatrix ^ n) `safeAt` ij
= {- by defintion of a and b -}
    ((square a - square b) `div` 2)^n `safeAt` ij -- division by 2 is pseudocode
= {- division does not depend on the index -}
    (square a - square b)^n `safeAt` ij `div` 2^n

We still cannot apply the interchange law, because the exponentiation (^n) is applied to the difference of two tensor products and not a single one. We can, however, expand this exponentation by the formula:

(a + b)^n = sum [ multinomial [n_a,n_b] * a^n_a * b^n_b | (n_a,n_b) <- split n ]

This is just the usual binomial expansion, as in

Applying binomial expansion to our work-in-progress gives:

    (square a - square b)^n `safeAt` ij `div` 2^n
= {- binomial expansion -}
    sum [ multinomial [n_a,n_b] * square a^n_a * (-square b)^n_b
        | (n_a,n_b) <- split n ]
    `safeAt` ij `div` 2^n
= {- (-square b)^n_b == (-1)^n_b * square b^n_b -}
    sum [ multinomial [n_a,n_b] * (-1)^n_b
        * square a^n_a * square b^n_b
        | (n_a,n_b) <- split n ]
    `safeAt` ij `div` 2^n
= {- interchange law -}
    sum [ multinomial [n_a,n_b] * (-1)^n_b
        * square (a^n_a * b^n_b)
        | (n_a,n_b) <- split n ]
    `safeAt` ij `div` 2^n
= {- move `safeAt` inwards, since addition is pointwise -}
    sum [ multinomial [n_a,n_b] * (-1)^n_b
        * square (a^n_a * b^n_b) `safeAt` ij
        | (n_a,n_b) <- split n ]
    `div` 2^n

Fast indexing

Since square something already has n² elements and the loop is performed n+1 times, this algorithm still requires O(n³) operations.

The only reason for calculating square (a^n_a * b^n_b) is because we need the element at index ij. So instead of constructing a whole array, let's just calculate that single element:

-- square x `safeAt` ij  ==  x `squareAt` ij
x `squareAt` (i,j) = x `safeAt` i * x `safeAt` j

So the inner part of the algorithm becomes:

    square (a^n_a * b^n_b) `safeAt` ij
= {- property of squareAt -}
    (a^n_a * b^n_b) `squareAt` ij

We are still not there yet. Both a^n_a and b^n_b have O(n) elements, so just calculating their convolution takes O(n²) work. But again, we need only two elements of the convolution, so we can define:

-- a * b `safeAt` i  ==  mulArrayAt a b i
mulArrayAt a b n = sum [ x * b `safeAt` (n-i) | (i,x) <- assocs a ]

And update squareAt accordingly:

mulSquareAt a b (i,j) = mulArrayAt a b i * mulArrayAt a b j

Finally we need a more efficient way to calculate all the powers of a and b. The iterate function can help us with that:

Knight4> iterate (*u) 1
[1, 1*u, 1*u*u, 1*u*u*u, 1*u*u*u*u, ...

Putting the pieces together gives a O(n²) algorithm for the knight moves problem:

paths_tensor n ij
      = sum [ multinomial [n_a,n_b] * (-1)^n_b
            * mulSquareAt (powers_of_a !! n_a) (powers_of_b !! n_b) ij
            | (n_a,n_b) <- split n
            ]
       `div` 2^n
  where powers_of_a = iterate (*a) 1
        powers_of_b = iterate (*b) 1

Note that the savings we have made do not come directly from decomposing the moveMatrix. It is just that this decomposition allows us to see that we are computing all elements of am expensive product where a single one would do.

This post brings another order of improvement. Do you think you can do better than O(n²) time and O(n²) space complexity? If so I would like to hear.

*: The number of elements is often called the dimension of a vector. Here we use the term dimension to refer to the number of indices used, also known as the (tensor) order. So a 100*100 pixel image has dimension 10000 according to the first interpretation (the number of elements), but dimension two in the second interpretation (the number of indices).

Knight in n, part 3: rings

2008-12-03T23:00:00Z

Previously in this series:

In this third installment, we will look at how to use various types as numbers, i.e. how to make them an instance of the Num type class. The solution the Knight-moves-problem will emerge at the end, almost as if by magic. :)

Tangent: Things as numbers

Many types can be used as if they are numbers. Haskell-wise this means they can be an instance of the Num type class. Mathematically it means that these types are rings.

Pairs as numbers

Let's start with a Num instance for pairs (α,β). In general, our only choice is to do everything pointwise. So for all operations ⊗ (i.e. (+), (-) and (*):

In ring theory this is called the direct product. In Haskell we can write it as:

instance (Num α, Num β) => Num (α,β) where
    (a,b) + (c,d) = (a+c,b+d)
    (a,b) - (c,d) = (a-c,b-d)
    (a,b) * (c,d) = (a*c,b*d)
    fromInteger i = (fromInteger i, fromInteger i)
    abs     (a,b) = (abs    a, abs    b)
    signum  (a,b) = (signum a, signum b)

We could also make instances for triples, quadruples and other tuples this way, but those are not needed for the rest of the story.

Arrays as numbers

A more general kind of tuple is an array; which is somewhat like a tuple of arbitrary size. Of course, that is not quite true, since two arrays with the same type can have a different size. One way around this problem is to treat all arrays as if they are infinite, by taking values outside the bounds to be equal to 0. So

-- EXAMPLE
listArray (0,0) [1] == listArray (-∞,∞) [..,0,0,1,0,0,..] -- pseudocode

That way we can still do addition pointwise,

The accumArray function can help with the missing elements by setting them to 0 by default:

addArray a b = accumArray (+) 0 (min a_lo b_lo, max a_hi b_hi) (assocs a ++ assocs b)
  where (a_lo,a_hi) = bounds a
        (b_lo,b_hi) = bounds b

Next up is the fromInteger function. fromInteger 0 is easy; there are two options for other values

.# fromInteger i is an infinite array of values i. .# fromInteger i is an array with values i at some single point.

The first choice mimics the definition for tuples, fromInteger i = (fromInteger i, fromInteger i). But for arrays this has the slight problem of requiring an infinite array. For the second alternative we need to pick the index where to put the number i. The obvious choice is to put i at 'the origin', index 0:

intArray i = listArray (0,0) [fromInteger i]

Finally, multiplication. As you have learned in school, multiplication can be seen as repeated addition, In our Haskell world that means that we expect the law a + a = fromInteger 2 * a to hold.

If we had used the first choice for fromInteger then multiplication could be done pointwise as it was for tuples. But we have made a different choice, so now fromInteger 2 is an array that contains the value 2 at index 0 (and is implicitly zero everywhere else). When calculating fromInteger 2 * a, this 2 should by multiplied with all elements of the array a.

The operation that does the right thing is convolution. It looks like this:

So for each element v at index i in the first array, we shift a copy of the second array so that its origin becomes i. This copy is multiplied by v and all these copies are added. If one of the arrays is fromInteger v (i.e. a scalar), then this corresponds to multiplying all elements in the other array by v; exactly what we wanted.

Convolution can be implemented with accumArray as:

mulArray a b
    = accumArray (+) 0 (bounds a + bounds b)
      [ (i+j, x*y) | (i,x) <- assocs a, (j,y) <- assocs b ]

Notice that we use the Num (α,β) instance for the bounds, and that this definition is nicely symmetrical.

Putting it all together, we get the following instance:

instance (Ix i, Num i, Num a) => Num (Array i a) where
    fromInteger = intArray
    (+)         = addArray
    (*)         = mulArray
    negate      = fmap negate
    abs         = fmap abs
    signum      = fmap signum

In mathematical terms, what we constructed here is called a group ring. There is a group ring G[R] for any group G and ring R, which corresponds to an instance Num (Array g r) when g is a group (i.e. an instance of Num) and r is a ring (also an instance of Num).

Arrays as polynomials

Another way to interpret the above instance, is by treating arrays as polynomials over some variable x. The array array [(i,a),(j,b),(k,c),..] then represents the polynomial axⁱ+bx^j+cx^k+.... The addition and multiplication defined above now have the expected meaning, for example:

> let a = listArray (0,2) [2,3,4]  --  2 + 3x + 4x^2
> let b = listArray (1,2) [5,6]    --  5x + 6x^2
> a + b
array (0,2) [(0,2),(1,8),(2,10)]   --  2 + 8x + 10x^2
> a * b
array (1,4) [(1,10),(2,27),(3,38),(4,24)]  --  10x + 27x^2 + 38x^3 + 24x^4

We can make this even more suggestive by defining:

x = listArray (1,1) [1]

> (2 + 3*x + 4*x^2) * (5*x + 6*x^2) == 10*x + 27*x^2 + 38*x^3 + 24*x^4
True

If you are interested in this interpretation, sigfpe wrote an interesting blog post about convolutions, polynomials and power series.

It's magic!

Now, let's go back to our original problem, the moves of a chess knight.

The positions reachable in a single move can be put into a two dimensional array (i.e. a matrix).

moveMatrix :: Array (Int,Int) Integer
moveMatrix = accumArray (+) 0 ((-2,-2),(2,2)) [ (m,1) | m <- moves ]

This is the familiar move matrix, which we already saw in part 1.

Knight3> printMatrix moveMatrix
    0 1 0 1 0
    1 0 0 0 1
    0 0 0 0 0
    1 0 0 0 1
    0 1 0 1 0

Now the magic. We defined multiplication of two arrays a and b as adding copies of b for each value in a. If we use the move matrix as b, then this means we add all possible destinations of a knight making one move from each place it can reach. Repeating this n times gives us our answer. Since repeated multiplication is exponentiation:

allPaths_conv n = moveMatrix ^ n

For example, for n=2:

If we are interested in just a single point there is the array indexing operator (!!) to help us,

paths_conv n ij
    | inRange (bounds m) ij = m ! ij
    | otherwise             = 0
  where m = allPaths_conv n

This convolutional algorithm can count the number of paths in O(n³), but not just for a single end point, but for all end points at once! The program is also a lot simpler than the

The paths_conv algorithm is pretty good, but we can still do better. Next time I will show how the algorithm from part 3 can be improved further, and curiously, how it will start to look more like the algorithm from part 2.

Knight in n, part 2: combinatorics

2008-11-30T23:00:00Z

Previously in this series:

part 1: moves

In my previous post I introduced the 'knight moves problem': How many ways are there for a chess knight to reach cell (i,j) in exactly n moves? The recursive solution from last time is horribly inefficient for larger values of n. Today I will show some more efficient solutions.

Ignoring the order of moves

If the knight first makes a move (-1,2) and then a move (2,1) it will end up at (1,3). If it first moves (2,1) and then (-1,2) it will also end up at (1,3). So, the order in which the moves happen does not matter for the final position! We can exploit this fact to make a faster program. Instead of determining what move to make at each step, we can count how many moves we make of each type and then determine in how many different orders these moves can be performed.

Denote by n₁ the number of moves of the first type, n₁₂₃ the number of moves of type 1, 2 or 3, etc. So n₁₂₃₄ = n₁+n₂+n₃+n₄, and since there are eight different moves, n_12345678 = n. A count n_ab can be split into n_a+n_b in several ways, for now we will consider all possibilities:

split n = [ (i,n-i) | i <- [0..n] ]

So for example, split 3 = [(0,3),(1,2),(2,1),(3,0)].

By repeatedly splitting n we arrive at:

paths_split n (i,j) = sum $ do
    let n_12345678 = n
    (n₁,n_2345678) <- split n_12345678
    (n₂,n₃₄₅₆₇₈) <- split n_2345678
    (n₃,n₄₅₆₇₈) <- split n₃₄₅₆₇₈
    (n₄,n₅₆₇₈) <- split n₄₅₆₇₈
    (n₅,n₆₇₈) <- split n₅₆₇₈
    (n₆,n₇₈) <- split n₆₇₈
    (n₇,n₈) <- split n₇₈
    let counts = [n₁,n₂,n₃,n₄,n₅,n₆,n₇,n₈]
    guard $ (i,j) == destination counts
    return $ multinomial counts

Here we only keep sequences of moves that end up in (i,j), as determined by the destination function:

destination counts = (sum hs, sum vs)
    where (hs,vs) = unzip [ (n*δ_i,n*δ_j) | (n,(δ_i,δ_j)) <- zip counts moves ]

Next, we need to know how many different paths can be formed with a particular set of moves. You might remember binomial coefficients from high school, which give the number of ways to pick k items from a set of size n:

If we take n equal to m+k we get the number of different lists containing exactly k red balls and m green balls. Or put differently, the number of different paths containing k moves of the first type and m moves of the second type. This interpretation of binomial coefficients can be generalized two more than two types, giving multinomial coefficients. These are exactly what we need to determine the number of paths given the counts of each type of move:

multinomial xs | any (< 0) xs = 0
multinomial xs = factorial (sum xs)
               `div` product (map factorial xs)

This multinomial function requires calculating a lot of factorials, to make this as fast as possible they should be stored in an 'array':

factorial :: Int -> Integer
factorial = unboundedArray $ scanl (*) 1 [1..]

Calculating paths_split only takes O(n⁷) integer operations, since each split effectively costs a factor n. While this is better than the previous result, it is still not satisfactory.

Solving the guard condition

The above function uses a "generate and test" approach: Generate all possibilities and test which ones reach the destination. It would be more efficient to generate only those possibilities.

Algebraic reasoning can help us here. Let's start by expanding the condition in the guard statement:

   (i,j) == destination counts
= {- by definition of destination -}
   (i,j) == (sum hs, sum vs)
     where (hs,vs) = unzip [ (n*δ_i,n*δ_j) | (n,(δ_i,δ_j)) <- zip counts moves ]
= {- expand unzip and simplify -}
   i == sum (zipWith (*) counts (map fst moves) &&
   j == sum (zipWith (*) counts (map snd moves)
= {- by definition of moves (see previous post) -}
   i == sum (zipWith (*) counts [2,2,-2,-2,1,-1,1,-1] &&
   j == sum (zipWith (*) counts [1,-1,1,-1,2,2,-2,-2]
= {- expanding the sum and product, remember n₁₂ = n₁+n₂, etc. -}
   i == 2*n₁₂ - 2*n₃₄ + n₅₇ - n₆₈ &&
   j == 2*n₅₆ - 2*n₇₈ + n₁₃ - n₂₄
= {- reordering -}
   n₅₇ - n₆₈ == i - 2*n₁₂ + 2*n₃₄ &&
   n₁₃ - n₂₄ == j - 2*n₅₆ + 2*n₇₈

These are equations we can work with. Take the equation involving i. We know that n₅₇ + n₆₈ = n₅₆₇₈, and that n₅₇ - n₆₈ == i - 2*n₁₂ + 2*n₃₄. From these two equations, we can solve for n₅₇ and n₆₈, without needing an expensive split:

-- | find a and b such that a+b == c, a-b == d, a,b >= 0
solve_pm c d
   | ok == 0 && a >= 0 && a <= c = return (a,c-a)
   | otherwise                   = mzero
 where (a,ok) = (c + d) `divMod` 2

This gives an O(n⁵) algorithm:

paths_pm n (i,j) = sum $ do
    let n_12345678 = n
    (n₁₂₃₄,n₅₆₇₈) <- split n_12345678
    (n₁₂,n₃₄) <- split n₁₂₃₄
    (n₅₆,n₇₈) <- split n₅₆₇₈
    (n₅₇,n₆₈) <- solve_pm n₅₆₇₈ (i - 2*n₁₂ + 2*n₃₄)
    (n₁₃,n₂₄) <- solve_pm n₁₂₃₄ (j - 2*n₅₆ + 2*n₇₈)
    (n₁,n₂) <- split n₁₂
    let n₃ = n₁₃ - n₁
    let n₄ = n₂₄ - n₂
    (n₅,n₆) <- split n₅₆
    let n₇ = n₅₇ - n₅
    let n₈ = n₆₈ - n₆
    return $ multinomial [n₁,n₂,n₃,n₄,n₅,n₆,n₇,n₈]

Multinomial laws

It turns out that we don't actually need to know n₁, n₂, etc. If you think about it, the multinomial coefficient [n₁,n₂,n₃,n₄,n₅,n₆,n₇,n₈] means: "The number of different lists with n₁ red balls, n₂ of green balls, etc.". To make such a list we can first pick where to put the red balls, then where to put the blue balls, then the green balls and so on.

But we could also first decide where the brightly colored balls (red and green) go and where the dark collored ones (blue) go. Now there are only two types of balls, so this is a binomial coefficient, or in terms of a multinomial, multinomial [n_rg,n_b]. Then for the positions with brightly colored balls, we need to determine which ones are which color, which can be done in multinomial [n_r,n_g] ways. In a picture:

This same arguments also holds when there are eight types of balls (or moves), so

multinomial [n₁,n₂,n₃,n₄,n₅,n₆,n₇,n₈]
 == multinomial [n₁₂,n₃₄,n₅₆,n₇₈]
  * multinomial [n₁,n₂] * multinomial[n₃,n₄]
  * multinomial [n₅,n₆] * multinomial[n₇,n₈]

If you plug this into the paths_pm function, you might notice that the last part of the function is calculating the product of two independent things. One part is about n₁..n₄ and the other about n₅..n₈. Now remember that the function paths takes the sum of all possibilities, and that products distributes over sums. This means that the two loops for n₁₂₃₄ and n₅₆₇₈ can be performed independently, giving us an O(n⁴) algorithm:

paths_O4 n (i,j) = sum $ do
    let n_12345678 = n
    (n₁₂₃₄,n₅₆₇₈) <- split n_12345678
    (n₁₂,n₃₄) <- split n₁₂₃₄
    (n₅₆,n₇₈) <- split n₅₆₇₈
    (n₅₇,n₆₈) <- solve_pm n₅₆₇₈ (i - 2*n₁₂ + 2*n₃₄)
    (n₁₃,n₂₄) <- solve_pm n₁₂₃₄ (j - 2*n₅₆ + 2*n₇₈)
    let result₁₂₃₄ = sum $ do
         (n₁,n₂) <- split n₁₂
         let n₃ = n₁₃ - n₁
         let n₄ = n₂₄ - n₂
         return $ multinomial [n₁,n₂] * multinomial[n₃,n₄]
    let result₅₆₇₈ = sum $ do
         (n₅,n₆) <- split n₅₆
         let n₇ = n₅₇ - n₅
         let n₈ = n₆₈ - n₆
         return $ multinomial [n₅,n₆] * multinomial[n₇,n₈]
    return $ multinomial [n₁₂,n₃₄,n₅₆,n₇₈] * result₁₂₃₄ * result₅₆₇₈

Here both of the result parts are of the form

sum [ multinomial [a,b] * multinomial[x-a,y-b] | (a,b) <- split n ]

which just so happens to be equivalent to just multinomial [x,y] (a proof of this statement is left as an exercise, i.e. I am too lazy to write it out). This equation immediately leads to a (much simpler) O(n³) algorithm:

paths_O3 n (i,j) = sum $ do
    let n_12345678 = n
    (n₁₂₃₄,n₅₆₇₈) <- split n_12345678
    (n₁₂,n₃₄) <- split n₁₂₃₄
    (n₅₆,n₇₈) <- split n₅₆₇₈
    (n₅₇,n₆₈) <- solve_pm n₅₆₇₈ (i - 2*n₁₂ + 2*n₃₄)
    (n₁₃,n₂₄) <- solve_pm n₁₂₃₄ (j - 2*n₅₆ + 2*n₇₈)
    return $ multinomial [n₁₂,n₃₄,n₅₆,n₇₈]
           * multinomial [n₅₇,n₆₈]
           * multinomial [n₁₃,n₂₄]

Verifying the results

After all this manipulation it is a good idea to check whether the program still does the right thing. We can either manually compare the path matrices:

check paths = and [ pathMatrix paths_rec n == pathMatrix paths n | n <- [0..3] ]

Or use QuickCheck or SmallCheck:

Knight2> smallCheck 5 (\(N n) ij -> paths_O3 n ij == paths_rec n ij)
...
Depth 5:
  Completed 726 test(s) without failure.

Finally, to contrast with the first part of this series, here is the time it takes to calculate the number of paths in 100 steps:

Knight2> paths_O3 100 (4,4)
2422219241769802380469882122062019059350760968380804461263234408581143863208781993964800
(4.75 secs, 270708940 bytes)

The recursive algorithm would need in the order of 10⁷⁷ years to arrive at this answer.

Still, paths_O3 is not the fastest possible algorithm. Next time I will look at a completely different approach, but further improvements to the solution in this post are possible as well. As an exercise for the reader, you should try transforming paths_O3 into an O(n²) solution. Hint: there are more sums-of-products of independent values.

Knight in n, part 1: moves

2008-11-25T23:00:00Z

Consider the following problem:

A knight is placed at the origin of a chessboard that is infinite in all directions. How many ways are there for that knight to reach cell (i,j) in exactly n moves?

This knight moves problem is not hard, nor does it have any real life applications. The problem is still interesting because there are many different ways to solve it, ranging from very simple to quite complex. In this series of articles I will describe some of these solutions.

Knight's moves

In chess, a knight can move two squares horizontally and one square vertically, or two squares vertically and one square horizontally. One complete move therefore looks like the letter 'L'. The picture on the right shows all possible moves for the black knight in the center.

We can summarize all these moves in an array:

moves :: [(Int,Int)]
moves = [(2,1),(2,-1),(-2,1),(-2,-1)
        ,(1,2),(-1,2),(1,-2),(-1,-2)]

Counting the number of paths to (i,j) in n steps can now be done with a simple recursive function. The base case is that in 0 moves only cell (0,0) is reachable. In the recursion step we simply try all moves:

paths_rec :: Int -> (Int,Int) -> Integer
paths_rec 0 (0,0) = 1
paths_rec 0 (_,_) = 0
paths_rec n (i,j) = sum [ paths_rec (n-1) (i+δ_i,j+δ_j) | (δ_i,δ_j) <- moves ]

So for example

Knight1> paths_rec 4 (2,2)
54

I.e. there are 54 ways to reach cell (2,2) in 4 moves.

Unfortunately the function paths_rec is not very efficient. In fact, it is very much not efficient. At each step all 8 possible moves are considered, so the total time complexity of this function is O(8ⁿ).

Tables

Besides calculating the number of paths to a single point it can also be interesting to display the number of pats for each possible end point. We can make a list of lists containing all the path counts,

pathMatrix paths n
    = [ [ paths n (i,j) | j <- [-2*n .. 2*n] ] | i <- [-2*n .. 2*n] ]

and then display this list in a tabular format

showMatrix :: Show α => [[α]] -> String
showMatrix xss = unlines [ unwords [ show x | x <- xs ] | xs <- xss ]

printPathMatrix paths = putStr . showMatrix . pathMatrix paths

The path matrix for n=1 should be familiar, it is the same as the image of possible moves of a knight.

Knight1> printPathMatrix paths_rec 1
    0 1 0 1 0
    1 0 0 0 1
    0 0 0 0 0
    1 0 0 0 1
    0 1 0 1 0

But now we can also make larger tables:

Knight1> printPathMatrix paths_rec 2
    0 0 1 0 2 0 1 0 0
    0 2 0 2 0 2 0 2 0
    1 0 0 0 2 0 0 0 1
    0 2 0 2 0 2 0 2 0
    2 0 2 0 8 0 2 0 2
    0 2 0 2 0 2 0 2 0
    1 0 0 0 2 0 0 0 1
    0 2 0 2 0 2 0 2 0
    0 0 1 0 2 0 1 0 0

If you were to continue increasing n, the table and the numbers in it become ever larger. It is a good idea to make a 'density plot', i.e. to use colors to visualize larger numbers. For example for n=4, the path matrix can be rendered as:

Special cases

Looking at the above matrices, you might start to see some patterns emerge:

A knight cannot move more than 2n cells in any direction (horizontal or vertical).
Similarly, the knight moves no more than 3n squares in total.
A knight always moves from a white square to a black square and vice-versa. In other words, the parity of i+j+n must be zero.

These observations can be used as additional cases in the recursive function to quickly eliminate large parts of the input space:

paths_case :: Int -> (Int,Int) -> Integer
paths_case 0 (0,0) = 1
paths_case 0 (_,_) = 0
paths_case n (i,j) | (n+i+j) `mod` 2 /= 0 = 0
paths_case n (i,j) | abs i + abs j > 3*n  = 0
paths_case n (i,j) | abs i > 2*n          = 0
paths_case n (i,j) | abs j > 2*n          = 0
paths_case n (i,j) = sum [ paths_case (n-1) (i+δ_i,j+δ_j) | (δ_i,δ_j) <- moves ]

A quick test shows that this can be a big improvement for the run time:

Knight1> paths_rec 8 (4,4)
124166
(92.88 secs, 4605991724 bytes)
Knight1> paths_case 8 (4,4)
124166
(17.69 secs, 807191624 bytes)

The asymptotic time complexity of paths_case is harder to analyze. It is still O(8ⁿ) in the worst case, but the complexity is now also output dependant.

That is all for now, next time we will look at smarter algorithms. For the interested reader I would suggest that you try to come up with some ideas of your own. I would love to hear how other people approach this problem.

Arrays without bounds

2008-11-13T23:00:00Z

Regular old arrays have a size; you can't just have an infinite array. On the other hand, a lazy language such as Haskell does allow infinite lists. The idea behind the UnboundedArray module is to combine the O(1) access of arrays with the unbounded size of lazy lists.

module UnboundedArray where

This data type is built on top of ordinary arrays and unsafe IO operations:

import Data.Array
import Data.IORef
import System.IO.Unsafe

To keep things simple, an unbounded array is just a function from the natural numbers to array elements:

type UnboundedArray a = Int -> a

I am just going to dump the code here instead of explaining it. The idea is to make an array and resize it when it becomes too small. If the size increases geometrically with each resize, then the amortized cost of a single access will be O(1).

-- | Create an unbounded array from an infinite list
--   Accessing element /n/ takes /O(n)/ time, but only /O(1)/ amortized time.
unboundedArray :: [a] -> UnboundedArray a
unboundedArray xs = unsafePerformIO . unsafePerformIO (unboundedArrayIO xs)

unboundedArrayIO :: [a] -> IO (Int -> IO a)
unboundedArrayIO xs = do
    theArray <- newIORef (listArray (0,0) xs)
    return $ \n -> do
        ar <- readIORef theArray
        let (0,size) = bounds ar
        if n <= size
          then return $ ar ! n
          else do let size' = max n (size * 3 `div` 2)
                  let ar' = listArray (0,size') xs
                  writeIORef theArray ar'
                  return $ ar' ! n

So, what are UnboundedArrays good for? A simple application is memoization, for example:

memo_Int f = unboundedArray (map f [0..])

fib = memo_Int realFib
  where realFib 0 = 1
        realFib 1 = 1
        realFib n = fib (n - 1) + fib (n - 2)

> map fib [1..20]
[1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,1597,2584,4181,6765,10946]

But since we can use an arbitrary list for initialization the unboundedArray function can sometimes be more flexible/convenient than memo_Int.

A generic merge function

2008-08-14T22:00:00Z

When working with sorted lists you often come to the point where you want to combine two or more of them. This merge procedure forms the heart of merge sort it works something like:

merge [1,3,4,5] [2,3,4] = [1,2,3,3,4,4,5]

This merge function is not in the Haskell standard library, and even if there were, it might not be very useful.

The problem is that when you need merge you often need a slight variation. For example, you might want to remove duplicates,

merge_union [1,3,4,5] [2,3,4] = [1,2,3,4,5]

Or find the elements common to both lists,

merge_intersection [1,3,4,5] [2,3,4] = [3,4]

Or you want the difference, the symmetric difference, or...

The solution for all these problems is to make a more general merge function. To do that we take a note from the most generic function over a single list, foldr. The generic merge function is also a right fold, but over two lists. Behold the type signature:

mergeByR :: (a -> b -> Ordering)  -- ^ cmp: Comparison function
         -> (a -> b -> c -> c)    -- ^ f_xy: Combine when a and b are equal
         -> (a -> c -> c)         -- ^ f_x:  Combine when a is less
         -> (b -> c -> c)         -- ^ f_y:  Combine when b is less
         -> c                     -- ^ z:   Base case
         -> [a] -> [b] -> c       -- ^ Argument lists and result list

Don't be scared by the size. The reason there are a lot of arguments is that for each case we use a different combining function: If the smallest element comes from the first list we use f_x, if it comes from the second list we use f_y, and when the two elements are equal, we combine them both with f_xy. As in foldr these calls to f_x/f_y/f_xy are then chained like f_x x₁ (f_x x₂ (.. z)).

The lists from the example above can be aligned as follows:

xs               =  [1,      3,   4,   5 ]
ys               =  [    2,  3,   4      ]
function to use  =  [f_x, f_y, f_xy, f_xy, f_x]
mergeByR ....    =  f_x 1 . f_y 2 . f_xy 3 3 . f_xy 4 4 . f_x 5 $ z

The function implementation is straightforward:

mergeByR cmp f_xy f_x f_y z = go
    where go []     ys     = foldr f_y z ys
          go xs     []     = foldr f_x z xs
          go (x:xs) (y:ys) = case cmp x y of
              LT -> f_x  x   (go xs (y:ys))
              EQ -> f_xy x y (go xs ys)
              GT -> f_y    y (go (x:xs) ys)

Now, let's look at some uses of this function. First of all, the usual merge sort merge function:

mergeBy cmp = mergeByR cmp (\a b c -> a:b:c) (:) (:) []
merge = mergeBy compare

Instead of adding both a and b to the resulting list when they are equal, we can instead add only one of them, or even the result of some function on them. This gives the set union operation:

unionByWith cmp f = mergeByR cmp (\a b c -> f a b:c) (:) (:) []
unionWith = unionByWith compare

If we ignore elements that occur in only one of the lists by setting f_x and f_y to const id, we get the intersection instead:

intersectionByWith cmp f = mergeByR cmp (\a b c -> f a b:c) (const id) (const id) []
intersectionWith = intersectionByWith compare

With these merge functions, implementing merge sort becomes simple. All that is left to do is split a list in two, and recursively sort and merge.

split :: [a] -> ([a],[a])
split (x:y:zs) = let (xs,ys) = split zs in (x:xs,y:ys)
split xs       = (xs,[])

sort []  = []
sort [x] = [x]
sort xs  = let (ys,zs) = split xs in merge (sort ys) (sort zs)

If we replace merge by unionWith we instead get a sort that combines duplicate elements.

Besides set operations, mergeByR can also be (ab)used for other things, such as

zipWith = intersectionByWith (const $ const EQ)

Or a variant of zipWith, that keeps the tail of the longer list:

zipWith' = unionByWith (const $ const EQ)

We can even implement concatenation:

(++) = mergeByR (const $ const LT) undefined (:) (:) []

Solving nonograms

2008-07-25T22:00:00Z

In this post I will show how to solve nonograms automatically using a computer. The code has been on the Haskell wiki for over year, but I have never taken the time to explain how it works.

This post is literate haskell (download the source here), so we need to start with some imports:

import qualified Data.Set as Set
import qualified Data.Map as Map
import Data.Set (Set)
import Data.List
import Control.Applicative

Since we will be working with sets a lot, here are some additional utility functions:

setAll :: (a -> Bool) -> Set a -> Bool
setAll pred = all pred . Set.toList
unionMap :: (Ord a, Ord b) => (a -> Set b) -> Set a -> Set b
unionMap f = Set.unions . map f . Set.toList

The puzzle

So, what is a nonogram anyway? Quoting Wikipedia:

Nonograms are picture logic puzzles in which cells in a grid have to be colored or left blank according to numbers given at the side of the grid to reveal a hidden picture. In this puzzle type, the numbers measure how many unbroken lines of filled-in squares there are in any given row or column. For example, a clue of "4 8 3" would mean there are sets of four, eight, and three filled squares, in that order, with at least one blank square between successive groups.

A solved nonogram might look like the following image:

A Haskell function to solve nonograms for us could have the following type, taking the clues for the rows and columns, and returning a grid indicating which squares are filled,

solvePuzzle :: [[Int]] -> [[Int]] -> [[Bool]]

Values and cells

For simplicity we will start with a single row. A first idea is to represent the cells in a row as booleans, type Row = [Bool]. This works fine for a finished puzzle like:
;
but consider a partially solved row:
.

First of all we will need a way to distinguish between blank cells (indicated by a cross) and unknown cells. Secondly, we throw away a lot of information. For instance, we know that the last filled cell will be the last cell of a group of three.

To solve the second problem we can give each position an unique label, so the first filled cell will always be, for instance 1, the second one will be 2, etc. For blank cells we can use negative numbers; the first group of blanks will be labeled -1, the second group will be -2, etc. Since the groups of blanks are of variable size, we give each one the same value. Our solved row now looks like:
.

In Haskell we can define the type of cell values as simply

newtype Value = Value Int
    deriving (Eq, Ord, Show)

Since negative values encode empty cells, and positive values are filled cells, we can add some utility functions:

blank (Value n) = n < 0
filled = not . blank

This still leaves the first issue, dealing with partially solved puzzles.

Partial information

When we don't know the exact value of a cell it is still possible that there is some information. For instance, we might know that the first cell will not contain the value 9, since that value is already somewhere else. One way of representing this is to keep a set of possible values:

type Cell = Set Value

An unknown cell is simply a cell containing all possible values, and the more we know about a cell, the less the set will contain.

At a higher level we can still divide cells into four categories:

data CellState = Blank | Filled | Indeterminate | Error
    deriving Eq

cellState :: Cell -> CellState
cellState x
    | Set.null      x = Error         -- Something went wrong, no options remain
    | setAll blank  x = Blank         -- The cell is guaranteed to be blank
    | setAll filled x = Filled        -- The cell is guaranteed to be filled
    | otherwise       = Indeterminate

CellStates are convenient for displaying (partial) solution grids,

instance Show CellState where
    show Blank         = "."
    show Filled        = "#"
    show Indeterminate = "?"
    show Error         = "E"

For example, here is our running example again, this time rotated 90°. The CellStates are shown on the left as before; while the actual Cell set is on the right:

Solving a single row

Now it is time to solve a row.

As stated before, each filled cell gets a unique value. From a clue of the group lengths we need to construct such a unique labeling, such that labeling [4,3] == [-1,-1,2,3,4,5,-6,-6,7,8,9,-10,-10]. The exact values don't matter, as long as they are unique and have the right sign.

Constructing this labeling is simply a matter of iterating over the clues,

labeling :: [Int] -> [Value]
labeling = map Value . labeling' 1
    where labeling' n []     = [-n,-n]
          labeling' n (x:xs) = [-n,-n] ++ [n+1 .. n+x] ++ labeling' (n+x+1) xs

This labeling gives us important local information: we know what values can occur before and after a particular value. This is also the reason for including the negative (blank) values twice, since after a -1 another -1 can occur.

We can determine what comes after a value by zipping the labeling with its tail. In our example:

after    [-1,-1, 2, 3, 4, 5,-6,-6, 7, 8,  9, -10, -10]
comes [-1,-1, 2, 3, 4, 5,-6,-6, 7, 8, 9,-10, -10]

Collecting all pairs gives the mapping:

{ -1 -> {-1,2}, 2 -> {3}, 3 -> {4}, 4 -> {5}, 5 -> {-6}, -6 -> {-6,7}, ...}

Instead of carrying a Map around we can use a function that does the lookup in that map. Of course we don't want to recalculate the map every time the function is called, so we need to be careful about sharing:

bad1 a    x =  Map.lookup x (expensiveThing a)
bad2 a    x =  Map.lookup x theMap  where theMap = expensiveThing a
good a = \x -> Map.lookup x theMap  where theMap = expensiveThing a

So for determining what comes after a value in the labeling:

mkAfter :: [Value] -> (Value -> Cell)
mkAfter vs = \v -> Map.findWithDefault Set.empty v afters
    where afters = Map.fromListWith Set.union
                 $ zip vs (map Set.singleton $ tail vs)

Row data type

In the Row datatype we put all the information we have:

The cells in the row
What values can come before and after a value
The values at the edges

data Row = Row
    { cells         :: [Cell]
    , before, after :: Value -> Cell
    , start,  end   :: Cell
    }

Some simple Show and Eq instances:

instance Show Row where
    show row = "[" ++ concatMap show (rowStates row) ++ "]"

instance Eq Row where
    a == b  =  cells a == cells b

To construct a row we first make a labeling for the clues. Then we can determine what comes after each value, and what comes after each value in the reversed labeling (and hence comes before it in the normal order).

mkRow :: Int -> [Int] -> Row
mkRow width clue = Row
        { cells  = replicate width (Set.fromList l)
        , before = mkAfter (reverse l)
        , after  = mkAfter l
        , start  = Set.singleton $ head l
        , end    = Set.singleton $ last l
        }
    where l = labeling clue

Actually solving something

Now all the things are in place to solve our row: For each cell we can determine what values can come after it, so we can filter the next cell using this information. To be more precise, we can take the intersection of the set of values in a cell with the set of values that can occur after the previous cell. In this way we can make a forward pass through the row:

solveForward, solveBackward :: Row -> Row
solveForward row = row { cells = newCells (start row) (cells row) }
    where newCells _    []     = []
          newCells prev (x:xs) = x' : newCells x' xs
              where x' = x `Set.intersection` afterPrev
                    afterPrev = unionMap (after row) prev

Applying solveForward to the example row above, we get

solveForward

In much the same way we can do a backwards pass. Instead of duplicating the code from solveForward it is easier to reverse the row, do a forward pass and then reverse the row again:

solveBackward = reverseRow . solveForward . reverseRow

Where reverseRow reverses the cells and swaps before/after and start/end:

reverseRow :: Row -> Row
reverseRow row = Row
    { cells  = reverse (cells row)
    , before = after row,   after = before row
    , start  = end   row,   end   = start  row }

In the running example even more cells will be known after doing a backwards pass,

solveBackward

These two steps together are as far as we are going to get with a single row, so let's package them up:

solveRow :: Row -> Row
solveRow = solveBackward . solveForward

In the end we hopefully have a row that is completely solved, or we might h We can determine whether this is the case by looking at the CellStates of the cells:

rowStates :: Row -> [CellState]
rowStates = map cellState . cells

rowDone, rowFailed :: Row -> Bool
rowDone   = not . any (== Indeterminate) . rowStates
rowFailed = any (== Error) . rowStates

Human solution strategies

By using just one single solution strategy we can in fact emulate most of the techniques humans use. The Wikipedia page on nongrams lists several of these techniques. For instance, the simple boxes technique is illustrated with the example:

The Haskell program gives the same result:

Nonograms> solveRow $ mkRow 10 [8]
[??######??]

The reason why humans need many different techniques, while a single technique suffices for the program is that this simple technique requires a huge amount of administration. For each cell there is a while set of values, which would never fit into the small square grid of a puzzle.

The whole puzzle

Just a single row, or even a list of rows is not enough. In a whole nonogram there are clues for both the rows and the columns. So, let's make a data type to hold both:

data Puzzle = Puzzle { rows, columns :: [Row] }
    deriving Eq

And a function for constructing the Puzzle from a list of clues,

mkPuzzle :: [[Int]] -> [[Int]] -> Puzzle
mkPuzzle rowClues colClues = Puzzle 
    { rows    = map (mkRow (length colClues)) rowClues
    , columns = map (mkRow (length rowClues)) colClues
    }

To display a puzzle we show the rows,

instance Show Puzzle where
    show = unlines . map show . rows
    showList = showString . unlines . map show

Initially the puzzle grids are a bit boring, for example entering in GHCi

Nonograms> mkPuzzle [[1],[3],[1]] [[1],[3],[1]]
[???]
[???]
[???]

We already know how to solve a single row, so solving a whole list of rows is not much harder,

stepRows :: Puzzle -> Puzzle
stepRows puzzle = puzzle { rows = map solveRow (rows puzzle) }

Continuing in GHCi:

Nonograms> stepRows previousPuzzle
[???]
[###]
[???]

To also solve the columns we can use the same trick as with reverseRow, this time transposing the puzzle by swapping rows and columns.

transposePuzzle :: Puzzle -> Puzzle
transposePuzzle (Puzzle rows cols) = Puzzle cols rows

But this doesn't actually help anything! We still display only the rows, and what happens there is not affected by the values in the columns. Of course when a certain cell in a row is filled (its cellState is Filled), then we know that the cell in the corresponding column is also filled. We can therefore filter that cell by removing all blank values

filterCell :: CellState -> Cell -> Cell
filterCell Blank  = Set.filter blank
filterCell Filled = Set.filter filled
filterCell _      = id

A whole row can be filtered by filtering each cell,

filterRow :: [CellState] -> Row -> Row
filterRow states row = row { cells = zipWith filterCell states (cells row) }

By transposing the list of states for each row we get a list of states for the columns. With filterRow the column cells are then filtered.

stepCombine :: Puzzle -> Puzzle
stepCombine puzzle = puzzle { columns = zipWith filterRow states (columns puzzle) }
    where states = transpose $ map rowStates $ rows puzzle

To solve the puzzle we apply stepRows and stepCombine alternatingly to the rows and to the columns. When to stop this iteration? We could stop when the puzzle is done, but not all puzzles can be solved this way. A better aproach is to take the fixed point:

solveDirect :: Puzzle -> Puzzle
solveDirect = fixedPoint (step . step)
    where step = transposePuzzle . stepCombine . stepRows

The fixed point of a function f is the value x such that x == f x. Note that there are different fixed points, but the one we are interested in here is found by simply iterating x, f x, f (f x), ...

fixedPoint :: Eq a => (a -> a) -> a -> a
fixedPoint f x
    | x == fx   = x
    | otherwise = fixedPoint f fx
  where fx = f x

The tiny 3*3 example can now be solved:

Nonograms> solveDirect previousPuzzle
[.#.]
[###]
[.#.]

But for other puzzles, such as the letter lambda from the introduction, we have no such luck:

Nonograms> solveDirect lambdaPuzzle
[??????????]
[??????????]
...

Guessing

To solve more difficult puzzles the direct reasoning approach is not enough. To still solve these puzzles we need to make a guess, and backtrack if it is wrong.

Note that there are puzzles with more than one solution, for example
and

To find all solutions, and not just the first one, we can use the list monad.

To make a guess we can pick a cell that has multiple values in its set, and for each of these values see what happens if the cell contains just that value. Since there are many cells in a puzzle there are also many cells to choose from when we need to guess. It is a good idea to pick the best one.

For picking the best alternative a pair of a value and a score can be used:

data Scored m a = Scored { best :: m a, score :: Int }

This data type is an applicative functor if we use 0 as a default score:

instance Functor m => Functor (Scored m) where
    fmap f (Scored a i) = Scored (fmap f a) i
instance Applicative m => Applicative (Scored m) where
    pure a = Scored (pure a) 0
    Scored f n <*> Scored x m = Scored (f <*> x) (n `min` m)

When there are alternatives we want to pick the best one, the one with the highest score:

instance Alternative m => Alternative (Scored m) where
    empty = Scored empty minBound
    a <|> b | score a >= score b  =  a
            | otherwise           =  b

Now given a list we can apply a function to each element, but change only the best one. This way we can find the best cell to guess and immediately restrict it to a single alternative. We can do this by simply enumerating all ways to change a single element in a list.

mapBest :: Alternative m => (a -> m a) -> [a] -> m [a]
mapBest _ []      =  pure []
mapBest f (x:xs)  =  (:xs) <$> f x         -- change x and keep the tail
                 <|> (x:) <$> mapBest f xs -- change the tail and keep x

This can also be generalized to Rows and whole Puzzles:

mapBestRow :: Alternative m => (Cell -> m Cell) -> Row -> m Row
mapBestRow f row = fmap setCells $ mapBest f $ cells row
    where setCells cells' = row { cells = cells' }

mapBestRows :: Alternative m => (Cell -> m Cell) -> Puzzle -> m Puzzle
mapBestRows f puzzle = fmap setRows $ mapBest (mapBestRow f) $ rows puzzle
    where setRows rows' = puzzle { rows = rows' }

What is the best cell to guess? A simple idea is to use the cell with the most alternatives, in the hope of eliminating as many of them as soon as possible. Then the score of a cell is the size of its set. The alternatives are a singleton set for each value in the cell.

guessCell :: Cell -> Scored [] Cell
guessCell cell = Scored
    { best  = map Set.singleton $ Set.toList cell
    , score = Set.size cell }

We can now make a guess by taking the best way to apply guessCell to a single cell:

guess :: Puzzle -> [Puzzle]
guess = best . mapBestRows guessCell

Putting it together

Direct solving is much faster than guess based solving. So the overall strategy is to use solveDirect, and when we get a puzzle that is not done we do a single guess, and then continue with direct solving all alternatives:

solve :: Puzzle -> [Puzzle]
solve puzzle
    | failed puzzle' = []
    | done   puzzle' = [puzzle']
    | otherwise      = concatMap solve (guess puzzle')
  where puzzle' = solveDirect puzzle

done, failed :: Puzzle -> Bool
done   puzzle = all rowDone   (rows puzzle ++ columns puzzle)
failed puzzle = any rowFailed (rows puzzle ++ columns puzzle)

Finally we can solve the lambda puzzle!

lambdaPuzzle = mkPuzzle
    [[2],[1,2],[1,1],[2],[1],[3],[3],[2,2],[2,1],[2,2,1],[2,3],[2,2]]
    [[2,1],[1,3],[2,4],[3,4],[4],[3],[3],[3],[2],[2]]

Nonograms> solve lambdaPuzzle
[.##.......]
[#.##......]
[#..#......]
[...##.....]
[....#.....]
[...###....]
[...###....]
[..##.##...]
[..##..#...]
[.##...##.#]
[.##....###]
[##.....##.]

Simple reflection of expressions

2008-01-29T23:00:00Z

This blog post is inspired by a message from Cale on #haskell yesterday. He came up with an amazing way to show how foldr and foldl work:

<Cale>      > foldr (\x y -> concat ["(f ",x," ",y,")"]) "z" (map show [1..5])
<lambdabot> "(f 1 (f 2 (f 3 (f 4 (f 5 z)))))"

While the output looks great, the call itself could be clearer, especially for beginners. Through a combination of overloading and small hacks it is possible to get the same result with a much nicer expression,

> foldr f x [1..5]
f 1 (f 2 (f 3 (f 4 (f 5 x))))

Let's get started

I will call this module SimpleReflect, since this is a poor mans form of reflection, converting code back to expressions at run time.

module SimpleReflect where

Our results will be 'expressions'. All we need to do with expressions is show them, convert them to strings.

The Show class has a function showsPrec :: Int -> a -> ShowS for converting a value of type a to a string. The ShowS type improves the performance compared to using strings; the integer is used for putting parentheses in the right places. But none of this matters for now, we will just emulate that behavior for our expression type:

newtype Expr = Expr { showExpr :: Int -> ShowS }

instance Show Expr where
    showsPrec p r = showExpr r p

The things like f and x will be variables these are just strings. Showing strings is easy,

var :: String -> Expr
var s = Expr { showExpr = \_ -> showString s }

In fact, we can show all kinds of values, for instance numbers. So we could make a function that lifts any showable value to an expression:

lift :: Show a => a -> Expr
lift x = Expr { showExpr = \p -> showsPrec p x }

While this is almost identical to var, it is not the same, because the Show instance for String is not the same as showString. Compare:

> var "x"
x
> lift "x"
"x"

From variables to functions

In your average piece of source code multiple expressions are combined with operators. The most common operator is function application, written with just whitespace. Each Haskell operator has a precedence level, indicating how tight that operator binds to its arguments.

In this blog post we only deal with left associative operators, which means that the left sub-expressions is printed with the same precedence level. A simple combinator for operators is then:

op :: Int -> String -> Expr -> Expr -> Expr
op prec op a b = Expr { showExpr = showFun }
 where showFun p = showParen (p > prec) $
                   showExpr a prec . showString op . showExpr b (prec + 1)

We would like to be able to use variables like f as if they were functions, so this f has to have the type f :: a -> Expr, or f :: a -> b -> Expr, etc. This can be done with type classes. The class FromExpr defines what things we can use expressions for:

class FromExpr a where
    fromExpr :: Expr -> a

Obviously expressions are themselves expressions,

instance FromExpr Expr where
    fromExpr = id

Any expression can also be used as a function. As stated above function application is the operator " "; it has precedence level 10, higher than any real operator. To be as generic as possible we can lift any showable argument to an expression.

instance (Show a, FromExpr b) => FromExpr (a -> b) where
    fromExpr f a = fromExpr $ op 10 " " f (lift a)

With FromExpr in place we can make more generic variables that can be used as any function type:

fun :: FromExpr a => String -> a
fun = fromExpr . var

With all this in place Cale's foldr example can now be written as

> foldr (fun "f") (var "x") [1..5]
f 1 (f 2 (f 3 (f 4 (f 5 x))))

Lifting the alphabet

To write even shorter examples a slightly evil idea is to simply define 26 variables,

a,b,c,.. :: FromExpr a => a

There is a minor problem with this idea, which will become apparent once you try it out:

*SimpleReflect> foldr f x [1..5]

<interactive>:1:8:
    Ambiguous type variable `b' in the constraints:
      `FromExpr b' arising from a use of `x' at <interactive>:1:8
      `Show b' arising from a use of `f' at <interactive>:1:6
    Probable fix: add a type signature that fixes these type variable(s)

The compiler doesn't know what the type of x should be. It is only used as an argument to f, but that can be any Showable type. In the future we might be able to write default FromExpr Expr (see the Haskell' wiki), but until then we will have to do something else.

Since usually the names f, g, etc. are used for functions, I chose to only overload those, and make the rest simple variables:

a,b,c,d,e,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z :: Expr
[a,b,c,d,e,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z]
 = [var [x] | x <- ['a'..'e']++['i'..'z']]

f,g,h :: FromExpr a => a
f = fun "f"
g = fun "g"
h = fun "h"

With our 26 new top level names we can finally the original example in a natural way,

> foldr f x [1..5]
f 1 (f 2 (f 3 (f 4 (f 5 x))))

Lifting numbers (a.k.a. lots-of-instances)

All this work for just foldr and foldl seems like a bit of a waste of time. To make things a little bit more interesting we could also add support for numeric operations. Then we can write

> sum [1..5] :: Expr
0 + 1 + 2 + 3 + 4 + 5

To do this we need to define instances of Num and Enum. The first of these is not very hard, but we need Eq and later Ord instances as well.

instance Eq Expr where
    a == b = show a == show b

The Ord class has two functions of type a -> a -> a, which is where we can do something interesting:

instance Ord Expr where
    compare a b = compare (show a) (show b)
    min = fun "min"
    max = fun "max"

Now we get minimum [x,y,z] ==> min (min x y) z for free.

The Num class has some operators. The mechanism for defining does is already in place, so this class should be simple:

instance Num Expr where
    (+)    = op 6 " + "
    (-)    = op 6 " - "
    (*)    = op 7 " * "
    negate = fun "negate"
    abs    = fun "abs"
    signum = fun "signum"
    fromInteger = lift

To write [1..5] :: [Expr], Expr needs to be an instance of Enum. Here we bump into a bit of a problem, how do we enumerate expressions?

Well, I will cheat a bit, and read out the expression as an integer. This operation is the inverse of lift, let's call it unlift:

unlift :: Read a => Expr -> a
unlift expr = read (show expr)

Conversion to Integers is usually done with toInteger from the Integral, so we add an instance for that as well. We need a Real instance first:

instance Real Expr where
    toRational = toRational . toInteger

instance Integral Expr where
    toInteger = unlift
    quot = op 7 " `quot` "
    rem  = op 7 " `rem` "
    div  = op 7 " `div` "
    mod  = op 7 " `mod` "
    -- someone forgot a default :(
    quotRem a b = (quot a b, rem a b)
    divMod  a b = (div  a b, mod a b)

Finally the Enum class. As I already said, the actual enumeration can be handled by going through toInteger and fromInteger.

instance Enum Expr where
    succ   = fun "succ"
    pred   = fun "pred"
    toEnum = fun "toEnum"
    fromEnum = fromEnum . toInteger
    enumFrom       a     = map fromInteger $ enumFrom       (ti a)
    enumFromThen   a b   = map fromInteger $ enumFromThen   (ti a) (ti b)
    enumFromTo     a   c = map fromInteger $ enumFromTo     (ti a)        (ti c)
    enumFromThenTo a b c = map fromInteger $ enumFromThenTo (ti a) (ti b) (ti c)
ti = toInteger -- just to fit in the page layout of the blog

Playing a bit

None of the above was terribly complicated, just a lot of boilerplate code. What can we do with such a well-plated boiler? Here are some examples:

> sum $ map (*x) [1..5]
0 + 1 * x + 2 * x + 3 * x + 4 * x + 5 * x

> iterate (^2) x
[x, x * x, x * x * (x * x), x * x * (x * x) * (x * x * (x * x)), ...

> scanl f x [a,b,c]
[x, f x a, f (f x a) b, f (f (f x a) b) c]

> zipWith3 f [1,2..] [1,3..] [1,4..] :: [Expr]
[f 1 1 1, f 2 3 4, f 3 5 7, f 4 7 10, f 5 9 13, f 6 11 16, ...

Coming soon to a lambdabot near you.

References, Arrows and Categories

2007-11-06T23:00:00Z

Recap: functional references

Last time (okay, it was over two months ago) I talked about overloading functional references so that they can be used both as regular functions and as references. The data type of references I used was

data FRef s a = FRef
      { get :: s -> a
      , set :: a -> s -> s
      }

While I arrived at the type class,

class Ref r where
      ref :: (a -> b) -> (b -> a -> a) -> r a b
      (.) :: r b c -> r a b -> r a c

This provides 'construction' and 'composition'.

Arrows

The class parameter r has kind * -> * -> *, meaning it takes two types and 'evaluates' to a type. If you are familiar with the Haskell libraries you may know there is a similar class who's parameter also has kind * -> * -> *, called Arrow. There is an instance Arrow (->), just like there I defined an instance Ref (->).

According to the arrows webpage, "Arrows are a new abstract view of computation". This raises the question: Can we combine these ideas? Are functional references arrows?

That Arrow class looks like

class Arrow a where
      arr :: (b -> c) -> a b c
      (>>>) :: a b c -> a c d -> a b d
      -- some more stuff

If you look closely, (>>>) is just (.) with the arguments reversed. What about arr? arr should turn any function into a reference. But references need a way to transform the result back to be able to set the new value.

Clearly this is not going to work. The problem is that Arrows are not general enough!

The easy fix

What we need here is to make Ref a superclass of Arrow. All current Arrows can implement (.), since it is the same as (>>>). To implement ref an arrow just ignores the setter, then ref becomes the same as arr.

Okay, we're done.

Except that we will have the same problem again the next time someone wants to generalize Arrows. It would be much better to fix it once and for all.

Making it useful again

Now that we have kicked arr and friends out of the type class lots of types can become instances. On the other hand, the class itself has become pretty useless.

Before going to functional references there is a more general notion, invertible functions. These are discussed in relation to arrows in "There and back again: arrows for invertible programming". The only way to be sure that a function is invertible is to give its inverse. In a data type that could look like

data Invertible a b = Invertible
      { forward  :: a -> b
      , backward :: b -> a
      }

To put that in the Arrow/Category framework we can add a subclass InvArrow. It is similar to the Ref class, only for invertible functions instead of references.

class Category cat => InvArrow cat where
      arrInv :: (a -> b) -> (b -> a) -> (a ~> b)

      -- We get a default implementation for id.
      -- Note that this is not valid Haskell, we would need something like class aliases.
      id = arrInv (\x -> x) (\x -> x)

What does it mean if a type/category is an instance of InvArrow? It means that that category contains all invertible functions. Read this statement carefully. An InvArrow does not mean that morphisms in the category are invertible, but that invertible functions can be turned into morphisms.

With InvArrow we already get all kinds of interesting morphisms, for example

negate :: (InvArrow cat, Num a) => cat a a
(+) :: (InvArrow cat, Num a) => a -> cat a a

> update negate (+1) 3 == 2 -- increment the negation by 1, so decrement by one
> set (3+) undefined 4 == 1 -- find a value x such that 3+x == 4

This last example is a bit ugly, because we use function references. It will look better if we use the Invertible type. The function similar to set is inverse:

-- Get the inverse of an invertible function
inverse :: InvArrow cat => Invertible a b -> cat b a
inverse i = arrInv (backward i) (forward i)

Now we can write the above example without undefined:

> inverse (+3) 4 == 1

To references and beyond

Inverses are nice, but we haven't got references yet. There is no way to define fst :: InvArrow cat => cat (a,b) a.

For that we really need the Ref class, or in this arrow framework, RefArrow:

class InvArrow cat => RefArrow cat where
      arrRef :: (a -> b) -> (b -> a -> a) -> cat a b

      -- A default implementation of @arrInv@
      arrInv f g = arrRef f (\b a -> g b)

Like with InvArrow, if a category type is an instance of RefArrow, it means that that category contains all functional references.

Finally, the least restrictive class is the regular old Arrow,

class RefArrow cat => Arrow cat where
      arr :: (a -> b) -> cat a b

      arrRef f _ = arr f

To summarize, we now have a class hierarchy that looks like

Category => InvArrow => RefArrow => Arrow

The rest of the `Arrow` class

If you look back to the definition of Arrow I gave above, you will see

      -- some more stuff

Besides lifting (arr) and composition (>>>) the standard Arrow class also defines combinators for working with tupled values.

We could put these in the new Arrow class, but they might also be useful for types which are not full arrows. Like, say, functional references.

The most flexible thing to do is to put this functionality in yet another class. For working with pairs we can define

class Category cat => CategoryPair cat where
      first  :: cat a b -> cat (a,c) (b,c)
      second :: cat a b -> cat (c,a) (c,b)
      (***)  :: cat a b -> cat c d -> cat (a,c) (b,d)

There are some tricky issues to work out, but this post is already five pages long.

I am going to stop here. Pairs, sum types, fixed points, monoids and duality all will have to wait until next time.

The code

That was a long story, and I even stopped way before the end and skipped the instances.

The generalized arrow/category framework is growing into a useful library that hopefully someday can become part of the base libraries. I have decided to put the code somewhere. In this case, somewhere is

darcs get http://code.haskell.org/category

As the name suggests, this library is not just for functional references. Rather it contains the whole Category framework. The FRef type is just a bonus.

The library also contains code for deriving RefArrow functions for record fields, courtesy to omnId.

footnotes
†: I am in no way a category theory expert; Category theorists feel free to hate me for abuse of terminology and incorrect explanations. In particular, the type constructor cat is not really the category itself, just like the f in Functor f is not really a functor. But it is the closest thing we have got.

Overloading functional references

2007-09-02T22:00:00Z

Recently there have been some blog post and mailing list messages about "functional references". In this message I will look into ways to improve upon that concept.

The above links should give you an idea of what a functional reference is, but I will explain it here in my own words. You can skip this introduction if you already know what functional references are.

What are functional references?

A functional reference is a data structure that can be used to update parts of another structure, it is a reference into that structure. We need a way to get the part, and a way to replace the part by setting it to something else. This leads to the data type:

data FRef s a = FRef
      { get :: s -> a
      , set :: a -> s -> s
      }

Now FRef s a represents a reference to an a inside an s structure.

One of the simplest possible (non-trivial) references is that to the first part of a pair:

fst :: FRef (x,y) x
fst = FRef
      { get = \(x,y) -> x
      , set = \x (_,y) -> (x,y)
      }

Having defined this, we can use it to access and modify pairs, for example

> get fst (1,2)
1
> set fst 3 (1,2)
(3,2)

You can read this as "get the first part of ..." and "set the first part to 3 in ...".

Another useful function is update,

update :: FRef s a -> (a -> a) -> (s -> s)
update ref f s = set ref (f (get ref s)) s

Update gets the value, applies a function, and sets it again. This allows us to 'map' functions over parts of data structures:

> update fst (+1) (1,2)
(2,2)

The real power of functional references lies in their composability. Like functions, we can compose two references to give a new one

compose :: FRef b c -> FRef a b -> FRef a c
compose bc ab = FRef
      { get = get bc . get ab
      , set = update ab . set bc
      }

We can now modify nested pairs:

> update (fst `compose` fst) (*2) ((3,4),5)
((6,4),5)

Use case: records

The place where these references shine is with records. Say we have the following data type:

data Employee = Employee
      { name   :: String
      , salary :: Int
      }

It would be great if name and salary where references, then we could simply say

giveRaise = update salary (+100)

This shouldn't be too hard to automate with Data.Derive, the functions would look like

name = FRef
      { get = name_
      , set = \n e -> e { name_ = n }
      }

Or better yet, we could specify references as the default behavior in the next Haskell standard!

There is a problem, however, when we have defined the record accessors to be references. Take the normally legal code

johnsSallary = salary john

This is no longer valid, since salary is not a function. Instead we must write

johnsSallary = get salary john

Type classes to the rescue

Fortunately there is a clean solution to this problem using type classes. We can define the type class

class Ref r where
      ref :: (a -> b) -> (b -> a -> a) -> r a b

We can define an instance for functional references,

instance Ref FRef where
      ref = FRef

As well as for functions

instance Ref (->) where
      ref = const

The record accessors can now be defined as

name :: Ref r => r Employee String
name = ref name_ (\n e -> e { name_ = n })

And all is well again:

giveRaise = update salary (+100)
johnsSallary = salary john

While we are at it, we could also add the (.) operator to the class,

class Ref r where
      ref :: (a -> b) -> (b -> a -> a) -> r a b
      (.) :: r b c -> r a b -> r a c
instance Ref FRef where
      ref = FRef
      (.) = compose
instance Ref (->) where
      ref = const
      (.) = (Prelude..) -- the (.) from the prelude

now

giveRaiseToFirst = update (salary . fst) (+100)

Gives a raise to the first employee in a pair.

Concluding remarks

Note that all this is perfectly valid Haskell 98 code, no extensions are needed. This means that it should not be hard to add such references to the language standard.

There are many more neat things you can do with functional references, but I will save that for another time.

Knuth-Morris-Pratt in Haskell

2007-04-15T22:00:00Z

A request that comes up regularly on the Haskell mailing list is for a function to determine whether one string (the needle) is a substring of another one (the haystack). While there is no such function in the Haskell standard library^†, it is easy enough to implement:

import Data.List
as `isSubstringOf` bs = any (as `isPrefixOf`) (tails bs)

Unfortunatly, this function has a worst case time complexity of O(length as * length bs). For example if we evaluate

"aaaaaaaaaab" `isSubstringOf` replicate 100 'a'

We will first match 10 characters starting from the first position and fail just before we matched the entire string. Then, starting from the second position, we will match 10 characters again, etc. In total we we will do 11 * 100 = O(length as * length bs) comparisons.

There exists an algorithm called the Knuth-Morris-Pratt string searching algorithm which has a much better, O(length as + length bs), worst case behavior. Unfortunately all descriptions you find of the algorithm rely on building a table, and using random access patterns on it. Not only does this make it impossible to use simple data structures like lists, it also obfuscates the underlying idea.

The idea

The core idea of the algorithm is that we only want to process each character of both strings once. This is done by building a table from the needle, and using that table to determine what should be done after each character of the haystack. Either the entire needle has been matched at that point and we are done, or we get a new position in the table to use for the next character.

So, let's turn the above description into a Haskell datatype!

data KMP a = KMP
      { done :: Bool
      , next :: (a -> KMP a)
      }

Clearly, if we know how to make such a 'table' the matching process is straight forward. We need to apply next to each character and we want to know if any of the intermediate tables are done:

isSubstringOf2 :: Eq a => [a] -> [a] -> Bool
isSubstringOf2 as bs = match (makeTable as) bs
   where  match table []     = done table
          match table (b:bs) = done table || match (next table b) bs

This can be made shorter using functions from the Prelude:

isSubstringOf3 as bs = any done $ scanl next (makeTable as) bs

Making the table

All that is left is to make a table, constructing it using a simple recursive function is not an option

makeTable1 :: Eq a => [a] -> KMP a
makeTable1 []     = KMP True  undefined?
makeTable1 (x:xs) = KMP False (\c -> if c == x then makeTable1 xs else ????)

Because what do we do if we don't have a match? Let's look at an example, the calculation "abc" `isSubstringOf` "aabc" would go something like:

makeTable "abc" = table0
done table0 = False
next table0 'a' = (\c -> if c == 'a' then table1 else ????) 'a' = table1
done table1 = False
next table1 'a' = (\c -> if c == 'b' then table2 else ????) 'a'
                = ???? -- what to do now?

What we should do, is start over, but dropping the first character from the input, in this case that gives

-- start over, now for "abc" `isSubstringOf` "abc"
next table0 'a' = (\c -> if c == 'a' then table1 else ????) 'a' = table1
done table1 = False
next table1 'b' = (\c -> if c == 'b' then table2 else ????) 'b' = table2
done table2 = False
next table2 'b' = (\c -> if c == 'c' then table3 else ????) 'c' = table3
done table3 = True

The trick

At first glance it would seem that we have to reexamine parts of the haystack when we start over. But this is not the case.

If, for example the test of table35 fails, we don't have to move back 35 characters, because we already know what those characters are, namely the characters we matched to get to table35! So the table in case of a failed match is always the same, and we can compute that as well.

Lets look again at the makeTable function. If f is the table we get for a failed match, we call next f the failure function, and pass it along as a second parameter. For the first character, in case of a failed match we simply and start from the beginning for the next character:

makeTable :: Eq a => [a] -> KMP a
makeTable xs = table
   where table = makeTable' xs (const table)

Notice we have tied the knot, table depends on table itself! In Haskell this is not a problem because of lazy evaluation, as long as we don't try to use what is not computed yet.

The makeTable' function is where the real work happens.

makeTable' []     failure = KMP True failure
makeTable' (x:xs) failure = KMP False test
   where  test  c = if c == x then success else failure c
          success = makeTable' xs (next (failure x))

The base case is not very interesting, although we can now use something better than undefined. That becomes useful when looking for multiple matches.

The interesting clause is for (x:xs). The next function compares a character c against x.
Is it the same? Great, move to the table for xs.
Is it different? Then look at the failure function.

Finally, to determine the table for xs, we need a new failure function, describing what would have happened if we started later and ended up at the position after x. We can ask the current failure function what would have happened in that case, next (failure x).

Correctness

It would be nice if we could be sure that what we have constructed is actually a substring matching algorithm. The easiest way to verify that I use a simple QuickCheck property:

prop_isSubstringOf :: [Bool] -> [Bool] -> Bool
prop_isSubstringOf as bs = (as `isSubstringOf` bs) == (as `isSubstringOf2` bs)

> Test.QuickCheck.test prop_isSubstringOf
OK, passed 100 tests.

It seems to work, that's great.

An interesting exercise would be to prove that what I have made here is equivalent to the naïve algorithm using equational reasoning. Also nice would be comparing it to the imperative Knutt-Moris-Pratt algorithm, is this actually KMP? Maybe next time.

footnotes
† Actually, this function was recently added under the, in my opnion, wrong^‡ name isInfixOf.
‡ It is wrong because while "a is a prexif of b" and "a is a suffix of b" are valid English sentences, there is as far as I know no such thing as "an infix of". Maybe "infix in", but not "of". </rant>

Twan van Laarhoven's blog

Type theory with indexed equality - the theory

The equality type

Transport

Evaluating transport

Transport for equality types

A note about transitivity

Inductive types

The homotopy circle

Truncation

Quotient types

Indexed types

Univalence

Computation rules

Transporting univalent paths

Reduction rules spoiled by univalence

Conclusion

Traversing syntax trees

A type theory based on indexed equality - Implementation

Stream fusion for streaming, without writing any code

Extra unsafe sequencing of IO actions

Dependent equality with the interval

cong from refl in univalent OTT

Representation

Food for thought

Substitution from congruence in univalent OTT

Does it compute

The complete correctness of sorting

What it means to be sorted

What it means to be a permutation

A monad for keeping track of the runtime

Logarithms

Vectors versus lists

Extension: expected runtime

Another extension: lower bound on runtime

Categories over pairs of types

Benchmark: unpacked values in containers

Building pipes with monad transformers

Producers

Pipes

Consumers, take 2

General consumers and producers

What to do with the results of upstream pipes

My blog software

From files to blog posts

Conduits vs. Pipes

Dependently typed DAGs

Data types

Instances

Convert to tree

Convert from a tree

Lifting boxes

Box is a monad

Bonus: alternative definition of Box

Finding rectangles, part 3: divide and conquer

A rectangle is two brackets

Faster searching

From brackets to rectangles

Divide and conquer

Search trees without sorting

Bounds

Implementation

Constructing

Example uses

When is it efficient?

Closing example

Finding rectangles, part 2: borders

Specification

An O(n4) algorithm

An O(n3) algorithm

Conclusions

Finding rectangles

Specification

What is 'largest'?

Machinery

Finding lines

Finding maximal rectangles

Conclusion

A small rant on writing academic papers

Isomorphism lenses

An O(n⁴) algorithm

An O(n³) algorithm

The rest of the `Arrow` class