Some time ago, someone asked how to encoded GADTs, *Generalized algebraic datatypes* in PureScript. That's actually a very interesting question!

*Warning*: lot of code with higher order types ahead!

Let's start from the very beginning. What are *generalised abstract datatypes*, GADTs? Ordinary algebraic data types can be parametrised by a type parameter, for example `Maybe`

or `List`

. However GADTs gives us ability to be less parametric, so that some constructors can produce only a value with some type argument.

The often used example is an expression type, and we will use one later. But let's begin with a request type to do interact with GitHub:

```
data GitHubRequest
= Repositories OwnerName
| Issues OwnerName RepoName
```

and we want to execute this request:

`exec :: GitHubRequest -> IO ???`

We hit a problem we cannot state the type. One solution is to have separate execution functions, like

```
repositories :: OwnerName -> IO [Repository]
issues :: OwnerName -> RepoName -> IO [Issue]
```

but it's not optimal. We cannot abstract over the requests. We could end up with lots of similar functions, normal, one using authentication, limiting the response.

```
repositories :: OwnerName -> IO [Repository]
repositories' :: Auth -> OwnerName -> IO [Repository]
repositories'' :: Auth -> Maybe MaxSize -> IO [Repositories]
```

The next step, still *Haskell98* is to use *phantom types*:

```
data GitHubRequest a =
= Repositories OwnerName
| Issues OwnerName RepoName
```

the only change is the additional type parameter, which isn't used anywhere. Now we can write function with type signature

`repositoriesReq :: OwnerName -> GitHubRequest [Repository]`

This is an improvement. Now we can combine `GitHubRequest`

with `Auth`

to perform authenticated requests But somehow this is still not good enough. We need to make `Request`

type opaque (e.g. use smart constructors), otherwise someone might construct a request value of the meaningless type. And the compiler won't catch that:

`Issues ownerName repoName :: GitHubRequest [Repositories]`

GADTs gives us ability to restrict the type of constructors:

```
data GitHubRequest a where
Repositories :: OwnerName -> GitHubRequest [Repository]
Issues :: OwnerName -> RepoName -> GitHubRequest [Issue]
```

Using this type, it's very straightforward to implement execution function

`execRequest :: GitHubRequest a -> IO a`

This pattern of defining "an expression language", in this case language of GitHub requests and separate interpreter, the `execRequest`

, is very common pattern in (typed) functional programming.

For the rest of the gist, we'll however concentrate on a different expression language. THis way we can have the code self contained, and hopefully help concentrate on actual type juggling. One can mechanically apply the ideas the GitHubRequest type, if you want so.

As this post is born as a literate Haskell file, first we need few language extensions and imports:

```
{-# LANGUAGE GADTs, RankNTypes, TypeOperators, ScopedTypeVariables, EmptyCase #-}
-- Leibniz equality
import Data.Eq.Type as Leibniz
import Data.Type.Equality as Eq
import Data.Functor.Product
```

## GADT

Let's start with GADTs as we do them in Haskell nowadays:

```
data GExpr a where
GZero :: GExpr Int -- 0
GSucc :: GExpr Int -> GExpr Int -- succ
GBool :: Bool -> GExpr Bool -- True / False
GEq :: GExpr a -> GExpr a -> GExpr Bool -- (==), this clause include existentials!
```

`GZero`

produces an expression of type `Int`

, and so on. In this example we also have an existential quantification in `GEq`

constructor. The type variable `a`

is not visible outside. Yet both parameters of `GEq`

must have the same index.

So how to encode this in a language without support for GADTs? The only thing we need is `RankNTypes`

(and some sort of impredicativity).

First let's talk about type equality. With GADTs we can have an actual witness of `(a ~ b)`

type equality constraint:

```
data a :~: b where
Refl :: a :~: a
```

but this is formulated as a GADT itself! Luckily we can use a different encoding of the equality, It seems a bit strange at first:

`data a := b = Refl { subst :: forall c. c a -> c a }`

Leibnizian equality states that two things are equal if you can substitute one for the other in all contexts.

The *all* is essential here.

Then we can encode GADTs as ADT by providing equality witness explicitly. For example the `GZero`

constructor is implemented under the hood as

`GZero :: Int ~ a => GExpr a`

so by dictionary translation (https://skillsmatter.com/skillscasts/8978-checking-and-translating-type-classes) we get

`LZero :: (Int := a) -> LExpr a`

Cool isn't it?

## Exists

The second complication, is that we need a way to encode existential quantification. Fortunately it's possible to encode using universal one. This is a real mind twister: The key idea is that `f a`

appears in the negative position. As actual type `a`

is existentially hidden, we need to provide functions which work with all possible `a`

to make use of the value.

```
-- | Existential quantification using the universal one.
newtype Exists f = MkExists { runExists :: forall r. (forall a. f a -> r) -> r }
```

```
-- For example existential list is isomorphic to integer
mkExists :: forall f a. f a -> Exists f
mkExists x = MkExists $ \f -> f x
existsEx :: Exists []
existsEx = mkExists ("foobar" :: [Char])
existsEx2 :: Exists [] -> Int
existsEx2 e = runExists e length
-- | This is 6
existsEx3 :: Int
existsEx3 = existsEx2 existsEx
```

## GADT as ADT

Now we have all needed ingredients to write `GExpr`

as an "ordinary" ADT.

```
data LExpr a
= LZero (Int := a)
| LSucc (Int := a) (LExpr a)
| LBool (Bool := a) Bool
| LEq (Bool := a) (Exists (LExpr :*: LExpr))
-- | Infix operators are handy!
type (:*:) = Product
-- To make work with this form a bit easier, we define smart constructors:
lzero :: LExpr Int
lzero = LZero refl
lsucc :: LExpr Int -> LExpr Int
lsucc = LSucc refl
lbool :: Bool -> LExpr Bool
lbool = LBool refl
leq :: LExpr a -> LExpr a -> LExpr Bool
leq u v = LEq refl (mkExists $ Pair u v)
```

And to proof that the forms are equivalent we can write functions transforming from one to another.

From GADT to ADT direction is easy, GHC does magic for us.

```
fromGExpr :: GExpr a -> LExpr a
fromGExpr GZero = lzero
fromGExpr (GSucc n) = lsucc (fromGExpr n)
fromGExpr (GBool b) = lbool b
fromGExpr (GEq u v) = leq (fromGExpr u) (fromGExpr v)
```

The reverse direction is more tricky as now we need to use 'subst' explicitly. Also to go subst in opposite direction we need to manually construct another proof.

```
toGExpr :: LExpr a -> GExpr a
toGExpr (LZero proof) = subst proof $ GZero
toGExpr (LSucc proof n) = subst proof $ GSucc $ subst (symm proof) $ toGExpr n -- symm :: a := b -> b := a
toGExpr (LBool proof b) = subst proof $ GBool b
toGExpr (LEq proof uv) = subst proof $ runExists uv $ \(Pair u v) ->
GEq (toGExpr u) (toGExpr v)
```

## Using the GADT as ADT

Now we have encoded our types, let's use them. Not surprisingly using GADT version is quite convenient. For example getting a Int value out:

```
gToInt :: GExpr Int -> Int
gToInt GZero = 0
gToInt (GSucc n) = succ (gToInt n)
```

GHC is smart enough to not complain about `GBool`

and `GEq`

cases, as they cannot happen, because we have `GExpr Int`

, not `GExpr Bool`

.

Let's try with a `LExpr`

version!

```
lToInt :: LExpr Int -> Int
lToInt (LZero _proof) = 0
lToInt (LSucc _proof n) = succ (lToInt n)
-- But now exhaustiveness checker complains about LBool and LEq cases?
-- Well, as we have Bool := Int "proof", we'll just use it!
lToInt (LBool proof b) = coerce proof b -- coerce :: a := b -> a -> b
lToInt l@(LEq proof _uv) = coerce proof $ lToBool $ subst (symm proof) l
```

Here we went a bit ahead, and used not yet defined `lToBool`

. Let's define it's GADT counter part first, as it's simpler.

```
gType :: GExpr a -> Either (a :~: Int) (a :~: Bool)
gType GZero {} = Left Eq.Refl
gType GSucc {} = Left Eq.Refl
gType GBool {} = Right Eq.Refl
gType GEq {} = Right Eq.Refl
gToBool :: GExpr Bool -> Bool
gToBool (GBool b) = b
gToBool (GEq u v) = case gType u of
Left Eq.Refl -> gToInt u == gToInt v
Right Eq.Refl -> gToBool u == gToBool v
-- Here we learn what existential parameter is, Int or Bool so we can use
-- the right equality!
```

Manual approach is actually not magical at all. We just extract the proof from inside the data type:

```
lType :: LExpr a -> Either (a := Int) (a := Bool)
lType (LZero proof) = Left $ symm proof
lType (LSucc proof _) = Left $ symm proof
lType (LBool proof _) = Right $ symm proof
lType (LEq proof _ ) = Right $ symm proof
-- Then the lToBool is straight-forward, though tedious, to write:
lToBool :: LExpr Bool -> Bool
lToBool e@(LZero proof) = coerce proof $ lToInt $ subst (symm proof) e
lToBool e@(LSucc proof _) = coerce proof $ lToInt $ subst (symm proof) e
lToBool (LBool proof b) = coerce proof b
lToBool (LEq _proof e) = runExists e $ \(Pair u v) -> case lType u of
Left proof -> lToInt (subst proof u) == lToInt (subst proof v)
Right proof -> lToBool (subst proof u) == lToBool (subst proof v)
```

## Interlude

We have shown how to encode GADTs with existentials using only *RankNTypes*. It's possible. However, there is non-trivial programming cost, as well, as probably non-trivial run-time performance cost as well.

The type equality constraints, and GADts do really help. We are able to express more problems in less space. Everything seems to be possible with Leibniz equality and `Exists`

, but at what cost!

## Few words about falsehood

As we have general recursion, we can cheat and construct a proof of any types being equal, for example

```
falsehood :: Int := Bool
falsehood = Leibniz.Refl proof
where
proof x = proof x
```

but this isn't very useful, obviously. Meta theory prevents us (or at least should!) from constructing terminating terms of such type.

On the other hand, with GADT equality definition, we can use ex falso quadlibet principle directly (we need `EmptyCase`

extension for this):

```
boom :: Int :~: Bool -> a
boom p = case p of {}
```

```
bazinga :: Int := Bool -> a
bazinga (Leibniz.Refl _f) = undefined -- ???
```

With built-in type-equality, the compiler knows that `Int`

ins't `Bool`

, but we give it a proof that it is. The principle ex falso quodlibet `P /\ ~P => _|_`

says that if we know "P" and not "P" then any proposition can be proven (or value of any type constructed).

In the *RankNType* framework (i.e. System F), the falsehood is encoded as "value of any type": forall a. a. With GADTs GHC gives us the proof, but here we need to construct it ourselves. And it's something we cannot do.

## Comparing expressions

One advanced exercise is heterogeonous equality. Given `GExpr a`

and `GExpr b`

we'd like to compare them. We have already seen that, we can recover the type argument via `gType`

, but direct comparison is still interesting:

In the definition we use do-notation, `Maybe`

is a monad! The trick I recently learn in Simon Peyton Jones keynote as Haskell eXchange There, by the way, similar function was defined: `typeEq :: TypeRep a -> TypeRep -> Maybe (a :~: b)`

(There is a type class with such operation: `GEq`

)

```
gequals :: GExpr a -> GExpr b -> Maybe (a :~: b)
gequals GZero GZero = Just Eq.Refl
gequals (GSucc n) (GSucc m) = do
Eq.Refl <- gequals n m
pure Eq.Refl
gequals (GBool a) (GBool b)
| a == b = Just Eq.Refl
| otherwise = Nothing
gequals (GEq x y) (GEq u v) = do
Eq.Refl <- gequals x u
Eq.Refl <- gequals y v
pure Eq.Refl
gequals _ _ = Nothing -- otherwise we give up
```

The Leibniz variant is far more complicated, as you could guessed.

```
lequals :: LExpr a -> LExpr b -> Maybe (a := b)
lequals (LZero p) (LZero q) = Just (Leibniz.trans (symm p) q)
lequals (LSucc p n) (LSucc q m) = do
_proof <- lequals n m
pure (Leibniz.trans (symm p) q)
lequals (LBool p a) (LBool q b)
| a == b = Just (Leibniz.trans (symm p) q)
| otherwise = Nothing
lequals (LEq p xy) (LEq q uv)
| existsEquals lequals' xy uv = Just (Leibniz.trans (symm p) q)
| otherwise = Nothing
lequals _ _ = Nothing
lequals' :: (LExpr :*: LExpr) a -> (LExpr :*: LExpr) b -> Maybe (a := b)
lequals' (Pair x y) (Pair u v) = do
proof <- lequals x u
_proof <- lequals y v -- it's irrelevant which proof we pick!
pure proof
```

We used a little helper to compare two existentials:

```
existsEquals
:: (forall a b. f a -> f b -> Maybe (a := b))
-> Exists f -> Exists f
-> Bool
existsEquals eq ex ey =
runExists ex $ \x ->
runExists ey $ \y ->
case eq x y of
Just _ -> True
Nothing -> False
-- >>> gequals (GEq GZero GZero) (GEq GZero GZero)
-- Just Refl
-- >>> lequals (leq lzero lzero) (leq lzero lzero)
-- Just Leibniz.Refl
```

Before the conclusion, we used an orphan instance for above:

```
instance Show (a := b) where
showsPrec _ _ = showString "Leibniz.Refl"
```

## Conclusion

There are familiar posts: Approximating GADTs in PureScript and referred paper GADTless Programming in Haskell 98 by Martin Sulzmann and Meng Wang. Both however miss existentials, which I important for the encoding of many GADts.

There aren't many reddit comments for the post. I agree, it's great that it's great you can solve problems without extending the language. On the other hand, I have no idea how inconvenient it is to work with GADTs encoded this way in the long run. It helps when the compiler (= machine) does at least some work for you!

Writing the GADTs encoding in PureScript felt a bit like writing type-level trickery in Haskell. With latter you might tray with *Agda* to convince yourself it's possible, and then encode using features available in GHC Haskell. Direct exploratory programming is quite hard, you have to know what you are doing.

Also there were concerns about writing binary functions. Hopefully the last equals example shows it's possible. Even you cannot write `unsafeCoerce`

from a `Int := Bool`

fact, you can still coerce non-matching types to each other:

```
sum :: forall a . T a -> T a -> T a
sum (B b1 p) (B b2 _) = B (b1 || b2) p
sum (I n1 p) (I n2 _) = I (n1 + n2) p
sum (I n p) (B b q) _ = I (n + b') p
where
b' = coerce (trans ...) b
```

However the `RankNTypes`

approach has limitations. SPJ mentioned the heterogeneous type equality: `data (a :: k) :~~: (a :: k')`

. I'm not sure what's the story with polymorphic kinds in PureScript, but if at some point you want to have that, you cannot, as we'd need a way to abstract over kinds:

```
data _≅′_ {k : Set} (a : k) {k′ : Set} (b : k′) : Set₁ where
subst' : (keq : (kc : Set → Set) → kc k → kc k′)
→ (c : k′ → Set) → (c (keq (λ t → t) a) → c b)
→ a ≅′ b
```

And before I stop, you might wonder: this all is in Haskell, is it actually possible in PureScript? Yes it is. Check the gist: https://gist.github.com/phadej/1e3cbf35495c2951860cd9effb7723e1.