summaryrefslogtreecommitdiff
path: root/sec/encoding.ltx
blob: becab5e86063b3ae842e78f5b7c2a0c142bff3f4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
\documentclass[../main.tex]{subfiles}

\begin{document}
\section{Encoding Artyst}%
\label{sec:encoding}

We have devised an encoding of \lang{} into System~T. The encoding has
seven phases. In general, each phase removes a specific type
constructor until only naturals and function types remain.  Sometimes
removing types requires introducing others; we will introduce lists
and C-style unions, which we will later need to remove. The full list
of seven phases are:
\begin{enumerate}
\item changing the \roll{} operator so that all inductive components
  are collected together in a list.
\item encoding inductive types using a list-indexed heap.
\item encoding lists using eliminators.
\item introducing unions to encode sums as a tagged union.
\item encoding products as an indexed union.
\item encoding unions of System~T types.
\item removing syntactic sugar we introduced, such as the \arb{} operator that
  represents an arbitrary value of a given type.
\end{enumerate}

We will give two running examples throughout, both with regards to the
binary tree type \TODO{type syntax: \(\mu X. (\nat \to \nat) + X \times X\)},
where leaves are labelled by unary functions on naturals. In our first
example we construct a balanced binary tree of depth \(n + 1\),
filling leaves with a given function \systemtinline{f}. This will
demonstrate how we encode constructors for inductive types.
\begin{listing}[H]
\begin{systemt}
let balanced n f = primrec n with
  Zero     => roll (Leaf f)
| Suc tree => roll (Branch (tree, tree))
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

We will also use the \systemtinline{compose} function given in the
introduction. This will show how we encode destructors for inductive
types.

\subsection{Phase 1: Simplifying Roll}%
\label{subsec:simplify-roll}

Recall the typing judgement for \roll{} in \cref{M-fig:lang-ty}. The
argument has type \(\sub{A}{X/\mu X. A}\). Using substitution means that
inductive components can appear scattered throughout a vessel of this
type. Take the inductive type \(\mu X. \lgroup (1 + \nat \times X + \mu Y. (1 +
X \times Y)) \times (1 + X) \rgroup\) as an example. The inductive parameter
\(X\) appears in three seperate locations, including within another
inductive type. A vessel of this shape could have any number of
inductive components.

Collecting all the inductive components into one location will make
future encoding steps much easier. We enforce this by removing the
\roll{} operator and adding the \roll*{} operator, which has the
following typing judgement.
\[
\begin{prooftree}
  \hypo{\judgement{\Gamma}{t}{\mathsf{List}~(\mu X.A)}}
  \hypo{\judgement{\Gamma}{u}{\sub{A}{X/\nat}}}
  \infer2{\judgement{\Gamma}{\roll*~t~u}{\mu X. A}}
\end{prooftree}
\]
Rather than including the inductive components within the argument of
\roll{}, we instead collect them into an external list. We fill the
vessel \(u\) with pointers to the values within \(t\). The new
operator satisfies the following equation:
\[
\dofold{\roll*~t~u}{x}{v} \coloneq \sub{v}{x/\mapkw{}~(\lambda i. \dofold{\mathsf{index}~t~i}{x}{v})~u}
\]

To transform an argument for \roll{} \(x\) into the arguments for
\roll*{}, we take the projections of the term
\[
  \mathsf{distrib}^{\nat}_{\Lambda X. A}~(
    \maptm{X}{A}~(
      \lambda y, acc. \tuple{\mathsf{snoc}~acc~y, \mathsf{length}~acc}
    )~x
  )~\mathsf{nil}
\]
The \mapkw{} replaces each inductive component \(\mu X. A\) with an
accumulator \(\mathsf{List}~(\mu X. A) \to \mathsf{List}~(\mu X. A) \times
\nat\), which extends a list with the component and records the
component's index in the list. We then ``distribute'' the accumulator
over the vessel using the \(\mathsf{distrib}\) meta-operator. We finally
give the empty list as the initial accumulator.

Given a
well-formedness derivation \(\jdgmnt{ty}{\Psi}{A}\), a type variable \(X
\in \Psi\), a type environment \(\alpha\) and a type \(S\),
\(\mathsf{distrib}^{\alpha, S}_{\Lambda X. A}\) has type
\[
\suball{A}{\sub{\alpha}{X/S \to S \times \alpha(X)}} \to S \to S \times \suball{A}{\alpha}.
\]
\(\mathsf{distrib}\) calls each accumulator of type in the vessel in
sequence. We give the details in \cref{M-sec:distrib}.

We require a few operations on lists to compute this transformation:
\begin{description}
  \item[\(\mathsf{nil}\)] the empty list;
  \item[\(\mathsf{cons}\)] adds an element to the head of a list;
  \item[\(\mathsf{snoc}\)] adds an element to the tail of a list;
  \item[\(\mathsf{length}\)] computes the number of elements in a
    list; and
  \item[\(\mathsf{index}\)] retrieves a value from a list by position.
\end{description}

We give the equations these operators satisfy below.
\begin{align*}
\mathsf{snoc}~\mathsf{nil}~v &= \mathsf{cons}~v~\mathsf{nil} &
\mathsf{snoc}~(\mathsf{cons}~t~u)~v &= \mathsf{cons}~t~(\mathsf{snoc}~u~v) \\
\mathsf{length}~\mathsf{nil} &= \zero &
\mathsf{length}~(\mathsf{cons}~t~u) &= \suc~(\mathsf{length}~u) \\
\mathsf{index}~(\mathsf{cons}~t~u)~\zero &= t &
\mathsf{index}~(\mathsf{cons}~t~u)~(\suc~n) &= \mathsf{index}~u~n
\end{align*}

At the end of this phase, the \systemtinline{compose} example is
unchanged. The \systemtinline{balanced} example reduces to the below
code. A \systemtinline{Leaf} has no inductive components, so the list
passed to \roll*{} in the \zero{} branch is empty. The \suc{} branch
uses the same variable \systemtinline{tree} in two different positions
in the list. To keep our future examples small, we will instead use
the list \systemtinline{[tree]} with indices \systemtinline{(0, 0)}.
\begin{listing}[H]
\begin{systemt}
let balanced n f = primrec n with
  Zero     => roll2 []           (Leaf f)
| Suc tree => roll2 [tree, tree] (Branch (0, 1))
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\subsection{Phase 2: Encoding Inductive Types}%
\label{subsec:inductive-types}

We use a modified heap encoding to encode regular types. We use a
\((\mathsf{List}~\nat)\)-indexed heap, but use naturals as
pointers. The idea is that the heap index describes the path taken
through the term to reach a particular entry, whilst the pointers
describe the next step along the path.

We chose to use a heap encoding over another encoding strategy by
elimination of the other strategies. Inductive types in \lang{} can
contain higher-order data, such as our tree of functions, which
prevents us from using G\"odel encodings. Using a local translation
makes writing the encoding easier, and as System~T does not have
polymorphism, we cannot use Church encodings. We need to be able to
write the fold operation, so we cannot use eliminator encodings. Thus
the only suitable encoding strategy is a heap encoding.

Unlike the description of the heap encoding in
\cref{M-subsec:encoding-strategies} we do not use the same type for
indices and pointers. We use \(\mathsf{List}~\nat\) as the index type,
representing a path through the term. We use the empty list to
indicate the root of the inductive value. Otherwise, the head of the
list selects which child to recurse into and its tail is the path
within this component. Instead of eagerly fix pointers as described
earlier, we compute new paths lazily during a fold. This simplifies
the \roll*{} operation as we do not need to fixup the entire heap.

\begin{figure}
  \begin{align*}
    \roll*~ts~x &\coloneq \tuple*{
      \suc~(\mathsf{max}~(\lambda t. t.0)~ts),
      \lambda i. \domatch*{i}{
        \mathsf{nil}. x;
        \mathsf{cons}(i, j). {(\mathsf{index}~ts~i).1~j}}}
    \\
    \dofold{t}{x}{u} &\coloneq \dolet
      {go}*{\doprimrec*{t.0}
        {\arb}
        {r}{\lambda i. \sub{u}{x/\mapkw~(\lambda n. r~(\mathsf{snoc}~i~n))~(t.1~i)}}
      }*{go~\mathsf{nil}}
  \end{align*}
  \caption{Phase 2 encoding of the \roll*{} and \foldkw{} operators.}\label{fig:phase-2-encode}
\end{figure}

More formally, we encode the type \(\mu X. A\) as \(\nat \times
(\mathsf{List}~\nat \to \sub{A}{X/\nat})\). We present the encoding of
\roll*{} and \foldkw{} in \cref{fig:phase-2-encode}. We add two new
operators for working with lists:
\begin{description}
  \item[\(\mathsf{max}\)] for calculating the maximum from a list,
    given a function converting values to naturals; and
  \item[\(\mathsf{match}\)] for pattern matching on a list as either
    \(\mathsf{nil}\) or \(\mathsf{cons}\)
\end{description}
These operators satisfy the following equations:
\begin{align*}
  \mathsf{max}~f~\mathsf{nil} &= 0 &
  \domatch{\mathsf{nil}}{\mathsf{nil}. f; \mathsf{cons}~x~y. g} &= f \\
  \mathsf{max}~f~(\mathsf{cons}~t~u) &= f~t \sqcup u &
  \domatch{\mathsf{cons}~t~u}{\mathsf{nil}. f; \mathsf{cons}~x~y. g}
    &= \sub{g}{x/t, y/u}
\end{align*}

Computing the maximum value from a list is necessary to correctly
determine the recursive depth to use when folding over a value. It is
also the primary reason why infinite inductive types are
forbidden. Recall the inductive type \(\mu X. 1 + (\nat \to X)\) of
countable trees. To compute the recursive depth of such a tree, we
need to compute the maximum of an arbitrary countable sequence. Thus
we cannot encode such infinite types.

We also add \(\mathsf{match}\) on lists. We need this to determine
whether a path is addressing the root entry or one of its inductive
components.

We now return to our examples. After some beta reduction we recover
the following value for \systemtinline{balanced}:
\begin{listing}[H]
\begin{systemt}
let balanced n f = primrec n with
  Zero => (Suc (max (fun (d, h) => d) []), fun xs =>
    match xs with
      []      => Leaf f
    | x :: xs => snd (index [] x) xs)
| Suc (depth, heap) =>
    (Suc (max (fun (d, h) => d) [(depth, heap), (depth, hep)]), fun xs =>
      match xs with
        []      => Branch (0, 1)
      | x :: xs => snd (index [(depth, heap), (depth, heap)] x) xs)
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

And here is the updated value of \systemtinline{compose}:
\begin{listing}[H]
\begin{systemt}
let compose (depth, heap) =
  let go = primrec depth with
    Zero   => arb
  | Suc ih => fun index =>
    let update = fun i => ih (snoc index i) in
    let x = match heap (length, idxs) with
      Leaf i        => Leaf (update i)
    | Branch (i, j) => Branch (update i, update j)
    in match x with
      Leaf f        => f
    | Branch (f, g) => fun x => f (g x)
  in go []
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

To keep our example small, we will perform a commuting conversion
within \systemtinline{compose} to reduce the two match statements into
one. After some further beta reductions, we obtain the simplified
defintion

\begin{listing}[H]
\begin{systemt}
let compose' (depth, heap) =
  let go = primrec depth with
    Zero   => arb
  | Suc ih => fun index =>
    let update = fun i => ih (snoc index i) in
    match heap (length, idxs) with
      Leaf i        => update i
    | Branch (i, j) => fun x => update i (update j x)
  in go []
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\subsection{Phase 3: Encoding Lists}%
\label{subsec:lists}

This phase uses an eliminator encoding for lists. Recall we have the
following operators for lists: \(\mathsf{nil}\), \(\mathsf{cons}\),
\(\mathsf{length}\), \(\mathsf{index}\), \(\mathsf{max}\),
\(\mathsf{snoc}\) and \(\mathsf{match}\). We will encode all of these
operators using only the \(\mathsf{length}\) and \(\mathsf{index}\)
eliminators.

More formally, we encode the type \(\mathsf{List}~A\) by the type
\(\nat \times (\nat \to A)\), where the first component is the length of the
list and the second is the index function. We will justify using these
eliminators by giving an encoding for each operator. Starting with the
constructors, we can encode \(\mathsf{nil}\) by the pair \(\tuple{0,
  \arb}\). The empty list has length zero, and as there are no valid
indices, we can give an arbitrary indexing function.

We encode \(\mathsf{cons}~t~u\), adding element \(t\) to the head of
the list \(u\), by
\[
\tuple{
  \suc~u.0,
  \lambda x.\mathsf{if}~x = \zero~
    \mathsf{then}~t~
    \mathsf{else}~u.0~(\mathsf{pred}~x)}
\]
The length of our new list is one larger that
the tail. To lookup a value, we first test whether the index is
zero. If it is, we return the new head directly. Otherwise, we
decrement the index and lookup its value in the tail. The encoding of
\(\mathsf{if}\) and equality is standard~\cite{if+equals}.

The encoding of \(\mathsf{snoc}~t~u\), adding element \(u\) to the
tail of the list \(t\), is similar:
\[
\tuple{
  \suc~t.0,
  \lambda x.\mathsf{if}~x = t.0~\mathsf{then}~u~\mathsf{else}~t.1~x
}
\]
The new list is again one item longer that the old list. When looking
up an item, we first check if the index is the last in the list. If it
is, we return the element we are adding to the tail. Otherwise, we
lookup the index in the old list.

We encode \(\mathsf{max}~f~t\) by primitive recursion on the length of
the list \(t\).
\[\doprimrec{t.0}{\zero}{x}{(x.1 - f~x.0) + f~x.0}\]
We compute the binary maximum by performing a truncated subtraction
followed by an addition. These both have standard
encodings~\cref{add+sub}. Note that we use the inductive hypothesis on
the left of the subtraction so that a naive partial evaluator can
reduce the maximum of a singleton list to a single value.

The final operator to encode is pattern matching. We achieve this by
inspecting the length of the list to match.
\begin{multline*}
\domatch{t}{
  \mathsf{nil}. f;
  \mathsf{cons}(x, y). g
} \coloneq \\
\mathsf{if}~t.0 = \zero~\mathsf{then}~f~\mathsf{else}~\sub{g}{
  x/t.1~\zero, y/\tuple{\mathsf{pred}~t.0, \lambda i.~t.1~(\suc~i)}
}
\end{multline*}
The tricky part of this definition is computing the head and tail of a
non-empty list. We retreive the head by calling the index function
with index zero. The tail is one shorter that the initial list, and
the index function is shifted by one.

We have shown that \(\mathsf{length}\) and \(\mathsf{index}\) are
sufficient to produce an eliminator encoding for lists. We cannot add
\(\mathsf{nil}\), \(\mathsf{cons}\) nor \(\mathsf{snoc}\) to the set
of eliminators, as these all construct lists. Similarly pattern
matching ``constructs'' the tail of a non-empty list. The only other
operator we could possibly add as an eliminator is
\(\mathsf{max}\). There are two main reasons we have not done
this. Firstly, the maximum is only computed for a small number of
lists. In our running examples we compute the maximum only twice,
whereas we use lists thoughout. Carrying redundant data around for an
infrequent operation is inefficient and would complicate the
encoding. Secondly, \(\mathsf{max}\) interacts poorly with pattern
matching. The only way to correctly calculate the maximum of the tail
of a list is to start from scratch. Whilst for our purposes an
overestimate is acceptable, carrying data we need to recompute is
inefficient.

After phase three, our example for \systemtinline{balanced} beta
reduces to the following:
\begin{listing}[H]
\begin{systemt}
let balanced n f = primrec n with
  Zero => (1 , fun (length, idxs) =>
    if length == 0 then Leaf f else
      snd arb (length - 1, fun i => idxs (Suc i)))
| Suc (depth, heap) => (Suc ((depth - depth) + depth), fun (length, idxs) =>
    if length == 0 then Branch (0, 1) else
      let x = idxs 0 in
      let dh =
        if     x == 0 then (depth, heap) else
        if x - 1 == 0 then (depth, heap) else
          arb
      in snd dh (length - 1, fun i => idxs (Suc i)))
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

And \systemtinline{compose'} reduces to:
\begin{listing}[H]
\begin{systemt}
let compose' (depth, heap) =
  let go = primrec depth with
    Zero   => arb
  | Suc ih => fun (length, idxs) =>
    let update = fun i => ih (Suc length, fun j =>
                   if j == length then i else idxs j)
    match heap (length, idxs) with
      Leaf i        => update i
    | Branch (i, j) => fun x => update i (update j x)
  in go
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\subsection{Phase 4: Encoding Sums}%
\label{subsec:sums}

In this phase we remove sums from the language by encoding them as
tagged C-style unions, following the work of \textcite{oleg}. We
encode the type \(\sum_i A_i\) by the pair \(\nat \times \bigsqcup_i A_i\) of
a tag indicating which case we are in, and a union which can contain a
value from any case.

Unions have two operators: \(\mathsf{inj}~i~t\) and
\(\mathsf{prj}~t~i\) for injecting and projecting values at type
\(A_i\) respectively. When the two operators have the same index,
unions have the beta reduction rule
\(\mathsf{prj}~(\mathsf{inj}~i~t)~i = t\). If the two indices are
different then projection is stuck.

We encode the injection into a sum \(\tuple{i, t}\) by the pair
\(\tuple{i, \mathsf{inj}~i~t}\). We encode pattern matching
\((\casetm{t}{\tuple{i,x_i}}{t_i}{i})\) by the term \(
(\casetm{t.0}{i}{\sub{t_i}{x_i/\mathsf{prj}~t.1~i}}{i})
\) performing a pattern match over the tag to find the correct branch
to take. The pattern match on the right will be desugared into a
sequence of equality tests in phase seven.

\FIXME{these examples are hard to read}

Our two examples reduce even further. We obtain the following for
\systemtinline{balanced}:

\begin{listing}[H]
\begin{systemt}
let balanced n f = primrec n with
  Zero => (1, fun (length, idxs) =>
    if length == 0 then (0 , inj 0 f) else
      snd arb (length - 1, fun i => idxs (Suc i)))
| Suc (depth, heap) => (Suc ((depth - depth) + depth), fun (length, idxs) =>
  if length == 0 then (1, inj 1 (0, 1)) else
    let x = idxs 0 in
    let dh =
      if     x == 0 then (depth, heap) else
      if x - 1 == 0 then (depth, heap) else
        arb
    in snd dh (length - 1, fun i => idxs (Suc i)))
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

The \systemtinline{compose'} example demonstrates how pattern matching
is encoded:

\begin{listing}[H]
\begin{systemt}
let compose' (depth, heap) =
  let go = primrec depth with
    Zero   => arb
  | Suc ih => fun (length, idxs) =>
    let update = fun i => ih (Suc length, fun j =>
                   if j == length then i else idxs j)
    let (tag, v) = heap (length, idxs) in
    match tag with
      0 => update (prj v 0)
    | 1 => let (i, j) = prj v 1 in fun x => update i (update j x)
  in go
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\subsection{Phase 5: Encoding Products}%
\label{subsec:products}

We will continue following the work of \textcite{oleg} to encode away
products. A product \(\prod_i A_i\) is encoded as a function \(\nat \to
\bigsqcup_i A_i\) from indices to values. This is similar to the
encoding for lists, with only a couple of small variations. First, we
statically know the length of a product, so we do not need to include
it within its type. Secondly, a product can store values from
different types whilst a list is homogenous, so we need to use the
union to make it homogenous.

We encode tupling \(\tuple{\rangeover{t_i}{i}}\) as the case split \(\lambda
x. \casetm{x}{i}{\mathsf{inj}~i~t_i}{i}\). The projection \(t.i\) is
encoded as the application \(\mathsf{prj}~(t~i)~i\).

At this phase the encodings of our example functions,
\systemtinline{balanced} and \systemtinline{compose'}, become too
cluttered to be useful. Instead we will consider the
\systemtinline{dupfirst} function, of type \((\nat \to \nat) \times \nat \to
(\nat \to \nat) \times (\nat \to \nat) \times \nat\), which takes a pair of a
function and value, and duplicates the first component of the pair.
Originally defined as \systemtinline{let dupfirst t = (t.0, t.0, t.1)},
after encoding products the function becomes
\begin{listing}[H]
\begin{systemt}
let dupfirst t = fun x => match x with
  0 => inj 0 (prj (t 0) 0)
| 1 => inj 1 (prj (t 0) 0)
| 2 => inj 2 (prj (t 1) 1)
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\subsection{Phase 6: Encoding Unions}%
\label{subsec:unions}

At this point, the only type former we use that is not present in
System~T is the union type.\@ \textcite{oleg} gives an inductive
encoding for binary unions. We instead use an encoding for unions
derived from the argument form of types. Given we have a family of
types \(A_i\) in argument form, their union \(\bigsqcup_i A_i\) is the
concatenation \(A_1 \append A_2 \append \cdots \append A_n\). To inject
type \(A_k\) into the union, we ignore the function arguments for all
the other types. To project type \(A_k\) out of the union, we pass
\(\mathsf{arb}\) to all the other arguments.

Using this argument-form union, we remove the need to perform
induction on types, and only have to iterate over the number of types
in the union. This also simplifies the proof that our encoding of the
union satisfies the required beta reduction rule. In exchange, our
union encoding is neither idempotent nor commutative, and generally
results in larger types than \posscite{oleg} encoding.

The \systemtinline{dupfirst} example reduces to the below. If we
instead used \posscite{oleg} encoding then the returned function would
take only a single argument \systemtinline{x}.
\begin{listing}[H]
\begin{systemt}
let dupfirst t = fun x => match x with
  0 => fun x y => t 0 x
| 1 => fun x y => t 0 y
| 2 => fun x y => t 1 arb
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\subsection{Phase 7: Desugaring}%
\label{subsec:desugar}

This final phase of encoding performs desugaring; there are only a
couple of remaining operations to encode. These include case splitting
on a natural number; constructing an arbitrary value of a type; and
\letkw{} expressions.

We encode case splitting on a number by a chain of equality tests. If
all the tests fail, we will return an arbitrary value. We can
construct an arbitrary value at any type by using the function that
constantly returns zero. Let expressions are given their usual
functional decoding as an abstraction applied immediately to an
argument.

The \systemtinline{dupfirst} example desugars into the following
expression:
\begin{listing}[H]
\begin{systemt}
let dupfirst t = fun x =>
  if x == 0 then fun x y => t 0 x else
  if x == 1 then fun x y => t 0 y else
  if x == 2 then fun x y => t 1 0 else
    fun x y => 0
\end{systemt}
\vspace{-\baselineskip}
\end{listing}

\end{document}