sec/reducer.ltx


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204

\documentclass[../main.tex]{subfiles}

\begin{document}
\section{Writing ARTyST Programs}%
\label{sec:reducer}

We have used \lang{} to write two non-trivial programs that can be
embedded in System~T. The first is a fuelled reducer for
System~T. Given \(n\) units of fuel, the reducer will perform \(n\)
steps of complete~development~\cite{cd} on a System~T term. The second
program is the encoding from \lang{} to System~T. This takes a
type-annotated \lang{} program and encodes it as a System~T term.

\subsection{Fuelled System~T Reducer}%
\label{subsec:reducer}

Given enough fuel to work with, a fuelled reducer takes a term as
input and reduces it to normal form. Our fuelled reducer operates
using term rewriting. For each unit of fuel, we perform one step of
complete development; we reduce \emph{all} beta redices in the input
term simultaneously. As System~T is confluent and strongly
normalising, complete development will converge on the normal form of
the input term.

We define System~T terms as the inductive type in \cref{lst:term-ty}.
We use the keyword \verb|data| to introduce a fixed point type. Sums
are given in square brackets and products in angle brackets.  As
\lang{} is simply typed, there is no way to statically enforce terms
are correctly typed nor correctly scoped, but one can write scope and
type checkers within \lang{}.

\begin{listing}
  \begin{verbatim}
data Term
  [ Var: Nat
  ; Zero: <>
  ; Suc: Term
  ; Primrec: <Zero: Term; Suc: Term, Target: Term>
  ; Abs: Term
  ; App: <Fun: Term, Arg: Term>
  ]
  \end{verbatim}
  \caption{Inductive type representing System~T terms.}\label{lst:term-ty}
\end{listing}

An example function is \verb|rename|, taking a term and a renaming
function \verb|Nat -> Nat|, and updating all variables according to
the renaming function. The program is in \cref{lst:rename}. We use
\verb|~| as the syntax for \roll{} and \verb|foldcase| is syntactic
sugar for performing a \foldkw{} whose body immediately pattern
matches on the induction hypothesis. Recall that a fold fills the
shape type with the motive. For example, for the \verb|App| branch,
\verb|p| has type
\verb|<Fun: (Nat -> Nat) -> Term; Arg: (Nat -> Nat) -> Term>|. As
application doesn't bind any variables, we can pass the renaming
\verb|f| to the inductive components without modification. For
abstraction and primitive recursion, we have to lift the renaming
function, corresponding to binding a variable.

\begin{listing}
  \begin{verbatim}
let rename (t : Term) : (Nat -> Nat) -> Term =
  foldcase t by
    { Var n     => \f => ~(Var (f n))
    ; Zero _    => \f => ~(Zero <>)
    ; Suc r     => \f => ~(Suc (r f))
    ; Primrec p => \f =>
      ~(Primrec <Zero: p.Zero f; Suc: p.Suc (lift f); Target: p.Target f>)
    ; Abs r     => \f => ~(Abs (r (lift f)))
    ; App p     => \f => ~(App <Fun: p.Fun f; Arg: p.Arg f>)
    }
  \end{verbatim}
  \caption{Implementation of term renaming.}\label{lst:rename}
\end{listing}

The main driver in the reducer is the \verb|step| function, which
performs a complete development of a term. We give its definition in
\cref{lst:step}. By induction, all terms within the induction hypothesis
have been completely developed. Most cases pass through their
arguments unchanged: the only reductions for \(\suc~t\) happen within
\(t\) for instance. The two interesting cases are \verb|Primrec| and
\verb|App|. Here we match on the ``active'' term, performing a
reduction if possible. If the active term is either a neutral
(\verb|Var|) or absurd (\verb|Zero| as a function) we return the
inductive argument without further computation applied. Our fuelled
reducer applies the \verb|step| function to the input for each unit of
fuel it is given.

\begin{listing}
  \begin{verbatim}
let step (t : Term) : Term =
  foldcase t by
    { Var n     => ~(Var n)
    ; Zero _    => ~(Zero <>)
    ; Suc t     => ~(Suc t)
    ; Primrec p =>
      let default = ~(Primrec p) in
      case p.Target of
        { Var _     => default
        ; Zero _    => p.Zero
        ; Suc t     =>
          sub p.Suc [ ~(Primrec <Zero: p.Zero; Suc: p.Suc; Target: t>) ]
        ; Primrec _ => default
        ; Abs _     => default
        ; App _     => default
        }
    ; Abs t     => ~(Abs t)
    ; App p     =>
      let default = ~(App p) in
      case p.Fun of
        { Var _     => default
        ; Zero _    => default
        ; Suc t     => default
        ; Primrec _ => default
        ; Abs t     => sub t [ p.Arg ]
        ; App _     => default
        }
    }
  \end{verbatim}
  \caption{Implementation of complete development.}\label{lst:step}
\end{listing}

Our reducer only applies \(\beta\)-reduction. It is also possible to
perform \(\eta\)-contraction as computing a strengthened term requires only
structural induction.

\subsection{Encoding Algorithm}%
\label{subsec:encoding}

In \cref{M-sec:encoding} we described an algorithm to embed \lang{}
within System~T. We have written this algorithm in \lang{} itself,
simultaneously mechanising the conversion and indirectly proving it is
terminating. Our input is a type-annotated \lang{} term, and we output
an unannotated System~T term as in \cref{lst:term-ty}. Each phase is
accompanied by its own inductive types representing the phases' types
and terms. To compute the encoding, any term formers that are not in
System~T must be annotated with their types. For simplicity, our
finite sums and products are over unlabelled lists of types instead of
finite sets.

Folds in \lang{} are destructive---the inductive components used to
generate the induction hypothesis are inaccessible. This has some
interesting interactions with encoding the \(\mathsf{distrib}\) term
in phase one, and implementing the \mapkw{} pseudo-operator in phase
two. Both of these functions of the form \(\text{Type} \to \text{Env} \to
\text{Term}\). Our term representations require type annotations but
any subtypes are inaccessible. So, we program \(\mathsf{distrib}\) and
\(\mathsf{map}\) to return a pair of term and type.

\begin{listing}
  \begin{verbatim}
; Arrow p =>
  let myTy = ~(Arrow <Dom: p.Dom.Ty []; Cod: p.Cod.Ty []>) in
  < Ty:   \env => myTy
  ; Term: \env, k, s => ~(Abs ~(Abs ~(Tup
    [ <Ty: s;    Term: ~(Var 0)>
    ; <Ty: myTy; Term: ~(Var 1)>])))
  >
  \end{verbatim}
  \caption{The branch of \(\mathsf{distrib}\) corresponding to
    function types.}\label{lst:distrib-arrow}
\end{listing}

As a concrete example, consider the branch of \(\mathsf{distrib}\)
over a function type, shown in \cref{lst:distrib-arrow}. The motive
type is
\verb|<Ty : List Type -> Type; Term: List Type -> Nat -> Type>|. The
\verb|Ty| component is the result of type substitution and the
\verb|Term| component is the result of \(\mathsf{distrib}\).  Ideally,
we would only need the \verb|Term| component of the pair. However, the
type annotations needed for the tuple force us to also compute the
types.

From a language design standpoint, we could have chosen that \foldkw{}
is non-destructive. The inductive hypothesis would have the form
\(\sub{A}{X/(\mu X. A) \times B}\) for shape \(A\) and motive \(B\). This is
exactly as expressive as the chosen type, with inductive hypothesis
\(\sub{A}{X/B}\). We chose to use the simpler motive as many folds do
not need the non-destructive argument.

\subsection{Composing the Two Programs}

Our encoding algorithm introduces many administrative
\(\beta\)-redexes. We manually reduced many of them in the examples at the
end of each phase in \cref{M-sec:encoding}. The final desugaring in
phase seven also introduced many redexes when eliminating \letkw{}
expressions. Thus the ``efficiency'' of an encoded program can be
improved by performing the simple partial evaluation of our fuelled
reducer.

Even with a partial reduction pass, our encoder can still have some
performance issues. Neither \lang{} nor System~T have \emph{commuting
conversions}; equations about how two case splits, folds or primitive
recursions interact. This means two contextually equivalent terms are
not equal in the equational theory. An example of this are the
\systemtinline{compose} and \systemtinline{compose'} functions from
encoding phase two. Because it performs fewer pattern matches,
\systemtinline{compose'} would have a encoding than
\systemtinline{compose} despite containing the same information.

There are no equations we can add to resolve this for \foldkw{} or
\primreckw{}. \TODO{how do I mitigate this?}

\end{document}