8 files changed, 572 insertions, 0 deletions
diff --git a/sec/compiler.ltx b/sec/compiler.ltx
new file mode 100644
index 0000000..dde2856
--- /dev/null
+++ b/sec/compiler.ltx
@@ -0,0 +1,18 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Idris 2 Compiler}%
+\label{sec:compiler}
+
+\TODO{
+  \begin{itemize}
+    \item QTT basics
+    \item intrinsically well-scoped AST
+    \item bidirectional judgement
+    \item well-typed data types
+    \item type checking function
+    \item irrelevant pattern matching
+  \end{itemize}
+}
+
+\end{document}
diff --git a/sec/embedding.ltx b/sec/embedding.ltx
new file mode 100644
index 0000000..722eff1
--- /dev/null
+++ b/sec/embedding.ltx
@@ -0,0 +1,65 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Embedding in System T}%
+\label{sec:embedding}
+
+A major motivation for \lang{} is the ability to write terms with regular types
+and embed them in System~T. We explain how the embedding works, which encodes
+products, sums and recursive types as System~T types.
+
+Every pair of System~T types has a union (in the sense of C). The union has
+injections into it and projections out of it. The specification of the union
+type, \(A \sqcup B\) for types is:
+\[
+  \begin{array}{cc}
+    \mathsf{inl} : A \to A \sqcup B &
+    \mathsf{inr} : B \to A \sqcup B \\
+    \mathsf{prl} : A \sqcup B \to A &
+    \mathsf{prr} : A \sqcup B \to B \\
+    \mathsf{prl}~(\mathsf{inl}~t) = t &
+    \mathsf{prr}~(\mathsf{inr}~t) = t
+  \end{array}
+\]
+\Cref{fig:union-impl} shows \posscite{Squid:online/Kiselyov22} implementation of
+unions. We can extend this to n-ary unions \(\bigsqcup_i^n A_i\) in the usual
+way. We pick \(\nat\) to represent the empty union arbitrarily.
+
+With unions, it becomes easy to encode products.
+\textcite{Squid:online/Kiselyov22} uses the functional encoding \(\prod_i^n A_i
+\coloneq \nat \to \bigsqcup_i^n A_i\), representing the product as a
+heterogeneous vector. The \(k\)th component of the product is accessed by
+applying the product to \(k\). Tupling does a case analysis of the argument,
+returning the corresponding value.
+
+\textcite{Squid:online/Kiselyov22} also describes how to encode sums using
+products and unions. A sum is encoded as a tagged union \(\sum_i^n A_i \coloneq
+\nat \times \bigsqcup_i^n A_i\), where the tag indicates which case of the sum
+is being stored. Case analysis first inspects the tag to decide which branch to
+take, downcasts the value to the correct type, and then performs the branch
+action. Note this encoding is not consistent: the empty type is necessarily
+inhabited, as all System~T types are. This inconsistency is necessary for the
+encoding of recursive types.
+
+Recursive types are the most difficult to encode.
+\textcite{DBLP:books/sp/LongleyN15} describe one strategy: represent values as a
+heap. We have the encoding \(\mu X. A \coloneq \nat \times \nat \times (\nat \to
+\sub{A}{X/\nat})\). The first component is the ``depth'' of the heap: an upper
+bound on the number of pointers to follow on any arbitrary path. The second
+component represents the root index of the heap---the address where the data
+structure starts. The final component is the heap itself. Each location stores
+data associated with its value. Any recursion is replaced with a pointer to a
+different location in the heap.
+
+\TODO{example}
+
+Folding is easier to compute than rolling. The precise encoding is given below.
+We compute the fold across the heap in parallel, iteratively increasing the
+depth of paths we can safely use. The initial result is arbitrary: we don't have
+any data to operate on.
+
+\begin{multline*}
+\TODO{fold}
+\end{multline*}
+
+\end{document}
diff --git a/sec/intro.ltx b/sec/intro.ltx
new file mode 100644
index 0000000..00a6d5b
--- /dev/null
+++ b/sec/intro.ltx
@@ -0,0 +1,103 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Introduction}%
+\label{sec:intro}
+
+System~T is an extension of the simply-typed lambda calculus adding natural
+numbers and primitive recursion. This minimality makes it an excellent calculus
+to model, and yet the calculus is expressive enough to perform a wide array of
+useful calculations. This includes most arithmetic, \TODO{middle thing} and even
+rewriting of lambda terms.
+
+The restricted type system of System~T makes expressing complex data difficult.
+Often structured data has to be encoded as naturals or first-order functions.
+Encoding data within a natural number often makes structural recursion
+exceedingly complex. Whilst using first-order functions can make recursion
+easier, it is often the case that passing different arguments can return
+different shapes of data.
+
+Consider the following System~T pseudocode:
+\begin{listing}[H]
+  \begin{systemt}
+  let sum (tree : nat -> nat -> nat -> nat -> nat) : nat =
+    let depth = tree 0 0 0 0 in
+    let root  = tree 1 0 0 0 in
+    let heap  = tree 2 in
+    let go : nat -> nat = primrec depth with
+      Z      => fun i => i
+    | S(acc) => fun i =>
+      let tag = heap i 0 0 in
+      let v   = heap i 1 in
+      if tag == 0 then
+        v 0
+      else
+        acc (v 0) + acc (v 1)
+  \end{systemt}
+  \vspace{-\baselineskip}
+\end{listing}
+From the name and type, one can deduce this performs a sum over some tree
+structure. With some more thought, you may be able to work out that
+\systemtinline{tree} is a binary tree. This information is not at all available
+in its type. It is also unclear why \systemtinline{tree 0 0 0 0} should return
+the depth of the tree and \systemtinline{tree 1 0 0 0} the root.
+
+In other functional programming languages, the same sum function would look
+something like the following OCaml code:
+\begin{listing}[H]
+  \begin{minted}{ocaml}
+  let sum tree = match tree with
+    Leaf(v)             -> v
+  | Branch(left, right) -> sum left + sum right
+  \end{minted}
+  \vspace{-\baselineskip}
+\end{listing}
+In this example it is immediately obvious what the function does. The code is
+first order reducing its complexity and the richer type system makes it easier
+to identify the role of different variables. These additional features of OCaml
+and other such languages improve the closeness of mapping and
+role-expressiveness of the code whilst reducing the error-proneness (\TODO{check
+  this}). This is great for human programmers but this comes at a modelling
+cost. Unlike System~T, most functional languages allow non-termination. For
+example, OCaml would have no issue with us writing
+\mintinline{ocaml}{sum tree + sum right} by mistake.
+
+In this work we present \lang{}, a new intermediate language that compiles into
+System~T. \lang{} extends the simply-typed lambda calculus with the regular
+types~\cite{Squid:unpublished/McBride01}: finite products, finite sums and some
+inductive types.\@ \lang{} has most of the expressive types we are used to from
+functional languages---we can write code similar to the second example---whilst
+still being simple enough to represent within System~T.
+
+The type encodings used to compile \lang{} into System~T are
+well-known~\cites{Squid:online/Kiselyov22}{DBLP:books/sp/LongleyN15}. Encoding
+products and sums relies on C-style union types, which exist due to System~T
+having only naturals and function types. Recursive types are represented using a
+heap with an explicit depth, informally described by
+\textcite{DBLP:books/sp/LongleyN15}. We have written this compilation algorithm
+within \lang{} itself.
+
+To aid in the development of this program, we also constructed a type checker
+for \lang{} in Idris~2~\cite{idris}. The typechecker uses intrisically
+well-scoped syntax. The syntax is represented by an indexed data type
+\texttt{Term}, where the index describes the variables that are in scope. By
+using quantities~\cite{quants} and auto-implicit search, the compiler code is
+written in a named style, yet compiles down to using deBruijn indices. Using
+names helps to reduce errors in the code, and the compilation to deBruijn
+indices means we don't pay any runtime cost.
+
+%% \subsubsection*{Contributions}%
+%% \label{sec:contributions}
+%% \begin{itemize}
+%% \item Design of new intermediate language \lang{}, featuring iso-recursive
+%%   regular types
+%% \item Algorithm for embedding \lang{} into to System~T, implementable within
+%%   \lang{}
+%% \item Case study implementation of a fuelled self-reducer for System~T,
+%%   demonstrating improvements in several cognitive dimensions~\cite{cog-dim} for
+%%   \lang{} compared to working in System~T directly
+%% \item Compiler implemented in Idris 2 with intrinsically well-scoped syntax and
+%%   runtime-erased typechecking proofs.
+%% \end{itemize}
+
+\end{document}
diff --git a/sec/lang.ltx b/sec/lang.ltx
new file mode 100644
index 0000000..c7851e2
--- /dev/null
+++ b/sec/lang.ltx
@@ -0,0 +1,226 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Core Language}%
+\label{sec:lang}
+
+In this section, we describe \lang{} a new calculus with higher-order functions
+and regular types~\cite{Squid:unpublished/McBride01}. We start by describing the
+types within the language: products, sums, recursive types and functions. We
+then describe the syntax which is standard for n-ary products and sums. We also
+define a \mapkw{} operation over arbitrary (covariant) functors.\@ \mapkw{} is
+necessary to define the equational theory of \foldkw{}, the eliminator for
+recursive types. The equational theory has \(\beta\) equalities, but no \(\eta\)
+equalities, a consequence of the embedding into System~T.
+
+Types for \lang{}, in \cref{fig:lang-wf}, are given relative to a context of
+type variables. The judgement \(\jdgmnt{ty}{\Theta}{A}\) states that \(A\) is a
+well-formed type in with variables from context \(\Theta\). Products and sums
+are well-formed when each of the n-ary components are well-formed. The recursive
+type \(\mu X. A\) is formed from a type \(A\) in a context extended by \(X\).
+The other type formation rules ensure that \(X\) is used strictly positively,
+and thus that recursive types are well-formed. In particular the rule forming
+function types forbids using any type variables on the left of the arrow. This
+forbids the use of type variables in negative positions. The function type
+formation rule also forbids variables on the right.  This is to forbid types
+such as \(\mu X. 1 + \nat \to X\) of countable trees, as such a type cannot be
+represented by System~T~\cite{proof}. Well-formed types also respect
+substitution:
+\begin{proposition}[Type Substitution]
+Given \(\jdgmnt{ty}{\Theta}{A}\) and \(\jdgmnt{ty}{\Psi}{B_i}\) where \(i\)
+ranges from \(0\) to \(\lvert \Theta \rvert\), we have
+\(\jdgmnt{ty}{\Psi}{\FIXME{\sub{A}{X_i/B_i}}}\).
+\end{proposition}
+
+\begin{figure}
+\[
+\begin{array}{ccccc}
+  \begin{prooftree}
+    \hypo{
+      X \in \Theta
+      \vphantom{\jdgmnt{ty}{\Theta}{A}} %% Spooky formatting phantom
+    }
+    \infer1{\jdgmnt{ty}{\Theta}{X}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\rangeover{\jdgmnt{ty}{\Theta}{A_i}}{i<n}}
+    \infer1{\jdgmnt{ty}{\Theta}{\prod_i^n A_i}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\rangeover{\jdgmnt{ty}{\Theta}{A_i}}{i<n}}
+    \infer1{\jdgmnt{ty}{\Theta}{\sum_i^n A_i}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{
+      \jdgmnt{ty}{\Theta, X}{A}
+      \vphantom{\rangeover{\jdgmnt{ty}{\Theta}{A}}{i<n}} %% Spooky formatting phantom
+    }
+    \infer1{
+      \jdgmnt{ty}{\Theta}{\mu X.A}
+      \vphantom{\jdgmnt{ty}{\Theta}{\sum_i^n A_i}} %% Spooky formatting phantom
+    }
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\jdgmnt{ty}{}{A}}
+    \hypo{\jdgmnt{ty}{}{B}}
+    \infer2{\jdgmnt{ty}{\Theta}{A \to B}}
+  \end{prooftree}
+\end{array}
+\]
+\caption{Well-formedness of \lang{} types}%
+\label{fig:lang-wf}
+\end{figure}
+
+We present terms in \cref{fig:lang-type}. All type variables must be bound
+within recursive types, so there is no type context. The typing rules for
+variables, functions, products and sums are standard. Values of a recursive type
+\(\mu X. A\) are constructed using \emph{rolling}. A value of type
+\(\sub{A}{X/\mu X. A}\) is rolled into one of type \(\mu X. A\). Recursive types
+are eliminated by \emph{folding}. To fold a target of type \(\mu X. A\) into
+type \(B\), it is necessary to describe how to transform \(\sub{A}{X/B}\) into
+\(B\). By only allowing eliminations of this form, we ensure that all recursion
+is well-founded.
+
+\begin{figure}
+\[
+\begin{array}{cc}
+  \begin{prooftree}
+    \hypo{x : A \in \Gamma}
+    \infer1{\judgement{\Gamma}{x}{A}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{x}{A}}
+    \infer1{\judgement{\Gamma}{(x : A)}{A}}
+  \end{prooftree}
+  \\\\
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma, x : A}{t}{B}}
+    \infer1{\judgement{\Gamma}{\lambda x.~t}{A \to B}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{f}{A \to B}}
+    \hypo{\judgement{\Gamma}{t}{A}}
+    \infer2{\judgement{\Gamma}{f~t}{B}}
+  \end{prooftree}
+  \\\\
+  \begin{prooftree}
+    \hypo{\rangeover{\judgement{\Gamma}{t_i}{A_i}}{i<n}}
+    \infer1{\judgement{\Gamma}{\tuple*{\rangeover{t_i}{i<n}}}{\prod_i^n A_i}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{e}{\prod_i^n A_i}}
+    \infer1{\judgement{\Gamma}{e.k}{A_k}}
+  \end{prooftree}
+  \\\\
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{t}{A_k}}
+    \infer1{\judgement{\Gamma}{\tuple{k, t}}{\sum_i^n A_i}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{e}{\sum_i^n A_i}}
+    \hypo{\rangeover{\judgement{\Gamma, x_i : A_i}{t_i}{B}}{i<n}}
+    \infer2{\judgement{\Gamma}{\casetm{e}{x_i}{t_i}{i}{n}}{B}}
+  \end{prooftree}
+  \\\\
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{t}{\sub{A}{X/\mu X.A}}}
+    \infer1{\judgement{\Gamma}{\mathsf{roll}~t}{A}}
+  \end{prooftree}
+  &
+  \begin{prooftree}
+    \hypo{\judgement{\Gamma}{e}{\mu X.A}}
+    \hypo{\judgement{\Gamma, x : \sub{A}{X/B}}{t}{B}}
+    \infer2{\judgement{\Gamma}{\foldtm{e}{x}{t}}{B}}
+  \end{prooftree}
+\end{array}
+\]
+\caption{Typing judgements for \lang{}}%
+\label{fig:lang-type}
+\end{figure}
+
+Using these typing rules, we can derive a map operation for any \emph{functor},
+a type with a single free type variable:
+\[
+\begin{prooftree}
+  \hypo{\jdgmnt{ty}{X}{F}}
+  \hypo{\jdgmnt{ty}{}{B}}
+  \hypo{\jdgmnt{ty}{}{C}}
+  \infer3{
+    \judgement
+      {\Gamma}
+      {\maptm{X}{F}^{A, B}}
+      {(A \to B) \to \sub{F}{X/A} \to \sub{F}{X/B}}}
+\end{prooftree}
+\]
+The definition of \mapkw, \cref{fig:lang-map}, proceeds by \TODO{induction}. We
+will elide the superscripts. The hardest case to understand is recursive types
+\(\mu Y. A\). Given a value of type \(\mu Y. \sub{G}{X/B}\), we need to
+construct a value of type \(\mu Y. \sub{G}{X/C}\). By folding over the given
+value, we need to map \(\sub{G}{X/B, Y/\mu Y. \sub{G}{X/C}}\) into \(\mu Y.
+\sub{G}{X/C}\). We can construct such a value by rolling, thus we need to
+produce a value of type \(\sub{G}{X/C, Y/\mu Y.\sub{G}{X/C}}\). Note that our
+given value from the fold and the result we want to roll both agree on the type
+of \(Y\) within \(G\), and only disagree on the type assigned to \(X\). Thus we
+use map recursively.
+
+\begin{figure}
+\begin{align*}
+  \maptm{X}{X}^{A,B}~f~t &= f~t \\
+  \maptm{X}{Y}^{A,B}~f~t &= t \qquad\qquad (Y \neq X) \\
+  \maptm{X}{A \to B}^{C,D}~f~t &= t \\
+  \maptm{X}{\prod_i^n A_i}^{B,C}~f~t &=
+    \tuple*{\rangeover{\maptm{X}{A_i}^{B,C}~f~t.i}{i<n}} \\
+  \maptm{X}{\sum_i^n A_i}^{B,C}~f~t &=
+    \casetm{t}{x_i}{\tuple{i, \maptm{X}{A_i}^{B,C}~f~x_i}}{i}{n} \\
+  \maptm{X}{\mu Y.A}^{B,C}~f~t &=
+    \foldtm{t}{x}{\mathsf{roll}~(\maptm{X}{\sub{A}{Y/\mu Y.\sub{A}{X/C}}}^{B,C}~f~x)}
+\end{align*}
+\caption{Definition of \(\mathsf{map}\)}%
+\label{fig:lang-map}
+\end{figure}
+
+\Cref{fig:lang-eq} shows the equational theory for \lang{}. We have the standard
+\(\beta\) equalities for functions, products and sums. We do not have \(\eta\)
+equalities for products or sums as these are no satisfied by the System~T
+embedding. We also don't have \(\eta\) equalities for functions as our
+equational theory for System~T does not have them either. The \(\beta\) law for
+recursive types depends on the map operation.
+
+\begin{figure}
+\begin{align*}
+  (t : A) &= t \\
+  (\lambda x.~t)~u &= \sub{t}{x/u} \\
+  \tuple*{\rangeover{t_i}{i<n}}.k &= t_k \\
+  \casetm{\tuple{k, t}}{x_i}{u_i}{i}{n} &= \sub{u_k}{x_k/t} \\
+  \foldtm{\mathsf{roll}~t}{x}{u} &=
+    \sub{u}{x/\maptm{X}{A}~(\lambda y.~\foldtm{y}{x}{u})~t}
+\end{align*}
+\caption{Equational theory for \lang{}}%
+\label{fig:lang-eq}
+\end{figure}
+
+\begin{example}
+  We can use folding and \(\mapkw\) do derive an \emph{unrolling} operation
+  \begin{gather*}
+    \mathsf{unroll}_{X. A}~t \coloneq \foldtm{t}{x}{\maptm{X}{A}~\mathsf{roll}~x} \\
+  \shortintertext{with typing rule}
+    \begin{prooftree}
+      \hypo{\judgement{\Gamma}{t}{\mu X. A}}
+      \infer1{\judgement{\Gamma}{\mathsf{unroll}~t}{\sub{A}{X/\mu X. A}}}
+    \end{prooftree} \\
+  \shortintertext{From the equational theory, we find}
+    \mathsf{unroll}~(\mathsf{roll}~t) =
+      \maptm{X}{A}~\mathsf{roll}~(\maptm{X}{A}~\mathsf{unroll}~t)
+  \end{gather*}
+  which in general doesn't simplify further.
+\end{example}
+
+\end{document}
diff --git a/sec/lexer.py b/sec/lexer.py
new file mode 120000
index 0000000..aef549f
--- /dev/null
+++ b/sec/lexer.py
@@ -0,0 +1 @@
+../lexer.py
+\ No newline at end of file
diff --git a/sec/reducer.ltx b/sec/reducer.ltx
new file mode 100644
index 0000000..8cd19f0
--- /dev/null
+++ b/sec/reducer.ltx
@@ -0,0 +1,17 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Fuelled Self-Reducer for System T}%
+\label{sec:reducer}
+
+\TODO{
+  \begin{itemize}
+    \item fuelled self-reducer
+    \item high-level strategy
+    \item example: case distinction
+    \item qualitative comparison
+    \item quantitative comparison (dLoC)
+  \end{itemize}
+}
+
+\end{document}
diff --git a/sec/related.ltx b/sec/related.ltx
new file mode 100644
index 0000000..caeed41
--- /dev/null
+++ b/sec/related.ltx
@@ -0,0 +1,12 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Related Work}%
+\label{sec:related}
+
+\TODO{
+  \begin{itemize}
+    \item Hagino 1987
+  \end{itemize}
+}
+\end{document}
diff --git a/sec/systemt.ltx b/sec/systemt.ltx
new file mode 100644
index 0000000..12b1ed2
--- /dev/null
+++ b/sec/systemt.ltx
@@ -0,0 +1,130 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{System T}%
+\label{sec:systemt}
+
+G\"odel's System~T extends the simply-typed lambda calculus by adding natural
+numbers and the primitiver recursor. The type formers are \(\nat\) for natural
+numbers and arrows \(A \to B\) for functions.
+
+\Cref{fig:systemt-type} shows the typing rules for System~T terms. On top of the
+usual abstraction and application rules, we can construct unary natural numbers
+using the operators \(\zero\) and \(\suc\). Natural numbers are eliminated using
+the primitive recursor \primreckw. Given a base case, update function and target
+\(n\), the primitive recursor applies the update function to the target \(n\)
+times. We will abuse notation and treat the \(\suc\) and \primreckw operators as
+functions.
+
+\begin{figure}
+  \[
+  \begin{array}{ccc}
+    \begin{prooftree}
+      \hypo{
+        x : A \in \Gamma
+        \vphantom{\judgement[T]{}{}{}} %% Spooky formatting phantom
+      }
+      \infer1{\judgement[T]{\Gamma}{x}{A}}
+    \end{prooftree}
+    &
+    \begin{prooftree}
+      \hypo{\judgement[T]{\Gamma, x \colon A}{t}{B}}
+      \infer1{\judgement[T]{\Gamma}{\lambda x.~t}{A \to B}}
+    \end{prooftree}
+    &
+    \begin{prooftree}
+      \hypo{\judgement[T]{\Gamma}{f}{A \to B}}
+      \hypo{\judgement[T]{\Gamma}{t}{A}}
+      \infer2{\judgement[T]{\Gamma}{f~t}{B}}
+    \end{prooftree}
+    \\
+    \\
+    \begin{prooftree}
+      \hypo{\vphantom{\judgement[T]{}{}{}}} %% Spooky formatting phantom
+      \infer1{\judgement[T]{\Gamma}{\zero}{\nat}}
+    \end{prooftree}
+    &
+    \begin{prooftree}
+      \hypo{\judgement[T]{\Gamma}{t}{\nat}}
+      \infer1{\judgement[T]{\Gamma}{\suc~t}{\nat}}
+    \end{prooftree}
+    &
+    \begin{prooftree}
+      \hypo{\judgement[T]{\Gamma}{z}{A}}
+      \hypo{\judgement[T]{\Gamma}{s}{A \to A}}
+      \hypo{\judgement[T]{\Gamma}{t}{\nat}}
+      \infer3{\judgement[T]{\Gamma}{\primrec{z}{s}{t}}{A}}
+    \end{prooftree}
+  \end{array}
+  \]
+  \caption{Typing judgements for System~T}%
+  \label{fig:systemt-type}
+\end{figure}
+
+The equational theory is defined in \cref{fig:systemt-eq}. We have the standard
+\(\beta\)-reduction for functions. When we apply the primitve recursor to target
+zero, we get the base case. When the recursor is applied to target \(\suc~t\),
+we apply the update function to the result of applying \primreckw{} recursively.
+An alternative theory would instead apply the update function directly to the
+base case instead of to the overall result of the recursion.  Whilst this has
+little impact for System~T (and operationally corresponds to tail recursion) the
+alternative rule does not generalise to the more complex recursive types found
+in \lang{}.
+
+\begin{figure}
+  \begin{align*}
+    (\lambda x.~t)~u &= \sub{t}{x/u} &
+    \primrec{z}{s}{\zero} &= z &
+    \primrec{z}{s}{(\suc~t)} &= s~(\primrec{z}{x}{s}{t})
+  \end{align*}
+  \caption{Equational theory for System~T. \TODO{add eta for unions}}%
+  \label{fig:systemt-eq}
+\end{figure}
+
+The primitive recursor is named as such because it generalises the first-order
+primitive recursive functions. For example, we can define addition as
+\[
+k + n = \primrec{n}{\suc}{k}.
+\]
+Despite its name, the primitive recursor can compute more than the primitive
+recursive functions. By exploiting the higher-order nature of System~T, one can
+define the Ackermann-P\'eter function~\cite{semantic-domain} as
+\[
+\lambda k, m.~\primrec{\suc}{(\lambda f.~\primrec{(\lambda n.~n)}{(\lambda g, n.~f~(g~n))})}{k}~m.
+\]
+
+Any countable set can be encoded by natural numbers. For inductively defined
+sets, such as lists or abstract syntax trees for programs, one can use a G\"odel
+encoding to represent values as natural numbers. This depends on a pairing
+function \(\pairingkw : \nat \to \nat \to \nat\) that is a (curried)
+bijection \(\nat \times \nat \to
+\nat\)~\cites{DiscoverEd:book/oc/Cantor52}{szudzik}. As some examples, we can
+encode lists of natural numbers as
+\begin{align*}
+  \encode{\epsilon} &\coloneq \pairing{0}{0} &
+  \encode{n, l} &\coloneq \pairing{1}{(\pairing{n}{\encode{l}})} \\
+  \shortintertext{and System~T terms (using deBruijn indices) as}
+  \encode{x_i}               &\coloneq \pairing{0}{i} &
+  \encode{\zero}             &\coloneq \pairing{3}{0} \\
+  \encode{\lambda~t}         &\coloneq \pairing{1}{\encode{t}} &
+  \encode{\suc~t}            &\coloneq \pairing{4}{\encode{t}} \\
+  \encode{t~u}               &\coloneq \pairing{2}{(\pairing{\encode{t}}{\encode{u}})} &
+  \encode{\primrec{z}{s}{t}} &\coloneq
+    \pairing{5}{(\pairing{\encode{z}}{(\pairing{\encode{s}}{\encode{t}})})}.
+\end{align*}
+In general, each constructor has a unique tag and this is paired with an
+encoding for each subterm. We can analyse an encoded term by reverting the
+pairing function and inspecting the tag. For an invalid tag (for our encoding of
+lists this is anything greater than 2) we can return an arbitrary value.
+Performing recursion on G\"odel encodings is difficult, particularly for values
+with more than one recursive argument.
+
+An alternative encoding strategy uses functions to encode data. One example we
+have already given is our encoding of binary trees, from the introduction.
+Another functional encoding is to represent lists as functions \(\nat \to
+\nat\). Giving the function \zero{} will return the length of the list, and the
+values are 1-indexed. This strategy doesn't require a pairing function, but
+values no longer have a unique representation. For example, the functions
+\(\lambda n.~n + n\) and \(\lambda n. \zero\) both represent the empty list.
+
+\end{document}