summaryrefslogtreecommitdiff
path: root/sec/intro.ltx
diff options
context:
space:
mode:
authorChloe Brown <chloe.brown.00@outlook.com>2025-03-25 16:52:43 +0000
committerChloe Brown <chloe.brown.00@outlook.com>2025-03-25 16:52:43 +0000
commita2afd4b08dc2b7eada2f95ee95457457a3331344 (patch)
tree671b8530d7e8934efad4d91f7575ae01833c6bfe /sec/intro.ltx
Before the big rewrite
Diffstat (limited to 'sec/intro.ltx')
-rw-r--r--sec/intro.ltx103
1 files changed, 103 insertions, 0 deletions
diff --git a/sec/intro.ltx b/sec/intro.ltx
new file mode 100644
index 0000000..00a6d5b
--- /dev/null
+++ b/sec/intro.ltx
@@ -0,0 +1,103 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Introduction}%
+\label{sec:intro}
+
+System~T is an extension of the simply-typed lambda calculus adding natural
+numbers and primitive recursion. This minimality makes it an excellent calculus
+to model, and yet the calculus is expressive enough to perform a wide array of
+useful calculations. This includes most arithmetic, \TODO{middle thing} and even
+rewriting of lambda terms.
+
+The restricted type system of System~T makes expressing complex data difficult.
+Often structured data has to be encoded as naturals or first-order functions.
+Encoding data within a natural number often makes structural recursion
+exceedingly complex. Whilst using first-order functions can make recursion
+easier, it is often the case that passing different arguments can return
+different shapes of data.
+
+Consider the following System~T pseudocode:
+\begin{listing}[H]
+ \begin{systemt}
+ let sum (tree : nat -> nat -> nat -> nat -> nat) : nat =
+ let depth = tree 0 0 0 0 in
+ let root = tree 1 0 0 0 in
+ let heap = tree 2 in
+ let go : nat -> nat = primrec depth with
+ Z => fun i => i
+ | S(acc) => fun i =>
+ let tag = heap i 0 0 in
+ let v = heap i 1 in
+ if tag == 0 then
+ v 0
+ else
+ acc (v 0) + acc (v 1)
+ \end{systemt}
+ \vspace{-\baselineskip}
+\end{listing}
+From the name and type, one can deduce this performs a sum over some tree
+structure. With some more thought, you may be able to work out that
+\systemtinline{tree} is a binary tree. This information is not at all available
+in its type. It is also unclear why \systemtinline{tree 0 0 0 0} should return
+the depth of the tree and \systemtinline{tree 1 0 0 0} the root.
+
+In other functional programming languages, the same sum function would look
+something like the following OCaml code:
+\begin{listing}[H]
+ \begin{minted}{ocaml}
+ let sum tree = match tree with
+ Leaf(v) -> v
+ | Branch(left, right) -> sum left + sum right
+ \end{minted}
+ \vspace{-\baselineskip}
+\end{listing}
+In this example it is immediately obvious what the function does. The code is
+first order reducing its complexity and the richer type system makes it easier
+to identify the role of different variables. These additional features of OCaml
+and other such languages improve the closeness of mapping and
+role-expressiveness of the code whilst reducing the error-proneness (\TODO{check
+ this}). This is great for human programmers but this comes at a modelling
+cost. Unlike System~T, most functional languages allow non-termination. For
+example, OCaml would have no issue with us writing
+\mintinline{ocaml}{sum tree + sum right} by mistake.
+
+In this work we present \lang{}, a new intermediate language that compiles into
+System~T. \lang{} extends the simply-typed lambda calculus with the regular
+types~\cite{Squid:unpublished/McBride01}: finite products, finite sums and some
+inductive types.\@ \lang{} has most of the expressive types we are used to from
+functional languages---we can write code similar to the second example---whilst
+still being simple enough to represent within System~T.
+
+The type encodings used to compile \lang{} into System~T are
+well-known~\cites{Squid:online/Kiselyov22}{DBLP:books/sp/LongleyN15}. Encoding
+products and sums relies on C-style union types, which exist due to System~T
+having only naturals and function types. Recursive types are represented using a
+heap with an explicit depth, informally described by
+\textcite{DBLP:books/sp/LongleyN15}. We have written this compilation algorithm
+within \lang{} itself.
+
+To aid in the development of this program, we also constructed a type checker
+for \lang{} in Idris~2~\cite{idris}. The typechecker uses intrisically
+well-scoped syntax. The syntax is represented by an indexed data type
+\texttt{Term}, where the index describes the variables that are in scope. By
+using quantities~\cite{quants} and auto-implicit search, the compiler code is
+written in a named style, yet compiles down to using deBruijn indices. Using
+names helps to reduce errors in the code, and the compilation to deBruijn
+indices means we don't pay any runtime cost.
+
+%% \subsubsection*{Contributions}%
+%% \label{sec:contributions}
+%% \begin{itemize}
+%% \item Design of new intermediate language \lang{}, featuring iso-recursive
+%% regular types
+%% \item Algorithm for embedding \lang{} into to System~T, implementable within
+%% \lang{}
+%% \item Case study implementation of a fuelled self-reducer for System~T,
+%% demonstrating improvements in several cognitive dimensions~\cite{cog-dim} for
+%% \lang{} compared to working in System~T directly
+%% \item Compiler implemented in Idris 2 with intrinsically well-scoped syntax and
+%% runtime-erased typechecking proofs.
+%% \end{itemize}
+
+\end{document}