Before the big rewrite

author: Chloe Brown <chloe.brown.00@outlook.com> 2025-03-25 16:52:43 +0000
committer: Chloe Brown <chloe.brown.00@outlook.com> 2025-03-25 16:52:43 +0000
commit: a2afd4b08dc2b7eada2f95ee95457457a3331344 (patch)
tree: 671b8530d7e8934efad4d91f7575ae01833c6bfe /sec/embedding.ltx
1 files changed, 65 insertions, 0 deletions
diff --git a/sec/embedding.ltx b/sec/embedding.ltx
new file mode 100644
index 0000000..722eff1
--- /dev/null
+++ b/sec/embedding.ltx
@@ -0,0 +1,65 @@
+\documentclass[../main.tex]{subfiles}
+
+\begin{document}
+\section{Embedding in System T}%
+\label{sec:embedding}
+
+A major motivation for \lang{} is the ability to write terms with regular types
+and embed them in System~T. We explain how the embedding works, which encodes
+products, sums and recursive types as System~T types.
+
+Every pair of System~T types has a union (in the sense of C). The union has
+injections into it and projections out of it. The specification of the union
+type, \(A \sqcup B\) for types is:
+\[
+  \begin{array}{cc}
+    \mathsf{inl} : A \to A \sqcup B &
+    \mathsf{inr} : B \to A \sqcup B \\
+    \mathsf{prl} : A \sqcup B \to A &
+    \mathsf{prr} : A \sqcup B \to B \\
+    \mathsf{prl}~(\mathsf{inl}~t) = t &
+    \mathsf{prr}~(\mathsf{inr}~t) = t
+  \end{array}
+\]
+\Cref{fig:union-impl} shows \posscite{Squid:online/Kiselyov22} implementation of
+unions. We can extend this to n-ary unions \(\bigsqcup_i^n A_i\) in the usual
+way. We pick \(\nat\) to represent the empty union arbitrarily.
+
+With unions, it becomes easy to encode products.
+\textcite{Squid:online/Kiselyov22} uses the functional encoding \(\prod_i^n A_i
+\coloneq \nat \to \bigsqcup_i^n A_i\), representing the product as a
+heterogeneous vector. The \(k\)th component of the product is accessed by
+applying the product to \(k\). Tupling does a case analysis of the argument,
+returning the corresponding value.
+
+\textcite{Squid:online/Kiselyov22} also describes how to encode sums using
+products and unions. A sum is encoded as a tagged union \(\sum_i^n A_i \coloneq
+\nat \times \bigsqcup_i^n A_i\), where the tag indicates which case of the sum
+is being stored. Case analysis first inspects the tag to decide which branch to
+take, downcasts the value to the correct type, and then performs the branch
+action. Note this encoding is not consistent: the empty type is necessarily
+inhabited, as all System~T types are. This inconsistency is necessary for the
+encoding of recursive types.
+
+Recursive types are the most difficult to encode.
+\textcite{DBLP:books/sp/LongleyN15} describe one strategy: represent values as a
+heap. We have the encoding \(\mu X. A \coloneq \nat \times \nat \times (\nat \to
+\sub{A}{X/\nat})\). The first component is the ``depth'' of the heap: an upper
+bound on the number of pointers to follow on any arbitrary path. The second
+component represents the root index of the heap---the address where the data
+structure starts. The final component is the heap itself. Each location stores
+data associated with its value. Any recursion is replaced with a pointer to a
+different location in the heap.
+
+\TODO{example}
+
+Folding is easier to compute than rolling. The precise encoding is given below.
+We compute the fold across the heap in parallel, iteratively increasing the
+depth of paths we can safely use. The initial result is arbitrary: we don't have
+any data to operate on.
+
+\begin{multline*}
+\TODO{fold}
+\end{multline*}
+
+\end{document}
author	Chloe Brown <chloe.brown.00@outlook.com>	2025-03-25 16:52:43 +0000
committer	Chloe Brown <chloe.brown.00@outlook.com>	2025-03-25 16:52:43 +0000
commit	a2afd4b08dc2b7eada2f95ee95457457a3331344 (patch)
tree	671b8530d7e8934efad4d91f7575ae01833c6bfe /sec/embedding.ltx