diff options
author | Chloe Brown <chloe.brown.00@outlook.com> | 2025-03-25 16:52:43 +0000 |
---|---|---|
committer | Chloe Brown <chloe.brown.00@outlook.com> | 2025-03-25 16:52:43 +0000 |
commit | a2afd4b08dc2b7eada2f95ee95457457a3331344 (patch) | |
tree | 671b8530d7e8934efad4d91f7575ae01833c6bfe /sec/embedding.ltx |
Before the big rewrite
Diffstat (limited to 'sec/embedding.ltx')
-rw-r--r-- | sec/embedding.ltx | 65 |
1 files changed, 65 insertions, 0 deletions
diff --git a/sec/embedding.ltx b/sec/embedding.ltx new file mode 100644 index 0000000..722eff1 --- /dev/null +++ b/sec/embedding.ltx @@ -0,0 +1,65 @@ +\documentclass[../main.tex]{subfiles} + +\begin{document} +\section{Embedding in System T}% +\label{sec:embedding} + +A major motivation for \lang{} is the ability to write terms with regular types +and embed them in System~T. We explain how the embedding works, which encodes +products, sums and recursive types as System~T types. + +Every pair of System~T types has a union (in the sense of C). The union has +injections into it and projections out of it. The specification of the union +type, \(A \sqcup B\) for types is: +\[ + \begin{array}{cc} + \mathsf{inl} : A \to A \sqcup B & + \mathsf{inr} : B \to A \sqcup B \\ + \mathsf{prl} : A \sqcup B \to A & + \mathsf{prr} : A \sqcup B \to B \\ + \mathsf{prl}~(\mathsf{inl}~t) = t & + \mathsf{prr}~(\mathsf{inr}~t) = t + \end{array} +\] +\Cref{fig:union-impl} shows \posscite{Squid:online/Kiselyov22} implementation of +unions. We can extend this to n-ary unions \(\bigsqcup_i^n A_i\) in the usual +way. We pick \(\nat\) to represent the empty union arbitrarily. + +With unions, it becomes easy to encode products. +\textcite{Squid:online/Kiselyov22} uses the functional encoding \(\prod_i^n A_i +\coloneq \nat \to \bigsqcup_i^n A_i\), representing the product as a +heterogeneous vector. The \(k\)th component of the product is accessed by +applying the product to \(k\). Tupling does a case analysis of the argument, +returning the corresponding value. + +\textcite{Squid:online/Kiselyov22} also describes how to encode sums using +products and unions. A sum is encoded as a tagged union \(\sum_i^n A_i \coloneq +\nat \times \bigsqcup_i^n A_i\), where the tag indicates which case of the sum +is being stored. Case analysis first inspects the tag to decide which branch to +take, downcasts the value to the correct type, and then performs the branch +action. Note this encoding is not consistent: the empty type is necessarily +inhabited, as all System~T types are. This inconsistency is necessary for the +encoding of recursive types. + +Recursive types are the most difficult to encode. +\textcite{DBLP:books/sp/LongleyN15} describe one strategy: represent values as a +heap. We have the encoding \(\mu X. A \coloneq \nat \times \nat \times (\nat \to +\sub{A}{X/\nat})\). The first component is the ``depth'' of the heap: an upper +bound on the number of pointers to follow on any arbitrary path. The second +component represents the root index of the heap---the address where the data +structure starts. The final component is the heap itself. Each location stores +data associated with its value. Any recursion is replaced with a pointer to a +different location in the heap. + +\TODO{example} + +Folding is easier to compute than rolling. The precise encoding is given below. +We compute the fold across the heap in parallel, iteratively increasing the +depth of paths we can safely use. The initial result is arbitrary: we don't have +any data to operate on. + +\begin{multline*} +\TODO{fold} +\end{multline*} + +\end{document} |