From 4a9fa7002b3606a5e58781139263981e9697de77 Mon Sep 17 00:00:00 2001 From: Greg Brown Date: Tue, 26 Apr 2022 18:54:43 +0100 Subject: Background draft 1. --- thesis.org | 115 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 114 insertions(+), 1 deletion(-) (limited to 'thesis.org') diff --git a/thesis.org b/thesis.org index adaeca1..cd11ea6 100644 --- a/thesis.org +++ b/thesis.org @@ -15,7 +15,9 @@ #+latex_class: article #+latex_class_options: [twoside,a4paper] #+latex_header: \usepackage[hyperref=true,url=true,backend=biber,natbib=true]{biblatex} -#+latex_header: \usepackage{savetrees} +#+latex_header: \usepackage[autostyle,english=british]{csquotes} +#+latex_header: \usepackage[moderate]{savetrees} +#+latex_header: \usepackage[a4paper]{geometry} #+latex_compiler: pdflatex #+begin_abstract @@ -124,8 +126,117 @@ Structure of the paper: can be taken. * Background +The Armv8.1-M pseudocode specification language is used to describe the +operation of instructions in the Armv8-M instruction set [cite:@arm/DDI0553B.s § +E1.1]. If the semantics of this pseudocode can be formalised, then it is +possible to formulate proofs about the action of all Armv8-M instructions, and +hence construct proofs about the algorithms written with them. The language is +rather simple, being pseudocode used as a descriptive aid, but has some +interesting design choices atypical of regular imperative programming languages. + +As the pseudocode is an imperative language, one useful proof system for it is +Hoare logic. Hoare logic is a proof system driven by the syntax of a program, +with most of the hard work of proofs being in the form of choosing suitable +program invariants and solving simple logical implications +[cite:@10.1145/363235.363259]. As the logic is syntax driven, proofs using Hoare +logic are less impacted than other proof systems by the large number of loops +used in Armv8-M instruction descriptions. Further, solving simple logical +implications is a task well-suited to Agda and other proof assistants, making +proofs even simpler to construct. + ** Armv8.1-M Pseudocode Specification Language +The Armv8.1-M pseudocode specification language is a strongly-typed imperative +programming language [cite:@arm/DDI0553B.s § E1.2.1]. It has a first-order type +system, a small set of operators and basic control flow, as you would find in +most imperative languages. There are some interesting design choices atypical of +regular imperative languages to better fulfil the requirements of being a +descriptive aid over an executable language. + +Something common to nearly all imperative languages is the presence of a +primitive type for Booleans. Other typical type constructors are tuples, +structs, enumerations and fixed-length arrays. The first interesting type used +by the pseudocode is mathematical integers as a primitive type. Most imperative +languages use fixed-width integers for primitive types, with exact integers +available through some library. This is because the performance benefits of +using fixed-width integers in code far outweigh the risk of overflow. However, +as the pseudocode is a descriptive aid, with no intention of being executed, it +can use exact mathematical integers and eliminate overflow errors without any +performance cost [cite:@arm/DDI0553B.s § E1.3.4]. + +Another such type present in the pseudocode is mathematical real numbers. As +most real numbers are impossible to record using finite storage, any executable +programming language must make compromises to the precision of real numbers. +Because the pseudocode does not concern itself with being executable, it is free +to use real numbers and have exact precision in real-number arithmetic +[cite:@arm/DDI0553B.s § E1.2.4]. + +The final primitive type used by the pseudocode is the bitstring; a fixed-length +sequence of 0s and 1s. Some readers may wonder what the difference is between +this type and arrays of Booleans. The justification given by +[cite/t:@arm/DDI0553B.s § E1.2.2] is more philosophical than practical: +"bitstrings are the only concrete data type in pseudocode". In some places, +bitstrings can be used instead of integers in arithmetic operations. + +Most of the operators used by the pseudocode are unsurprising. For instance, +Booleans have the standard set of short-circuiting operations; integers and +reals have addition, subtraction and multiplication; reals have division; and +integers have integer division (division rounding to \(-\infty\)) and modulus +(the remainder of division). + +By far the two most interesting operations in the pseudocode are bitstring +concatenation and slicing. Bitstring concatenation is much like appending two +arrays together, or regular string concatenation. Bitstring slicing is a more +nuanced process. Slicing a bitstring by a single index is no different from a +regular array access. If instead a bitstring is sliced by a range of integers, +the result is the concatenation of each single-bit access. Finally, when +integers are sliced instead of bitstring, the pseudocode "treats an integer as +equivalent to a sufficiently long \textelp{} bitstring" [cite:@arm/DDI0553B.s § +E1.3.3]. + +The final interesting difference between the pseudocode and most imperative +languages is the variety of top-level items. The pseudocode has three forms of +items: procedures, functions and array-like functions. Procedures and functions +behave like procedures and functions of other imperative languages. The +arguments to them are passed by value, and the only difference between the two +is that procedures do not return values whilst functions do +[cite:@arm/DDI0553B.s § E1.4.2]. + +Array-like functions act as getters and setters for machine state. Every +array-like function has a reader form, and most have a writer form. This +distinction exists because "reading from and writing to an array element require +different functions", [cite:@arm/DDI0553B.s § E1.4.2], likely due to the nature +of some machine registers being read-only instead of read-writeable. The writer +form acts as one of the targets of assignment expressions, along with variables +and the result of bitstring concatenation and slicing [cite:@arm/DDI0553B.s § +E1.3.5]. + ** Hoare Logic +Hoare logic is a proof system for programs written in imperative programming +languages. At its core, the logic describes how to build partial correctness +triples, which describe how program statements affect assertions about machine +state. The bulk of a Hoare logic derivation is dependent only on the syntax of +the program the proof targets. + +A partial correctness triple is a relation between a precondition \(P\), a +program statement \(s\) and a postcondition \(Q\). If \((P , s , Q)\) is a +partial correctness triple, then whenever \(P\) holds for some machine state, +then when executing \(s\), \(Q\) holds for the state after it terminates +[cite:@10.1145/363235.363259]. Those last three words, "after it terminates", +are what leads the relation being a /partial/ correctness triple. If all +statements terminate, which we will see later, then this relation is called a +correctness triple. + +Along with the syntactic rules for derivations, Hoare logic typically also +features a number of adaptation rules. The most-widely known of these is the +rule of consequence, which can strengthen the precondition and weaken the +postcondition. This requires an additional logic for assertions. Typically, this +is first-order or higher-order logic +[cite:@10.1007/s00165-019-00501-3;@10.1007/s001650050057]. + +One vital feature of Hoare logic with regards to specification is auxiliary +variables. These are variables that cannot be used by programs, hence remain +constant between the precondition and postcondition +[cite:@10.1007/s001650050057]. * Implementation ** Formal Definition of MSL @@ -141,3 +252,5 @@ Structure of the paper: ** Future Work #+print_bibliography: + +# LocalWords: Hoare ISAs Jasmin CompCert structs bitstring bitstrings getters -- cgit v1.2.3