#+options: ':t *:t -:t ::t <:t H:3 \n:nil ^:t arch:headline author:t #+options: broken-links:nil c:nil creator:nil d:(not "LOGBOOK") date:t e:t #+options: email:nil f:t inline:t num:t p:nil pri:nil prop:nil stat:t tags:t #+options: tasks:t tex:t timestamp:t title:t toc:nil todo:t |:t #+title: Semantics of an embedded vector architecture for formal verification of software #+date: \today #+author: Greg Brown #+email: greg.brown@cl.cam.ac.uk #+language: en-GB #+select_tags: export #+exclude_tags: noexport #+creator: Emacs 27.2 (Org mode 9.6) #+cite_export: biblatex #+bibliography: ./thesis.bib #+latex_class: article #+latex_class_options: [twoside,a4paper] #+latex_header: \usepackage[hyperref=true,url=true,backend=biber,natbib=true]{biblatex} #+latex_header: \usepackage[autostyle,english=british]{csquotes} #+latex_header: \usepackage[moderate]{savetrees} #+latex_header: \usepackage[a4paper]{geometry} #+latex_compiler: pdflatex #+begin_abstract All good implementations of any algorithm should be correct and fast. To maximise performance some algorithms are written in hand-tuned assembly. This can introduce subtle bugs that invalidate correctness or other safety properties. Whilst tools exists to help formally verify these algorithms, none are designed to target the recent M-profile Vector Extension for the Armv8.1-M architecture. This work describes a denotational and Hoare logic semantics for the language used to specify the instructions, and attempts to use them to formally verify the correctness of hand-written assembly for cryptographic applications. #+end_abstract # Flip this all around. What am I doing? What is the stuff? Why is it hard? (Why # am I smart?) * Introduction # Merge these two paras In nearly all cases, the best implementation of an algorithm for a particular purpose is both correct and fast. If the implementation is not correct, then it does not implement the algorithm. If the implementation is slow, then resources are wasted executing this algorithm over performing other useful work. Ensuring implementations are both performant and correct is typically difficult. One case that proves particularly tricky is writing assembly code. # dubious claim Most modern handwritten assembly is for cryptographic algorithms, where it is vital that compilers do not introduce timing attacks or other side channels. Hand written assembly can also take advantage of microarchitectural differences to gain a slight performance benefit. Another domain that frequently deals with assembly code is the output of compilers. Optimising compilers take the semantic operations described by high-level code and attempt to produce the optimal machine code to perform those actions. In both of these cases, either humans or machines shuffle around instructions in an attempt to eek out the best performance from hardware. The resulting code warps the original algorithms in ways that make it difficult to determine whether the actions performed actually have the same result. The correctness of the assembly code comes into question. It is only possible to prove the assembly code is correct if there is a model describing how it should behave. Suitable mathematical models of the behaviour allow for the construction of formal proofs certifying the assembly code has the correct behaviour. Due to the size and complexity of different instruction set architectures (ISAs), formal verification of software in this manner makes use of a number of software tools. Formal verification software is well-developed for the x86-64 and Arm A-profile ISAs. For instance, Jasmin [cite:@10.1145/3133956.3134078] is a programming language which verifies that compiling to x86-64 preserves a set of user-defined safety properties. Another similar tool is CompCert [cite:@hal/01238879] which is a verified C compiler, ensuring that the program semantics are preserved during compilation. Unfortunately, little work has been done to formally verify software written for the Arm M-profile architecture. This architecture is designed for low-power and low-latency microcontrollers, which operate in resource-constrained environments. The goal of this project is to formally verify an assembly function for the Arm M-profile architecture. The specific algorithm [cite:@10.46586/tches.v2022.i1.482-505] is a hand-written highly-optimised assembly-code implementation of the number-theoretic transform, which is an important procedure for post-quantum cryptography. This work has made progress on two fronts. Most work has been spent developing an embedding and semantic model of the Armv8.1-M pseudocode specification language for the Agda programming language [cite:@10.1007/978-3-642-03359-9_6]. All instructions in the Armv8.1-M architecture are given a "precise description" [cite:@arm/DDI0553B.s § \(\texttt{I}_\texttt{RTDZ}\)] using the pseudocode, and by developing an embedding within Agda, it is possible to use its powerful type system to construct formal proofs of correctness. Progress has also been made on the formal algebraic verification of Barrett reduction, a subroutine of the NTT. Whilst there exists a paper proof of the correctness of Barrett reduction, a proof of Barrett reduction within Agda is necessary to be able to prove the correctness of the given implementation of the NTT. # Focus on contributions Structure of the paper: - [[Armv8.1-M Pseudocode Specification Language]] describes the Armv8.1-M pseudocode specification language. This is an imperative programming language used by the Armv8.1-M manual to describe the operation of the instructions. - A simple recap of Hoare logic is given in [[Hoare Logic]]. This is the backbone of one of the formal verification techniques used in this work. - [[Formal Definition of MSL]] contains a description of MSL. This is a language similar to the Armv8.1-M pseudocode embedded within the Agda programming language. - The denotational semantics and a Hoare logic semantics of MSL are detailed in [[Semantics of MSL]]. Due to Agda's nature of being a dependently-typed language, these are both given using Agda code. - [[Application to Proofs]] describes the experience of using the Hoare logic and denotational semantics of MSL to prove the correctness of some simple routines, given in [[Proofs using Hoare Semantics]] and [[Proofs using Denotational Semantics]] respectively. - Formal verification of Barrett reduction, an important subroutine in the NTT, is given in [[Proofs of Barrett Reduction]]. In particular, it proves that Barrett reduction performs a modulo reduction and gives bounds on the size of the result. - Finally, [[Future Work]] describes the steps necessary to complete the formal verification of the NTT, as well as listing some other directions this work can be taken. * Background The Armv8.1-M pseudocode specification language is used to describe the operation of instructions in the Armv8-M instruction set [cite:@arm/DDI0553B.s § E1.1]. If the semantics of this pseudocode can be formalised, then it is possible to formulate proofs about the action of all Armv8-M instructions, and hence construct proofs about the algorithms written with them. The language is rather simple, being pseudocode used as a descriptive aid, but has some interesting design choices atypical of regular imperative programming languages. As the pseudocode is an imperative language, one useful proof system for it is Hoare logic. Hoare logic is a proof system driven by the syntax of a program, with most of the hard work of proofs being in the form of choosing suitable program invariants and solving simple logical implications [cite:@10.1145/363235.363259]. As the logic is syntax driven, proofs using Hoare logic are less impacted than other proof systems by the large number of loops used in Armv8-M instruction descriptions. Further, solving simple logical implications is a task well-suited to Agda and other proof assistants, making proofs even simpler to construct. ** Armv8.1-M Pseudocode Specification Language The Armv8.1-M pseudocode specification language is a strongly-typed imperative programming language [cite:@arm/DDI0553B.s § E1.2.1]. It has a first-order type system, a small set of operators and basic control flow, as you would find in most imperative languages. There are some interesting design choices atypical of regular imperative languages to better fulfil the requirements of being a descriptive aid over an executable language. Something common to nearly all imperative languages is the presence of a primitive type for Booleans. Other typical type constructors are tuples, structs, enumerations and fixed-length arrays. The first interesting type used by the pseudocode is mathematical integers as a primitive type. Most imperative languages use fixed-width integers for primitive types, with exact integers available through some library. This is because the performance benefits of using fixed-width integers in code far outweigh the risk of overflow. However, as the pseudocode is a descriptive aid, with no intention of being executed, it can use exact mathematical integers and eliminate overflow errors without any performance cost [cite:@arm/DDI0553B.s § E1.3.4]. Another such type present in the pseudocode is mathematical real numbers. As most real numbers are impossible to record using finite storage, any executable programming language must make compromises to the precision of real numbers. Because the pseudocode does not concern itself with being executable, it is free to use real numbers and have exact precision in real-number arithmetic [cite:@arm/DDI0553B.s § E1.2.4]. The final primitive type used by the pseudocode is the bitstring; a fixed-length sequence of 0s and 1s. Some readers may wonder what the difference is between this type and arrays of Booleans. The justification given by [cite/t:@arm/DDI0553B.s § E1.2.2] is more philosophical than practical: "bitstrings are the only concrete data type in pseudocode". In some places, bitstrings can be used instead of integers in arithmetic operations. Most of the operators used by the pseudocode are unsurprising. For instance, Booleans have the standard set of short-circuiting operations; integers and reals have addition, subtraction and multiplication; reals have division; and integers have integer division (division rounding to \(-\infty\)) and modulus (the remainder of division). By far the two most interesting operations in the pseudocode are bitstring concatenation and slicing. Bitstring concatenation is much like appending two arrays together, or regular string concatenation. Bitstring slicing is a more nuanced process. Slicing a bitstring by a single index is no different from a regular array access. If instead a bitstring is sliced by a range of integers, the result is the concatenation of each single-bit access. Finally, when integers are sliced instead of bitstring, the pseudocode "treats an integer as equivalent to a sufficiently long \textelp{} bitstring" [cite:@arm/DDI0553B.s § E1.3.3]. The final interesting difference between the pseudocode and most imperative languages is the variety of top-level items. The pseudocode has three forms of items: procedures, functions and array-like functions. Procedures and functions behave like procedures and functions of other imperative languages. The arguments to them are passed by value, and the only difference between the two is that procedures do not return values whilst functions do [cite:@arm/DDI0553B.s § E1.4.2]. Array-like functions act as getters and setters for machine state. Every array-like function has a reader form, and most have a writer form. This distinction exists because "reading from and writing to an array element require different functions", [cite:@arm/DDI0553B.s § E1.4.2], likely due to the nature of some machine registers being read-only instead of read-writeable. The writer form acts as one of the targets of assignment expressions, along with variables and the result of bitstring concatenation and slicing [cite:@arm/DDI0553B.s § E1.3.5]. ** Hoare Logic Hoare logic is a proof system for programs written in imperative programming languages. At its core, the logic describes how to build partial correctness triples, which describe how program statements affect assertions about machine state. The bulk of a Hoare logic derivation is dependent only on the syntax of the program the proof targets. A partial correctness triple is a relation between a precondition \(P\), a program statement \(s\) and a postcondition \(Q\). If \((P , s , Q)\) is a partial correctness triple, then whenever \(P\) holds for some machine state, then when executing \(s\), \(Q\) holds for the state after it terminates [cite:@10.1145/363235.363259]. Those last three words, "after it terminates", are what leads the relation being a /partial/ correctness triple. If all statements terminate, which we will see later, then this relation is called a correctness triple. Along with the syntactic rules for derivations, Hoare logic typically also features a number of adaptation rules. The most-widely known of these is the rule of consequence, which can strengthen the precondition and weaken the postcondition. This requires an additional logic for assertions. Typically, this is first-order or higher-order logic [cite:@10.1007/s00165-019-00501-3;@10.1007/s001650050057]. One vital feature of Hoare logic with regards to specification is auxiliary variables. These are variables that cannot be used by programs, hence remain constant between the precondition and postcondition [cite:@10.1007/s001650050057]. * Implementation ** Formal Definition of MSL ** Semantics of MSL * Application to Proofs ** General Observations ** Proofs using Hoare Semantics ** Proofs using Denotational Semantics ** Proofs of Barrett Reduction * Conclusions ** Future Work #+print_bibliography: # LocalWords: Hoare ISAs Jasmin CompCert structs bitstring bitstrings getters