thesis.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138

#+options: ':t *:t -:t ::t <:t H:3 \n:nil ^:t arch:headline author:t
#+options: broken-links:nil c:nil creator:nil d:(not "LOGBOOK") date:t e:t
#+options: email:nil f:t inline:t num:t p:nil pri:nil prop:nil stat:t tags:t
#+options: tasks:t tex:t timestamp:t title:t toc:nil todo:t |:t
#+title: Semantics of an embedded vector architecture for formal verification of software
#+date: \today
#+author: Greg Brown
#+email: greg.brown@cl.cam.ac.uk
#+language: en-GB
#+select_tags: export
#+exclude_tags: noexport
#+creator: Emacs 27.2 (Org mode 9.6)
#+cite_export: biblatex
#+bibliography: ./thesis.bib
#+latex_class: article
#+latex_class_options: [twoside,a4paper]
#+latex_header: \usepackage[hyperref=true,url=true,backend=biber,natbib=true]{biblatex}
#+latex_header: \usepackage{savetrees}
#+latex_compiler: pdflatex

#+begin_abstract
All good implementations of any algorithm should be correct and fast. To
maximise performance some algorithms are written in hand-tuned assembly. This
can introduce subtle bugs that invalidate correctness or other safety
properties. Whilst tools exists to help formally verify these algorithms, none
are designed to target the recent M-profile Vector Extension for the Armv8.1-M
architecture. This work describes a denotational and Hoare logic semantics for
the language used to specify the instructions, and attempts to use them to
formally verify the correctness of hand-written assembly for cryptographic
applications.
#+end_abstract

* Introduction

In nearly all cases, the best implementation of an algorithm for a particular
purpose is both correct and fast. If the implementation is not correct, then it
does not implement the algorithm. If the implementation is slow, then resources are
wasted executing this algorithm over performing other useful work. Ensuring
implementations are both performant and correct is typically difficult.

One case that proves particularly tricky is writing assembly code. Most modern
handwritten assembly is for cryptographic algorithms, where it is vital that
compilers do not introduce timing attacks or other side channels. Hand written
assembly can also take advantage of microarchitectural differences to gain a
slight performance benefit.

Another domain that frequently deals with assembly code is the output of
compilers. Optimising compilers take the semantic operations described by
high-level code and attempt to produce the optimal machine code to perform those
actions.

In both of these cases, either humans or machines shuffle around instructions in
an attempt to eek out the best performance from hardware. The resulting code
warps the original algorithms in ways that make it difficult to determine
whether the actions performed actually have the same result. The correctness of
the assembly code comes into question.

It is only possible to prove the assembly code is correct if there is a model
describing how it should behave. Suitable mathematical models of the behaviour
allow for the construction of formal proofs certifying the assembly code has the
correct behaviour. Due to the size and complexity of different instruction set
architectures (ISAs), formal verification of software in this manner makes use
of a number of software tools.

Formal verification software is well-developed for the x86-64 and Arm A-profile
ISAs. For instance, Jasmin [cite:@10.1145/3133956.3134078] is a programming
language which verifies that compiling to x86-64 preserves a set of user-defined
safety properties. Another similar tool is CompCert [cite:@hal/01238879] which
is a verified C compiler, ensuring that the program semantics are preserved
during compilation.

Unfortunately, little work has been done to formally verify software written for
the Arm M-profile architecture. This architecture is designed for low-power and
low-latency microcontrollers, which operate in resource-constrained
environments.

The goal of this project is to formally verify an assembly function for the Arm
M-profile architecture. The specific algorithm
[cite:@10.46586/tches.v2022.i1.482-505] is a hand-written highly-optimised
assembly-code implementation of the number-theoretic transform, which is an
important procedure for post-quantum cryptography.

This work has made progress on two fronts. Most work has been spent developing
an embedding and semantic model of the Armv8.1-M pseudocode specification
language for the Agda programming language [cite:@10.1007/978-3-642-03359-9_6].
All instructions in the Armv8.1-M architecture are given a "precise description"
[cite:@arm/DDI0553B.s § \(\texttt{I}_\texttt{RTDZ}\)] using the pseudocode, and
by developing an embedding within Agda, it is possible to use its powerful type
system to construct formal proofs of correctness.

Progress has also been made on the formal algebraic verification of Barrett
reduction, a subroutine of the NTT.  Whilst there exists a paper proof of the
correctness of Barrett reduction, a proof of Barrett reduction within Agda is
necessary to be able to prove the correctness of the given implementation of the
NTT.

Structure of the paper:
- [[Armv8.1-M Pseudocode Specification Language]] describes the Armv8.1-M pseudocode
  specification language. This is an imperative programming language used by the
  Armv8.1-M manual to describe the operation of the instructions.
- A simple recap of Hoare logic is given in [[Hoare Logic]]. This is the backbone of
  one of the formal verification techniques used in this work.
- [[Formal Definition of MSL]] contains a description of MSL. This is a language
  similar to the Armv8.1-M pseudocode embedded within the Agda programming
  language.
- The denotational semantics and a Hoare logic semantics of MSL are detailed in
  [[Semantics of MSL]]. Due to Agda's nature of being a dependently-typed language,
  these are both given using Agda code.
- [[Application to Proofs]] describes the experience of using the Hoare logic and
  denotational semantics of MSL to prove the correctness of some simple
  routines, given in [[Proofs using Hoare Semantics]] and [[Proofs using Denotational
  Semantics]] respectively.
- Formal verification of Barrett reduction, an important subroutine in the NTT,
  is given in [[Proofs of Barrett Reduction]]. In particular, it proves that Barrett
  reduction performs a modulo reduction and gives bounds on the size of the
  result.
- Finally, [[Future Work]] describes the steps necessary to complete the formal
  verification of the NTT, as well as listing some other directions this work
  can be taken.

* Background
** Armv8.1-M Pseudocode Specification Language
** Hoare Logic

* Implementation
** Formal Definition of MSL
** Semantics of MSL

* Application to Proofs
** General Observations
** Proofs using Hoare Semantics
** Proofs using Denotational Semantics
** Proofs of Barrett Reduction

* Conclusions
** Future Work

#+print_bibliography: