yaegi-internals/readme.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409

# Yaegi internals

[Yaegi] is an interpreter of the Go language written in Go. This project
was started in Traefik-Labs initially to provide a simple and practical
embedded plugin engine for the traefik reverse proxy.  Now, more than
200 plugins contributed by the community are listed on the public
catalog at [plugins.traefik.io]. The use of yaegi extends also to other
domains, for example [databases], [observability], [container security]
and many
[others](https://github.com/traefik/yaegi/network/dependents?package_id=UGFja2FnZS0yMjc1NTQ3MjIy).

Yaegi is lean and mean, as it delivers in a single package, with no
external dependency, a complete Go interpreter, compliant with the [Go
specification]. Lean, but also mean: its code is dense, complex, not
always idiomatic, and sometimes maybe hard to understand.

This document is here to address that. In the following, after getting
an overview, we look under the hood, explore the internals and discuss
the design. Our aim is to provide the essential insights, clarify the
architecture and the code organization. But first, the overview.

## Overview of architecture

Let's see what happens inside yaegi when one executes the following
line:

```go
interp.Eval(`print("hello", 2+3)`)
```

The following figure 1 displays the main steps of evaluation:

![figure 1: steps of evaluation](yaegi_internals_fig1.drawio.svg)

1. The *scanner* (provided by the [go/scanner] package) transforms a
   stream of characters (the source code) into a stream of tokens,
   through a [lexical analysis] step.

2. The *parser* (provided by the [go/parser] package) transforms the
   stream of tokens into an [abstract syntax tree] or AST, through a
   [syntax analysis] step.

3. The *analyser* (implemented in the [yaegi/interp] package) performs
   the checks and creation of type, constant, variable and function
   symbols. It also computes the [control-flow graphs] and memory
   allocations for symbols, through [semantic analysis] steps. All those
   metadata are obtained from and stored to the nodes of the AST, making
   it annotated.

4. The *generator* (implemented in the [yaegi/interp] package) reads the
   annotated AST and produces code intructions to be executed, through a
   [code generation] step.

5. The *executor* (implemented in the [yaegi/interp] package) runs the
   code instructions in the context of the interpreter.

The interpreter is designed as a simple compiler, except that the code
is generated into memory instead of object files, and with an executor
module to run the specific instruction format.

We won't spend more details on the scanner and the parser, both provided
by the standard library, and instead examine directly the analyser.

## Semantic analysis

The analyser performs the semantic analysis of the program to interpret.
This is done in several steps, all consisting of reading from and
writing to the AST, so we first examine the details and dynamics of our
AST representation.

### AST dynamics

Hereafter stands the most important data structure of any compiler,
interpreter or other language tool, and the function to use it
(extracted from
[here](https://github.com/traefik/yaegi/blob/8de3add6faf471a807182c7b8198fe863debc9d8/interp/interp.go#L284-L296)).

```go
// node defines a node of a (abstract syntax) tree.
type node struct {
    // Node children
    child []*node
    // Node metadata
    ...
}

// walk traverses AST n in depth first order, invoking in function
// at node entry and out function at node exit.
func (n *node) walk(in func(n *node) bool, out func(n *node)) {
    if in != nil && !in(n) {
        return
    }
    for _, child := range n.child {
        child.Walk(in, out)
    }
    if out != nil {
        out(n)
    }
}
```

The above code is deceptively simple. As in many complex systems, an
important part of the signification is carried by the relationships
between the elements and the patterns they form.  It's easier to
understand it by displaying the corresponding graph and consider the
system as a whole. We can do that using a simple example:

```go
a := 3
if a > 2 {
    print("ok")
}
print("bye")
```

The corresponding AST is:

![figure 2: a raw AST](ex1_raw_ast.drawio.svg)

This is the raw AST, with no annotations, as obtained from the parser.
Each node contains an index number (for labelling purpose only), and the
node type, computed by the parser from the set of Go grammar rules (i.e.
"stmt" for "list of [statements]", "call" for "call [expression]", ...).
We also recognize the source tokens as literal values in leaf locations.

Walking the tree consists in visiting the nodes starting from the root
(node 1), in their numbering order (here from 1 to 15): depth first (the
children before the siblings) and from left to right. At each node, a
callback `in` is invoked at entry (pre-processing) and a callback `out`
at exit (post-processing).

When the `in` callback executes, only the information computed in the
sub-trees in the left of the node is available, in addition to the
pre-processing information computed in the node ancestors. The `in`
callback returns a boolean. If the result is false, the node sub-tree is
skipped, allowing to short-cut processing, for example to avoid to dive
in function bodies and process only function signatures.

When the `out` callback executes, the results computed on the whole
descendant sub-trees are available, which is useful for example to
compute the size of a composite object defined accross nested
structures. In the absence of post-processing, multiple tree walks are
necessary to achieve the same result.

A semantic analysis step is therefore simply a tree walk with the right
callbacks. In the case of our interpreter, we have two tree walks to
perform: the globals and types analysis in [interp/gta.go] and the
control-flow graphs analysis in [interp/cfg.go]. In both files, notice
the call to `root.Walk`.

Note: we have chosen to represent the AST as a uniform node structure
as opposed to the [ast.Node] interface in the Go standard library,
implemented by specialized types for all the node kinds. The main
reason is that the tree walk method [ast.Inspect] only permits a
pre-processing callback, not a post-processing one, required for
several compiling steps. It also seemed simpler at the time to start
with this uniform structure, and we ended up sticking with it.

### Globals and types analysis

Our first operation on the AST is to check and register all the
components of the program declared at global level. This is a partial
analysis, concerned only about declarations and not function
implementations.

This step is necessary because in Go, at global level, symbols can be
used before being declared (as opposed to Go function bodies, or in C in
general, where use before declaration is simply forbidden in strict
mode).

Allowing out of order symbols is what permits the code to be scattered
arbitrarily amongst several files in packages without more constraints.
It is indeed an important feature to let the programer organize her code
as she wants.

This step, implemented in [interp/gta.go], consists in performing a tree
walk with only a pre-processing callback (no `out` function is passed).
There are two particularities:

The first is the multiple-pass iterative walk. Indeed, in a first global
pass, instead of failing with an error whenever an incomplete definition
is met, the reference to the failing sub-tree is kept in a list of nodes
to be retried, and the walk finishes going over the whole tree. Then,
all the problematic sub-trees are iteratively retried until all the
nodes have been defined, or as long as there is progress. That is, if
two subsequent iterations lead to the exact same state, it is a hint
that progress is not being made and it would result in an infinite loop,
at which point yaegi just stops with an error.

The second particularity is that despite being in a partial analysis
step, a full interpretation can still be necessary on an expression
sub-tree if this one serves to implement a global type definition. For
example if an array size is computed by an expression as in the
following valid Go declarations:

```go
const (
    prefix = "/usr"
    path   = prefix + "/local/bin"
)
var a [len(prefix+path) + 2]int
```

A paradox is that the compiler needs an interpreter to perform the type
analysis! Indeed, in the example above, `[16]int` (because
`len(prefix+path) + 2 = 16`) is a specific type in itself, distinct from
e.g. `[14]int`. Which means that even though we are only at the types
analysis phase we already must be able to compute the `len(prefix+path)
+ 2` expression. In the C language it is one of the roles of the
[pre-processor], which means the compiler itself does not need to be
able to achieve that.  Here in Go, the specification forces the compiler
implementor to provide and use early-on the mechanics involved above,
which is usually called constant folding optimisation. It is therefore
implemented both within the standard gc, and whithin yaegi.  The same
kind of approach is pushed to its paroxysm in the [Zig language] with
its [comptime] keyword.

### Control-flow graphs

After GTA, all the global symbols are properly defined no matter
their declaration order. We can now proceed with the full code
analysis, which will be performed by a single tree walk in
[interp/cfg.go].

Both pre-processing and post-processing callbacks are provided to the
walk function. Despite being activated in a single pass, multiple kinds
of data processing are executed:

- Types checking and creation. Started in GTA, it is now completed also
  in all function bodies.

- Analysis of variable scoping: scope levels are opened in
  pre-processing and closed in post-processing, as the nesting of scope
  reflects the AST structure.

- Precise computing of object sizes and locations.

- Identification and ordering of actions.

The last point is critical for code generation. It consists in the
production of control-flow graphs. CFGs are usually represented in the
form of an intermediate representation (IR), which really is a
simplified machine independent instruction set, as in the [GCC GIMPLE],
the [LLVM IR] or the [SSA] form in the Go compiler. In yaegi, no IR is
produced, only AST annotations are used.

Let's use our previous example to explain:

![figure 3: CFG is in AST](ex1_ast_cfg.drawio.svg)

In the AST, the nodes relevant to the CFG are the *action* nodes (in
blue), that is the nodes referring to an arithmetic or a logic
operation, a function call or a memory operation (assigning a variable,
accessing an array entry, ...).

Building the CFG consists in identifying action nodes and then find
their successor (to be stored in node fields `tnext` and `fnext`). An
action node has one successor in the general case (shown with a green
arrow), or two if the action is associated to a conditional branch
(green arrow if the test is true, red arrow otherwise).

The rules to determine the successor of an action node are inherent to
the properties of its neighbours (ancestors, siblings and descendants).
For example, in the `if` sub-tree (nodes 5 to 12), the first action to
execute is the condition test, that is, the first action in the
condition sub-tree, here the node 6.  This action will have two
alternative successors: one to execute if the test is true, the other
if not. The *true* successor will be the first action in the second
child sub-tree of the `if` node, describing the *true* branch (this
sub-tree root is node 9, and first action 10). As there is no `if`
*false* branch in our example, the next action of the whole `if`
sub-tree is the first action in the `if` sibling sub-tree, here the node
13. This node will be therefore the *false* successor, the first action
to execute when the `if` condition fails. Finally the node 13 is also
the successor of the *true* branch, the node 10. The corresponding
implementation is located in a [block of 16 lines] in the
post-processing CFG callback. Note that the same code also performs dead
branch elimination and condition validity checking.  At this stage, in
terms of Control Flow, our AST example can now be seen as a simpler
representation, such as the following.

![figure 4: the same CFG isolated](ex1_cfg.drawio.svg)

In our example, the action nodes composing the CFG can do the following
kind of operations:
- defining variables in memory and assigning values to them
- performing arithmetic or logical operations
- conditional branching
- function calling

Adding the capacity to jump to a *backward* location (where destination
node index is inferior to source's one, an arrow from right to
left), thus allowing *loops*, makes the action set to become
[Turing complete], implementing a universal computing machine.

![figure 5: a CFG with a loop](ex1_cfg_loop.drawio.svg)

The character of universality here lies in the cyclic nature of the
control-flow graph (remark that `if` statement graphs, although
appearing cyclic, are not, because the conditional branches are
mutually exclusives).

This is not just theoretical. For example, forbidding backward jumps was
crucial in the design of the Linux [eBPF verifier], in order to let user
provided (therefore untrusted) snippets execute in a kernel system
privileged environment and guarantee no infinite loops.

## Code generation and execution

The compiler implemented in yaegi targets the Go runtime itself, not a
particular hardware architecture. For each action node in the CFG a
corresponding closure is generated. The main benefits are:

- Portability: the generated code runs on any platform where Go is
  supported.
- Interoperability: the objects produced by the interpreter are directly
  usable by the host program in the form of reflect values.
- The memory management in particular the garbage collector, is provided
  by the runtime, and applies also to the values created by the
  interpreter.
- The support of runtime type safety, slices, maps, channels, goroutines
  is also provided by the runtime.

The action templates are located in [interp/run.go] and [interp/op.go].
Generating closures allows to optimize all the cases where a constant is
used (an operation involving a constant and a variable is cheaper and
faster than the same operation involving two variables). It also permits
to hard-code the control-flow graph, that is to pre-define the next
instruction to execute and avoid useless branch tests.

The pseudo architecture targeted by the interpreter is in effect a
virtual [stack machine] where the memory is represented as slices of Go
reflect values, as shown in the following figure, and where the
instructions are represented directly by the set of action nodes (the
CFG) in the AST.  Those atomic instructions, also called *builtins*, are
sligthly higher level than a real hardware instruction set, because they
operate directly on Go interfaces (more precisely their reflect
representation), hiding a lot of low level processing and subtleties
provided by the Go runtime.

![figure 6: frame organization](frame1.drawio.svg)

The memory management performed by the interpreter consists of creating a
global frame at a new session (the top of the stack), populated with all
global values (constants, types, variables and functions). At each new
interpreted function call, a new frame is pushed on the stack,
containing the values for all the return value, input parameters and
local variables of the function.

## Conclusion

We have described the general architecture of a Go interpreter, reusing
the existing Go scanner and parser. We have focused on the semantic
analysis, which is based on AST annotations, up to the control-flow
graph and code generation.  This design leads to a consistent and
concise compiler suitable for an embedded interpreter.  We have also
provided a succint overview of the virtual stack machine on top of the
Go runtime, leveraging on the reflection layer provided by the Go
standard library.

We can now evolve this design to address different target architectures,
for example a more efficient virtual machine, already in the works.

Some parts of yaegi have not been detailed yet and will be addressed in
a next article:

- Integration with pre-compiled packages
- Go Generics
- Recursive types
- Interfaces and methods
- Virtualization and sandboxing
- REPL and interactive use

P.S. Thanks to [@lejatorn](https://twitter.com/@lejatorn) for his feedback
and suggestions on this post.

[Yaegi]: https://github.com/traefik/yaegi
[plugins.traefik.io]: https://plugins.traefik.io
[databases]: https://github.com/xo/xo
[observability]: https://github.com/slok/sloth
[container security]: https://github.com/cyberark/kubesploit
[Go specification]: https://go.dev/ref/spec
[go/scanner]: https://pkg.go.dev/go/scanner
[go/parser]: https://pkg.go.dev/go/parser
[abstract syntax tree]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
[lexical analysis]: https://en.wikipedia.org/wiki/Lexical_analysis
[syntax analysis]: https://en.wikipedia.org/wiki/Syntax_analysis
[semantic analysis]: https://en.wikipedia.org/wiki/Semantic_analysis_(compilers)
[control-flow graphs]: https://en.wikipedia.org/wiki/Control-flow_graph
[yaegi/interp]: https://pkg.go.dev/github.com/traefik/yaegi/interp
[code generation]: https://en.wikipedia.org/wiki/Code_generation_%28compiler%29
[ast.Node]: https://pkg.go.dev/go/ast#Node
[ast.Inspect]: https://pkg.go.dev/go/ast#Inspect
[statements]: https://go.dev/ref/spec#Statements
[expression]: https://go.dev/ref/spec#Expressions
[Turing complete]: https://en.wikipedia.org/wiki/Turing_completeness
[eBPF verifier]: https://www.kernel.org/doc/html/latest/bpf/verifier.html
[interp/gta.go]: https://github.com/traefik/yaegi/blob/master/interp/gta.go
[interp/cfg.go]: https://github.com/traefik/yaegi/blob/master/interp/cfg.go
[interp/run.go]: https://github.com/traefik/yaegi/blob/master/interp/run.go
[interp/op.go]: https://github.com/traefik/yaegi/blob/master/interp/op.go
[pre-processor]: https://gcc.gnu.org/onlinedocs/cpp/
[GCC GIMPLE]: https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html
[LLVM IR]: https://llvm.org/docs/LangRef.html
[SSA]: https://github.com/golang/go/blob/bf48163e8f2b604f3b9e83951e331cd11edd8495/src/cmd/compile/internal/ssa/README.md
[block of 16 lines]: https://github.com/traefik/yaegi/blob/8de3add6faf471a807182c7b8198fe863debc9d8/interp/cfg.go#L1608-L1624
[Zig language]: https://ziglang.org
[comptime]: https://ziglang.org/documentation/master/#comptime
[stack machine]: https://en.wikipedia.org/wiki/Stack_machine