parscan - experimental interpreter

Age	Commit message (Collapse)	Author
2023-10-12	move to a direct byte code compiler (#8)	Marc Vertes
	* chore: refactor to keep only the new parser and bytecode vm * scanner: remove Token.value field * scanner: remove scanner.kind field * chore: move language specification in lang package This avoid a cyclic dependency in scanner_test which can now use the golang/GoSpec language specification for Go. * clean code * scanner: export scanner fields Also parser now generate function calls, including externals. * chore: fix lint issues * parser: handle strings * wip * parser: implement support for 'if, else, else if' statements Resolving labels in the compiler still in progress. * parser: support if statements, improve compiler * improve handling of functions * improve support of local variables * scanner: trim leading and trailing spaces * fixes to make fibonacci work * parser: improve README, fix function parameters parsing
2023-08-24	codegen: add Interpreter struct	Marc Vertes
	This makes the code easier to use.
2023-08-24	Update README.md	Marc Vertes
	fix diagram
2023-07-20	vm1: introduce a fast and simple bytecode stack machine (#1)	Marc Vertes
	`vm1` is bytecode based stack machine. The purpose of `vm1` is to provide a simple, fast, embeddable and portable Go execution environment. The bytecode consists of a dozen of instructions, each taking 0 or 1 immediate argument (non-immediate arguments are only passed through the stack). Only a few operators for a few types are implemented. I expect to have 1 instruction per operator per numerical type, all with the same pattern, which would be generated from a template. Estimate is around 20 operators and 10 numerical types, so around 200 instructions in final. Structurally, the vm implements logical and arithmetic operators, condional jumps for `if`, `for` and `switch` control flow, and function call, return and frame management. the memory state of the vm is a slice of Go interfaces (`[]any`). The whole vm is coded in a single function of 80 lines with no dependencies. The size will grow as we add missing instructions, but the code complexity will remain the same. the vm1 package is totally standalone and could be used for any purpose outside of parscan and/or gno. ## Performances The benchmark is performed on a fibonacci program as below: ```go= func fib(n int) int { if n < 2 { return n } return fib(n-2) + fib(n-1) } ``` Note: I generated the bytecode by hand, waiting for the code generator package (next week). I measured the execution time of `fib(35)` on my Apple M1 with: - a static executable generated by the Go compiler (native Go) - a byte code interpreted by vm1 - a byte code interpreted by vm1 where memory is `[]int` instead of `[]any` - the source interpreted by vm0 - using gno and yaegi interpreters \| \| go native \| vm1 \| vm1-int \| yaegi \| vm0 \| gno \| \| ------- \| --------- \| ---- \| ------- \| ----- \| --- \| --- \| \| fib(35) \| 0.1s \| 1.4s \| 1.0s \| 9.2s \| 18s \| 23s \| Notes: - both gno and vm0 are an AST based stack machine - yaegi is a register machine with no bytecode but closures - vm1 is a bytecode stack machine - vm1-int is identical to vm1 except that memory is `[]int` instead of `[]any` (just for benchmark) - go-native is statically compiled native machine code ### Additional notes on vm1 Only one instruction allows to execute code outside of the VM: `CallX`. It can be easily restricted. The performance of vm1 is achieved by simplicity and locality. All the execution of the virtual machine takes place inside a single `for` loop and a single `switch` on the opcode. The code of each instruction is trivial. There is no function called, except for optional `CallX`, and invoking the Go slice builtins (`append`, `len`), which are inlined by the Go compiler. The memory state of vm1, as in vm0, is defined as a `[]any` (slice of interfaces), allowing arbitrary objects, and insuring type consistency (each item of the `mem` slice is a tuple `{typePtr, DataPtr}`). To measure the overhead of maintaining type consistency in the runtime, I defined the memory as `[]int` instead of `[]any`, `fib(35)` executed in 1s instead of 1.4s. In terms of memory management, vm1, as vm0, benefits from the way slices are implemented in Go, minimizing the impact of the garbage collector upon growing and shrinking the stack, as nested and/or recursive function calls require. In this scheme, the bottom of the stack is naturally devoted to storage of global data. ### Breaking performance limits ? Vm1 is approximately 10 to 15 times slower than a Go compiled binary, but is simple, fully dynamic, portable, and self-contained. Given the simplicity of the code, I think it is close to the max of what could be obtained by remaining portable and safe on top of the Go runtime. To break this performance barrier, and get even closer to the performance of a Go native binary, a possible approach is to translate the bytecode into target processor instructions (x86, arm64, ...), then execute them directly, as done for example by some WASM jit interpreters. This would induce either a dependence on librairies inside or outside Go (the fastest existing being outside), or, add dependencies both on each supported CPU architecture (to encode instructions) and each supported OS (to execute a memory-mapped segment). See also the PR about LLVM jit as proposed in [gno#774](https://github.com/gnolang/gno/pull/774). ## Next steps A bytecode generator is still necessary to generate the input bytecode from the AST. I would like to propose that as the next step. hint: it will involve a single tree-walk of the AST.