From 8080afb5a9e38116646d69155365fb6bd09ea40c Mon Sep 17 00:00:00 2001 From: Marc Vertes Date: Thu, 24 Aug 2023 10:59:39 +0200 Subject: fix: parser must be initialized before use --- scanner/README.md | 43 +++++++++++++++++++++++++++++++++++++++++++ scanner/readme.md | 43 ------------------------------------------- 2 files changed, 43 insertions(+), 43 deletions(-) create mode 100644 scanner/README.md delete mode 100644 scanner/readme.md (limited to 'scanner') diff --git a/scanner/README.md b/scanner/README.md new file mode 100644 index 0000000..c131a9f --- /dev/null +++ b/scanner/README.md @@ -0,0 +1,43 @@ +# Scanner + +A scanner takes a string in input and returns an array of tokens. + +Tokens can be of the following kinds: +- identifier +- number +- operator +- separator +- string +- block + +Resolving nested blocks in the scanner is making the parser simple +and generic, without having to resort to parse tables. + +The lexical rules are provided by a language specification at language +level which includes the following: + +- a set of composable properties (1 per bit, on an integer) for each + character in the ASCII range (where all separator, operators and + reserved keywords must be defined). +- for each block or string, the specification of starting and ending + delimiter. + +## Development status + +A successful test must be provided to check the status. + +- [x] numbers starting with a digit +- [ ] numbers starting otherwise +- [x] unescaped strings (including multiline) +- [x] escaped string (including multiline) +- [x] separators (in UTF-8 range) +- [x] single line string (\n not allowed) +- [x] identifiers (in UTF-8 range) +- [x] operators, concatenated or not +- [x] single character block/string delimiters +- [x] arbitrarly nested blocks and strings +- [x] multiple characters block/string delimiters +- [x] blocks delimited by operator characters +- [ ] blocks delimited by identifiers +- [x] blocks with delimiter inclusion/exclusion rules +- [ ] blocks delimited by indentation level (python, yaml, ...) diff --git a/scanner/readme.md b/scanner/readme.md deleted file mode 100644 index c131a9f..0000000 --- a/scanner/readme.md +++ /dev/null @@ -1,43 +0,0 @@ -# Scanner - -A scanner takes a string in input and returns an array of tokens. - -Tokens can be of the following kinds: -- identifier -- number -- operator -- separator -- string -- block - -Resolving nested blocks in the scanner is making the parser simple -and generic, without having to resort to parse tables. - -The lexical rules are provided by a language specification at language -level which includes the following: - -- a set of composable properties (1 per bit, on an integer) for each - character in the ASCII range (where all separator, operators and - reserved keywords must be defined). -- for each block or string, the specification of starting and ending - delimiter. - -## Development status - -A successful test must be provided to check the status. - -- [x] numbers starting with a digit -- [ ] numbers starting otherwise -- [x] unescaped strings (including multiline) -- [x] escaped string (including multiline) -- [x] separators (in UTF-8 range) -- [x] single line string (\n not allowed) -- [x] identifiers (in UTF-8 range) -- [x] operators, concatenated or not -- [x] single character block/string delimiters -- [x] arbitrarly nested blocks and strings -- [x] multiple characters block/string delimiters -- [x] blocks delimited by operator characters -- [ ] blocks delimited by identifiers -- [x] blocks with delimiter inclusion/exclusion rules -- [ ] blocks delimited by indentation level (python, yaml, ...) -- cgit v1.2.3