A programming language
It is fair to assume that programming with so few primitives, will make it difficult for software tools, to infer meaning from a program's source files. But we can still convey a lot of details about the intent of various data structures by encoding information in the few tools at our disposal.
I. Structural Editing
Let's consider the following uxn program, and associated symbols file, created with Drifblim:
hello.rom a001 1560 0001 0040 0005 9480 1817 2194 20ff f722 6c48 656c 6c6f 2057 6f72 6c64 21 hello.rom.sym 0018 write 0100 ( -> ) 0100 on-reset 0107 ( str* -: ) 0107 <print> 010a ( -- ) 010a <print>/while 010a ( send ) 010f <print>/ 010e ( loop ) 0115 hello-txt
It is possible to recreate a textual source file by walking through the program data, drawing from the symbols file, as: first the labels, second the sublabels, lastly the comments, and create the following valid program:
|0018 @write |0100 @on-reset ( -> ) ;hello-txt <print> BRK @<print> ( str* -: ) !& &while ( -- ) ( send ) LDAk .write DEO ( loop ) INC2 & LDAk ?&while POP2 JMP2r @hello-txt 48 65 6c 6c 6f 20 57 6f 72 6c 64
While this text stream can be reassembled into the program from which it originates, it will first need to be reformatted to be readable as a programming artifact.
|0018 @write |0100 @on-reset ( -> ) ;hello-txt <print> BRK @<print> ( str* -: ) !& &while ( -- ) ( send ) LDAk .write DEO ( loop ) INC2 & LDAk ?&while POP2 JMP2r @hello-txt 48 65 6c 6c 6f 20 57 6f 72 6c 64
By breaking on absolute padding, label tokens, non-defining comments, and tabbing the content of sublabels, we can improve readability further and can already return to something like what the original source file might have looked like:
|0018 @write |0100 @on-reset ( -> ) ;hello-txt <print> BRK @<print> ( str* -: ) !& &while ( -- ) ( send ) LDAk .write DEO ( loop ) INC2 & LDAk ?&while POP2 JMP2r @hello-txt "Hello 20 "World! 00
Finally, we can ensure that lines terminate on emiting
opcodes(STZ/STR/STA/DEO) and some immediate opcodes(JCI, JMI). To make explicit
that some routines are emiting, and therefore line-terminating, I chose to use
the <label>
format.
Labels marking the start of binary information use prefixes that communicate to the reassembler, how to handle the content. For example, txt for ascii characters, icn for 1-bit graphics and chr for 2-bit graphics.
II. Source Validation
Lacking data types, there is not much to go on as far as static validation, but there is still space to explore here. Type inference in Uxntal is done by checking the stack effect declarations of words, against the sum of stack changes predicted to occur based on the arity of each token in their bodies.
@add ( a* b* -: c* ) Warning: Imbalance in @add of +2 DUP2 ADD2 JMP2r
Words that do not pass the stack-checker are generating a warning, and so essentially this defines a very basic and permissive type system that nevertheless catches some invalid programs and enables compiler optimizations.
@routine ( a b -: c ) Warning: Imbalance in @routine of +1 EQUk ?&sub-routine POP2 #0a JMP2r &sub-routine ( a b -: c* ) POP2 #000b JMP2r
III. Peephole Optimization
Routines are sequences of combinators that ingest values from the stacks, some permutations of these combinators are obviously redundant and reducing these extraneous transformations can be done on source files, for example:
#12 #34 SWP POP -> NIP
Tail-call optimization happen where jumps to subroutines are followed by subroutine returns and can be replaced instead by a single jump.
@routine ( a b -: c ) SWP ;function JSR2 JMP2r -> JMP2
Going one step further, routines that would otherwise terminate in a tail-call optimization could even be relocated before their tail's location in memory and do away with the ending jump altogether. Incidentally, we leave here a comment marker to indicate to the stack-effect checker that the routine's tail will fall-through.
@routine ( a b -: c ) SWP ;function JMP2 -> ( >> ) @function ( b a -: c ) DIV JMP2r
These are just a handful of examples,
there is still many more things to explore.
The previous notes contain experiments done with self-hosted tools, from the assembly, formatting, to the validation — each tool is written it the language that it assembles, formats or validates. Each program follows the same pattern of ingesting a file path, emitting errors and warnings through the Console/error port, and emits the same file path through Console/write on success, making them composable.
incoming devlog