uxntal devlog

A programming language

It is fair to assume that programming with so few primitives, will make it difficult for software tools, to infer meaning from a program's source files. But we can still convey a lot of details about the intent of various data structures by encoding information in the few tools at our disposal.

I. Structural Editing

Let's consider the following uxn program, and associated symbols file, created with Drifblim:

hello.rom

a001 1560 0001 0040 0005 9480 1817 2194
20ff f722 6c48 656c 6c6f 2057 6f72 6c64
21

hello.rom.sym

0018 write           0100 ( -> )
0100 on-reset        0107 ( str* -: )
0107 <print>         010a ( -- )
010a <print>/while   010a ( send )
010f <print>/        010e ( loop )
0115 hello-txt

It is possible to recreate a textual source file by walking through the program data, drawing from the symbols file, as: first the labels, second the sublabels, lastly the comments, and create the following valid program:

|0018 @write |0100 @on-reset ( -> ) ;hello-txt <print> BRK @<print> ( str* -: ) !& &while ( -- ) ( send ) LDAk .write DEO ( loop ) INC2 & LDAk ?&while POP2 JMP2r @hello-txt 48 65 6c 6c 6f 20 57 6f 72 6c 64

While this text stream can be reassembled into the program from which it originates, it will first need to be reformatted to be readable as a programming artifact.

|0018 
@write 
|0100 
@on-reset ( -> )
	;hello-txt <print> BRK
@<print> ( str* -: )
	!& &while ( -- )
		( send ) LDAk .write DEO
		( loop ) INC2 & LDAk ?&while POP2 JMP2r
@hello-txt 
	48 65 6c 6c 6f 20 57 6f 72 6c 64

By breaking on absolute padding, label tokens, non-defining comments, and tabbing the content of sublabels, we can improve readability further and can already return to something like what the original source file might have looked like:

|0018 

@write

|0100 

@on-reset ( -> )
	;hello-txt <print>
	BRK

@<print> ( str* -: )
	!&
	&while ( -- )
		( send ) LDAk .write DEO
		( loop ) INC2 & LDAk ?&while
	POP2 JMP2r

@hello-txt
	"Hello 20 "World! 00

Finally, we can ensure that lines terminate on emiting opcodes(STZ/STR/STA/DEO) and some immediate opcodes(JCI, JMI). To make explicit that some routines are emiting, and therefore line-terminating, I chose to use the <label> format.

Labels marking the start of binary information use prefixes that communicate to the reassembler, how to handle the content. For example, txt for ascii characters, icn for 1-bit graphics and chr for 2-bit graphics.

II. Source Validation

Lacking data types, there is not much to go on as far as static validation, but there is still space to explore here. Type inference in Uxntal is done by checking the stack effect declarations of words, against the sum of stack changes predicted to occur based on the arity of each token in their bodies.

@add ( a* b* -: c* ) Warning: Imbalance in @add of +2
	DUP2 ADD2 JMP2r

Words that do not pass the stack-checker are generating a warning, and so essentially this defines a very basic and permissive type system that nevertheless catches some invalid programs and enables compiler optimizations.

@routine ( a b -: c ) Warning: Imbalance in @routine of +1
	EQUk ?&sub-routine
	POP2 #0a JMP2r
	&sub-routine ( a b -: c* )
		POP2 #000b JMP2r

III. Peephole Optimization

Routines are sequences of combinators that ingest values from the stacks, some permutations of these combinators are obviously redundant and reducing these extraneous transformations can be done on source files, for example:

#12 #34 SWP POP -> NIP

Tail-call optimization happen where jumps to subroutines are followed by subroutine returns and can be replaced instead by a single jump.

@routine ( a b -: c )
	SWP ;function JSR2 JMP2r -> JMP2

Going one step further, routines that would otherwise terminate in a tail-call optimization could even be relocated before their tail's location in memory and do away with the ending jump altogether. Incidentally, we leave here a comment marker to indicate to the stack-effect checker that the routine's tail will fall-through.

@routine ( a b -: c )
	SWP ;function JMP2 -> ( >> )

@function ( b a -: c )
	DIV JMP2r

These are just a handful of examples,
there is still many more things to explore.

The previous notes contain experiments done with self-hosted tools, from the assembly, formatting, to the validation — each tool is written it the language that it assembles, formats or validates. Each program follows the same pattern of ingesting a file path, emitting errors and warnings through the Console/error port, and emits the same file path through Console/write on success, making them composable.

incoming: devlog 2023