Sidorov S.A. (MSU), Shumakov M.N. (NIISI RAN)
_____________________________________________

               DSSP AND FORTH. COMPARE ANALYSIS
               ---------------------------------

                          1. HISTORY

      DSSP (Dialog System for Structured Programming) was
created in the laboratory of Brousentsov N.P. at the Computer
science department of the Moscow State University in 1980,
contemporary version - in 1985, 32-bit version - in 1989.

      Genealogy:           Setun-70 ----- Forth
                                      |
                                     DSSP

      Setun-70 is a ternary minicomputer with a two-stack
architecture. From it DSSP inherited control constructions and the
basic idea: programming without a semantic gap.
      From Forth DSSP inherited the idea of vocabulary and the method
of notation of procedures between ":" and ";" . The ideologist of
DSSP, d-r. Brousentsov was educated as mathematician. He has
theoretical works in the fields of logic and computer architecture.
And he had a big experience in the computer industry as an architect
and constructor of ternary computers Setun and Setun-70.
      In 1980 he was not young men (55) and worked as a theorist and
leader of a project. The code of DSSP is created mainly by students
and postgraduate students of Moscow university. After 1991 the
majority of them works in other fields, DSSP-activities decreased.
But DSSP is alive. In the Moscow university and in NIISI RAN new
32-bit versions and a new shell were created, some applications are
in practice use. DSSP is open and ready. Probably, it is the first
English paper about DSSP.

     2. BUILDING OF DSSP LANGUAGE

           2.1 OPENNESS

Our startpoint is the definition:
    -------------------------------------------------
    | Openness is absence of gaps between languages |  (1)
    -------------------------------------------------
We consider pairs:
    machine language - program language;
    program language - system language ;
    program language - language of application field
     and make assumption that a programming system without gaps
between these languages is some optimum in a system "human+computer".


      Firstly, we consider the pair: machine language - program
language. We should remark that an usual assembler does not coincide
with its machine language. An assembler program has pseudocommands,
macros and other terms, that are absent in a machine language. Even a
single assembler command, for example:
     MOVE.L (R1)+,(R2+BASE)
consists of letters, comas and other terms, that are far from the
internal machine world. The only object, that can have similar sense
for human and for machine is the word. Let us try to build the
language from demand:

     "one word of text - one word of machine code".     (2)

     It is obvious, that Forth is very close to such a language.
Any one word of text
       : EXAMPLE ONE TWO THREE ;
corresponds to one word of code: address of procedure. But there are
exclusions: control operators and literals.

                   2.2.   CONTROL OPERATORS
2.2.1. BRANCHES
     "IF" in Forth has form:
       [n]  IF  y1 y2 ... yN  ELSE n1 n2 ... nM THEN
The direct interpretation of this phrase is not effective as
the interpreter must go through both branches. It means, that we
must install in code invisible "GO TO" or transform text to:
    : P1 y1 y2 ... yN ;
    : P2 n1 n2 ... nM ;
    [n] IF P1 ELSE P2 THEN
Thus, if we want to have the text and the code word to word
compatible, we must use only one word length phrase inside branches.
But in the phrase IF P1 THEN P2 ELSE there are three predefined words
(IF, THEN, ELSE) and only two user defined (P1 P2). It can be made
shorter.
      Secondly, Forth has only one branch operator. It leads to a
short programming system but long phrases in user programs. In DSSP
another way was chosen: a lot of branch operators and short phrases in
user programs. In the result we have:

  DSSP               Forth
  [n] IF+ A       [n] 0> IF A THEN
  [n] IF0 A       [n] 0= IF A THEN
  [n] IF- A       [n] 0< IF A THEN
  [n] BR+ A B     [n] 0> IF A ELSE B THEN
  [n] BR- A B     [n] 0< NEG IF A ELSE B THEN
  [n] BR0 A B     [n] 0= IF A ELSE B THEN
  [n] BRS X Y Z
  [n] BR c1 p1  c2 p2 ... cN pN else pN+1

      Two last operators in Forth are too long. BRS (BRanch on Sign)
is the three-directional switch (as in FORTRAN):
     [n] BRS MINUS ZERO PLUS
"BR" is similar to switch of Pascal or C. Using of "BR" is
clear from this example:
 : TRANSCRIPTION [n] BR #A ."ei"  #B ."bi"  #C ."ci" ....  #Z ."zet"
   ELSE ."???" ;
( here #A - constant, equal to ASCII-code of letter A).

      So, we must write a new procedure (and a new name) for any new
branch of program. It seems annoying, but only at first sight. As any
branch of program has a name, break point setting is very convenient.
Moreover, as any branch is called by one word, we can replace this
word by other one during debugging.
      In early 80's a special experimental version of DSSP with
brackets inside control constructions was created. After some time
of exploitation practical programmers said, that the variant
without brackets (i.e. without semantic gap) is better.
      The list of DSSP compare operation is short: "=", "<", ">".
Such operations as "0>" "0=" "0<" "<>" are absent.

2.2.2. CYCLES
      Construction DO .... LOOP is not suitable for us as it has
invisible variable of cycle I. This variable can be used only on
level between DO and LOOP. For example
  0 10  DO I . LOOP
is right, but
     : A I . ;
  0 10 DO A LOOP
is wrong. It contradicts to our principle of openness. In the
result we have only two simple cycle operators:

     DSSP            Forth
  [ ] RP A        []  BEGIN  A 0 UNTIL
  [n] DO A        [n,0] DO A LOOP

    To leave the cycle  4 break operators can be used:

   DSSP   Forth
    EX    LEAVE
    EX-   O< IF LEAVE THEN
    EX0   0= IF LEAVE THEN
    EX+   0> IF LEAVE THEN

      To leave N nested cycles the word EXT can be used:
                  [N] EXT.
      But it is a dangerous word and a mechanism of
situations in this case used usually (see 4.3.).

        Example:
1. The calculation of sum of numbers:
   1 2 3 4  DEEP 1- DO +  [10]  .
(DEEP  is the deepness of stack)

                           2.3. DATA
2.3.1. VARIABLES, NUMBERS, CONSTANTS

      A Literal (numbers inside procedures) of Forth occupies one
word of text, but two word of code:
           address_of_primitive
           value_of_number
      But a constant of Forth occupies only one word of code. To save
the principle "one word of text - one word of procedures code" DSSP
in both cases uses variant, similar to the constant of Forth. As one
number can be used in many places, the special vocabulary of
constants is supported.
      In Forth the execution of a constant give a value, but the
execution of variable give an address. We want to have one language
for all, therefore, the execution a variable must give a value too.
It means, that we must use the prefix notation for other operations
with variables.
     Example:
       DSSP                           Forth
       VAR X                          VARIABLE X
         X                              X @
     7 ! X  [ X=7 ]                   7 X !
       ' X  [address]                   X

      To make it a special table, and a special set of assembler
procedures is used for any type of variable. Any element of the table
is the address of the procedure. "X" calls the first procedure, "! X"
calls the second procedure and so on. It is the full list of
operations with scalar variables:

  DSSP       Forth    HLL
   ' X        X       t=addr(X)
     X        X @     t=X
   ! X        X !     X=t
  !0 X      0 X !     X=0
  !1 X      1 X !     X=1
 !1+ X      1 X !+    X=X+1
 !1- X      1 X !-    X=X-1
 !+  X        X !+    X=X+t
 !-  X        X !-    X=X-t

There are 5 base types of data, 2 unsigned: BYTE, WORD
and 3 signed: SBYTE SWORD and LONG. To support them about 40
assembler procedures are used.

      So, DSSP is more complex in realisation, but more
convenient to human operations with data. A usual "A=B" in DSSP
is " B ! A " versus " B @ A ! " in Forth. We think that the
speed of DSSP data operations is less or equal to that of
Forth in case of emulation (DSSP operations have longer
algorithm) and greater or equal in case of hardware processors
(the code of DSSP is shorter than the code of Forth).

2.3.2. ARRAYS
      In usual high level language (HLL) the dimensionality of array
is defined by the number of indexes in brackets. For example
    ARRAY M (5,7,9)
is a definition of 3-dimension array. In the Polish inverse notation
there are not brackets, therefore we must define the
dimensionality by number:
    5 7 9 3 ARRAY M
    The language of data execution is obtained by usual translation
from HLL to an inverse notation:
    A(2,5)=X  ----->   X  5 2  ! A
      The last phrase is very natural to a human. It step by step
describes actions of a human, who writes "X" in a two-dimensional
table.
      DSSP uses for arrays the same list of operations as for scalar
variables, plus a natural operation "write the same value in all
elements of array:
                            X !!! M
    As one-dimensional arrays are used more often than multidimensional,
a special word VCTR (vector) was introduced. Examples:

      100  BYTE VCTR  V       [  V[0:100]   ]
     4 6 2 LONG ARRAY A       [  A[0:4,0:6] ]

     DSSP       Forth            HLL
     !T           !
     @            @
     I   V      V I 2* + @       t=V(I)
     I ' V      V I 2* +         t=addr(V(I))
   3 I ! V    3 V I + !          V(I)=3
   I J   A      A J 9 * I + @    t=A(I,J)
 7 I J ! A    7 A J 9 * I + !    A(I,J)=7
   X !!! A   [ store X in all elements A(I,J) ]
     I   V      V I 2* + @       t=V(I)

2.3.3. FIXED VARIABLES
    A variables, whose values must be saved with the current
state of system (command SAVE) must be marked by the word "FIX":
   :: FIX VAR LENGTH   80 ! LENGTH
   BYTE FIX VECTOR V

              2.4.  OTHER OPERATIONS

      Many simple operations of Forth and DSSP coincide exactly. But
not all of them have the same names. There are two reasons for it: 1)
authors of DSSP thought, that in this case "the shorter - the better";
2) they used the terms of Setun-70. This table contains some examples
to compare Forth and DSSP. The sign :-( marks, where we do not know
any arguments in favour of DSSP.

 DSSP  Forth              DSSP    Forth
  +    +                  C@     DUP @
  D    DROP               @B     C@
  C    DUP                EXEC   EXECUTE
 .     DUP .              CR     CR
 .D    .                  TOS    TYPE
 DD   2DROP               SP     SPACE
SHL    2*     :-(         B8          OCTAL
  2+   2+                 ."Hello"    ."Hello"
NEG    NEG                ."Hello"    .(Hello)
ABS    ABS                VALUE       CONSTANT              :-(
LOAD   LOAD               CONSTANT    VECTOR OF CONSTANTS   :-(

          2.5   INPUT/OUTPUT.

      Input/Output has a strong dependence on the environment.
Therefore, we must think about the portability. A number of assembler
primitives must be as small as possible.
     For terminal only two low level operations is enough:
   TRB  [Terminal Input  Byte]
   TOB  [Terminal Output Byte]
     Other operation can be written in DSSP

2.5.1. TERMINAL OUTPUT
      [Terminal Output String:]
      : TOS [Al] DO TOSS D [] ;
        : TOSS [a] C @B TOB 1+ [a+1] ;
  [ In Forth:  C - DUP, D - DROP,  @B - C@ ]
      The output of an integer number is described by two
parameters: the length of output in chars and the base of
calculus. Thus, the general form is:

        Number   N_of_chars  Base_of_calculus  Operator

     As the base of calculus changes relatively seldom,
it is better to define it as the variable:
              VAR BASE@
Then the form for number output is:

        Number L TON   [Terminal Output Number]

      Obviously, all variants of text number output can be performed
by four words: TOB, TOS, TON and BASE@. But, for ease of handling,
some additional operations were added as standard:

[n]  .          Output all positions of number
."Hello world"  Message output
SP  CR          SPace, Carriage Return
B2 B8 B10 B16   BINARY  OCTAL  DECIMAL  HEX

    Many applications use next extensions:
: .D [n] . D [] ;
: .B10 [n] BASE@  E2 B10 .  ! BASE@ [n] ;

2.5.2. TERMINAL INPUT

      The main base input operation is TRB - Terminal Read
Byte. Other input operations are symmetrical to output
ones:
    TIB  Terminal Input Byte with echo
    TIN  Terminal Input Number
    TIS  Terminal Input String
   The text of TIB is simple:
    : TIB TRB C TOB [b] ;

                    3. PROGRAMMING SYSTEM

             3.1. INTERPRETATION  AND  COMPILATION

Three main gaps contradict to the openness of the programming
system:
   1) a gap between the program language and the system language;
   2) a time gap between editing and debugging;
   3) a gap between parts of program;

The "language of programming system" is more confusing
thing than a program language. Usually it is some strange mixture of
commands of a text editor, compiler, debugger, file system etc.

Let the command line interpreter of system works exactly as the
internal interpreter, i.e. let it interpret, word by word, the text
of the input flow. It is easy for DSSP, as any word of the text
corresponds to one and the only one word of the code. All what is
needed is a vocabulary with names and addresses of words plus a
simple program to handle this vocabulary.

The minimal time from editing to debugging is reached,
when a word (data or procedure) is compiling once after the input. It
means, what such words as "VAR", ":" etc. must be no
descriptors for some external compiler, they must be
a special compiler themselves.

I.e. "VAR", ":", ARRAY and others are compiling words, words
"LONG" "BYTE" and other are description words.

Consequences: 1) A programmer can create new compiling
words; 2) Compiling words can be executed inside a procedure, a
special word "TEXEC" (call of external text interpreter) was
introduced for it. Example: creation of a temporary array:

    : PROC1 "B10 10000 LONG VCTR VL  0 !!! VL " TEXEC ;

Now we can say, that some minimal programming system has
been built. But words in this system must be defined "from down to
top". To clear this limitation we must introduce a table of
undefined names and some procedure to support it. In result the
system becomes more complex and big. In 1970's it was difficult,
but in 1980's it was not, and it was made.  Thus, DSSP supports
"Top to Down"  programming.

              3.2.  SUBVOCABULARIES

The text of a usual word is a too small unit of the compilation
to store it as one file, the text of a full vocabulary is a too big
unit. Therefore, the structurisation of the vocabulary into groups of
words is needed. Moreover, in some cases the need appears in more
detail management of such groups. For example to close a part of
vocabulary from unwanted use or link.

That is why the vocabulary of DSSP is divided into named
subvocabularies. Usually, the name of subvocabulary begins with "$"
and the text of one subvocabulary is saved in one file, but it is
only a tradition. The size of text files of early stand-alone
versions of DSSP was limited by 4K bytes, today it is limited by 64K
bytes, but the usual size is less than 20 K.
       Next words used to handle  subvocabularies

   ?$          Show list of vocabularies
   FORGET  $V   [ Analogue to Forget of Forth ]
   GROW    $V
   PROGRAM $V  [ FORGET $V + GROW  $V]
   SHUT    $V  [ Close access and link to words of $V ]
   USE     $V  [ Open  access  and link to words of $V ]
   ONLY    $V  [ Use only one subvocabulary $V ]
   CANCEL      [ Cancel last "ONLY" ]
   CLEAR   $V  [ Delete all names of $V, exclude marked by :: ]

Obviously, many names are used only inside one or
several vocabularies, others are used only in debugging. The
operation "CLEAR" deletes such names from the subvocabulary, to
diminish the size of used memory and the number of
coincident names.

So, the vocabulary of DSSP is more complex, than the
vocabulary of Forth. Obviously, it is more suitable to develop
large program, particularly when several programmers are working
at the same time. For example, let programmer A make
subvocabulary $A, programmer B make $B etc., and let in
$A uses words from $B, $B uses words from $A.
Our programmers can use the next lists of vocabularies:

    Computer A:     Computer B:
    $A              $B
    $B              $A
    $OTHERS         $OTHERS

Both A and B are free in compiling and debugging. There is only
one limitation: the attempt to execute undefined word XXX leads to
the message "Unknown word XXX" and restart of DSSP.


           4. SOME  ARCHITECTURAL  FEATURES

                 4.1. ACTIONS

Any word of DSSP, for example X, can be executed in
procedure directly, by address and by name:

    1)   X
    2)   '' X  EXEC   [ '' - (two apostrophes) - get address of executor]
    3)  "X" TEXEC     [ call outernal text interpreter ]

Obviously, the executive address can be stored as value of variable
or array, and can be used to switch a flow of program. A special type
of data was introduced:

                 ACT  VAR  X

Actions are used to switch the output flow. As all terminal output is
based on the word TOB (Terminal Output Byte), TOB was defined as:

   :: FIX ACT VAR TOB   '' TOB' ! TOB

where TOB' is the "unconditional TOB".

    To change the direction of output flow is enough:
   '' LPB ! TOB   - direct output to printer.
   '' D   ! TOB   - no output.


      4.2.  LOCAL VALUES INSTEAD OF LOCAL VARIABLES

It is impossible to have a mechanism of local variables
in DSSP, as there are no place inside a procedure to describe
variables. But instead of it the mechanism of local values of
variables can be introduced. Using of local values is clear
from the example:

    : P1 S( BASE@ X Y ) B8  11 ! X  22 ! Y   P2 P3 ;

The values of variables , whose names of which are placed
between "S(" and ")", store in the additional stack and restore after
leaving the level of procedure P1 . While P2 and P3 are executed
these variables have local values.

              4.3.  SITUATIONS

DSSP is a language, a machine and a programming
system. The usual machine has interrupt system, therefore the
DSSP language must have words to describe it. As DSSP has
a vocabulary, interrupts can be handled by name, but not by
address or number. As DSSP is a "machine in development" there
are base interrupts and user defined interrupts. As DSSP is a
threaded code stack machine, "to call a interrupt" means "to
store the former state of interpreter and begin interpret with
some new word on first level of hierarchy". "To return from
interrupt" means "to run the interpreter in the former state".

The interrupt can be described by two names: the name of
interrupt and the name of reaction:

                       TRAP Q1 REACT1

TRAP is a compiling word, it create new word Q1, which run
new interpretation process with word REACT1 in first level.
In runtime the reaction can be set by phrase:

                  ON Q1 REACT2

which acts only in the current branch of program tree.

Also, we have got usual mechanism of interrupts. It is good for
hardware DSSP machine, but for DSSP emulators we have got only a
control method, similar to ACT VAR. But we can use situations to
resolve the well-known problem of structured languages: the absence
of GOTO leads in some cases to very long return from hierarchy of
calls. The word EON (Exit ON) was introduced for it. Example:

    : P  EON Q1 REACT2  P1 P2 P3 ;
       : P2 ... Q1 ... ;

After exciting of Q1 in word P2, the program leaves
P2 and P and after it executes word REACT2.

Situations are used widely in DSSP, especially in
input-output. Among predefined traps are: NOF (NO File), EOF
(End Of File), DERR (Disk ERRor), NOIND (index out of
boundaries), ZAPROS etc.

The ZAPROS (Russian word for ASK) is excited, if
text interpreter has not got text after execution of some input
operation.  ZAPROS is described as

      TRAP ZAPROS ." ?"   [type " ?" when exciting].

    It is the example of memory look procedure:
    : M [] ON ZAPROS ." Address? " TIN . @ . D CR [] ;

Now, if we write  " M " the system asks: " Address? ",
but if we write "M 100", the system execute M without asking.

        5.  EDITING, DEBUGGING, EXECUTING

The last 10 years DSSP uses built-in common purpose screen
editor, in MSDOS size of text is limited by 64K bytes. The only
specific for DSSP function is PF (Perform text).

In debugging the main principle "one word of text - one word of
code" is in use. A procedure can be recompiled from binary form to
text form. Any word of procedure can be changed by another word,
particularly - by breakpoint. You can call a debugger of OS or load a
DSSP-disassembler to see the text of primitives. I.e. DSSP is the
open system. May be, the most open in the world.

After debugging the state of system can be saved by command:

       SAVE FILE.EXT  phrase

where "phrase" - a set of words to execute after run FILE.EXT.

             6. AS IT WORKS

      Let us write a small program. Let "U:" marks the user text,
"S:" marks the system text.

 S: *
 U: KE E [Clear editor buffer, Edit]

    The next text is created in editor:

  ---------------------------------------------------------
  PROGRAM $LM [ Look memory from a1 to a2, where a1 and a2 -
              numbers in stack]
  : LM [a1,a2] CR C2 - 1+  DO LM1  D [] ;
   : LM1 [a] C . @ . D 1+ CR  [a+1] ;
  : HELP ." a1 a2 LM - dump memory from a1 to a2 "  ;

   UNDEF  [show undefined words]
    ---------------------------------------------------------
    U: ^E           [ leave editor ]
    S: *
    U: OE LM  [ save text in file LM.DSP ]
    S: *
    U: PF   [Perform the text in editor's buffer. Other variant: LOAD LM ]

     Now begin the debugging:

    U:   \L LM1    [recompile the word LM1:
    S:  : LM1
    S:    C         .            @B        .         D
    S:    1+         ;

    U: 1 \P LM1    [Set breakpoint at word 1 of procedure LM1 ]
    U: 100 103 LM  [RUN LM]
    S: Break at C
    U: ..          [ Show stack of operands ]
    S: 1 Pos       [
    0100 ]
    U: \G          [ Go ]
    S: 0100 00BE
      Break at C
    U: ..         [ Show stack of operands ]
    S: 1 Pos      [
       0101 ]     [ The most typical error is the change of the stack
                 depth in cycle. We see that stack is right ]
    U: \C  \G     [Clear breakpoint, Go ]
    S: 0101 003F
       0102 0000
       0103 0089
       *
    U: SAVE lm.exe  HELP

The last command create the executive file, which types a help
message at run.

                          CONCLUSION

DSSP was not invented. It was found. That is why DSSP has not
versions, but only extensions. Forth is created by practice.
DSSP is created by theory. But they are similar and it is a
fact of great importance.


REFERENCES
Sorry, we do not know papers about DSSP in English.


1