Brief Notes on Forth

Table of Contents

Background

Forth is a stack-based concatenation language. A complete Forth implementation is very compact, and can include language extensions that implement a block-based file-system with editor, and a simple operating system with software-based virtual memory.

Historically Forth has been used for control application in astronomy, medicine and film-making. The Adobe's Postscript page-description language can be considered a version of Forth with types.

The core language of Forth is very compact. Only a few "words" need be implemented directly in machine code. Other words are defined in terms of each other and these core words. In effect the Forth interpreter uses the stack to allow parameter-free procedure definitions that "thread" together the underlying fragments of machine code using the constructs of the language. (This is sometimes called "threaded programming": not to be confused with "threads" in the modern sense of concurrent threads of execution in a common context.)

The job of a Forth programmer is to define new words. Ideally each word does something relatively simple, and is easy to debug and understand. It can take some practise to understand the syntax and the effect of operations on the stack.

Basics

Forth is a fully fledged programming environment which has some interesting properties. For CE305, we are simply using Forth as a target language for an expression analyser and a small compiler. For these tasks, we only need a small fraction of the language's potential. The fact that it can be run interactively can help debug the generated program code makes it a more convenient target than the JVM (used by Java) or CLR (used by C# and other .NET languages).

The Stack

Most operations in Forth work using values on the stack. Operations typically consume values from the stack, and add new values to it.

Numbers by themselves can be seen by Forth as program statements instructing the interpreter to push the corresponding value onto the stack. For example, the Forth statement 35 simply pushes the value 35 onto the stack.

The command .s can be used to inspect the stack non-destructively, which is useful for debugging purposes. The command clearstack does what it says.

Stack operations

Binary operations such as addition and subtraction take the top two values as inputs, and push the output value onto the stack. The word . prints the value on the top of the stack, and removes it. You can inspect the top of the stack non-destructively by first duplicating it using dup, so dup . prints the current value of top value, and leaves the stack unchanged.

There are many stack manipulation words in Forth to make it easier to express complex stack operations, and move values around in the top few positions. Some may be native, others are defined.

Integers are normally signed, 32-bit values.

Floating point numbers

gForth natively supports floating point numbers. You are not required to use floating point numbers in the CE305 assignments, but you may wish to extend your expression analyser to handle them.

Forth maintains a second stack for floating point operations. Typically, operations on the floating point stack have the same name as the integer operations, but are prefixed with an f. Floating point literals are identified by the presence of an exponent (1e0), which may be empty if zero (1e). So 2.0 2e0 f+ f. performs 2 + 2 using floating point arithmetic. Operations also include square root (fsqrt), sin (fsin), cosine (fcos), and logarithm (fln) as well as stack manipulation words (such as fdup, fswap) printing (f.). The command f.s can be used to inspect the floating-point stack non-destructively.

Note that you cannot combine floating point values directly with integers. Also, as with all implementations of floating point arithmetic, beware that floating point addition is not associative, and that equality test may fail unexpectedly due to rounding errors. There are operations for moving values between the integer and floating point stacks (f>d, d>f), but they may not behave in the way you expect.

Strings

Forth supports strings, but you are not required to use them in the CE305 assignments, but you may wish to explore their use in the small compiler project.

Running Forth

Forth can be run interactively from the command line. The GNU implementation of Forth (gForth) is available on the lab machines under both MS-Windows and Fedora Linux.

If you have a file yoursource.fs that contains a Forth program, you can run it from within an interactive Forth session using include yoursource.fs.

You can also run a Forth program using gforth yoursource.fs In this latter case you might want to end your program with bye so that you exit the interpreter.

Global Variables

In general, global variables are avoided in programs written directly using Forth. It is considered better to use the stack whenever possible (which is usually most of the time). There is a distinct notion of local variable, which we will not cover here.

Defining variables

Variables can be declared using VARIABLE your-var. This assigns an address for storing a value.

Retrieving variable values

Entering the name of the variable pushes that address onto the stack. The word @ will retrieve the value stored at the address given by the top of stack value, and push it onto the stack.

Storing variable values

The word ! will treat the top of the stack as an address at which the next value on the stack is then stored. BEWARE: This is potentially dangerous!. The word +! adds the second from top value to the value stored at the address at the top of stack. So your-var @ retrieves the value of your-var, 5 your-var ! stores the value 5 in your-var, and 10 your-var +! adds 10 to the value stored in your-var.

Checking variable values

You can check the value of a variable using ?. So your-var ? is equivalent to your-var @ . .

Notes

Forth variables are normally 32-bit values, falling on word boundaries appropriate for the underlying architecture. Other commands are available for handling larger and smaller sequences of bits. If a memory location is used to store floating point values then the commands f@ and f! should be used in place of @ and !.

Constants

Forth constants can be defined using value CONSTANT your-constant, so 128 CONSTANT MAXVALUE defines MAXVALUE to be 128. The word MAXVALUE will now instruct the interpreter to push 128 onto the stack.

Defining words

For CE305, it should not be necessary to define new Forth words, but here is how it is done. Essentially you start a definition with : (a colon) and end a definition with ; (a semicolon). The first word after the colon is what is being defined, everything else is the content of the definition. The following defines squared as an operation that squares the value at the top of the stack.

: squared 
  dup * ;

The word dup pushes a copy of the top of stack onto the stack, and * multiplies the two values at the top of the stack, which will be the original value and a copy of it.

It is conventional to use comments to describe the impact of a definition on the top of the stack using a specific short-hand notation. In this case, the integer value n at the top of stack is replaced by n^2. The appropriate comment is ( n -- n^2 ), making the definition

: squared ( n -- n^2 )
  dup * ;

Exercises

Using gForth interactively, work out what you need to type to evaluate the following expressions, and print the result.

  1. 2 + 2
  2. 2*5 + 2
  3. 2*(5 + 2)
  4. 3^2
  5. 10/2
  6. 2(3^2 + 10/2)

Now produce a file that contains the program that evaluates one of these expressions, and try to have the gForth interpreter execute the program stored in that file. Files containing Forth source typically use the extension .fs, which stands for "F"orth "S"tream.

Control structure

Conditionals

A conventional conditional of the form if a == 3 then BODY is expressed in Forth as a @ 3 = if BODY then, or a @ 3 = if BODY endif

A conventional conditional of the form if a == 3 then BODY1 else BODY2 is expressed in Forth as a @ 3 = if BODY1 else BODY2 endif.

Loops

While loops of the form while a > 3 BODY can be implemented in Forth using begin a @ 3 > while BODY repeat.

Links for gForth documentation

Author: CSEE, University of Essex

Date: Spring Term 2010/11

HTML generated by org-mode 7.4 in emacs 23