Variants on a Simple Program Statement

Begin with a program statement in some high–level language.

Z = X + Y

In the MARIE assembly language, this would be written as follows.

Store        Z

The hexadecimal representation of the MARIE machine language might be as follows.

10A2
30BC
202D

How do we get to this hexadecimal representation?  There are a few items to discuss before we really answer this one.
1.     What is an accumulator and what is an accumulator–based machine?
2.     What is assembly language and what is the function of the assembler?

What is an Accumulator?

The textbook definition of an accumulator is completely accurate.  That definition might be expanded a bit.

In a classic accumulator–based architecture, the accumulator, denoted “AC” or “ACC”, is the one
register that holds temporary results from the computation.  It accumulates the results, as in the above
example in which it is used to accumulate the sum.

The single accumulator architecture is an artifact of the days in which all registers were very expensive to
build and took some trouble to maintain.  Architectures with fewer general purpose registers (such as
those with only one) were more reliable.

For a good model of an accumulator, think of the single line display on a pocket calculator.  The older
ones had only one line of digits.  The newer “full screen” units still have a line designated to hold the
latest results.  Think of each of these displays as an output unit copying the contents of a single accumulator.

Early architectures added another register, called “MQ” to allow for multiplication and division.  The
multiplication of two 16–bit numbers gives a 32–bit result.

Later architectures (except INTEL with the AX, BX, etc.) moved to a number of general purpose registers,
often denoted by number: R0, R1, etc.

What Is a Stored Program Computer?

This is often called a “von Neumann architecture”, after John von Neumann, who was the principle author
on a paper defining the concept of a stored program computer.

The basic idea is simple:
1.     Programs and data are stored in primary computer memory.  Obviously the
programs (and possibly data) are copied into memory from some secondary
storage (think “fixed disk”), but that is not part of the model.

2.     Each instruction is represented in memory as a binary number.  It is fetched
from memory by the “fetch unit” and executed by the “execute unit”.

3.     It is likely that some data are written to a secondary storage or print device
during the execution of the program.

There are a few corollaries to the above.
a)     Each memory location must be individually accessible and associated with
a unique identifier, such as an address.

b)     Each instruction must be stored at a unique identifiable memory location.

c)     Each “variable” must be associated with a unique identifiable memory location.

Consider the sample program from the first slide.

Store        Z

Each of these instructions must be placed at a unique address that is easily identifiable.  In very early
machines and very primitive machines (such as the MARIE), the first instruction of the program is
placed at a fixed address, often called “START”.

After placing one instruction, the assembler must compute the address of the next instruction.  For some
architectures, such as the INTEL Pentium™ series, this can be complex, as instructions come in a wide
variety of lengths: 1, 2, 4 bytes, etc.

The MARIE is simple.  All instructions have the same length: one word.  If an instruction is at location
N, the next is at location (N + 1).  Suppose start = 0x000 (hexadecimal).*

002  Store        Z

* REMEMBER: All addresses are 12–bit binary numbers; so three hexadecimal digits.

The assembler must allocate memory locations for each “variable” used in the computation.  In more
complex architectures, this needs to account for the number of bytes allocated for the variable: 2 or 4
bits for an integer, 4 or 8 bits for a real, etc.

Again, the simplicity of the MARIE architecture is helpful.  Only integers are used as data items.
Each is exactly 16–bits long; one word per integer.

Let’s expand the program slightly, so that its assembly will make sense.  We have:

002            Store      Z
003            Halt
004     X,   Dec        4
005     Y,   Dec        8
006     Z,   Dec        0
END

Note the label notation, as in “X,”.  It is a symbol followed by a comma.

Assembler, Pass 1

Now we consider a two pass assembler, which is the “standard variety”.

Pass One:  This identifies the three symbols X, Y, and Z.  It does so by scanning the labels at the
beginning of the lines and finds “X,”, “Y,”, and “Z,”.

In a more sophisticated assembler, the declarations for these three labels would identify the type of
the variable: integer, real number, fixed length string, etc.

In the MARIE, each variable is a 16–bit integer.

At the end of Pass 1, the assembler has noted the addresses to be assigned to each “variable” and
enrolled the names in its symbol table, which is used as a part of Pass 2 to associate each label

Here is the symbol table generated after Pass 1 for this program

 Label Address X 0x004 Y 0x005 Z 0x006

Assembler, Pass 2

The output of pass 1 of the assembler may be imagined as follows.  Each item has been assigned a
location and all of the symbols used (think “variables”) have been identified.

002             Store     Z
003             Halt
004      X,   Dec       4
005      Y,   Dec       8
006      Z,   Dec       0
END

Recall the MARIE Instruction Format, shown in Figure 4.10 of the textbook.

 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 OpCode Address (represented as 3 hexadecimal digits)

Note that the opcode, being the code that represents the instruction, is a binary number.  It is often
represented in hexadecimal, but almost never in decimal.

Octal notation is occasionally used for the opcodes, but we shall avoid the practice.

Tracing Pass 2

Let’s trace the second pass of the assembler.  This uses the symbol table to associate an address with
each of the “variables” mentioned in the assembly instructions.

The address for X is 0x004.  The instruction is 0x1004.

001  Add         Y     The opcode for “Add” is binary 0011, or 0x3.
The address for Y is 0x005.  The instruction is 0x3005.

002  Store        Z      The opcode for “Store” is binary 0010, or 0x2.
The address for Z is 0x006.  The instruction is 0x2006.

003  Halt                 The opcode for “Halt” is binary 0111, or 0x7.
There is no operand, so no address part.  The address part of
such an instruction is usually set to 000, so the instruction is 0x7000.

NOTE:    With “zero operand” instructions, such as HALT, only the first hexadecimal digit is significant.
The last three hexadecimal digits can be any value, as they are ignored by the microarchitecture (the
circuits that make the computer “go”).  Each of the following is a valid HALT instruction:

0x7000, 0x7123, 0x7777, 0x7ABC, 0x7FFF, etc.

The entire assembled program can be represented as an indexed array.

 Location Contents 0x000 0x1004 0x001 0x3005 0x002 0x2006 0x003 0x7000 0x004 0x0004 0x005 0x0008 0x006 0x0000

Note that the “Dec 0” or “Decimal 0” instruction caused the memory allocated to each of the
variables to be set to zero.

Technically “Dec 0” is a pseudo–op, in that it is a directive to the assembler to do something and is
not an executable instruction.  Two other useful pseudo–ops are

END        declares the end of the assembly unit.  Otherwise the assembler gets lost.

Store     Z
Halt
X,        Dec       4
Y,        Dec       8
Z,         Dec       0
END

This is the program as it would be input.

Comment:      One key difference between most assembly languages and high–level languages that are
compiled is that the latter do not require explicit declaration of memory locations, as was done above.
A compiler just requires a type definition, from which it will automatically generate the storage assignments.

The SkipCond Instruction
Disassembly

Consider the following “core dump” of a MARIE assembly language program.

All numbers are represented in hexadecimal.

Note that this program uses an advanced instruction (Clear) to clear the accumulator.

If the execution begins at address 0100, what does the program do?

000                 A000

001                 2009

002                 5000

003                 200A

004                 400B

005                 8800

006                 7000

007                 2009

008                 7000

009                 0000

00A                 0000

00B                 0030

Disassembling the Program (Page 1)

To disassemble the hexadecimal code, we must identify and determine the
effect of each machine language instruction.  Let’s do this one instruction at a time.

Look at table 4.7 on page 172 of the textbook to get the definitions.

000          A000

This is an instruction to clear the accumulator.                     Clear

001          2009

This instruction stores the accumulator into an address.
For lack of anything better, I am calling this W009              Store W009

002          5000

Input        Place the input data into the accumulator              Input

003          200A

This stores the contents of the accumulator                           Store W00A

004          400B

This subtracts a value from the accumulator                         Subt W00B

Disassembling the Program (Page 2)

005          8800

This is Skipcond 800.  Skip the next instruction if the AC > 0.        Skipcond 800

006          7000

Halt                                                                                                 Halt

007          2009

Store the accumulator contents into this address                           Store W009

008          7000

Halt.                As there are no branches around this, it is the               Halt
last instruction to be executed.  The next three
words must hold data for the program.

009          0000                                                                                Decimal 0

00A         0000                                                                                Decimal 0

00B         0030                                                                                Decimal 48

Disassembling the Program (Page 3)

We now list the pseudo–assembly language form of the program.  At this
point, we do not have any good names for the variables.

000                      Clear                           // Clear the accumulator

001                      Store            W009      // Store the zero in this location

002                      Input                            // Read from the input device.  Call this N.

003                      Store            W00A     // Store the raw input in this location.

004                      Subt             W00B      // Subtract decimal 48 from the input.

005                      Skipcond     800          // Skip if (N – 48) > 0, or N > 48.

006                      Halt                             // Halt with 0 in W009 if N £ 48.

007                      Store            W009      // Store (N – 48) in W109

008                      Halt                             // Halt

009     W009,       Dec              0

00A    W00A,      Dec              0

00B    W00B,      Dec              48            // The ASCII value for the character ‘0’.

This reads in the ASCII value of a digit and stores its numeric value in W109.

Sample Program #2

000                                     Input                       // Get a number into the AC

001                                   Store X                     // Store it into location X

003                                  Store X2                    // Store the doubled value.

004                                   Add X2                     // Now we have four times the value in AC

005                                  Store X4                    // Store X times 4

006                                   Add X4                     // Now we have eight times the value

007                                  Store X8                    // Just for debugging

008                                   Add X2                     // Now we have ten times the value

009                                   Output                      // Show the answer

00A                                     Halt

00B        X,                          Dec      0

00C        X2,                        Dec      0

00D       X4,                        Dec      0

00E        X8,                        Dec      0

END

Sample Program #3

000                                     Input                       // Get a number into the AC

001                                      JnS      By5          // Call a subroutine

002                                     Store     X5            // Store the value

003                                   Output                     // Display the vale

004                                      Halt                        // And stop.

005        X5,                        Dec      0

006        By5,                      Hex      0               // Stores the return address

007                                     Store     X               // Store the AC

008                                      Add      X               //  AC now has X · 2.

009                                     Store     X2            //  Store the doubled value

00A                                     Add      X2            // AC now has X · 4.

00B                                      Add      X               // AC now has X · 5.