Variants on a Simple Program Statement

Begin with a program statement in some high–level language.

Z = X + Y

In the MARIE assembly language, this would be written as follows.

Load        X
Add         Y
Store        Z

The hexadecimal representation of the MARIE machine language might be as follows.


How do we get to this hexadecimal representation?  There are a few items to discuss before we really answer this one.
        1.     What is an accumulator and what is an accumulator–based machine?
        2.     What is assembly language and what is the function of the assembler?

What is an Accumulator?

The textbook definition of an accumulator is completely accurate.  That definition might be expanded a bit.

In a classic accumulator–based architecture, the accumulator, denoted “AC” or “ACC”, is the one
register that holds temporary results from the computation.  It accumulates the results, as in the above
example in which it is used to accumulate the sum.

The single accumulator architecture is an artifact of the days in which all registers were very expensive to
build and took some trouble to maintain.  Architectures with fewer general purpose registers (such as
those with only one) were more reliable.

For a good model of an accumulator, think of the single line display on a pocket calculator.  The older
ones had only one line of digits.  The newer “full screen” units still have a line designated to hold the
latest results.  Think of each of these displays as an output unit copying the contents of a single accumulator.

Early architectures added another register, called “MQ” to allow for multiplication and division.  The
multiplication of two 16–bit numbers gives a 32–bit result.

Later architectures (except INTEL with the AX, BX, etc.) moved to a number of general purpose registers,
often denoted by number: R0, R1, etc.

What Is a Stored Program Computer?

This is often called a “von Neumann architecture”, after John von Neumann, who was the principle author
on a paper defining the concept of a stored program computer.

The basic idea is simple:
        1.     Programs and data are stored in primary computer memory.  Obviously the
                programs (and possibly data) are copied into memory from some secondary
                storage (think “fixed disk”), but that is not part of the model.

        2.     Each instruction is represented in memory as a binary number.  It is fetched
                from memory by the “fetch unit” and executed by the “execute unit”.

        3.     It is likely that some data are written to a secondary storage or print device
                during the execution of the program.

There are a few corollaries to the above.
        a)     Each memory location must be individually accessible and associated with
                a unique identifier, such as an address.

        b)     Each instruction must be stored at a unique identifiable memory location.

        c)     Each “variable” must be associated with a unique identifiable memory location.

What Does the Assembler/Loader Do?

Consider the sample program from the first slide.

Load        X
Add         Y
Store        Z

Each of these instructions must be placed at a unique address that is easily identifiable.  In very early
machines and very primitive machines (such as the MARIE), the first instruction of the program is
placed at a fixed address, often called “START”.

After placing one instruction, the assembler must compute the address of the next instruction.  For some
architectures, such as the INTEL Pentium™ series, this can be complex, as instructions come in a wide
variety of lengths: 1, 2, 4 bytes, etc.

The MARIE is simple.  All instructions have the same length: one word.  If an instruction is at location
N, the next is at location (N + 1).  Suppose start = 0x000 (hexadecimal).*

000  Load        X
001  Add         Y
002  Store        Z

* REMEMBER: All addresses are 12–bit binary numbers; so three hexadecimal digits.

More on the Assembler/Loader

The assembler must allocate memory locations for each “variable” used in the computation.  In more
complex architectures, this needs to account for the number of bytes allocated for the variable: 2 or 4
bits for an integer, 4 or 8 bits for a real, etc.

Again, the simplicity of the MARIE architecture is helpful.  Only integers are used as data items. 
Each is exactly 16–bits long; one word per integer.

Let’s expand the program slightly, so that its assembly will make sense.  We have:

                                                        000            Load       X
                                                        001            Add        Y
                                                        002            Store      Z
                                                        003            Halt
                                                        004     X,   Dec        4
                                                        005     Y,   Dec        8
                                                        006     Z,   Dec        0

Note the label notation, as in “X,”.  It is a symbol followed by a comma.

Assembler, Pass 1

Now we consider a two pass assembler, which is the “standard variety”.

Pass One:  This identifies the three symbols X, Y, and Z.  It does so by scanning the labels at the
beginning of the lines and finds “X,”, “Y,”, and “Z,”.

In a more sophisticated assembler, the declarations for these three labels would identify the type of
the variable: integer, real number, fixed length string, etc.

In the MARIE, each variable is a 16–bit integer.

At the end of Pass 1, the assembler has noted the addresses to be assigned to each “variable” and
enrolled the names in its symbol table, which is used as a part of Pass 2 to associate each label
with its unique address.

Here is the symbol table generated after Pass 1 for this program










Assembler, Pass 2

The output of pass 1 of the assembler may be imagined as follows.  Each item has been assigned a
location and all of the symbols used (think “variables”) have been identified.

                                                000             Load     X
                                                001             Add       Y
                                                002             Store     Z
                                                003             Halt
                                                004      X,   Dec       4
                                                005      Y,   Dec       8
                                                006      Z,   Dec       0

Recall the MARIE Instruction Format, shown in Figure 4.10 of the textbook.


















Address (represented as 3 hexadecimal digits)

Note that the opcode, being the code that represents the instruction, is a binary number.  It is often
represented in hexadecimal, but almost never in decimal.

Octal notation is occasionally used for the opcodes, but we shall avoid the practice.

Tracing Pass 2

Let’s trace the second pass of the assembler.  This uses the symbol table to associate an address with
each of the “variables” mentioned in the assembly instructions.

000  Load        X     The opcode for “Load” is binary 0001, or 0x1 (hexadecimal 1)
                                The address for X is 0x004.  The instruction is 0x1004.

001  Add         Y     The opcode for “Add” is binary 0011, or 0x3.
                                The address for Y is 0x005.  The instruction is 0x3005.

002  Store        Z      The opcode for “Store” is binary 0010, or 0x2.
                                The address for Z is 0x006.  The instruction is 0x2006.

003  Halt                 The opcode for “Halt” is binary 0111, or 0x7.
                                There is no operand, so no address part.  The address part of
                                such an instruction is usually set to 000, so the instruction is 0x7000.

NOTE:    With “zero operand” instructions, such as HALT, only the first hexadecimal digit is significant. 
The last three hexadecimal digits can be any value, as they are ignored by the microarchitecture (the
circuits that make the computer “go”).  Each of the following is a valid HALT instruction:

        0x7000, 0x7123, 0x7777, 0x7ABC, 0x7FFF, etc.

The Hexadecimal Program

The entire assembled program can be represented as an indexed array.


















Note that the “Dec 0” or “Decimal 0” instruction caused the memory allocated to each of the
variables to be set to zero.

Technically “Dec 0” is a pseudo–op, in that it is a directive to the assembler to do something and is
not an executable instruction.  Two other useful pseudo–ops are

        Hex         declares a hexadecimal number

        END        declares the end of the assembly unit.  Otherwise the assembler gets lost.

The Complete Program (Without Addresses)


                                                            Load     X
                                                            Add       Y
                                                            Store     Z
                                                X,        Dec       4
                                                Y,        Dec       8
                                                Z,         Dec       0

This is the program as it would be input.


Comment:      One key difference between most assembly languages and high–level languages that are
compiled is that the latter do not require explicit declaration of memory locations, as was done above. 
A compiler just requires a type definition, from which it will automatically generate the storage assignments.

The SkipCond Instruction

Consider the following “core dump” of a MARIE assembly language program.

All numbers are represented in hexadecimal.

Note that this program uses an advanced instruction (Clear) to clear the accumulator.

If the execution begins at address 0100, what does the program do?

                                         Address          Contents

                                             000                 A000

                                             001                 2009

                                             002                 5000

                                             003                 200A

                                             004                 400B

                                             005                 8800

                                             006                 7000

                                             007                 2009

                                             008                 7000

                                             009                 0000

                                             00A                 0000

                                             00B                 0030

Disassembling the Program (Page 1)

To disassemble the hexadecimal code, we must identify and determine the
effect of each machine language instruction.  Let’s do this one instruction at a time.

Look at table 4.7 on page 172 of the textbook to get the definitions.

        000          A000

        This is an instruction to clear the accumulator.                     Clear

        001          2009

        This instruction stores the accumulator into an address.
        For lack of anything better, I am calling this W009              Store W009

        002          5000

        Input        Place the input data into the accumulator              Input

        003          200A

        This stores the contents of the accumulator                           Store W00A

        004          400B

        This subtracts a value from the accumulator                         Subt W00B

Disassembling the Program (Page 2)

        005          8800

        This is Skipcond 800.  Skip the next instruction if the AC > 0.        Skipcond 800

        006          7000

        Halt                                                                                                 Halt

        007          2009

        Store the accumulator contents into this address                           Store W009

        008          7000

        Halt.                As there are no branches around this, it is the               Halt
                        last instruction to be executed.  The next three
                        words must hold data for the program.

        009          0000                                                                                Decimal 0

        00A         0000                                                                                Decimal 0

        00B         0030                                                                                Decimal 48


Disassembling the Program (Page 3)

We now list the pseudo–assembly language form of the program.  At this
point, we do not have any good names for the variables.

        000                      Clear                           // Clear the accumulator

        001                      Store            W009      // Store the zero in this location

        002                      Input                            // Read from the input device.  Call this N.

        003                      Store            W00A     // Store the raw input in this location.

        004                      Subt             W00B      // Subtract decimal 48 from the input.

        005                      Skipcond     800          // Skip if (N – 48) > 0, or N > 48.

        006                      Halt                             // Halt with 0 in W009 if N £ 48.

        007                      Store            W009      // Store (N – 48) in W109

        008                      Halt                             // Halt

        009     W009,       Dec              0

        00A    W00A,      Dec              0

        00B    W00B,      Dec              48            // The ASCII value for the character ‘0’.

This reads in the ASCII value of a digit and stores its numeric value in W109.

Sample Program #2

            Address                            Instruction                  Comment

                000                                     Input                       // Get a number into the AC

                001                                   Store X                     // Store it into location X

                002                                    Add X                      // Add it back to itself, doubling it.

                003                                  Store X2                    // Store the doubled value.

                004                                   Add X2                     // Now we have four times the value in AC

                005                                  Store X4                    // Store X times 4

                006                                   Add X4                     // Now we have eight times the value

                007                                  Store X8                    // Just for debugging

                008                                   Add X2                     // Now we have ten times the value

                009                                   Output                      // Show the answer

                00A                                     Halt

                00B        X,                          Dec      0

                00C        X2,                        Dec      0

                00D       X4,                        Dec      0

                00E        X8,                        Dec      0


Sample Program #3

            Address                            Instruction                  Comment

                000                                     Input                       // Get a number into the AC

                001                                      JnS      By5          // Call a subroutine

                002                                     Store     X5            // Store the value

                003                                   Output                     // Display the vale

                004                                      Halt                        // And stop.

                005        X5,                        Dec      0

                006        By5,                      Hex      0               // Stores the return address

                007                                     Store     X               // Store the AC

                008                                      Add      X               //  AC now has X · 2.

                009                                     Store     X2            //  Store the doubled value

                00A                                     Add      X2            // AC now has X · 4.

                00B                                      Add      X               // AC now has X · 5.

                00C                                    JumpI    By5          // Indirect jump to return.

                00D       X,                          Dec      0

                00E        X2,                        Dec      0