A miniPascal compiler for the E-machine
by Frances Wren Goosey
A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in
Computer Science
Montana State University
© Copyright by Frances Wren Goosey (1993)
Abstract:
This thesis is the third phase in the development of a program animation system called DYNALAB
(DYNAmic LABoratory). DYNALAB is an interactive software system that demonstrates
programming and computer science concepts at an introductory level. The first DYNALAB
development phase was the design of a virtual computer—the E-machine (Education Machine). The
E-machine was designed by Samuel D. Patton and is presented in his Master’s thesis, The E-machine:
Supporting the Teaching of Program Execution Dynamics. In order to facilitate the support of program
animation activities, the E-machine has many unique features, notably the ability to execute in reverse.
The second phase in the development of DYNALAB was the design and implementation of an
E-machine emulator, which is presented in Michael L. Birch’s Master’s thesis, An Emulator for the
E-machine. This thesis presents the design and implementation of a compiler for the E-machine. The
compiler’s source language is miniPascal, which is a subset of ISO Standard Pascal.
The miniPascal compiler was developed using the Unix lex and yacc compiler development tools. It
has successfully generated object files ready for execution on the E-machine. This thesis focuses on the
compilation aspects that are unique to the E-machine architecture and the planned animation
environment. 
A miniPASCAL COMPILER FOR THE
E-MACHINE
by
Frances Wren Goosey
A thesis submitted in partial fulfillment 
of the requirements- for the degree
of
Master of Science 
in
Computer Science
Montana State University
Bozeman, Montana
April 1993
7)^18
APPROVAL
of a thesis submitted by 
Frances Wren Goosey
ii
This thesis has been read by each member of the thesis committee and has 
been found to be satisfactory regarding content, English usage, format, citations, 
bibliographic style, and consistency, and is ready for submission to the College of 
Graduate Studies.
____ ^  / 2  T /  ^  3_________ i3 •
Date Chairperson ,^ Graduate Committee
Approved for the Major Department
Approved for the College of Graduate Studies
Jy7 7 / ^ 3
Date Graduate Dean
Ill
STATEMENT OF PERMISSION TO USE
In presenting this thesis in partial fulfillment of the requirements for a master’s 
degree at Montana State University, I agree that the Library shall make it available 
to borrowers under rules of the Library.
If I have indicated my intention to copyright this thesis by including a copyright 
notice page, copying is allowable only for scholarly purposes, consistent with “fair 
use” as prescribed in the U.S. Copyright Law. Requests for permission for extended 
quotation from or reproduction of this thesis in whole or in parts may be granted 
only by the copyright holder.
ACKNOWLEDGMENTS
This thesis is part of a larger software development project, called 
DYNALAB. The DYNALAB project evolved from an earlier pilot project called 
DYNAMOD [R oss 91], a program animation system that has been used exten­
sively at Montana State University in introductory Pascal programming classes. 
DYNAMOD was originally developed by Cheng Ng [Ng 82-1, Ng 82-2] and later 
extended and ported to various computing environments by a number of students, 
including Lih-nah Meng, Jim Mclnerny, Larry Morris, and Dean Gehnert. Experi­
ence with DYNAMOD proved the worth of program animation as a tool for teaching 
and learning programming and computer science concepts. It also provided exten­
sive insight into the facilities needed in a fully functional program animation system 
and the inspiration for the subsequent DYNALAB project and this thesis.
Many people have contributed to the DYNALAB project. Samuel Patton and 
Michael Birch laid the groundwork for this thesis by designing and implementing the 
underlying virtual machine for DYNALAB in their Masters’ theses. As this thesis is 
being completed, Craig Pratt is developing the animator portion of DYNALAB, and 
Robin Winslett and David Poole are implementing new compilers for the project.
I would like to take this opportunity to thank my graduate committee members, 
Dr. Rockford Ross, Dr. Gary Harkin, and Dr. Year Back Yoo, and the rest of 
the faculty members from the Department of Computer Science for their help and 
guidance during my graduate program. I would also like to thank my thesis advisor, 
Dr. Ross, and DYNALAB team members, David Poole, Craig Pratt, Robin Winslett, 
and Michael Woodring, for their help and suggestions for my thesis.
The original DYNAMOD project was supported by the National Science Foun­
dation, grant number SPE-8320677. Work on this thesis was also supported in part 
by a grant from the National Science Foundation, grant number USE-9150298.
VContents
Page
LIST OF TABLES.................................................  viii
LIST OF FIG U R ES.................................................................................................. ix
ABSTRACT ............................................................................................................... x
1. INTRODUCTION............................................................................................. I
The DYNALAB S y s tem ......................................................................................  I
Preview . ...............................................................................................................  3
2. THE E-MACHINE............................................................................................. 5
E-machine Design Considerations....................................................................  5
E-machine Architecture ...................................................................................  g
E-machine E m u la to r ............................................................................................. 14
E-machine Object File S ec tions...........................................................................14
The CODESECTION............................................................................. ' 15
The PACKETSECTION..............................................................................16
The VARIABLESECTION . ........................................................... . . . 16
The LABELSECTION.................................................................................17
The SOURCESECTION..........................................................................  17
The STATSCOPESECTION........................................................................17
The STRINGSECTION..............................................................................18
3. E-MACHINE COMPILATION CONSIDERATIONS......................................20
Program Animation Units and E-code Packets..................................................20
Identifying Program Animation U n its ........................................................21
Translating Program Animation Units into E-code P ack e ts ...................23
Generation of the Static Scope T a b le ................................................................. 25
Translating Enumerated Type Variables ........................................................... 29
Identifying Critical and Noncritical E-code Instructions.................................. 30
4. THE DESIGN OF THE miniPASCAL COMPILER.........................................32
The miniPascal Language ....................................................................................32
Overview of the miniPascal Com piler................................................................. 34
Error Detection and Recovery..............................................................................36
C onten ts— Continued
Page
Optimization ................................................................................ .....................  30
The Compiler M odules...................................................................................... 37
The Main M odule..........................................................................................37
The Parser M odu le .......................................................................................38
Calls to the S c a n n e r ...........................................................................39
Interface to the Symbol T ab le ........................................................... 39
Initiating Semantic A c tio n s .............................................................. 39
Providing for Dynamic Scop ing ........................................................ 40
Translating Animation Units into P a c k e ts ..................................... 41
The Lookahead Problem in Animation Unit Translation . . . .  42
The Semicolon Problem in Animation Unit Translation................43
Adjusting an Animation Unit’s Ending D e lim ite r......................... 44
Adjusting an Animation Unit’s Beginning Delimiter...................... 45
Adjusting the Starting Memory Address of a Packet...................... 46
Adjusting the Ending Memory Address of a P a c k e t ...................... 47
Fragmented Animation U n i t s ........................................................... 48
To HighHght or Not ...........................................................................53
The Scanner M od u le ....................................................................................54
The Code Driver M odule..............................................................................56
The Semantic Analysis Module ................................................................. 56
The PACKET M o d u le .................................................................................57
The SOURCE M o d u le .................................................................................57
The LABEL M o d u le ....................................................................................57
The VARIABLE M odule..............................................................................58
The STRING Module....................................................................................58
The Error Module.......................................................................................... 62
The Memory Allocation M odule................................................................. 64
The Assembly Code M odu le ........................................................................64
The CODE M odule.......................................................................................64
The Symbol Table M odule.......................................................................... 65
The STATSCOPE M odule ...........................................................................74
Generating a Static Scope Block .................................   74
The ProcNum Field ...........................................................................75
Writing the STATSCOPESECTION ...............................................80
Example of STATSCOPESECTION Generation............................ 81
vi
Vll
Contents— Continued
Page
5. CONCLUSIONS AND FUTURE ENHANCEMENTS.................   86
Conclusions.............................................................................................................86
Future Enhancements .................................. '................................................... 87
REFERENCES .......................................................................  89
APPENDICES....................................................    92
APPENDIX A—THE E-MACHINE INSTRUCTION SE T .................................. 93
APPENDIX B—THE E-MACHINE ADDRESSING M O D ES.......................... 104
APPENDIX C -A  miniPASCAL COMPILATION EX A M PLE ....................... 109
List o f Tables
viii
Table Page
1. Packet Table Resulting from Compilation of Program Saippl . . . . . .  25
2. Static Scope Table Resulting from Compilation of Program Sampl . . 26
3. Packet Table Resulting from Compilation of Program Increment I . . 51
4. Static Scope Table Resulting from Compilation of Program Ftrl . . . 78
5. Scope Owner Table for Program Samp2 ..................................................... 83
6. Scope Block for Function B in Procedure A in Program Samp2 . . . .  83
7. Scope Block for Procedure A in Program Samp2 ................................83
8. Scope Block for Procedure B in Program Samp2 ................................84
9. Scope Block for Program Scope in Program S'amp2 ............................... 84
10. Scope Block for “Bootstrap” Scope in Program Samp2 ......................... 84
11. Final Static Scope Table for Program Samp2 .................................. ... . 85
12. The E-code LABETSECTION for Program Samp3  116
13. The E-code VARIABLESECTION for Program S am p 3 ..................... 117
14. The E-code PACKETSECTION for Program Samp3 ■ ........................... 119
15. The E-code STATSCOPESECTION for Program Samp3 . . . . . . .  120
List o f Figures
Figure Page
1. The E-machine .....................................................................................  9
2. Source Code for Program Sampl .............................................................. 22
3. Animation Units Identified in Program S a m p l ...................................22
4. E-code Instructions Resulting from Compilation of Program Sampl . 24
5. Animation Display After Execution of X : = I ;   29
6. E-code Instructions Translating N := K + I * J ...................................31
7. Schematic Diagram of the miniPascal Compiler ......................................35
8. Code Fragment Illustrating the Semicolon P ro b le m ..........................44
9. Source Code for Program Increment I .................................................. 49
10. E-code Translation of Program Increment I ......................................... 49
11. Source Code for Program Increment2 .................................................. 52
12. E-code Translation of Program Increment2 ......................................... 52
13. Source Code for a CASE Statem ent......................................................55
14. Source Code for Program Payro lll.........................................................60
15. Animation Display After Execution of Program Payrolll ..................... 60
16. String Space’s Relationship with Variable Registers and Data Memory 61
17. Source Code for Program Payroll2.........................................................63
18. Animation Display After Execution of Program Payroll' 2 .................63
19. The Symbol Table Hash Im plem entation............................................ 67
20. The Symbol Table Structures .................................................................... 69
21. The miniPascal Identifier T y p e s ............................................................ 70
22. The miniPascal Identifier Classes ...............................................................70
23. Source Code for Program F t r l ............................................................... 77
24. Animation Display After Final Recursive Call of Function Fact . . . .  77
25. Procedure Count Array and Dynamic Scope S tack............................. 79
26. Source Code for Program Samp2  82
27. The E-code SOURCESECTION for Program Samp3 .......................... 115
28. The E-code STRINGSECTION for Program Samp3 ......................... 118
29. The E-code CODESECTION for Program Samp3 ............................ 121
30. Animation Display After Constant Declarations in Program Samp3 . 129
31. Animation Display Before Calling Procedure InitD in Program SampS 130
32. Animation Display at End of Procedure InitD in Program SampS . . 131
ABSTRACT
This thesis is the third phase in the development of a program animation 
system called DYNALAB (DYNAmic LABoratory). DYNALAB is an interactive 
software system that demonstrates programming and computer science concepts at 
an introductory level. The first DYNALAB development phase was the design of 
a virtual computer—the E-machine (Education Machine). The E-machine was de­
signed by Samuel D. Patton and is presented in his Master’s thesis, The E-machine: 
Supporting the Teaching of Program Execution Dynamics. In order to facilitate the 
support of program animation activities, the E-machine has many unique features, 
notably the ability to execute in reverse. The second phase in the development of 
DYNALAB was the design and implementation of an E-machine emulator, which 
is presented in Michael L. Birch’s Master’s thesis, An Emulator for the E-machine. 
This thesis presents the design and implementation of a compiler for the E-machine. 
The compiler’s source language is miniPascal, which is a subset of ISO Standard 
Pascal.
The miniPascal compiler was developed using the Unix lex and yacc compiler 
development tools. It has successfully generated object files ready for execution on 
the E-machine. This thesis focuses on the compilation aspects that are unique to 
the E-machine architecture and the planned animation environment.
IC H APTER I 
IN TRO DUCTIO N
T h e  D Y N A L A B  S y stem
This thesis represents- the third phase of the ongoing DYNALAB software de­
velopment project. DYNALAB  is an acronym for DYNAmic LABoratory, and its 
purpose is to support formal computer science laboratories at the introductory un­
dergraduate level. Students will use DYNALAB to experiment with and explore 
programs and fundamental concepts of computer science. The current objectives of 
DYNALAB include:
o providing students with facilities for studying the dynamics of programming 
language constructs—such as iteration, selection, recursion, parameter passing 
mechanisms, and so forth—in an animated and interactive fashion;
o providing students with capabilities to validate or empirically determine the 
run time complexities of algorithms interactively in the experimental setting 
of a laboratory;
o extending to instructors the capability of incorporating animation into lectures 
on programming and algorithm analysis.
In order to meet these immediate objectives, the DYNALAB project was di­
vided into four phases. The first phase was the design of a virtual computer, called 
the Education Machine, or E-machine, that would support the animation activities
2envisioned for DYNALAB. The two primary technical problems to overcome in the 
design of the E-machine were the incorporation of features for reverse execution and 
provisions for coordination with a program animator. Reverse execution was engi­
neered into the E-machine to allow students and instructors to animate repetitively 
sections of a program that were unclear without requiring that the entire program 
be restarted. Also, since the purpose of DYNALAB is to allow user interaction with 
animated programs, the E-machine had to be designed to be driven by an animator 
system that controls the execution of programs and displays pertinent information 
dynamically in animated fashion on a video screen. This first phase was completed 
by Samuel Patton in his Master’s thesis, The E-machine: Supporting the Teaching 
of Program Execution Dynamics [Patton 89].
The second phase of the DYNALAB project was the implementation of an em­
ulator for the E-machine. This was accomplished by Michael Birch in his Master’s 
thesis, An Emulator for the E-Machine, [Birch 90]. As the emulator was imple­
mented, Birch also included some modifications and extensions to the E-machine.
The third phase of the DYNALAB project, and the subject of this thesis, is 
the design and implementation of a Pascal compiler for the E-machine. The source 
language for the compiler is a subset of ISO Standard Pascal, called miniPascal, and 
the object language is E-code, the machine language of the E-machine. During com­
piler development, the E-machine and its emulator were again modified somewhat 
as practical considerations uncovered new design issues.
The fourth phase of the DYNALAB project, currently in progress, is the design 
and implementation of a program animator that will drive the E-machine and display 
miniPascal programs in dynamic, animated fashion under control of the user. Once 
the animator is complete, the first functional version of DYNALAB will be ready 
for use in introductory computer science laboratory and lecture courses by students 
and instructors alike.
3The DYNALAB project will not end at this point. Compilers for other pro­
gramming languages, such as C, Ada, and Juno—a pseudolanguage used purely for 
teaching [Winslett 93]—are in the initial stages of development. Algorithm anima­
tion (as opposed to program animation—see for example, [Brown 88-1, Brown 88-2]) 
is also a planned extension to DYNALAB. In fact, the DYNALAB project will likely 
never be finished, as new ideas and pedagogical conveniences are incorporated as 
they become apparent.
P rev iew
The thesis consists of five chapters and three appendices. Chapter I presents 
an overview of the thesis. Since a thorough understanding of the target computer’s 
architecture and instruction set is required for compiler development, a summary 
of the E-machine and its emulator is given in chapter 2. Much of the information 
in chapter 2 is taken from the Patton and Birch theses. During the compiler de­
velopment process, it became apparent that several additional E-machine features 
and modifications were necessary or desirable. These changes have been made and 
are so noted in chapter 2. For a more detailed explanation of the E-machine and its 
emulator, the reader is referred to the above-mentioned theses.
Chapter 3 describes the special considerations that E-machine compilers must 
address in order to function within the DYNALAB animation environment. 
Chapter 4 contains a description of the miniPascal compiler. The Pascal subset 
comprising the miniPascal language is presented, followed by an overview of the 
compiler design. It is the intent of chapter 4 to focus on the solutions to the compi­
lation considerations unique to the DYNALAB animation environment. The current 
status of the miniPascal compiler is given in chapter 5. Chapter 5 also includes sug­
gestions for future enhancements.
4Since there are many E-code examples used throughout the thesis, appendices 
A and B are included for completeness. Appendix A describes the E-machine in­
struction set and appendix B describes the E-machine addressing modes. Both of 
these appendices are adapted from chapter 2 of Birch’s thesis. Appendix C presents 
a complete mini P as cal compilation example.
5C H A PTER  2 
THE E-M ACHINE
This chapter is included to provide a description of the E-machine and is adapted 
from chapter 5 of Patton’s thesis [Patton 89] and chapters I, 2, and 3 of Birch’s thesis 
[Birch 90]. This chapter is a summary and update of information from those two 
theses (much of the material is taken verbatim). New E-machine features that have 
been added as a result of this thesis are noted by a leading asterisk (*).
The E-machine is a virtual computer with its own machine language, called 
E-code. The E-code instructions are described in appendix A; these instructions may 
reference various E-machine addressing modes, which are described in appendix B. 
The E-machine’s task is to execute E-code translations of high level language pro­
grams. The miniPascal language is the first language to be translated into E-code. 
The real purpose of the E-machine is to support the DYNALAB program animation 
system, as described more fully in [Ross 91], [Birch 90], [Ross 93] and in Patton’s 
thesis [Patton 89], where it was called a “dynamic display system”.
E -m ach in e  D esig n  C on sid era tion s
The fact that the E-machine’s sole purpose is to support program animation 
was central to its design. The E-machine operates as follows. After the E-machine
6is loaded with a compiled E-code translation of a high level language program, it 
awaits a call from a driver program (the animator). A call from the animator causes 
a group of E-code instructions, called a packet, to be executed by the E-machine. A 
packet contains the E-code translation of a single high level language construct, or 
animation unit, that is to be highlighted by the animator. An animation unit could 
be a complete high level language assignment statement, for example 
A := X + 2*Y;
which is to be highlighted as a result of a single call from the animator; the cor­
responding packet would be the E-code instructions that translate this assignment 
statement. Another animation unit could be just the conditional part of an if state­
ment; in this case the corresponding packet would be just the E-code instructions 
translating the conditional expression. It is the compiler writer’s responsibility to 
identify the animation units in the source program so that corresponding E-code 
packets can be generated. After the E-machine executes a packet, control is re­
turned to the animator, which then performs the necessary animation activities 
before repeating the process by again calling the E-machine to execute the packet 
corresponding to the next animation unit. Chapter 3 describes this process in more 
detail.
Since the E-machine’s purpose is to enable program execution dynamics of high 
level programming languages to be displayed easily by a program animator, it had 
to incorporate the following:
o structures for easy implementation of high level programming language 
constructs;
o a simple method for implementing functions, procedures, and parameters;
o the ability to execute either forward or in reverse.
The driving force in the design of the E-machine was the requirement for reverse 
execution. The approach taken by the E-machine to accomplish reverse execution
7is to save the minimal amount of information necessary to recover just the previous 
E-machine state from the current state in a given reversal step. The E-machine can 
then be restored to an arbitrary prior state by doing the reversal one state at a time 
until the desired prior state is obtained. This one-step-at-a-time reversal means that 
it is necessary only to store successive differences between the previous state and 
the current state, instead of storing the entire state of the E-machine for each step 
of execution.
One other aspect of program animation substantially influenced the design of 
the reversing mechanism of the E-machine. Since the animator is meant to animate 
high level language programs, the E-machine actually has to be able to effect rever­
sal only through high level language animation units in one reversal step, not each 
low level E-machine instruction in the packet that is the translation of an animation 
unit. This observation led to further efficiencies in the design of the E-machine and 
the incorporation of two classes of E-machine code instructions, critical and noncrit- 
ical. An E-machine instruction within a packet is classified as critical if it destroys 
information essential to reversing through the corresponding high level language an­
imation unit; it is classified as noncritical otherwise. For example, in translating the 
animation unit corresponding to an arithmetic assignment statement, a number of 
intermediate values are likely to be generated in the corresponding E-code packet. 
These intermediate values are needed in computing the value on the right-hand side 
of the assignment statement before this value can be assigned to the variable on 
the left-hand side. However, the only value that needs to be restored during re­
verse execution as far as the animation unit is concerned is the original value of 
the variable on the left-hand side. The intermediate values computed by various 
E-code instructions are of no consequence. Hence, E-code instructions generating 
intermediate values can be classified as noncritical and their effects ignored during 
reverse execution. It is the compiler writer’s responsibility to produce the correct
8E-code (involving critical and noncritical instructions) for reverse execution. How­
ever, it should also be noted that the E-machine has the flexibility to accurately 
execute E-code in reverse, instruction by instruction (rather than a packet at a 
time), by simply designating each E-code instruction as critical.
E -m ach in e  A rch itec tu re
Figure I shows the logical structure of the E-machine. A stack-based architecture 
was chosen for the Ermachine; however, a number of components that are not found 
in real stack-based computers were included.
Program memory contains the E-code program currently being executed by the 
E-machine. Program memory is loaded with the instruction stream found in the 
CODESECTION of the E-machine object code file, which is described later in this 
chapter. The program counter contains the address in program memory of the next 
E-code instruction to be executed. The previous program counter, needed for reverse 
execution, contains the address in program memory of the most recently executed 
E-code instruction.
Packet memory contains information about the translated E-code packets and 
their corresponding source language animation units. Packet memory, which is 
loaded with the information found in the PACKETSECTION of the E-machine ob­
ject code file, essentially effects the “packetization” of the E-code program found 
in program memory. Packet information includes the starting and ending line and 
column numbers of the original source program animation unit (e.g, an entire as­
signment statement, or just the conditional expression in an if statement) whose 
translation is the packet of E-code instructions about to be executed. Other packet 
information includes the starting and ending program memory addresses for the 
E-code packet, which are used internally to determine when execution of the packet
9Label Label
Registers Stacks
Evaluation
Stack Evaluation
Register Stack
Variable Variable
Registers Stacks
Index
Register
Address
Register
^Dynamic
Scope * Dynamic
Stack Scope
Register Stack
STATIC
SCOPE
MEMORY
Return *Save
Address Return Dynamic *Save
Scope
Stack
Register
Dynamic
Scope
Stack
Stack
Register
Address
Stack
Previous
Program
Counter
Save Stack 
Registers
Save
Stack
Program
CounterPacket
Register
PACKET
MEMORY
SOURCE
MEMORY
Figure I: The E-machine
10
is complete. The packet register contains the packet memory address of the packet 
information corresponding to either the next packet to be executed, or the packet 
that is currently being executed.
The variable registers are an unbounded number of registers that are assigned 
to source program variables, constants, and parameters during compilation of a 
source program into E-code. Each identifier name representing memory in the source 
program will be assigned its own unique variable register in the E-machine. For 
example, in a miniPascal program, a variable named Result might be declared in 
the current program scope and another variable—also named Result— might be 
declared in another enclosing procedure scope. The compiler will assign a unique 
variable register to each of these two variables. Once a variable is assigned a variable 
register, the register remains associated with the variable for the duration of the 
program’s compilation and subsequent execution, regardless of whether the variable 
is currently active or not.
The information held in a variable register consists of the corresponding vari­
able’s size (e.g., number of bytes) as well as a pointer to a corresponding variable 
stack. Each variable stack entry, in turn, holds a pointer into data memory, where 
the actual variable values are stored. The variable stacks are necessary because a 
particular variable may have multiple associated instances due to being declared in 
recursive procedures or functions. In such instances, the top of a particular variable’s 
register stack points to the value of the current instance of the associated variable 
in data memory; the second stack element points to the value of the previous in­
stantiation of the variable, and so on. The E-machine’s data memory represents the 
usual random access memory found on real computers. The E-machine, however, 
uses data memory only to hold data values (it does not hold any of the program 
instructions).
11
*The string space component of the E-machine’s architecture was added as a 
result of the miniPascal compiler development. The string space contains the values 
of all string literals and enumerated constant names encountered during the com­
pilation of a miniPascal program. The string space is loaded with the information 
contained in the STRINGSECTION of the E-machine object file. Currently, this 
string space is used only by the animator when displaying string constant and enu­
merated constant values. A more detailed discussion of the interaction of the string 
space and variable registers is found in chapter 4
The label registers are another unique component of the E-machine required for 
reverse execution. There are an unbounded number of these registers, and they are 
used to keep track of labeled E-code instructions. Each E-code lab e l instruction 
is assigned a unique label register at compile time. The information held in a label 
register consists of the program memory address of the corresponding E-code lab e l 
instruction as well as a pointer to a label stack. A label stack essentially maintains 
a history of previous instructions that caused a branch to the label represented by 
the label register in question. During reverse execution, the top of the label stack 
allows for correct determination of the instruction that previously caused the branch 
to the label instruction.
The index register is found in real computers and serves the same purpose in 
the E-machine. In many circumstances, the data in a variable is accessed directly 
through the appropriate variable register. However, in the translation of a high level 
language data structure, such as an array or record, the address of the beginning 
of the structure is in a variable register; to access an individual data value in the 
structure, an offset—stored in the index register—is used. When necessary, the 
compiler can therefore utilize the index register so that the E-machine can access 
the proper memory location via one of the indexed addressing modes.
12
The address register is provided to allow access to memory areas that are not 
accessible through variable registers. For example, a pointer in Pascal is a variable 
that contains a data address. Data at that address can be accessed using the address 
register via the appropriate E-machine addressing mode. The address register can 
be used in place of variable registers for any of the addressing modes.
As in many real computers, the results of all arithmetic and logical operations 
are maintained on the evaluation stack] the evaluation stack register keeps track of 
the top of this stack. For example, in an arithmetic operation, the operands, are 
pushed onto the evaluation stack and the appropriate operation is performed on 
them. The operands are consumed by the operation and the result is pushed onto 
the top of the stack. An assignment is performed by popping the top value of the 
evaluation stack and placing it into the proper location in data memory.
The return address stack (or call stack) is the E-machine’s mechanism for imple­
menting procedure and function calls. When a subroutine call is made, the program 
counter plus one is pushed onto the return address stack. Then, when the E-machine 
executes a return from subroutine instruction, all it has to do is load the program 
counter with the top of/the return address stack. A pointer to the top of the return 
address stack is kept in the return address stack register.
The save stack contains information necessary for reverse execution. Whenever 
some critical information (as determined by the execution of a critical instruction) is 
about to be destroyed, the required information is pushed onto the save stack. This 
ensures that when backing up, the instruction that most recently destroyed some 
critical information can be reversed by retrieving that critical information from the 
save stack. The save stack registers point to the top and bottom of the save stack.
*The dynamic scope stack was added to the original E-machine architecture 
as a result of the miniPascal compiler development. The original E-machine 
did not provide a way for the animator to determine (for display) the currently
13
active program scopes. The animator must be able to display variable val­
ues associated with the execution of a packet both from within the cur­
rent invocation of a procedure (or function) and from within the call­
ing scope(s). That is, the animator must have the ability to illustrate 
a program’s run time stack during execution. The Static Scope Table, 
which is loaded into static scope memory from the E-machine object file’s 
STATSCOPESECTION, provides the animator with the information relevant to 
the static nature of a program (e.g., information pertaining to variable names local 
to a given procedure). However, the specific calling sequence resulting in a particular 
invocation of a procedure (or function) was not available,
The dynamic scope stack provides the dynamic chain as found in the run time 
stack activation records generated by most conventional compilers. Even though 
the E-machine’s return address stack could be used to hold this information, a 
separate dynamic scope stack was added to the E-machine architecture in order 
to minimize the impact on the existing E-machine and its emulator. At any given 
point during program execution, the dynamic scope stack entries reflect the currently 
active scopes. Each dynamic scope stack entry—corresponding to a program name, 
a procedure name, or a function name—contains the index of the Static Scope 
Table entry describing that name (i.e., a static scope name). Once these indices are 
available, the animator can then use the Static Scope Table information to determine 
the variables whose values must be displayed following the execution of a packet. 
The animator needs access to the entire dynamic scope stack in order to display all 
pertinent data memory information following the execution of any given packet. A 
more detailed discussion of this process is found in chapter 4. The dynamic scope 
stack register points to the top of the dynamic scope stack.
*In order to handle reverse execution, a save dynamic scope stack was added 
to the E-machine architecture. This stack records the history of procedures and/or
14
functions that have been called and. subsequently-returned from. The save dynamic 
stack register points to the top of this stack.
Finally, source memory holds an array of records, each of which is a copy of a 
line of source code for the compiled program. Source memory is loaded from the 
E-machine object file’s SOURCESECTION at run time and is referenced only by 
the animator for display purposes.
E -m ach ine E m u la tor
The E-machine emulator was designed and written by Michael Birch and is de­
scribed in his thesis [Birch 90]. The emulator’s design essentially follows the design 
of the E-machine presented the previous sections of this chapter. The emulator 
was written in ANSI Standard C for portability and has been compiled in both 
Turbo C 2.0 and Borland C++ 3.1 by the current author. Within the complete 
DYNALAB environment, the emulator will act as a slave to the program animator, 
executing a packet of E-code instructions upon each call. The current author has 
written a simple DOS animator to drive the emulator in order to test compiled 
miniPascal programs. This animator/ emulator has successfully run compiled mini- 
Pascal programs on several IBM PO compatible computers including 286, 386, and 
486 architectures.
E -m ach in e  O b ject F ile  S ection s
The E-machine emulator defines the object file format' that must be generated 
by a compiler. As a result of the miniPascal compiler development, several changes 
were made to the original Ermachine object file definition and are denoted with a
15
leading asterisk (*) in the following discussion. A single E-code object file ready 
for execution on the E-machine consists of seven sections, which may occur in any 
order. Each section is preceded by an object file record containing the section’s name 
followed by a record that contains a count of the number of records in that particular 
section. Each of these seven sections (whose names are shown in capital letters) holds 
information which is loaded into a corresponding E-machine component at run time 
as follows:
o the CODESECTION, which is loaded into program memory;
o the PACKETSECTION, which is loaded into packet memory;
o the VARIABEESECTION, which is loaded into the size information associated 
with the variable registers;
o the LABELSECTION, which is loaded into the label program address infor­
mation associated with the label registers;
o the SOURCESECTION, which is loaded into source memory;
o the STATSCOPESECTION, which is loaded into static scope memory;
o- the STRINGSECTION, which is loaded into the string space.
The file sections are described below.
The CODESECTION
The CODESECTION contains the translated program—the E-code instruction 
stream. Even though the instruction stream can be thought of as stream of pseudo 
assembly language instructions, the instructions are actually contained in an array 
of C structures, and are loaded from the CODESECTION into the E-machine’s pro­
gram memory at run time. Each E-code instruction structure contains the following 
information:
o an operation code (e.g., push or pop);
o the instruction mode (critical or non critical);
16
O The data type of the operand (e.g., I indicates INTEGER); 
o Either a numeric data value or an addressing mode.
*The PACKETSECTION
The PACKETSECTION consists of packet structures describing source program 
animation units and their translated E-code packets. These structures are loaded 
into the E-machine’s packet memory at run time. Each packet structure contains 
the following information:
o the packet’s starting and ending E-code instruction addresses in program mem­
ory;
o the starting and ending line and column numbers in the original source file of 
the program animation unit corresponding to the packet;
o *an index into the current scope block of the Static Scope Table (discussed in 
chapter 3);
o *the program memory address at which the packet may be “fragmented” (dis­
cussed in chapter 4);
o *a flag indicating whether or not the animator should display information 
when the packet is executed (discussed in chapter 4).
The VARIABLESECTION
The VARIABLESECTION consists of structures describing the variable registers 
used by the compiled program. A variable register structure consists of a single field 
that contains the size of the data represented by the register. For example, on a 
DOS machine where the addressable unit is a byte, a variable representing a 32-bit 
integer would have a size of 4. This information is used to initialize size information 
held in the E-machine’s variable registers.
17
The LABELSECTION
The LABELSECTION consists of label structures describing the label numbers 
generated by the compiled program. A label structure consists of a single field that 
contains the program address at which the corresponding label is defined. This 
information is used to initialize the label program address information held in the 
E-machine’s label registers.
The SOURCESECTION
The SOURCESECTION contains a copy of the source program being executed. 
Each record in this section corresponds to a fine of original source code, and is loaded 
into the E-machine’s source memory at run time. Source memory is referenced only 
by the animator for display purposes. The animator references source memory 
via packet memory information that describes correlations between the currently 
executing E-code packet and the corresponding source program animation unit. 
The animator references the packet structure fields that hold starting and ending 
line and column numbers in source memory to determine the animation unit to 
highlight.
*The STATSCOPESECTION
The STATSCOPESECTION was originally named the SYMBOLSECTION in 
Birch’s thesis. It contains a complex structure—the Static Scope Table (called the 
symbol table in Birch’s thesis)—which is used by the animator to determine the 
variable values that should be displayed upon execution of a packet. The name 
was changed to Static Scope Table in order to avoid confusion with the compiler’s 
symbol table. The STATSCOPESECTION records are loaded into the E-machine’s 
static scope memory at run time.
18
A number of additions and changes were made to the Static Scope Table’s struc­
ture during miniPascal compiler development. These changes deal primarily with 
making information available so that the animator can display both the dynamic 
and static information that are appropriate at various stages of program execution. 
The Static Scope Table is logically divided into “scope blocks,” each of which de­
scribes identifiers declared within a single static scope of the source program. A 
more complete discussion of this section is found in chapters 3 and 4. Each Static 
Scope Table entry contains the following information:
o the name of the identifier being described (e.g., a variable name or a procedure 
name);
o upper and lower bounds (for array variables);
o *the index of the Static Scope Table entry containing the next array index 
bounds (for multidimensional arrays);
o the offset value (for record fields); ■
o an enumerated value indicating the data type (e.g., INTEGER, RECORD, or 
STRING);
o *the record size (for arrays of records);
o a pointer to this entry’s parent Static Scope Entry;
o a pointer to the child of this entry (e.g., if this static scope entry describes a 
procedure, this field would hold the index of the first entry in the static scope 
block describing the variables declared local to the procedure);
o a variable register number (for variable names);
o *a number statically assigned to procedure and functions entries; this number 
is used in determining the dynamic scoping level at execution time.
*The STRINGSECTION
The STRINGSECTION, which contains the values of string literals and enumer­
ated constant names, was added as a result of miniPascal compiler development. The
19
contents of the STRINGSECTION are loaded into the E-machine’s string space at 
run time. The string space allows the animator to have dynamic access to the names 
of an enumerated type as well as the internal numeric values corresponding to the 
names. The animator can also retrieve the values of string constants from the string 
space.
20
C H APTER 3
E-M ACHINE COM PILATION  
CONSIDERATIONS
Many of the compilation concerns confronting E-machine compiler writers are 
the same as those faced by writers of compilers for conventional machines. There 
are, however, several unique factors that must be addressed when compiling for the 
E-machine’s animation environment, including:
o identification and translation of program animation units into E-code packets; 
o generation of the Static Scope Table;
o providing access to names associated with enumerated type variables; 
o identifying critical and noncritical E-code instructions.
P rogram  A n im a tio n  U n its  and  E -cod e  P ack ets
As briefly described in chapter 2, the animation of a high level language program 
is accomplished by dividing its source code into program “chunks” called anima­
tion units. The compiler is responsible for isolating a source program’s animation 
units. Each animation unit, in turn, must be translated into a group—or packet—of 
E-code instructions along with corresponding descriptions of the animation unit and 
its translated E-code packet via a packet structure.
21
When a high level language program is animated, the animator begins execution 
by displaying the first several fines of the source code and highlighting the first 
animation unit in the program. The animator then awaits a response from the 
user. When the user responds, the animator calls the E-machine to execute the 
currently highlighted animation unit of the program. Actually, what the E-machine 
executes is the packet of instructions corresponding to the animation unit. When 
the E-machine has completed execution of the instructions contained in the packet, 
control is returned to the animator. The animator then performs various animation 
tasks (e.g., displaying pertinent data memory values) and then again awaits a user 
response before repeating this process by highlighting the next animation unit and so 
forth. Thus, two of the challenging tasks facing the compiler designer are identifying 
animation units and properly translating them into E-code packets for successful 
animation. The following two sections present an example program to illustrate 
how the miniPascal compiler accomplishes these two tasks. Although this example 
program posed no particular problems for the compiler, a number of subtle problems 
relative to identifying and translating program animation units were encountered 
during the compiler’s development. These problems and their solutions are discussed 
in detail in chapter 4.
Identifying Program Animation Units
The compiler identifies individual animation units as it is parsing the high 
level language source code. Consider the miniPascal program in figure 2 (the num­
bers on the left correspond to fine numbers in the source program file). For this 
program, the miniPascal compiler identifies the nineteen animation units shown 
in figure 3 (the numbers on the left correspond to each animation unit’s associ­
ated packet structure, as discussed in the next section). These animation units 
will be successively highlighted (in the original source program of figure 2) by the
22
0
1
2
3
4
5
6
7
8 
9
10
11
12
13
14
15
16
17
18
Program Sampl;
VAR
I,J,K:INTEGER; 
I:IITEGER;
Procedure -Init(VAR X,Y :INTEGER) 
BEGIN 
X := I;
Y := 2;
END;
BEGIN
Init(IjJ) ;
IF I < 10
THEN K := 100 
ELSE K := 0;
N := K + I*J 
END.
Figure 2: Source Code for Program Sampl
0 Program Sampl;
1 VAR
2 IjJ jK:INTEGER;
3 N :INTEGER;
4 Procedure Init
5 (VAR X jY:INTEGER);
6 BEGIN
7 X := I;
8 Y := 2;
9 END;
10 BEGIN
11 Init(IjJ);
12 IF I < 10
13 THEN
14 K := 100
15 ELSE
16 K := 0
17 N := K + I*J
18 END.
Figure 3: Animation Units.Identified in Program Sampl
23
animator as it performs the animation of the program. It should be noted that 
the determination of animation units is arbitrary and can vary from one compiler 
to another based on subjective aesthetics of program animation. As can be seen 
from this example, an animation unit can correspond to “chunks” of source code 
representing a single keyword, an entire program statement, the conditional part of 
an if statement, and so forth.
Translating Program Animation Units into E-code Packets
Once the compiler has identified an animation unit, it must then translate this 
unit into a corresponding packet of E-code instructions along with an associated 
descriptive packet structure. Thus, compilation of the example given in figure 2, 
would result in the generation of nineteen E-code packets and nineteen correspond­
ing packet structures. Figure. 4 shows the pseudo assembly language representation 
of the E-code instructions generated for the miniPascal program shown in figure 2. 
The numbers shown on the left in figure 4 correspond to program memory addresses 
(instruction numbers). The individual packets, corresponding to the animation units 
of figure 3, are shown separated by blank lines in figure 4.
Table I shows the array of packet structures—called the Packet Table— 
describing the individual packets resulting from the translation of the program of 
figure 2. The PacketNumber field (column) is included for clarity—it is not part of 
the Packet Table. The first two fields in the Packet Table (StartAddr and EndAddr) 
give the starting and ending addresses in program memory of the E-code packet. 
The next four fields (Star thine, St art Col, EndLine, and EndCol) demark the phys­
ical location of the packet’s corresponding program animation unit in the source 
program array. The ScopeIndex field in the Packet Table is discussed in the next 
section of this chapter. The final two fields (FragAddr and DisplayPacket) provide
24
0 pushd' Cl2 36 nop
I nop
37 push I,ClOO
2 nop 38 pop c,I,VO
39 br 5
3 inst c,VO
4 inst c,Vl 40 label 4
5 inst c,V2 41 nop.
6 inst c,V3 42 push I,C0
7 br 0 43 pop c,I,VO
8 label LI 44 label L5
9 pushd C9 45 inst c,V7
46 push I,V2
10 link V5 47 push I,Vl
11 link V4 48 mult c ,I
49 pop c,I,V7
12 nop 50 inst c,V8
51 push I,VO
13 push !,Cl 52 push I,V7
14 pop c,I,V5 53 add c,I
54 pop c,I,V8
15 push I,C2 55 push I,V8
16 pop c,I,V4 56 pop c,I,V3
17 nop 57 nop
18 unlink V4 58 uninst c,V8
19 unlink V5 59 uninst c,V7
20 popd 60 uninst c,V6
21 return 61 uninst c,V3
62 uninst c,VO
22 label 0 63 uninst c,Vl
23 nop 64 uninst c,V2
65 popd
24 pusha Vl
25 pusha V2
26 call I
27 label 2
28 label 3
29 inst c,V6
30 push I,V2
31 push I,CIO
32 . less c,I
33 pop c,B,V6
34 push B,V6
35 brf c,4
Figure 4: E-code Instructions Resulting from Compilation of Program Sampl
25
Packet
Number
Start
Addr
End
Addr
Start
Line
Start
Col
End
Line
End
Col
Scope
Index
Frag 
. Addr
Display
Packet
0 0 I 0 0 0 14 0 -I TRUE
I 2 2 2 2 2 4 0 -I TRUE
2 3 5 3 4 3 17 3 -I TRUE
3 6 7 4 4 4 13 4 -I TRUE
4 8 9 6 2 6 15 0 -I TRUE
5 10 11 6 17 6 33 2 -I TRUE
6 12 12 7 4 7 8 2 -I TRUE
7 13 14 8 6 8 12 2 -I TRUE
8 15 16 9 6 9 12 2 -I TRUE
9 17 21 10 6 10 9 5 -I TRUE
10 22 23 12 2 12 6 5 -I TRUE
11 24 26 13 4 13 13 5 -I TRUE
12 27 35 14 4 14 12 5 -I TRUE
13 36 36 15 6 15 9 5 -I TRUE
14 37 39 15 11 15 18 5 -I TRUE
15 40 41 16 6 16 9 5 -I TRUE
16 42 43 16 11 16 17 5 -I TRUE
17 44 56 17 4 17 15 5 -I TRUE
18 57 65 18 4 18 7 5 -I TRUE
Table I: Packet Table Resulting from Compilation of Program Sampl
additional information necessary for animating an animation unit and are discussed 
in chapter 4.
G en era tion  o f  th e  S ta tic  S cop e T able
The compiler writer must also provide information describing all of the data 
memory variables that the animator must display. This information is provided in 
the Static Scope Table, a linear array which is, in turn, logically divided into numer­
ous scope blocks. Each scope block describes the identifiers (e.g., variable names 
and procedure names) declared in a single static scope in a program. Even though 
this information is obtained from the compiler’s symbol table, the generation of the
26
Static Scope Table is not a straightforward task due to scope nesting characteristics 
of many high level languages, such as miniPascal.
Table 2 shows the Static Scope Table that is generated as a result of compiling the 
miniPascal program given in figure 2. The Entry (entry number) column, or field, 
is included for clarity—it is not part of the Static Scope Table. This Static Scope 
Table consists of three scope blocks—a block describing the identifiers declared 
within the scope of procedure Init (entries 0-3), a block describing the identifiers 
declared within the scope of program Sampl (entries 4-10), and a “bootstrap” block 
describing the main program entry (entries 11-13).
En Id Upr Lwr Nxt Off Type Rec Par Ch Var Proc
try Name Bnd Bnd Idx set Siz ent ild Reg Num
S c o p e  b lo c k  d e s c r i b in g  p r o c e d u r e  I n i t
0 - - - - HEADER - 4 - - -
I X - - - - INTEGER - - - 5 -
2 Y - - - - INTEGER - - - 4 -
3 - - - - END - - - - -
S c o p e  b lo c k  d e s c r i b in g  p r o g r a m  S a m p l
4 - - - - HEADER - 11 - - -
5 I - - - - INTEGER - - - 2 -
6 J - - - - INTEGER - - - I -
7 K - - - - INTEGER - - - 0 -
8 N - - - - INTEGER - - - 3 -
9 Init - - - - P R O C E D U R E - - 0 - I
10 - - - - E N D - - - - - -
Bootstrap scope block
11 - - - - H E A D E R - - - - -
12 Sampl - - - - P R O G R A M - - 4 - 0
13 - - - - E N D - - - - -
Table 2: Static Scope Table Resulting from Compilation of Program Sampl
The bootstrap block contains three entries: the HEADER and END entries that 
delimit the scope block and a PROGRAM entry containing information about the 
program itself. There are two fields of interest in the PROGRAM entry; these are the
27
child pointer field (Child) and the procedure number field (ProcNum). The Child 
field contains the index of the first entry of the scope block describing the identifiers 
declared in the program. The ProcNum field contains a compiler-generated number 
that is used in conjunction with dynamic scoping; this field is discussed in chapter 4.
The entries in the scope block describing the identifiers declared in the pro­
gram scope consist of the HEADER and END delimiter entries as well as entries 
describing each of the scope’s identifiers. The Parent field of the HEADER en­
try in this scope block contains the index of the first entry of the bootstrap scope 
block. This scope block’s PROCEDURE entry—describing procedure Init—uses the 
Child field, which contains the index of the first entry of the scope block describing 
the identifiers declared in procedure Init. The ProcNum field is also used in the 
PROCEDURE entry; it contains a compiler-generated number to be used in con­
junction with dynamic scoping.
The entries in the scope block describing the identifiers declared in procedure Init 
consist of the HEADER and END delimiter entries as well as entries describing each 
identifier declared in the scope, in this case the procedure’s parameters. The Parent 
field of the HEADER entry of this scope block contains the index of the first entry 
of the scope block containing the procedure’s declaration.
There must also be some way to relate a high level language program’s dynamic ' 
nature to the static information found in the Static Scope Table. That is, the 
animator must be able to determine all of the active scopes at any given point during 
execution of the program. The animator can then display the data memory values 
pertinent to the most current scope as well as the data memory values associated 
with the scopes in the calling sequence leading to the most current scope.
The animator retrieves dynamic scoping information from the E-machine’s dy­
namic scope stack. For instance, suppose that the animator has just highlighted
28
th_e animation unit 
X := I;
in procedure Init. After receiving a response from the user, the animator then 
calls the E-machine to execute the E-code packet corresponding to this animation 
unit. When the E-machine returns control to the animator, the animator must then 
determine the relevant data memory values to be displayed following any changes 
that resulted from execution of the packet. This task is accomplished by querying 
the E-machine’s dynamic scope stack, which contains a history of the active scopes. 
In this example, the dynamic scope stack currently consists of two entries, each 
containing an index into the Static Scope Table. The top entry contains the value 9 
and the bottom entry contains the value 12. These values indicate to the animator 
that procedure Init (Static Scope Table entry number 9) is the most current active 
scope and that program Sampl (entry number 12) is the calling scope, By using the 
child pointers associated with these two Static Scope Table entries, the animator can 
now determine the appropriate data memory values to be displayed. Figure 5 shows 
a possible animation resulting from the execution of this animation unit. The arrow 
(==>) pointing to the instruction Y := 2; indicates where animation proceeds.
The ScopeIndex field of the packet structure can now be explained. Suppose 
that the E-machine has completed execution of the packet corresponding to the 
animation unit
I,J,K:INTEGER;
and has returned control to the animator. The animator, via a query of the dynamic 
scope stack, now determines that only the values of the variables contained in the 
outer program scope should be displayed. The variables listed in the block describing 
this scope’s variables are I, J, K, and N. However, at this point in the program’s 
execution, variable N has not yet been declared, and thus should not be displayed. 
The ScopeIndex field of the packet structure associated with the above animation
29
Program Sampl; Program Sampl
I = I
VAR J is undefined
I,J,K :INTEGER; K is undefined
N :INTEGER; N is undefined
Procedure Init(VAR X,Y :INTEGER); Procedure Init
BEGIN X = I
X := I; Y is undefined
==> Y := 2;
END;
BEGIN
Init(IjJ) ;
IF I < 10
THEN K := 100
ELSE K := 0;
N := K + I*J
END.
Figure 5: Animation Display After Execution of X := I;
unit contains the value 3. This value indicates to the animator that it should only 
display data memory values for entries numbered 0, I, 2, and 3 in the window 
associated with the most current active scope block. Hence, the animator will 
display the values of the variables I, J, and K (0 stands for the HEADER entry). In 
this case, all of these variables would have the value “undefined,” as they have only 
just been declared and have not yet had values assigned to them.
T ransla ting  E n u m era ted  T yp e  V ariab les
Ordinarily, only the internal numeric value of an enumerated type variable is 
required in translated object code. It is desirable, however, for program animation 
purposes to have the animator display the enumerated constant name rather than 
just the internal numeric value of a variable of an enumerated type. Thus, when 
translating an enumerated type variable, the compiler must provide a way for the
30
animator to relate the variable’s internal numeric value to its corresponding constant 
name. This task was accomplished by the addition of the string space to the El- 
machine’s architecture. The string space holds the enumerated constant names 
(as well as string literals) defined in a miniPascal program. The method that the 
miniPascal compiler uses to relate an enumerated type variable’s internal numeric 
value to the appropriate name in the string space is discussed in chapter 4.
Id en tify in g  C r itica l and N on cr itica I E -cod e  In stru c tion s
The final major E-machine compilation concern is that of identifying the E-code 
instructions that would destroy information that is needed (i.e., critical) for success­
ful reverse execution. Since the immediate concern for the miniPascal compiler was 
to produce a usable compiler, the current version of the compiler treats all E-code 
instructions as critical. For example, the animation unit 
M := K + I*J;
in figure 2 corresponds to the packet of E-code instructions numbered 44 through 
56 in figure 4. AU of these instructions are marked critical via the “c” operand. 
Only instruction number 56 is actuaUy critical, however, as only it results in critical 
information being destroyed. That is, the old value of M is being destroyed by 
popping a new value into it in instruction 56; for reverse execution, this old value 
of N must be saved. Thus, the packet of E-code instructions corresponding to this 
animation unit could be generated as shown in figure 6, where the operand “n” 
indicates a noncritical instruction.
44 label L5
45 inst rL,V7
46 push I-V2
47 push IjVl
48 mult n ,I
49 pop U jIjVT
50 inst IxjVS
51 push IjVO
52 push IjVT
53 add-n,I
54 pop IxjIjVS
55 push IjVS
56 pop C jIjVS
Figure 6: E-code Instructions Translating N := K + l* j
32
CH A PTER  4
THE DESIG N  OF THE miniPASCAL
COM PILER
The rainiPascal compiler is a one-pass compiler written in ANSI Standard C and 
developed with Borland C++ 3.1 on an IBM PC compatible computer. E-machine 
object files (E-code files) generated by the miniPascal compiler have been tested 
using a simple DOS animator driving the E-machine emulator. Even though the 
capabilities of this animator are quite limited, a significant number of miniPascal 
programs have been compiled, executed, and animated successfully.
T h e  m in iP a sca l L angnage
The miniPascal language is a subset (with a few noted extensions) of ISO Stan­
dard Pascal as defined in the book Pascal User Manual and Report by Jensen and 
Wirth [Jensen 91]. The following Pascal features are supported by miniPascal: 
o constant, type, and variable declarations; 
o procedure and function declarations;
o simple types including integer, real, character, boolean, enumerated types, and 
subrange types;
33
o structured types:
— single and multidimensional arrays,
— strings, including arrays of strings,
• -  fixed-part records including records whose fields are arrays, records, strings,
or enumerated types (arrays of records are also supported);
o boolean expressions, unary expressions, and infix expressions;
o assignment statements;
o procedure and function calls;
o control statements:
— the ifithen and if-then-else statements,
— the while loop,
— the repeat loop,
— the for loop,
— the case statement (with the extension of an others clause).
The following Pascal features are not currently supported in miniEascal: 
o records with variant parts; 
o the with statement; 
o pointers; 
o sets; 
o labels;
o the goto statement; 
o external files; 
o the forward directive; 
o predeclared functions and procedures;
o procedure or function names as parameters; 
o conformant-array parameters.
34
O verv iew  o f  th e  m in iP a sca l C om p iler
The miniPascal compiler was developed using the lex and yacc compiler devel­
opment tools [Mason 90]. Lex is a scanner generator written by M.E. Lesk and E. 
Schmidt of Bell Laboratories [Lesk 75] and yacc is a parser generator written by 
S.C. Johnson, also of Bell Laboratories [Johnson 75]. Lex reads a specification file 
of regular expressions identifying the tokens in a language and generates a C mod­
ule containing a scanner for those tokens. Yacc reads a specification file containing 
a context-free grammar (and associated semantic actions) for a language and pro­
duces a C module containing an LALR(I) parser for the language. The basic lex 
and yacc specifications for ISO Standard Pascal were obtained from the ftp network 
site primost.cs.wisc.edu. The semantic stack definition and semantic actions were 
then added to these specifications.
Both lex and yacc are standard utilities available on Unix machines. Even though 
there are versions of these utilities available for DOS machines, the lex and yacc 
specifications for miniPascal have been run exclusively on a Unix machine, with the 
resulting C modules being downloaded to a DOS machine. These C modules were 
then compiled and linked with numerous other C modules containing the semantic 
analysis and code generation routines.
The compiler consists of a total of sixteen modules. Figure 7 is a schematic 
diagram showing the interactions among the various modules—the directions of the 
arrows indicate calls to a module. Three of the sixteen modules are omitted from 
the figure for the sake of clarity. These are the Error Message module, the Memory 
Allocation module, and the module that produces a text file containing the pseudo 
assembly language instructions translating the source program (used for compiler 
debugging purposes). A brief description of the compiler operation is given below.
35
STRING
Module
Scanner
Module
SOURCE
Module
Source 
x File J
VARIABLE
Module
Code Driver 
Module
Parser
Module
PACKET
Module
Main
Module
CODE
Module
LABEL
Module
Symbol
Table
Module
Semantic
Analysis
Module
STATSCOPE
Module
Figure 7: Schematic Diagram of the miniPascal Compiler
After the Main module opens appropriate files, it calls the Parser module, which 
drives the compilation process by requesting tokens from the Scanner module and by 
calling various semantic analysis and code generation routines, notably the Semantic 
Analysis module and the Code Driver module. As can be seen in figure 7, the Symbol 
Table module plays a central role during semantic analysis and code generation.
Seven of the modules are dedicated to producing the E-code object file. These 
modules are:
• the PACKET module, which produces the PACKETSECTION;
• the LABEL module, which produces the LABELSECTION;
• the VARIABLE module, which produces the VARIABLESECTION;
36
o the CODE module, which produces the CODESECTION; 
o the STRING module, which produces the STRINGSECTION; 
o the SOURCE module, which produces the SOURCESECTION; 
o the STATSCOPE module, which produces the STATSCOPESECTION.
When compilation is complete, control is returned to the Main module, which 
then calls routines in each of these seven E-code production modules in order to 
generate the final E-code file (these calls are not indicated in figure 7). If the 
compiler encounters an error during compilation, a call is made to the Error module 
(omitted from figure 7), which prints an error message and then calls a routine in 
the Main module for immediate termination of compilation.
Error D e te c t io n  and  R ecovery
When the compiler detects an error in a miniPascal source file, an appropriate 
message is printed and the compilation is halted. The initial users of this com­
piler will be instructors preparing laboratory exercises-—not students developing 
programs. Thus, minimal error reporting with no recovery was considered to be 
sufficient.
O p tim iza tion
There are no provisions for optimization in the compiler. There is no real need 
for optimization in the animation environment, and many optimizations would alter 
the E-code/source language relationship too severely for animation to be successful.
37
T h e C om p iler  M od u les
The remainder of this chapter describes the individual compiler modules in 
more detail. The discussion is focused on the role each module plays in the 
generation of the seven sections of the E-code file, giving particular attention to 
those sections that presented problems unique to this compiler, The E-code’s 
CODESECTION is essentially the equivalent of the intermediate code files gen­
erated by many compilers; the problems encountered in generating this section were 
the same as would be found in the development of any compiler. The four sections, 
VARIABLESECTION, LABELSECTION, SOURCESECTION, and 
STRINGSECTION, are unique to the E-machine; they, however, posed no 
particular problems and are generated in a straightforward manner. The 
PACKETSECTION, also unique to the E-machine, did present some problems, 
which are discussed below in the Parser Module description. The problems pre­
sented by the STATSCOPESECTION are discussed in the STATSCOPE Mod­
ule description. Another E-code generation problem occurred due to the desire 
to have the animator display an enumerated type variable’s constant name as 
well as its internal numeric value. The solution to this problem was to add the 
STRINGSECTION to the E-machine object file as discussed in the STRING Mod­
ule description.
The Main Module
When the miniPascal compiler is invoked, control passes to the main routine 
in the Main module. The Main module consists of the main routine, routines that 
handle the opening and closing of files, and a routine to handle abnormal end of 
compilation. The Main module opens the miniPascal source file, whose name is
38
obtained from a command line argument when the compiler is invoked. The Main 
module then creates three files to hold the output from the compilation: the E- 
code (object) file, a file to hold the pseudo assembly language instruction stream 
(for compiler debugging), and a temporary file to hold output from the module 
producing the CODESECTION of the E-code file. Next, the Main module calls 
the yacc-generated yyparse routine in the Parser module to begin the compilation. 
When yyparse returns successfully to the Main module, the compilation is complete, 
and the Main module then calls routines in the code producing modules to write 
the various E-code sections to the E-code file. Finally, the files are closed and the 
compiler exits normally. If a return marking an unsuccessful compilation is made to 
the main Module, the miniPascal source file is closed, the output files are deleted, 
and an abnormal exit is indicated.
The Parser Module
As indicated above, the yacc-generated Parser module is responsible for driving 
the compilation process. Yacc produces an LALR(X) parser by processing a speci­
fication file containing a context free grammar that generates the source language. 
Calls to semantic routines, written in C, are interspersed among the grammar pro­
duction rules given in the miniPascal yacc specification. The yacc-generated parser 
maintains a parser-controlled semantic stack, whose records hold information corre­
sponding to each token and non-terminal found in the grammar productions. The 
parser has access to the information in the semantic stack records via pseudo vari­
ables used in the semantic actions. The yacc specification provides a union structure 
to define the different types of semantic stack records necessary to describe the var­
ious semantic information required for each symbol in the grammar. In the case 
of the miniPascal specification, this union structure consists primarily of pointers 
to dynamically allocated structures containing information needed to produce the
39
E-code for the animation of a miniPascal program. The yacc-generated Parser mod­
ule consists of one very large routine, yyparse. Two small user-supplied supporting 
routines are also included in this module.
Calls to  th e  Scanner. As in conventional compilers, the Parser module re­
quests the next token from the Scanner module by calling the lex-generated yylex 
routine. The Parser module has access to the value of a token through the external 
variable yytext, whose value is produced in the Scanner module. Since the mini- 
Pascal language is not case-sensitive, the Scanner converts all letters in an identifier 
name token to lower case before returning the token (and its value) to the Parser 
module. The numeric values of integer and real literal tokens are available to the 
Parser module via the external variable yylval, also produced in the Scanner module.
In terface to  th e  Symbol Table. The Parser module interfaces directly with 
the Symbol Table module to enter and retrieve identifier names. The Parser module 
enters and retrieves an identifier’s symbol table attributes by calling routines in the 
Semantic Analysis module.
In itia ting  Semantic Actions. Many of the semantic actions initiated by the 
Parser module are accomplished by calls to routines in the Code Driver module. 
These routines perform further semantic analysis (via calls to routines in the Se­
mantic Analysis module) and then generate code (via calls to the code production 
routines). For example, when the Parser module recognizes an assignment produc­
tion, it calls the GenAssign routine in the Code Driver module. The parameters 
passed to GenAssign are pointers to the semantic stack structures corresponding to 
the symbols involved in the assignment production rule of the grammar. GenAssign 
can then determine whether the assignment is valid, determine what value (if any) to 
load into the index register, and generate appropriate E-code by calling routines in
40
the code production modules. There are also situations in which the Parser module 
itself can cause E-code generation directly by calling code production routines.
Providing for Dynam ic Scoping. The Parser module provides dynamic 
scoping information to the E-machine by generating code to manipulate the E- 
machine’s dynamic scope stack. (Static scoping information is contained in the 
Static Scope Table and is discussed in detail in the STATSCOPE module descrip­
tion.) When the Parser module encounters the beginning of a program, function, 
or procedure scope, it calls the GenInstr routine in the CODE module to generate 
the pushd instruction. At run time, the pushd instruction causes an entry to be 
pushed onto the E-machine’s dynamic scope stack. This entry contains the index of 
the scope’s declaration (e.g., procedure name description) in the Static Scope Table; 
this index must be passed as an operand in the pushd instruction. (Recall from the 
discussion of the dynamic scope stack in chapter 2 that this is necessary in order that 
identifiers in calling scopes be accessible at run time.) At this point in the parse, 
however, this index value is not known because the scope’s declaration is “owned” 
by the containing scope, whose Static Scope Table entries will.not be generated 
until that entire scope has been parsed. This means that the Parser must asso­
ciate a dummy index value with the pushd instruction, and the instruction must be 
“patched” when the actual value becomes available. When the Parser module en­
counters the end of a scope, it generates the popd instruction and then calls the 
Gen St at Scop eBlo ck routine in the STATS GOPE module to generate the Static 
Scope Table entries for the scope. When the Parser module finally encounters the ' 
end of a containing scope, the STATSCOPE module can calculate the index of 
any nested procedure or function scope declarations and patch their corresponding 
pushd instructions via a call to the CODE module.
41
Translating Anim ation Units into Packets. The Parser module controls 
the identification and subsequent translation of a miniPascal program’s animation 
units. This translation involves the generation of a packet of E-code instructions 
(via calls to the CODE and Code Driver modules) as well as the construction of 
an associated packet structure describing the animation unit. The Parser calls the 
following routines in the PACKET module to construct a packet structure: 
o S tart Packet; 
o EndPacket; 
o Adjust S tart Packet; 
o AdjustEndPacket; 
o AddPktFragInstr;
A packet structure’s delimiter values—pertaining to the source file fine and col­
umn number boundaries of the animation unit that the. packet translates, as well as 
the starting and ending program memory addresses of the E-code packet itself—are 
determined by the Parser module, which then passes these values to the appropriate 
PACKET module routine. The Parser module has access to the source file fine and 
column number values via external variables that are initialized when the Scanner 
module recognizes a token; the Parser has access to the current program memory 
address (instruction number) via an external variable maintained by the CODE 
module as E-code instructions are generated. When the Parser module recognizes a 
token that marks the beginning of an animation unit, it calls the StartPacket rou­
tine, passing as parameters the source file line and column numbers corresponding to 
the first character of this token as well as (in the general case) the number of the next 
E-code instruction to be generated. The PACKET module maintains an internal 
variable, PktNum, containing the number of the packet structure currently under 
construction. This variable, which serves as the index into the PACKET module’s
42
array of packet structures, is incremented in the StartPacket routine. Subsequent 
calls to any of the other PACKET module routines listed above refer to the pre­
viously “started” packet structure. Thus, for a given animation unit, StartPacket 
is called only once, while the remaining routines (including EndPacket) may be 
called any number of times while the animation unit’s packet structure is being 
constructed.
In the simplest case, upon recognizing a token that marks the beginn ing of an 
animation unit, the Parser module first calls the StartPacket routine and then gener­
ates any corresponding E-code instructions. As other tokens within the animation 
unit are recognized, the Parser continues to generate E-code instructions. When 
the Parser recognizes the token marking the end of the animation unit, it calls the 
EndPacket routine, passing as parameters the source file line and column numbers 
corresponding to the last character of this token as well as the number of the current 
E-code instruction. For example, the BEGII keyword is considered to be a complete 
packet. Thus, upon recognizing the BEGIN token, the Parser first calls the Start- 
Packet routine, passing the source file fine and column numbers associated with the 
letter B as well as the number of the next E-code instruction to be generated. Next, 
the Parser generates the E-code nop instruction (via a call to the CODE module). 
Finally, the Parser calls the EndPacket routine, passing the source file fine and col­
umn numbers associated with the letter N as well as the current E-code instruction 
number (corresponding to the nop instruction). There are many cases in which 
an animation unit’s delimiters can be determined in such a straightforward man­
ner; there were, however, a number of subtle animation unit translation problems 
encountered during compiler development.
The Lookahead Problem  in Anim ation Unit T ranslation. One of these 
problems occurs when the Parser module is assigning source file fine and column
43
numbers to an animation unit. For those tokens that delimit an animation unit, 
the parser calls either StartPacket or EndPacket, passing the token’s line and col­
umn numbers and the appropriate instruction number. The values in the external 
variables containing the line and column numbers, however, are incorrect when the 
Parser module must examine the lookahead token in order to determine which pro­
duction to reduce. (Recall that yacc produces an LALR(I) parser that sometimes 
requires a one-symbol lookahead for proper parsing actions.) In these cases, the 
current line and column numbers reflect the location of the lookahead token instead 
of the token delineating the animation unit. This problem was solved by identi- 
fying the tokens that were involved in these situations and replacing them with 
non-terminals on the right-hand sides of productions. Each new non-terminal be­
comes the left-hand side of a unit production whose right-hand side is just the token 
the non-terminal replaced. During reduction of one of these new unit productions, 
the token’s line and column numbers can be captured and placed in the semantic 
record belonging to the token because no lookahead is required to reduce these unit 
productions. Yacc places this record on the semantic stack, which allows access by 
routines processing the stack. Thus, when a production that has one of the new 
non-terminals on its right-hand side is reduced, the correct values can be retrieved 
from the semantic stack and passed to the PACKET module routines.
The Semicolon Problem  in Anim ation Unit T ranslation. Another prob­
lem was easily solved by calling the EndPacket routine more than once for the same 
animation unit. This situation can be illustrated by Pascal’s use of semicolons to 
separate, rather than terminate, statements (recall that the semicolon is not part of 
a Pascal statement [Jensen 91]). For example, in the Pascal code fragment shown 
in figure 8, the semicolon at the end of J := 2; is unnecessary. It really separates 
J := 2 from a null statement, and the null statement precedes the END statement.
44
However, the statement (including the semicolon)
J := 2;
is considered to be an animation unit. The yacc production associated with assign­
ment statements however, does not immediately associate the semicolon with the 
statement (e.g., J := 2). Rather, an enclosing production eventually accomplishes 
this association. Thus, in this case, the Parser first issues a call to EndPacket, 
passing the source file fine and column numbers associated with the token “2”, as 
well as the number of the final instruction within the E-code translation of the an­
imation unit. Then, when the Parser reduces the enclosing production recognizing 
the semicolon, the Parser again calls the EndPacket routine, passing the source file 
line and column numbers associated with the token 11; ”, as well as the current in­
struction number (this number will not have changed since no code is generated for 
the semicolon). Hence, the animation unit is “ended” correctly. As noted for this 
case, since the statement precedes an END statement, the semicolon is optional. If 
the semicolon is omitted, the Parser performs correctly by calling the EndPacket 
routine only once because the enclosing production in this case has no semicolon.
BEGIN 
I := I; 
J := 2; 
END;
Figure 8: Code Fragment Illustrating the Semicolon Problem
A djusting an Anim ation U n it’s Ending Delim iter. There are instances 
when an animation unit’s ending delimiters must be adjusted after the Parser has
45
already ended construction of its associated packet structure. For example, if there 
are procedure declarations following a scope’s variable declarations, the Parser must 
generate an instruction to branch around the code translating the nested procedures 
in order to achieve the correct program flow. During program execution, the an­
imator will need to highlight the animation units corresponding to the variable 
declarations in this routine, then skip the procedure declarations and continue by 
highlighting the body of this routine, demonstrating execution flow. In such cases, 
the Parser has already (correctly) ended the construction of the current packet 
structure—describing the animation unit consisting of the last variable declaration 
in the scope—prior to reaching the first procedure declaration. The branch in­
struction number, however, must now be included in the current packet structure 
as the ending program memory address of the corresponding E-code instruction 
packet to ensure proper animation around the procedure declarations. In this sit­
uation, the packet structure cannot simply be “ended” again, because the current 
source line and column numbers now reflect the location of the beginning of the 
animation unit corresponding to the subsequent procedure declaration. Hence, the 
AdjustEndPacket routine was designed to alter the ending program memory ad­
dress associated with the current packet structure. For the example given above, 
the Parser calls the AdjustEndPacket routine, passing as a parameter the E-code 
instruction number associated with the branch instruction. The Parser then contin­
ues by calling the StartPacket routine to begin construction of the packet structure 
describing the procedure declaration animation unit.
A djusting an Anim ation U n it’s Beginning Delim iter. The routine, 
AdjustStartpacket, was provided to support the situation in which an animation 
unit is nested within another animation unit. This situation occurs when there is 
a function call within another miniPascal statement. For example, consider the
46
following statement:
Result := Min(x,y) + 2;
Upon recognizing the assignment production, the Parser issues a call to the 
StartPacket routine to begin construction of the packet structure describing the 
animation unit consisting of the entire statement. For animation purposes, how­
ever, it is desirable to highlight the function call, Min(x ,y), separately in order to 
illustrate the fact that the function Min must be called before the ,assignment state­
ment can be completed (i.e., Min(x,y) should be treated as a separate animation 
unit). Thus, when the Parser recognizes the function call, it issues a call to the 
AdjustStartPacket routine, passing the source line and column numbers associated 
with the beginning of the function call. The AdjustStartPacket routine returns 
(via parameters) the previous source line and column numbers associated with the 
original StartPacket call for the current packet structure. The Parser then contin­
ues to control the construction of the current packet structure, which now describes 
only the Min(x ,y) portion of the statement, generating an E-code instruction packet 
translating Min(x,y) . When the Parser completes the processing of the function call 
production, it calls the EndPacket routine to end the current packet structure and 
then calls the StartPacket routine to start construction of the next packet structure. 
The line and column numbers passed to this call to StartPacket are the previous 
source line and column numbers returned by the AdjustStartPacket routine; the 
instruction number of the next E-code instruction is also passed to StartPacket. 
Thus, the source code associated with the now current packet structure is the entire 
assignment statement; the E-code packet translating this assignment statement does 
not include the instructions translating the function call.
A djusting th e  S tarting  M em ory Address of a Packet. There is a case 
when it is necessary to retain the value of the current E-code instruction number so
47
that it can be used as the starting program memory address of the packet translating 
the next animation unit. Normally, a packet’s starting program memory address is 
the next instruction number to be generated. However, when an E-code lab e l in­
struction is the current instruction, there are situations when this instruction must 
become the first instruction in the next E-code packet. Unfortunately, since it is 
not an enclosing production that needs the memory address of the label instruc­
tion, the semantic stack cannot be used to store this value. This problem is solved 
by storing the current instruction number (i.e., the number of the E-code lab e l 
instruction) in a global variable, SaveStartInstrNum, which is accessible in the ap­
propriate production. Thus, whenever the Parser recognizes a token that marks 
the beginning of an animation unit, it first queries SaveStartInstrNum for a valid 
instruction number. If the instruction number is valid (i.e., its value is not -I), the 
Parser passes this instruction number value to the StartPacket routine and then sets 
SaveStartlnstrNum’s value to -I. If the value of SaveStartInstrNum is -I, the Parser 
passes the number of the next E-code instruction to be generated to StartPacket. 
It should be noted that use of global variables in the yacc parser must be done very 
carefully to ensure that the parser does not alter a variable’s value before it is used.
A djusting th e  Ending M em ory Address of a Packet. Similarly, there is 
also a case when it is necessary to retain the value of the current E-code instruction 
number so that it can be used as the ending program memory address of the packet 
currently under construction. This situation arises when the Parser generates an 
E-code b r (branch) or c a l l  instruction immediately preceding the generation of a 
lab e l instruction. Due to the nature of the parse, however, the Parser has not yet 
determined that it is time to end construction of the animation unit corresponding 
to the current packet. This decision is made when the next production is processed. 
This next production is not an enclosing production, and thus cannot retrieve the
48
necessary information from the semantic stack. Here again, a global variable is used. 
The Parser queries this variable, SaveEndInstrNum, before calling the EndPacket 
routine. (This is the same situation in which the succeeding lab e l instruction must 
become the starting program address for the next animation unit.)
Fragm ented Anim ation Units. Another problem in translating animation 
units occurs when an animation unit becomes “fragmented”. Fragments result when 
parsing either a single conditional statement or a single procedure/ function call 
that occurs within another conditional statement alone, not within a compound 
(BEGIN/END) statement. This situation is best explained by an example. Con­
sider the miniPascal program in figure 9 (the numbers on the left correspond to 
line numbers in the source hie). This example illustrates a single procedure call 
statement that occurs within a while loop (line number 11). Figure 10 shows the 
pseudo assembly language representation of the E-code instructions translating pro­
gram Increment I. The numbers shown on the left correspond to program memory 
addresses (instruction numbers). The individual packets are separated by a blank 
line in the figure.
First, assume that the animator has two options pertaining to when an animation 
unit should be highlighted. One of these options is to highlight an animation unit, 
await a response from the user before executing the corresponding E-code packet 
(so that the user can contemplate what will happen when the animation unit is 
executed), and then rehighlight the same animation unit upon completion of its 
execution (so that the user can ponder where execution will proceed next). This 
is the scenario that has been used in previous examples. However, as the user 
progresses in his understanding of program flow, it would also be desirable to give the 
animator a second option. This option would allow the animator, upon completion 
of the execution of an animation unit, to immediately highlight the next animation
49
0 PROGRAM Increment I;
1 VAR
2 i.: INTEGER;
3
4 PROCEDURE IncrI;
5 BEGIN
6 i:=i+l
7 END;
8
9 BEGIN
10 i :=1;
11 WHILE i < 5 DO IncrI;
12 END.
Figure 9: Source Code for Program Increment I
0 pushd C7 19 label 0
I nop 20 nop
2 nop 21 push !,Cl
22 pop c-, I,VO
3 inst c,VO
4 br 0 23 label 2
24 inst c,V2
5 label I 25 push I,VO
6 pushd C4 26 push I,C5
27 less c,I
7 nop 28 pop c,B,V2
29 push B ,V2
8 inst c,Vl 30 brf c,3
9 push I,VO
10 push I,Cl 31 nop
11 add c,I
12 pop c,I,Vl 32 call I
13 push I,Vl 33 label 5
14 pop c,I,VO 34 br 2
15 nop 35 label 3
16 uninst c,Vl 36 nop
17 popd 37 uninst c,V2
18 return 38 uninst c,V0
39 popd
Figure 10: E-code Translation of Program Increment I
50
unit to be executed. The following discussion assumes that the animator is running 
under this second option.
Suppose that the animator has just highlighted the animation unit END; of pro­
cedure IncrL Upon receiving a response from the user, the animator calls the E- 
machine to execute the corresponding E-code packet (consisting of the instructions 
numbered 15 through 18). The E-machine returns to the animator when it com­
pletes execution of the packet. The animator now queries the E-machine’s packet 
register in order to determine the next animation unit to be highlighted. Although 
it is not evident from figure 10, the RETURN instruction (number 18) causes control 
to pass to the LABEL instruction (number 33) following the CALL instruction that 
caused control to pass to procedure Incrl. (This is accomplished via the E-machine’s 
query of its return address stack, as discussed in chapter 2.) Instruction number 33 
is within the E-code packet translating the animation unit consisting of the call to 
procedure IncrI (instruction numbers 32 through 34). Thus, the E-machine’s packet 
register contains the address of the packet structure describing the animation unit 
consisting of the call to procedure Incrl. This animation unit, however, was high­
lighted prior to the call to the procedure and should not be rehighlighted. The 
animation unit that should be highlighted is WHILE i < 5.
This fragmentation problem was solved by adding a new field to the packet 
structure definition. Table 3 shows the Packet Table containing the packet struc­
tures resulting from the compilation of program Increment I. This new field, 
named FragAddr in table 3, holds the first program memory address at which this 
fragmentation can occur. (More than one such LABEL instruction within the E-code 
packet can cause this problem to occur multiple times due to multiple nesting pos­
sibilities.) When the Parser module determines that this situation is possible, it 
calls the AddPktFragInstr routine to initialize the FragAddr field. The animator 
must now query the E-machine’s program counter as well as its packet register when
51
determining the next animation unit to be displayed. If the program counter value 
is greater than or equal to the FragAddr value in the packet structure corresponding 
to the packet register, the animator does not change its current display (i.e., it con­
tinues to highlight the animation unit it is on, not the animation unit described by 
the packet structure to which the return was made). Of course, the animator must 
still call the E-machine to complete execution of the fragmented E-code packet even 
though there is no change in what the animator highlights. Once execution of the 
fragmented packet is completed, the next animation unit is highlighted, in this case 
WHILE i < 5.
Packet
Number
Start
Addr
End
Addr
Start
Line
Start
Col
End
Line
End
Col
Scope
Index
Frag
Addr
Display
Packet
0 0 I 0 0 0 18 0 -I TRUE
I 2 2 I 2 I 4 0 -I TRUE
2 3 4 2 4 2 13 I -I TRUE
3 5 6 4 2 4 22 0 -I TRUE
4 7 7 5 4 5 .8 0 -I TRUE
5 8 14 6 5 6 10 0 -I TRUE
6 15 18 7 5 7 8 2 -I TRUE
7 19 20 9 2 ■9 6 2 -I TRUE
8 21 22 10 3 10 7 2 -I TRUE
9 23 30 11 3 11 13 2 -I TRUE
10 31 31 11 15 11 16 2 -I TRUE
11 32 34 11 18 11 28 2 33 TRUE
12 35 39 12 3 12 6 2 -I TRUE
Table 3: Packet Table Resulting from Compilation of Program Increment I
It should be noted that the miniPascal code 
WHILE i < 0 DO BEGII IncrI(i) END;
does not produce a fragmentation problem, because the call to IncrI is contained 
in a compound statement (BEGIN/END) pair. Figure 11 contains the source 
code illustrating this situation. Figure 12 shows the pseudo assembly language
52
0 PROGRAM Increment2
1 VAR
2 i:INTEGER;
3
4 PROCEDURE IncrI;
5 BEGIN
6 i:=i+l
7 END;
8
9 BEGIN
10 i:=l;
11 WHILE i < 5 DO BEGIN IncrI END;
12 END.
Figure 11: Source Code for Program Increment2
0 pushd C7 21 push I,Cl
I nop 22 pop c,I,VO
2 nop 23 label 2
24 inst c,V2
3 inst c,VO 25 push I,VO
4 br 0 26 push I,C5
27 less c,I
5 label I 28 pop c,B,V2
6 pushd C4 29 push B ,V2
30 brf c,3
7 nop
31 hop
8 inst c,Vl
9 push I,VO 32 nop
10 push I,Cl
11 add c,I 33 call I
12 pop c,I,Vl
13 push I,Vl 34 label 5
14 pop c,I,V0 35 nop
36 br 2
15 nop
16 uninst c,Vl 37 label 3
17 popd 38 nop
18 return 39 uninst c,V2
40- uninst c,VO
19 label 0 41 popd
20 nop
Figure 12: E-code Translation of Program Increment2
53
representation of the E-code instructions translating program Increment2. As can 
be seen in figure 12, in this case the LABEL instruction following the CALL instruc­
tion is not within the E-code packet that contains the CALL instruction. This LABEL 
instruction is the first instruction in the E-code packet translating the END state­
ment associated with the while loop. Thus, upon completion of the execution of the 
E-code packet translating the procedure’s END statement, the animator would (cor­
rectly) highlight the animation unit containing the END statement of the while loop. 
The fact that the LABEL instruction is physically adjacent to an instruction involved 
in the translation of the next animation unit (in this case, the while loop’s END 
statement) allows the Parser to “adjust” the E-code packet boundaries by querying 
the SaveStartInstrNum and SaveEndInstrNum variables as previously discussed.
The sample program found in appendix C illustrates another situation in which 
a packet becomes fragmented.
To H ighlight or N ot. A final field, named DisplayPacket in table 3, was 
added to the packet structure to indicate whether or not the animator should dis­
play (highlight) anything at all before the corresponding E-code packet is executed. 
There are two miniPascal situations when the animator should not change its display 
before execution of a packet:
o before execution of a packet associated with the return from a function call;
o before execution of a packet containing instructions that determine the case 
label to which a branch should be made based on the case selector value.
The following two examples illustrate these situations. First, consider the mini- 
Pascal statement,
Result := Min(x,y) + 2;
In this example, the animator will eventually highlight the animation unit Min(x,y) 
and then await a response from the user before executing the corresponding E- 
code packet. When execution of Min is complete and control returns to the calling
54
procedure, a dummy packet containing the E-code to pop the value returned 
by Min into a temporary variable is executed. Since Min(x,y) has already been 
highlighted and its corresponding E-code packet has been executed, there is no 
corresponding source code (i.e., animation unit) associated with this dummy packet. 
This situation is similar to the fragmentation problem discussed above. In this case, 
however, execution of the entire dummy packet should not result in any animation 
(highlighting) of the source program. Thus, the DisplayPacket field in the dummy 
packet’s associated packet structure is set to FALSE. The animator would continue 
to highlight Min(x,y) until execution of the dummy packet is complete, and then 
highlight the animation unit containing the statement Result := Min(x,y) + 2; 
in order to illustrate the result of executing this assignment statement.
Now, consider the code in figure 13. In this example, upon completion of parse 
of the entire case statement (up to the END statement), the Parser Module calls 
the GenCaseSearch routine in the Code Driver module. This routine generates 
a packet of E-code that enables control to pass to the proper case label at run 
time. Here again, this is a “dummy” packet in that there is no animation unit 
associated directly with it. For the case statement in figure 13, the animator will first 
highlight the animation unit, CASE i OF, and then await a response from the user. 
Upon completion of execution of the corresponding E-code packet, the animator will 
subsequently call the E-machine to execute the dummy packet, without changing its 
display (i.e., CASE i OF continues to be highlighted). Then, since the value of the 
case selector is 2, the animator highlights the animation unit containing the case 
label 2: and again awaits a response from the user.
The Scanner Module
The Scanner module performs the scanning (or lexical analysis) function for 
the compiler. This module consists of the lex-generated yylex routine and two
55
i : =2;
CASE i OF
I: j:=100
2: j:=200
3: j :=300
OTHERS: j :=0
END;
Figure 13: Source Code for a CASE Statement
user-supplied routines that handle mini? as cal comments and quoted strings, respec­
tively. The yylex routine is called by the Parser when the next token is required. 
The other two routines are called internally (from yylex in the Scanner module).
Lex produces a scanner by processing a specification file containing rules that 
consist of regular expressions. These regular expressions define the tokens in a 
language, in this case miniPascal. Actions, written in C, are interspersed among the 
rules—these actions effect the accomplishment of the two scanner tasks performed 
upon each call to yylex. The Scanner module’s first task is to recognize and return 
miniPascal tokens (and their values) to the parser. Its second task is to enter the 
original miniPascal source code into the E-code’s SOURCESEGTION via calls the 
GenSource routine in the SOURCE module. The Scanner module is also responsible 
for ensuring that the miniPascal compiler is not case-sensitive. Thus, when the 
Scanner module recognizes a identifier token, it first enters the name of the identifier 
into the SOURCESECTION, and then converts the name (in the yytext variable) 
to all lower case characters before returning to the Parser.
56
The Code Driver Module
The Code Driver module drives the E-code translation of the source program. 
This module is a large one, consisting of thirty-two routines. Many of these routines 
are called by the Parser module when the parse reaches a point where code should 
be generated. The remaining routines in this module are called internally (from 
within the Code Driver module). The Code Driver module interfaces directly with 
the Semantic Analysis and Symbol Table modules to perform semantic analysis, and 
with the CODE, LABEL, and VARIABLE modules to perform code generation.
The Semantic Analysis Module
The Semantic Analysis module performs semantic analysis during compilation. 
This module is a large module, consisting of fifty-eight routines. These routines are 
called by the Parser module, the Code Driver module, and the STATSCOPE mod­
ules when semantic checking must be done. The Semantic Analysis, routines may 
also be called internally (from within the Semantic Analysis module). These rou­
tines perform tasks relevant to both the initialization and the retrieval of semantic 
information. For example, the Parser module calls the SetProcAttributes routine 
upon encountering a procedure declaration. This routine is dedicated to associating 
with the procedure name its (compiler generated) starting program memory address 
as well as any formal parameter attributes. Later, when the Parser encounters a 
call to the procedure, it calls the GetProcAttributes routine to retrieve this informa­
tion in order to associate the correct program memory address with the generated 
E-code c a l l  instruction and to verify the actual parameter list. The Semantic 
Analysis module interfaces directly with the Symbol Table module to enter and re­
trieve symbol table attribute information. The Semantic Analysis module also inter­
faces directly with the STRING module by calling the EnterString routine to enter
57
the values of string literals and enumerated constant names into the string space 
array.
The PACKET Module
The PACKET module produces the E-code packet descriptors during compila­
tion. This module contains routines that initialize a statically allocated array of 
structures containing the packet descriptions. With the exception of the WritePKT 
routine, the PACKET module routines are called by the Parser module during 
the parsing process. The WritePKT routine, called by the Main module at the 
end of compilation, writes the packet structure array elements to records in the 
PACKETSECTION of the E-code file.
The SOURCE Module
The SOURCE module produces the source code array (for animation purposes). 
This module contains the GenSource routine, which initializes a statically allocated 
array containing the source code of the miniPascal program being compiled. Each 
element in the source code array corresponds to a single line in the miniPascal 
source program. The GenSource routine is called by the Scanner module during 
the scanning process. The WriteSOURCE routine, called by the Main module at 
the end of compilation, writes the source code array elements to' records in the 
SOURCESECTION of the E-code file.
The LABEL Module
The LABEL module produces the table that maps E-code label numbers to 
their corresponding E-code instruction numbers (i.e., E-code lab e l instructions). 
This module contains the GenLabRegTable routine, which initializes a statically 
allocated array whose elements contain the instruction number of corresponding
58
E-code (labe l) instructions. The GenLabRegTable routine is called by both the 
Parser and the Code Driver modules during the compilation process. The WriteLAB 
routine, called by the Main module at the end of compilation, writes the label array 
elements to records in the LABELSEGTION of the E-code file.
The VARIABLE Module
The VARIABLE module produces the table that maps E-code variable register 
numbers to their corresponding data memory sizes. This module contains the Gen- 
VarRegTable routine, which initializes a statically allocated array whose elements 
contain the size of the data memory reserved for corresponding variable register 
numbers. The GenVarRegTable routine is called by both the Parser and the Code 
Driver modules during the compilation process. The WriteVAR routine, called by 
the Main module at the end of compilation, writes the variable register array ele­
ments to records in the VARIABLESEGTION of the E-code file.
The STRING Module
The STRING module generates the string array found in the E-code 
STRINGSECTION. The miniPascal compiler’s implementation of enumerated types 
precipitated the need for a new E-machine component (the string space), and hence 
the need for a corresponding section in the E-code file. This new section is called the 
STRINGSECTION. Ordinarily, only the internal numeric value of an enumerated 
type variable is required in translated object code'for real computers and computing 
environments. It is desirable, however, for a program animation system to have the 
animator display the enumerated constant name rather than (or in addition to) the 
internal numeric value of a variable of an enumerated type. The STRINGSECTION 
consists of a statically allocated character array containing all of the enumerated 
constant names defined in a miniPascal program, as well as the values of any string
59
literals declared in the source program (which may also need to be displayed by 
the animator). When the Semantic Analysis module encounters the definition of a 
string literal or an enumerated constant name, it calls the EnterString routine in 
the STRING module. The WriteSTRINGS routine, called by the Main module at 
the end of compilation, writes the string character array to the STRINGSECTION 
of the E-code file.
When a miniPascal program is animated, the STRINGSECTION portion of the 
E-code file is loaded into the E-machine’s string space. The string space is then 
accessed by the animator for displaying string constants and enumerated variable 
values. For example, upon completion of execution of the program in figure 14, the 
animator Can display the enumerated type variable values as shown in figure 15.
Figure 16 illustrates the relationship of the E-machine’s string space with the 
variable registers and data memory. This illustration assumes that a variable regis­
ter associated with an enumerated type variable represents 32-bits of data memory. 
The 16 high-order bits of this data memory location contain the dynamically de­
termined internal numeric value of the enumerated constant associated with this 
variable; the 16 low-order bits contain an index into the string space where the as­
sociated enumerated constant name can be found. As can be seen in figure 16, the 
index into the string space is always that of the first constant name of the enumer­
ated type. This is due to the fact that the compiler can statically generate code to 
increment or decrement the numeric value of an enumerated type variable (e.g., for 
an enumerated type control variable in a for loop). The compiler cannot, however, 
statically determine in advance the absolute string space index of the enumerated 
constant name associated with an enumerated type variable at any given time. In­
stead, the animator has the capability to retrieve the variable’s numeric value and 
the starting string space index. The animator can then step sequentially through 
the string space until the name corresponding to the numeric value is found; the
60
Program Payroll!;
TYPE
DAYS = (MON,TUES,WED,THURS,FRI); 
FREQUENCY = (WEEK,MONTH);
VAR
OffDay,PayDay:DAYS;
PayFreq:FREQUENCY;
BEGIN
OffDay:=WED;
PayDay:=FRI;
PayFreq:=WEEK;
END.
Figure 14: Source Code for Program PayrolIl
Program Payroll!; Program Payroll!
OffDay = 2  /* WED */
TYPE PayDay = 4 /* FRI */
DAYS = (MON,TUES,WED,THURS,FRI); PayFreq = 0 /* WEEK */
FREQUENCY = (WEEK,MONTH);
VAR
OffDay,PayDay:DAYS;
PayFreq:FREQUENCY;
BEGIN
OffDay:=WED;
PayDay:=FRI;
PayFreq:=WEEK;
END.
Figure 15: Animation Display After Execution of Program Payrolll
61
Variable Variable Data String
Registers Stacks Memory Space
PayDay
OfFDay
PayFreq
O
4
8
M
O
N
O
T
U
E
S
O
W
E
D
O
T
H
U
R
S
O
F
R
I
O
W
E
E
K
O
M
O
N
T
H
O
Figure 16: String Space’s Relationship with Variable Registers and Data Memory
62
names are null-terminated,, thus ■ allowing such a. search. (A similar situation will 
exist when the predeclared Pascal functions, pred and succ, are eventually imple­
mented.)
The animator also accesses the string space when displaying enumerated type 
array indices. Thus, upon completion of the execution of the program shown in 
figure 17, the animator can display DayCode’s value as shown in figure 18. In 
this case, the animator retrieves the values of the enumerated type indices through 
information stored in the Static Scope Table. In this example, the Static Scope 
Table entry for the variable DayCode contains the following information: 
o Identifier Name: DayCode 
o Upper array bound: 19 
o Lower array bound: O 
o Entry type: INTEGERENUMI 
o Variable Reg: O
Type INTEGERENUMI means that the variable DayCode is an array with integer 
elements and an enumerated index type. This indicates to the animator that the 
array bounds are indices into the string space rather than absolute numbers.
The Error Module
The Error module produces an error message whenever an error is encountered 
during compilation. This module consists of routines to report the following types 
of errors:
o scan errors; 
o parse errors; 
o internal errors; 
o parse warnings; 
o lack of support messages.
63
Program Payroll2;
TYPE
DAYS = (MON,TUBS,WED,THURS,FRI); 
DAYLIST = ARRAY [MON..FRI] OF INTEGER;
VAR
DayCode:DAYLIST:
Day:DAYS;
BEGIN
FOR Day := MON TO FRI DO 
DayCode[Day] := O;
END.
Figure 17: Source Code for Program Payrolls
Program Payrolls; Program Payrolls 
DayCode[MON] = O
TYPE DayCode[TUBS] = O
DAYS=(MON,TUBS,NED,THURS,FRI); DayCode[WED] = O
DAYLIST=ARRAY [MON..FRI] OF INTEGER; DayCode[THURS] = O
VAR
DayCode:DAYLIST:
Day:DAYS;
BEGIN
FOR Day := MON TO FRI DO 
DayCode[Day] := O ;
END.
DayCode[FRI] = O
Figure 18: Animation Display After Execution of Program Payrolls
64
Each of these routines prints an appropriate message, and, with the exception of the 
parse warning routine, then calls the AbnormalEnd routine in the Main module. The 
Error module routines are called by various other modules during the compilation 
process.
The Memory Allocation Module
The Memory Allocation module is responsible for allocating and freeing memory 
for the various data structures required during the compilation process. This module 
consists of allocate and free routines associated with each data structure defined in 
the compiler. The Memory Allocation module routines are called, by many of the 
other modules during the compilation process.
The Assembly Code Module
The Assembly Code module produces a text file containing the pseudo assembly 
language translation of a source program. The WrtAsmFile routine in this module 
writes a copy of the CODESECTION instructions to a text file in pseudo assembly 
language format. This module is not required for compilation since it does not gen­
erate any of the E-code file sections; it does, however, provide an excellent debugging 
tool for compiler development. The routines in this module are (optionally) called 
by the CODE module as the instructions are generated.
The CODE Module
The CODE module produces the array of C structures containing the E-code 
instructions. The CODE module contains the GenInstr routine which writes a 
single E-code instruction to a temporary file, and (optionally) calls the WrtAsmFile 
routine in the Assembly Code module (to output an equivalent pseudo assembly code 
instruction). The GenInstr routine is called by the Parser and Code Driver modules.
65
The CODE module also contains the PatchInstr routine, which maintains an array 
of structures that associate a “patch value” with a program memory address. There 
are two situations when a code patch is necessary:
1. References to label values before they are known during the generation of case 
statement code.
2. The association of the index of the Static Scope Table entry for a proce­
dure/function with the appropriate pushd instruction (see the Parser module 
section previously in this chapter).
When compilation is complete, the Main module closes and reopens the tem­
porary CODESECTION file and then calls the WriteCODE routine. This routine 
reads the temporary file and writes the records to the CODESECTION of the E-code 
hie, incorporating any patches into the proper instructions.
The Symbol Table Module
The Symbol Table module manages the compiler’s symbol table. The Symbol 
Table module routines are responsible for opening and closing static scopes as well 
as entering and retrieving identifier names and their attributes. The Symbol Table 
routines are called by the Parser, the Semantic Analysis, the Code Driver, and the 
STATSCOPE modules.
Each identifier name in the symbol table has a static scope level and, possi­
bly, a record number, associated with it. This allows the same identifier name to 
be used in more than one scope, including scopes nested within each other. The 
current scope level is contained in a global variable in the Symbol Table Interface 
module. When the Parser module determines the beginning of a scope, it calls the 
OpenScope routine, which simply increments the scope level. When the Parser mod­
ule determines an end of scope, it calls the CloseScope routine, which deletes all 
symbol table entries for that scope and then decrements the scope level. It should
66
be noted that, at the beginning of compilation, the Parser module (via the Seman­
tic Analysis module) enters the predefined Pascal types—integer, real, boolean, and 
char—and the predefined Pascal constants—true, false, and maxint—into the sym­
bol table. These predefined identifiers are associated with the outermost scope of 
the program.
The symbol table itself is implemented as a single hash table using chaining to 
resolve collisions. The chained buckets are stacked such that the identifiers declared
i
in the most recent scope are at the top of the stack. This ensures that the proper 
identifier is retrieved by an outward search of the buckets associated with the same 
identifier name. The hashing algorithm used is “hashpjw” from P.J. Weinberger’s 
C compiler, as presented in Compilers: Principles, Techniques, and Tools by Aho, 
Sethi, and Ullman [Aho 86].
The basic structure of the symbol table, as well as the design of the Symbol Ta­
ble module, are based on the symbol table design presented in Crafting A Compiler 
by Charles N. Fischer and Richard J. LeBlanc, Jr. [Fischer 88]. There are also sev­
eral symbol table features adapted from the symbol table design given in Compiler 
Design in C by Allen I. Holub [Holub 90]. Figure 19 illustrates the symbol table 
implementation.
A symbol table entry for an identifier consists of several structures which are 
chained together. The various symbol table structures are shown in figure 20. The 
identifier’s primary structure, the Symbol Structure, is initialized as soon as the 
Parser module encounters the identifier’s declaration. Later, when the parse has 
progressed to the point where the identifier’s attributes become known, the re­
maining “descriptor” structures are initialized. Once its attributes are known, an 
identifier’s symbol table entry will consist of at least a Symbol Structure, a Type 
Descriptor Structure, and a Class Descriptor Structure. The various symbol table 
structures are described below.
67
Hash
Table
0
1
2
3
4
5
6
7
8
9
10 
11 
12
13
14
15
Figure 19: The Symbol Table Hash Implementation
A Symbol structure consists of the following fields:
• IdName (identifier name). This field contains a pointer to the dynamically 
allocated memory location containing the identifier name.
• ScopeLevel. This field contains the static scope level of the identifier.
• RecordNum. If the identifier being described is a field name, this Symbol 
Structure field contains the compiler-assigned number of the record containing 
the identifier name. If the identifier is not a field name, this number is 0.
• LineNum. This field contains the identifier’s source code line number. Since 
all of the symbol table names are stored as lower case, this number is later 
used by the STATSCOPE module to retrieve the identifier’s original name (for 
animation purposes) from the source code array, which is maintained in the 
SOURCE module.
68
o, ColNuxn. This held contains the identifier’s starting column number in the 
source code and is likewise used by the STATSCOPE module to retrieve the 
identifier’s original name.
0 IdType. This field contains the pointer to the identifier’s Type Descriptor 
record.
o IdClass. This field contains a pointer to the identifier’s Class Descriptor record.
o NextInList. If the identifier is declared within a fist of identifiers, this field 
contains the pointer to the Symbol Structure of the previous identifier in the 
list. The Parser module can then traverse this chain to associate common 
attributes with the identifiers in the list. This chain is also used (in reverse 
order) when assigning internal numeric values to enumerated constant names.
o NextInScope. This field contains a pointer to the Symbol Structure of the 
previous identifier in the current static scope level. This chain is traversed by 
the STATSCOPE module to retrieve the identifiers declared within a given 
static scope (as discussed in the next section).
o NextBucket. This field contains the pointer to the previous Symbol Structure 
in the hash table entry’s collision chain. This chain is used internally in the 
Symbol Table module.
A Type Descriptor structure consists of the following fields:
o UseCount. Since numerous other structures may point to the same Type 
Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o Size. This field contains the identifier’s size (in terms of the host computer’s 
smallest addressable memory size, normally bytes).
o Packed. This field indicates whether or not a structured type identifier is 
packed. The miniPascal compiler recognizes the packed attribute only if the 
identifier is a string variable.
0 TypeName. This field contains an enumerated constant value indicating the 
identifier’s type (e.g., INTEGERTYPE, ENUMTYPE, or ARRAYTYPE).
o SelectType. This field contains a pointer to another descriptor based on the 
value of the TypeName field. For example, if the TypeName is ARRAYTYPE, 
SelectType would point to an Array Descriptor record, which would contain 
further attribute information pertaining to the identifier. '
69
Id Scope Record Line Col Id Id NextIn NextIn Next
Name Level Num Num Num Type Class List Scope Bucket
Symbol Structure
C
Jse
ount Size Packed
Type
Name
Select
Type
Type Descriptor 
Structure
Class
Name SelectClass
Class Descriptor 
Structure
Use Elemnt First Const
Count Type Index Part
Array Descriptor 
Structure
Use Lower Upper Enum Is Next
Count Bound Bound Desc Char Index
Index Descriptor 
Structure
Use First Base Max
Count Const Type Val
Enumeration Descriptor 
Structure
Use Base Lower Upper
Count Type Bound Bound
Subrange Descriptor 
Structure
Use Record Size Num FirstCount Num Fields Field
Record Descriptor 
Structure
Use Mode Param NextCount Type Param
Parameter Descriptor 
Structure
Figure 20: The Symbol Table Structures
70
A Class Descriptor structure consists of the following fields:
o ClassName. This field contains an enumerated constant value indicat­
ing the identifier’s class (e.g., VARIABLECLASS, CONSTANTCLASS, 
or PROCEDURECLASS).
o Select Class. This field is composed of a union structure containing information 
based on the value of the ClassName field. For example, if the ClassName 
is VARIABLECLASS, one of the fields in this structure would contain the 
variable register number associated with the variable.
The remaining symbol table structures are used in describing specific type or 
class attributes pertaining to a given identifier. These symbol table structures are 
discussed below. Figure 21 shows the enumerated names of the various miniPascal 
identifier types; figure 22 shows the enumerated names of the various miniPascal 
identifier classes. (Identifier types POINTERTYPE, SETTYPE, and FILETYPE 
listed in figure 21 are not currently implemented.)
/* Identifier types */ 
typedef enmti
INTEGERTYPE, REALTYPE, BOOLEAFTYPE, CHARTYPE, ENUMTYPE, SUBRANGETYPE, 
POINTERTYPE, SETTYPE, ARRAYTYPE, RECORDTYPE, FILETYPE, EXISTINGTYPE, 
STRINGTYPE 
} IdTypes;
Figure 21: The miniPascal Identifier Types
/* Identifier classes */ 
typedef enum
VARIABLECLASS, FIELDCLASS, TYPENAMECLASS, CONSTANTCLASS, 
PROCEDURECLASS, FUNCTIONCLASS, INTLITCLASS, REALLITCLASS,
CHARLITCLASS, STRINGLITCLASS, PROGRAMCLASS, TEMPCLASS , PARAMETERCLASS 
} IdClasses;
Figure 22: The miniPascal Identifier Classes
71
In some cases, no additional structures are needed to describe an identifier’s 
attributes. For example, the three basic symbol structures (defined above) suffice 
in describing a variable (i.e., identifier class is VARIABLECLASS) of type integer 
(i.e., identifier type is INTEGERTYPE).
An Array Descriptor structure is used when an identifier’s TypeName field (in its 
Type Descriptor structure) has the value ARRAYTYPE. The SelectType field holds 
a pointer to the corresponding Array Descriptor structure. An Array Descriptor 
structure consists of the following fields;
o UseCount. Since numerous other structures may point to the same Array 
Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o ElemntType. This field holds a pointer to the Type Descriptor structure 
defining the array’s element type (e.g., the predefined integer type or some 
previously declared user-defined type).
o Firstlndex. This field holds the pointer to the Index Descriptor structure 
describing the array’s (first) index. The Index Descriptor structure is defined 
below.
o ConstPart. This field holds the “constant part” used in the algorithm that 
calculates the linear address for a (multidimensional) array reference. This 
algorithm was taken from [Fischer 88].
An Index Descriptor structure is referenced by the Firstlndex field in a corre­
sponding Array Descriptor structure, as discussed above. It may also be referenced 
by the NextIndex field of another Index Descriptor structure. An Index Descriptor 
structure consists of the following fields;
o UseCount. Since numerous other structures may point to the same Index 
Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o LowerBound. This field holds the (numeric) lower bound of an array index.
o UpperBound. This field holds the (numeric) upper bound of an array index.
72
o EnumDesc. If the array index values are an enumerated type, this field holds 
a pointer to the Enumeration Descriptor structure describing the enumerated 
type. The Enumeration Descriptor structure is discussed below.
o Is Char. This field’s value is TRUE if the array index values are CHARTYPE; 
otherwise this field’s value is FALSE.
o Nextlndex. This field holds a pointer to the Index Descriptor structure de­
scribing the next array index (if any).
An Enumeration Descriptor structure is used when an identifier’s TypeName 
field (in its Type Descriptor structure) has the value ENUMTYPE. The SelectType 
field holds a pointer to the corresponding Enumeration Descriptor structure. An 
Enumeration Descriptor structure consists of the following fields:
o UseCount. Since numerous other structures may point to the same Enumera­
tion Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o FirstConst. This field holds the pointer to the Symbol structure describing 
the first constant in the enumeration.
o BaseType. This field holds a pointer to the Type Descriptor structure that 
describes the first constant in the enumeration.
o MaxVaL This field holds the maximum numeric value associated with the 
enumeration (be., the number of constants in the enumeration minus I).
A Subrange Descriptor structure is used when an identifier’s TypeName 
field (in its Type Descriptor structure) has the value SUBRANGETYPE. The 
SelectType field holds the pointer to the corresponding Subrange Descriptor struc­
ture. A Subrange Descriptor structure consists of the following fields:
o UseCount. Since numerous other structures may point to the same Subrange 
Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o BaseType. This field holds a pointer to the Type Descriptor structure de­
scribing the base type of the subrange (e.g., a pointer to the Type Descriptor 
structure describing the predefined integer or char types).
73
o LowerBound. This field holds the (numeric) lower bound of the subrange.
o UpperBound. This field holds the (numeric) upper bound of the subrange.
A Record Descriptor structure is used when an identifier’s TypeName field (in 
its Type Descriptor structure) has the value RECORDTYPE. The SelectType field 
holds a pointer to the corresponding Record Descriptor structure. A Record De­
scriptor structure consists of the following fields:
o UseCount. Since numerous other structures may point to the same Record 
Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o RecordNum, This field hold the (compiler-generated) number associated with 
the record.
o Size. This field holds the record’s size (normally, in bytes).
o NumFields. This field holds the number of fields in the record.
o FirstField. This field holds a pointer to the Symbol structure associated with 
the record’s first field.
A Parameter Descriptor structure is used when an identifier’s ClassName 
field (in its Class Descriptor structure) has the value PROCEDURECLASS or 
FUNCTIONCLASS. A Parameter Descriptor structure describes the formal param­
eters associated with a procedure (or function). The union structure corresponding 
to the SelectClass field holds a pointer to the Parameter Descriptor structure de­
scribing the first parameter in the formal parameter list. A Parameter Descriptor 
structure consists of the following fields:
o UseCount. Since numerous other structures may point to the same Parameter 
Descriptor structure, the UseCount field is utilized to prevent the Memory 
Allocation module from freeing the structure while it is still in use.
o Mode. This field holds the parameter’s mode (i.e., VALUE or REFERENCE).
o ParamType. This field holds a pointer to the Type Descriptor structure that 
describes the parameter.
o NextParam. This field holds a pointer to the Parameter Descriptor structure 
of the next parameter in the formal parameter list.
74
The STATS COPE Module
The STATSGOPB module contains the routines that build the Static Scope 
Table. The animator uses the Static Scope Table in conjunction with the dynamic 
scope stack to determine the data memory values that should be displayed at a given 
point during a program’s execution. The Static Scope Table was called the Symbol 
Table in Birch’s thesis [Birch 90]; the name was changed here to avoid confusion 
with the compiler’s symbol table. This table is a linear array of structures (or 
entries) which are in turn divided into numerous scope blocks. The scope blocks 
are chained together via parent/ child pointers as discussed later. A scope block is 
used to describe the program identifiers associated with a single static scope. For 
example, a scope block would describe all of the local variable names and locally 
declared functions and/or procedures within a given procedure.
G enerating  a S tatic  Scope Block. The Parser module calls the 
STATSCOPE module’s GenStatScopeBlock routine whenever the end of a static 
scope is encountered during parsing (i.e., at the end of a procedure, function, or 
program). The parsing of an inner scope is always completed before the containing 
scope is completely parsed (a result of Pascal syntax).
The GenStatScopeBlock routine drives the generation of the static scope block 
in the Static Scope Table for the scope in question from information in the symbol 
table for the current scope. (Recall that the symbol table entries for this scope will 
be deleted at this point of the. parse, so this information must be saved in the Static 
Scope Table for animation purposes.) This routine, via calls to other STATSCOPE 
module routines, performs the following tasks:
o dynamic allocation of a static scope block. The number of static scope en­
tries (i.e, the size of the static scope block) is passed as a parameter to Gen­
StatScopeBlock;
75
o entry of the static scope’s owner name into the Scope Owner Table. The Scope 
Owner Table contains the information necessary to tie all of the static scope 
blocks together at the end of compilation. The static scope’s owner name is 
passed as a parameter to GenStatScopeBlock;
o initialization of the descriptive information contained in the static scope block 
entries.
The names and descriptive attributes of the identifiers declared within a scope 
are retrieved by traversing the symbol table’s NextInScope chain; the head of the 
appropriate “scope chain” is passed as a parameter to GenStatScopeBlock.
A Static Scope Table entry describing a simple variable identifier includes the 
variable’s type attribute (e.g., INTEGER) and its variable register number attribute. 
For the more complicated array variable entry, additional fields are utilized to de­
scribe the array bounds. If the array’s index values are simple integers or characters, 
the lower and upper bound values are entered directly into the corresponding fields. 
For arrays whose index values are enumerated type values, the appropriate indices 
into the STRINGSECTION array are computed and entered into the static scope 
entry’s array bound fields (see the STRING module section previously in this chap­
ter). For multidimensional arrays, additional scope blocks are used to describe the 
other index bounds. These additional index scope blocks are chained via the NxtIdx 
field. A corresponding entry is placed in the Scope Owner Table indicating that the 
array “owns” the index scope block (via the Array Descriptor field). Record variable 
entries also use an additional scope block to describe the fields within the record. 
The child pointer is used to associate a record name with its defining scope block. 
Here again, an entry is placed in the Scope Owner Table indicating that the record 
“owns” the scope block (via the Record Descriptor field).
The P rocN um  Field. The Static Scope Table’s ProcNum field can now 
be explained. As each program, procedure, and function name identifier is
76
encountered during compilation, it is assigned a unique “procedure number.” The 
identifier names are referred to as static scope names in the following discussion. 
The procedure number is produced by a counter variable in the Semantic Analy­
sis module. Thus, the procedure number assigned to a miniPascal program name 
is always 0. The next static scope name declaration encountered in the program 
would be assigned the procedure number I, and so on. A static scope name’s pro­
cedure number is stored as one of its symbol table attributes. Thus, when the 
GenStatScopeBlock routine encounters a static scope name while traversing a con­
taining scope’s NextInScope chain, one of the attributes it retrieves is the corre­
sponding procedure number. This number is then placed in the ProcNum field of 
the Static Scope Table entry describing the static scope name.
The animator uses the ProcNum field in conjunction with the dynamic scope 
stack when determining the dynamics of program execution. The use of this field is 
best explained by an example. The program shown in figure 23 contains a recursively 
called function (function Fact). That Fact is recursive implies that for any given 
call to function Fact, the animator must be able to determine the “depth” of the 
pertinent data memory values associated with the variables declared in function 
Fact, as well as the depths of any variables in the calling (program) scope. These 
values are retrieved by querying the appropriate variable stacks, as discussed in 
chapter 2. Thus, upon the final recursive call to function Fact, the animator should 
be able to display data memory values as shown in figure 24.
The ProcNum field is used in the following manner when determining the depths 
of the variables declared in a program, procedure or function scope. After the 
E-machine has been loaded with the E-code translation of a source program, the 
animator queries the E-machine to determine the total number of static “procedure” 
scopes that are described in the Static Scope Table. The Static Scope Table for the 
example in figure 23 is shown in table 4. The animator then dynamically allocates
77
Program Ftrl;
VAR
n ,nfact:INTEGER;
Function Fact(n:INTEGER):INTEGER; 
BEGIN
IF n = 0
THEN Fact==I
ELSE Fact:=n * Fact(n-1)
END ;
BEGIN
n:=3;
nfact:=Fact(n);
END.
Figure 23: Source Code for Program Ftrl
Program Ftrl; Program Ftrl 
n = 3
VAR nfact is undefined
n.nfact:INTEGER;
Function Fact
Function Fact(n:INTEGER):INTEGER; n = 3
BEGIN Fact is undefined
IF n = 0
THEN Fact:=1 Function Fact
ELSE Fact:=n * Fact(n-1) n = 2
END; Fact is undefined
BEGIN Function Fact
n:=3; n = I
nfact:=Fact(n); Fact is undefined
END.
Function Fact 
n = 0 
Fact = I
Figure 24: Animation Display After Final Recursive Call of Function Fact
78
a procedure count array that contains an entry corresponding to each of these scopes. 
Thus, for the program shown in figure 23, this array has two entries. Entry 0 cor­
responds to the program scope and entry I corresponds to function Fact. During 
program animation, the animator sets the values of the procedure count array en­
tries to reflect the current number of active calls to the corresponding procedure or 
function. (This means that the animator reinitializes the values in the procedure 
count array every time control is passed to the animator.) At the same time, the 
E-machine’s dynamic scope stack contains a history of active scopes, with the Static 
Scope Table entry number of the most current scope being the value at the top of 
this stack.
En Id Upr Lwr Nxt Off Type Rec Par Ch Var Proc
try Name End End Idx set Siz ent ild Reg Num
S c o p e  b lo c k  d e s c r i b in g  f u n c t i o n  F a c t
0 - - - - HEADER - 4 - - -
I n - - - - INTEGER - - - 2 _
2 Fact - - - - INTEGER - - - 3 -
3 - - - - END - - - - -
S c o p e  b lo c k  d e s c r i b in g  p r o g r a m  F t r l
4 - - - - HEADER - 9 - - -
5 n - - - - INTEGER - - - I
6 nfact - - - - INTEGER - - - 0
7 Fact - - - - FUNCTION - - 0 - I
8 - - - - END - - - - -
B o o t s t r a p  s c o p e  b lo c k
9 - - - - HEADER - - - - _
10 Ftrl - - - - PROGRAM - - 4 - 0
11 - - - - END - - - - - I
Table 4: Static Scope Table Resulting from Compilation of Program Ftrl
Now, consider the animation of the current example. Suppose the program has 
executed to the point that it is in the third recursive call to function Fact. When the 
animator begins displaying data memory variables after the execution of the packet
79
translating the animation unit F a c t: - I ,  the procedure count array and the dynamic 
scope stack are in the state shown in figure 25. The values in the procedure count 
array indicate that the program Ftrl has one active “call” and that function Fact has 
four active calls. In this example, the animator begins its retrieval of data memory 
values by examining the value at the bottom of the dynamic scope stack. The bottom 
stack value is 10, which means that the animator now examines the tenth entry in 
the Static Scope Table. This entry is a PROGRAM entry describing FtrL The 
ProcNum field in the PROGRAM entry has the value 0. Next, the animator will 
examine entry 0 in the procedure count array to determine the depth of the variables 
to be displayed for this invocation of the program scope. Since the program scope 
cannot be called recursively, this value will always be I. Thus, when the animator 
retrieves the values of the variables described in the program’s child scope block, it 
will instruct the E-machine to retrieve the data memory values associated with the 
top of the appropriate variable stacks. After these values have been displayed, the 
animator decrements the value in entry 0 of the procedure count array.
Procedure
Count
Array
(Program Ftrl) Q
(Function Fact) I
I
4
Dynamic
Scope
Stack
0
1
2
3
4
JlO
_ 7_
_7_
_7_
7
(bottom)
(top)
Figure 25: Procedure Count Array and Dynamic Scope Stack
80
Next, the animator examines the value in entry I in the dynamic scope stack. 
This value is 7, corresponding to the seventh entry in the Static Scope Table. This 
entry, whose ProcNum field has the value I, describes function Fact. The animator 
then examines entry I in the procedure count array. The current value in this 
entry is 4, indicating that the animator should instruct the E-machine to retrieve 
data memory values associated with the fourth level of the appropriate variable 
stacks when displaying variable values described in the function’s child scope block. 
These values reflect the function’s variable values resulting from its initial call from 
the program scope. The animator then decrements the value in entry I of the 
procedure count array so that the next iteration will result in displaying the values 
associated with the first recursive call to function Fact. The animator continues this 
process until the dynamic scope stack is exhausted, resulting in the display shown 
in figure 24.
W riting the  STATSCOPESECTION. When parsing is completed, the Main 
module calls the WriteSTATSCOPE routine. This routine first traverses the Scope 
Owner Table in reverse order to initialize the program, procedure, and function par­
ent/ child pointers that will appear in the final linear array containing the complete 
Static Scope Table. Since the nesting characteristics of miniPascal allow the same 
name to be given to more than one procedure or function, the reverse order traversal 
ensures that the proper child is found. The final entry in the Scope Owner Table 
describes a “bootstrap” scope block, which will become the final scope block in the 
completed Static Scope Table. The Scope Owner Table contains the information 
needed to initialize the child pointer in the bootstrap scope block; this child pointer 
is the computed index of the first entry in the scope block describing the local vari­
ables, procedures, and functions belonging to the program. Also, the parent pointer 
in the program scope block can now be initialized with the computed index of the
81
bootstrap scope block. Similarly, each function and procedure can have its child 
and parent pointers initialized. ■
Finally, the WriteSTATSCOPE routine traverses the Scope Owner Ta­
ble in forward order to sequentially write the various scope blocks to the
STATSCOPESECTION of the E-code file.
Exam ple of STATSCOPESECTIOIV G eneration. Program Samp2, shown 
in figure 26, is used to illustrate the generation of the STATSCOPESECTION. This 
program contains two procedures, named A and B, which are at the same static 
scope level. Procedure A contains a nested function, also named B. Table 5 is the 
Scope Owner Table for this program. The Scope Owner Table holds the following 
information:
o Owner Name. This field contains the corresponding static scope’s owner’s 
name (for program, procedure, and function scopes);
o Pointer to Scope Block. This field contains the memory address of the cor­
responding scope’s dynamically allocated scope block. The numbers used in 
this example are for illustrative purposes only;
o Scope Table Index. This field contains the computed (final) index of the first 
entry of the static scope block describing the corresponding static scope;
o Number of Scope Entries. This field contains the number of identifiers (e.g., 
variable names and function names) declared in the corresponding static scope;
o Array Descriptor (indicates the “owner” of additional scope blocks containing 
index descriptions for multidimensional array variables'). This field contains 
the memory address of a dynamically allocated symbol table array descriptor 
structure, and thus allows array variables sharing the same user defined type 
to share an index scope block;
o Record Descriptor (indicates the “owner” of the additional scope containing 
record field descriptions). This field contains the memory address of a dy­
namically allocated symbol table record descriptor structure, and thus allows 
record variables that share the same user defined type also to share a field 
description scope block.
82
Program - S amp 2 ;
TYPE
LIST = ARRAY [I..4] OF INTEGER;
VAR
X,Y :INTEGER;
Listl=LIST;
Procedure A(VAR X,Y :INTEGER);
Function B (I:INTEGER):INTEGER; 
BEGIN { Function B }
END; { Function B }
BEGIN { Procedure A }
END; { Procedure A }
Procedure B(I,J :INTEGER); 
VAR
List2=LIST;
BEGIN { Procedure B }
END; { Procedure B } 
BEGIN { Program Samp2 }
END. { Program Samp2 }
Figure 26: Source Code for Program Samp2
83
Owner
Name
Pointer to 
Scope Block
Scope Table 
Index
Number of 
Scope Entries
Array
Descriptor
Record
Descriptor
B 1000 0 4 - _
A 3002 4 5 - -
B 5001 9 5 - -
Samp2 6240 14 7 - -
Bootstrap 7000 21 3 -
Table 5: Scope Owner Table for Program Samp2
Tables 6 through 10 show the five scope blocks generated by the STATSCOPE 
module during compilation. Table 11 shows the completed Static Scope table as it 
would be written to the STATSCOPESECTION at the end of compilation.
En
try
Id
Name
Upr
Bnd
Lwr
Bnd
Nxt
Idx
Off
set
Type Rec
Siz
Par
ent
Ch
ild
Var
Reg
Proc
Num
0 - - - - HEADER - - - - -
I I - - - . - INTEGER - - - 5 -
2 B - - - - INTEGER - - - 6 -
3 - - - - END - - - - -
Table 6: Scope Block for Function B in Procedure A in Program Samp2
En
try
Id
Name
Upr
Bnd
Lwr
Bnd
Nxt
Idx
Off
set
Type Rec
Siz
Par
ent
Ch
ild
Var
Reg
Proc
Num
0 - - - - HEADER - - - - -
I X - - - - INTEGER - - - 4 -
2 Y - - - - INTEGER - - - 3 -
3 B - - - - FUNCTION - - - - 2
4 - - - - END - - - - -
Table 7: Scope Block for Procedure A in Program Samp2
84
En
try
Id
Name
Upr
Bnd
Lwr
Bnd
Nxt
Idx
Off
set
Type Rec
Siz
Par
ent
Ch
ild
Var
Reg
Proc
Num
0 - - - - HEADER - - - - _
I I - - - - INTEGER - - - 8 -
2 J - - - " INTEGER - - - 7 -
3 List2 4 I - - INTEGER - - - 9 _
4 - - - - END - - - - -
Table 8: Scope Block for Procedure B in Program Samp2
En
try
Id
Name
Upr
Bnd
Lwr
Bnd
Nxt
Idx
Off
set
Type Rec
Siz
Par
ent
Ch
ild
Var
Reg
Proc
Num
0 - - - - HEADER - - - - -
I X - - - - INTEGER - - I -
2 Y - - - - INTEGER - - - 0 -
3 Listl 4 I - - INTEGER - - - 2 -
4 A - - - - PROCEDURE - - - - I
5 B - - - - PROCEDURE - - - - 3
6 - - - - END - - - - -
Table 9: Scope Block for Program Scope in Program Samp2
En
try
Id
Name
Upr
Bnd
Lwr
Bnd
Nxt
Idx
Off
set
Type Rec
Siz
Par
ent
Ch
ild
Var
Reg
Proc
Num
0 - - - - H E A D E R - - - - -
I S amp 2 - - - - P R O G R A M - - - - 0
2 - - - - EN D - - - - -
Table 10: Scope Block for “Bootstrap” Scope in Program Samp2
85
En
try
Id
Name
Upr
Bnd
Lwr
Bnd
Nxt
Idx
Off
set
Type Rec
Siz
Par
ent
Ch
ild
Var
Reg
Proc
Num
S c o p e  b lo c k  d e s c r i b in g  f u n c t i o n  B  i n  p r o c e d u r e  A
0 - - - - HEADER - 4 - - _
I I - - - - INTEGER - - - 5 _
2 B - - - - INTEGER - - - 6 _
3 - - - - END - - - - -
b c o p e  b lo c k  d e s c r i b in g  p r o c e d u r e  A
4 - - - - HEADER - 14 - - -
5 X - - - - INTEGER - - - 4 -
6 Y - - - - INTEGER - - - 3 -
7 B - - - - FUNCTION - - 0 - 2
8 - - - - END - - - - -
Scope block describing procedure B
9 - - - - H E A D E R - 14 - - -
10 I - - - - IN T E G E R - - - 8 -
11 J - - - - IN T E G E R - - _ 7
12 List 2 4 I - - IN T E G E R - - _ 9
13 - - - - E N D - - - - -
Scope block describing program Samp2
14 - - - - H E A D E R - 21 - - -
15 X -  ,, - - - IN T E G E R - - - I _
16 Y - - - - IN T E G E R - _ 0
17 Listl 4 I - - IN T E G E R - - - 2 _
18 A - - - - P R O C E D U R E - - 4 - I
19 B - - - - P R O C E D U R E - - 9 - 3
20 - - - - E N D - - - - -
Bootstrap scope block
21 - - - - H E A D E R - - - -
22 S amp 2 - - - - P R O G R A M - - 14 _ 0
23 - - - - E N D - - - - -
Table 11: Final Static Scope Table for Program Samp2
86
C H APTER 5
CONCLUSIONS A ND  FU TU R E  
ENH ANCEM ENTS
C on clu sion s
The first compiler for the E-machine has been designed and implemented. The 
compiler’s source language, called miniPascal, is a subset of ISO Standard Pascal. 
The miniPascal compiler is a one-pass compiler written in ANSI Standard C and 
was developed using the Unix development tools, lex and yacc [Mason 90], [Lesk 75], 
[Johnson 75]. The compiler’s scanner module, produced by running lex on a Unix 
machine, and its parser module, produced by running yacc on a Unix machine, were 
subsequently downloaded to a DOS machine. These two modules, compiled and 
finked with numerous semantic analysis and code generation modules, comprise the 
miniPascal compiler. A number of miniPascal programs compiled into E-machine 
object files have been -successfully animated using a simple DOS animator to drive 
the E-machine.
87
F uture E n h an cem en ts
Since miniPascal is a subset of Pascal, future versions of miniPascal will include 
additional Pascal features. A next logical feature to be implemented is pointers— 
particularly important to animate, because they are often a difficult concept for 
students to graspi Other desirable features to be implemented in the future include: 
records with variant parts, the with statement, sets, and predeclared functions and 
procedures. It would be particularly useful to implement the predeclared procedure, 
read. The availability of the read procedure would greatly facilitate the initialization 
of data (e.g., arrays) in programs demonstrating concepts such as sorting and matrix 
multiplication.
One feature that is not completely implemented in the current version is the 
method of displaying the value returned by a function call. Currently, the code 
generated by the compiler allows the animator to display a function value only when 
displaying the variable values in a window associated with the called function itself. 
The function name, however, is actually declared in the calling scope, and hence 
its value is available in this scope. It would be desirable to have the function value 
also displayed in the calling scope’s data memory window. A problem occurs when 
a function is called multiple times from the same scope, either by calls in several 
different statements or by multiple calls within the same statement. The question 
here is whether to display only the most recent value returned by the function, or to 
display all previous function values as well. Once this design decision is made, the 
compiler will require modification to produce code to support the display method.
The compiler should also be enhanced to identify the E-code instructions that are 
considered critical. Currently, the compiler simply designates all E-code instructions 
as critical, thus hampering the efficiency of the E-machine.
88
Another compiler enhancement is improvement of error handling. Currently, 
only minimal error reporting is supported by the compiler, and there is no attempt 
at error recovery. This minimal support is considered sufficient for the present 
DYNALAB system since the miniPascal programs will be prepared by expert pro­
grammers. Later, however, the DYNALAB system may be used by students prepar­
ing their own programs for animation. Thus, error handling must be enhanced to 
provide a more “friendly” environment for the miniPascal programmers.
Finally, since the DYNALAB system is intended to be an evolutionary system, 
the miniPascal compiler will continue to evolve in order to support new animation 
features. For example, the animator may provide visualization of expression evalua­
tion in order to demonstrate precedence rules in a language. The animator may also ' 
display “TRUE” or “FALSE” as conditional expressions are evaluated. It may be 
desirable to have the programmer indicate groups of source code lines that should 
appear in the same source code animation window in order to clearly illustrate some 
concept. AU of these animation features require modifications to the compiler in 
order to generate the supporting code.
Thus, even though the miniPascal compiler is a usable first compiler for the 
E-machine, its evolution is expected to continue. The compiler is also expected to 
serve as a pattern for developers of future E-machine compilers.
89
REFERENCES
90
References
[Aho 86] A. V. Aho, R. Sethi, and J . D. Ullman. Compilers: Principlesl Tech­
niques, and Tools. Addison-Wesley, Reading, Massachusetts. 1986-
[Birch 90] M. L. Birch. An Emulator for the E-machine. Master’s thesis. Com­
puter Science Department, Montana State University. June 1990.
[Brown 88-1] M. Brown. Algorithm Animation. The MIT Press, Cambridge, Mas­
sachusetts. 1988.
[Brown 88-2] M. Brown. “Exploring Algorithms Using Balsa-IF, Computer 
Volume 21, Number 5. May 1988.
[Fischer 88] C. N. Fischer and R. J. LeBlanc, Jr. Crafting a Compiler. Ben­
jamin/Cummings Pubhshing Company, Menlo Park, California. 1988.
[Holub 90] A. I. Holub. Compiler Design in C. Prentice Hall, Englewood Chffs, 
New Jersey. 1990.
[Jensen .91] K. Jensen and N. Wirth. Pascal: User Manual and Report. Springer- 
Verlag, New York, New York. 1991.
[Johnson 75] S. C. Johnson. “Yacc: Yet Another Compiler-Compiler”, Computer
[Lesk 75]
[Mason 90]
[Ng 82-1]
[Ng 82-2]
[Patton 89]
[Ross 91]
Science Technical Report Number 32. Bell Laboratories, Murray Hill, 
New Jersey. July 1975.
M. E. Lesk and E. Schmidt. “Lex - A Lexical Analyzer Generator”, 
Computer Science Technical Report Number 39. Bell Laboratories, 
Murray Hill, New Jersey. October 1975.
T. Mason and D. Brown, lex & yacc. O’Reihy and Associates, 
Sebastopol, California. 1990.
C. Ng. Ling User’s Guide. Unpublished Master’s project. Computer 
Science Department, Washington State University. 1982.
C. Ng. Ling Programmer’s Guide. Unpubhshed Master’s project. 
Computer Science. Department, Washington State University. 1982.
S. D. Patton. The E-machine: Supporting the Teaching of Program 
Execution Dynamics. Master’s thesis. Computer Science Department, 
Montana State University. June 1989.
R. J. floss. “Experience with the DYNAMOD Program Animator”, 
Proceedings of the Twenty-second Symposium on Computer Science 
Education, SIGCSE Bulletin, 23(l):35-42. 1991.
[Ross 93] R. J. Ross. “Visualizing Computer Science”, Invited chapter to appear 
in the AACE monograph, Scientific Visualization in Mathematics and 
Science Education. 1993.
[Winslett 93] R. Winslett. Juno. Master’s thesis in progress. Computer Science De­
partment, Montana State University.
91
A PPEND IC ES
93
A PPE N D IX  A
THE E-M ACHINE IN STRU CTIO N  SET
This appendix, which is adapted from chapter 2 of Birch’s thesis, lists all of the 
instructions in the instruction set of the E-machine. A pseudo assembly language 
format is used to describe the instructions, however the instruction stream itself 
is actually an array of structures loaded from the CODESECTION portion of the 
E-machine object file at run time. The object file is described in detail in chapters 2 
and 4 of this thesis.
Each instruction is composed of four fields (or arguments): 
o an opcode mnemonic (e.g., push, pop, add); 
o a flag marking the instruction critical or noncritical (CFLAG); 
o an field denoting, the data type to be used in the instruction (TYPE);
o a field containing either a number (# ) or an addressing mode (ADDR);
Addressing modes and their formats are described in appendix B.
The mnemonic field is separated from the others by one or more spaces, and the 
remaining fields are separated by commas. The CFLAG field must be either c o r n  
to designate whether the instruction is to be treated as critical (c) or noncritical (n), 
The TYPE field holds a single capital letter, I, R, B, C, or A, referring to the data 
types integer, real, boolean, character, or address, respectively. The ^  refers to a 
constant specifying the number of an E-code label, a constant numeric value, or an
94
E-machine variable register number. If the ADDR argument is used for the fourth 
field, it refers to any of the addressing modes described in appendix B.
In the following description of the instruction set, the effects off executing an 
instruction both forward and in reverse are given. The actions taken in each case 
will be different, depending on whether the instruction has been designated critical 
or noncritical. Some instructions have no critical/noncritical flag, because their 
execution (either forward or in reverse) would be the same in either case. Reversing 
through a noncritical instruction sometimes requires that something be pushed onto 
the evaluation stack to keep the stack of the proper size; in such cases an arbitrary 
value, called DUMMY is used.
add CFLAG, TYPE
A dds th e  top  tw o values on  the  eva luation  stack  and  p laces th e  resu lt on to  th e  evaluation  
stack .
F o rw a rd -C ritic a l:  P ops th e  top  tw o values o f  the  eva luation  stack , pushes them  on to  the  
save stack , and  th en  pushes their sum  on to  the  eva luation  stack .
F o rw a r d -N o n c r it ic a h  Pops th e  top  tw o values o f  the  eva luation  stack  and  pushes their  
sum  on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  P ops the  top  value o f  the  eva luation  stack  and  d iscards the  value. Pops  
th e  top  tw o  elem en ts o f  the  save stack  and  pushes th em  on to  th e  eva luation  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  th e  eva luation  stack .
a l lo c  C F L A G , #
A llo ca tes  a  b lock  o f  m em ory  o f  #  size.
F o rw a rd :  A ttem p ts  to  a lloca te  #  com puter w ords o f  storage. I f  successfu l, th e  address o f  
th e  first word o f  d a ta  m em ory  th a t w as a lloca ted  is pushed  on to  th e  eva luation  stack . 
O therw ise, a  N U LL  address is pushed  on to  th e  eva luation  stack .
R e v e rs e :  P ops the  top  va lue off the  eva luation  stack , w hich  shou ld  be a d a ta  address, and  
frees #  w ords o f  d a ta  m em ory  startin g  at th a t  address.
and CFLAG, TYPE
B itw ise  a n d ’s the  top  tw o  values o f  th e  eva luation  stack  and  p laces th e  result on to  the  
eva lua tion  stack .
95
F o rw u rd -C ritic a l:  P ops the  top  tw o values o f  the  eva luation  stack , pushes the two values  
on to  th e  save stack , and  th en  pushes th e  b o tto m  value b itw ise  a n d ’ed w ith  the  top  
value on to  the  eva luation  stack .
F o rw a rd -N o n c r it ic a l:  P ops th e  top  tw o values o f  the  eva luation  stack  and  pushes the  
b o tto m  value b itw ise a n d ’ed w ith  th e  top  value on to  the  eva lua tion  stack .
R e v e r s e -C r it ic a l:  Pops the  top  value o f  th e  eva luation  stack  and  d iscards it .  Pops the  top  
tw o  values o f  the save stack  and  pushes th em  onto  th e  eva lua tion  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  th e  eva luation  stack .
br #
U n con d ition a lly  branches to  lab e l # .
F o rw a rd :  Loads th e  program  counter w ith  th e  address o f  the  lab e l #  in struction . 
R e v e rs e :  N o  operation .
brt, brf CFLAG, #
C ond itiona lly  branches d epend ing  on w hether th e  top  o f  the  eva lua tion  stack  is T R U E  or 
FALSE.
F o rw a rd -C ritic a l:  P ops the  top  value off the  eva luation  stack  and  pushes it  on to  the  save  
stack . I f  th e  value satisfies the  con d itiona l on  the  branch (T R U E  for brt, FALSE for 
b rf), th e  program  counter is loaded  w ith  th e  address o f  the  lab e l #  in struction .
F o rw a rd -N o n c r it ic a l:  P ops the  top  value off th e  eva luation  stack . If th e  value agrees w ith  
th e  con d itiona l branch  (T R U E  for brt, FA LSE  for b rf), th e  program  counter is loaded  
w ith  th e  address o f the  lab e l #  in struction .
R e v e r s e -C r it ic a l:  P ops the  top  value o f  the  save stack  and  pushes it  on to  the  evaluation  
stack .
R e v e r s e -N o n c r it ic a l:  A rb itrarily  pushes D U M M Y  on to  the  eva lua tion  stack .
call #
B ranches to  lab e l #  sav ing  th e  program  address w hich  fo llow s the  call in struction  so th a t  
execu tion  w ill con tinue there upon  execu tion  o f  a  return  in struction .
F o rw ard :  P ushes the  current program  counter on to  the  return address stack , then  loads  
th e  address o f  th e  lab e l #  in struction  in to  th e  program  counter.
R e v e r s e :  P ops th e  top  value from  the return address stack .
c a s t  C F L A G , T Y P E , T Y P E
C hanges th e  top  va lue o f  the  evaluation  stack  from  th e  first T Y P E  to  th e  second.
F o rw a rd -C ritic a l:  P ops th e  top  value o f  th e  eva luation  stack  and  pushes it  on to  the  save  
stack , th en  transform s the  value from  th e  first T Y P E  to  th e  second . T he result is 
pushed  on to  th e  eva luation  stack .
96
F o rw a rd -N o n c r it ic a l:  P ops th e  top  value o f  th e  eva luation  stack , th en  transform s the value  
from  th e  first T Y P E  to  th e  second . T he resu lt is pushed  on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  P ops th e  top  value o f  the  eva luation  stack . T h e  p op s th e  top  value o f  
th e  save stack  and  pushes it  on to  th e  eva luation  stack .
R e v e r s e -N o n c r it ic a l  N o th in g  happens.
d iv  C F L A G 1 T Y P E
D iv id es th e  second  value from  th e  top  o f  the  eva luation  stack  by th e  first and  p laces the  
resu lt on to  th e  eva luation  stack .
F o rw a rd -C ritic a l:  P ops the  top  tw o  values o f  the  eva luation  stack , pushes th e  tw o values 
on to  th e  save stack , and  pushes th e  b o tto m  value d iv ided  by th e  top  value on to  the  
eva lua tion  stack .
F o rw a rd -N o n c r it ic a l:  P ops the  top  tw o values o f  the  eva luation  stack  and  pushes the  
b o tto m  value d iv ided  by th e  top  value on to  the  eva luation  stack .
R e v e r s e -C r it ic a l:  P ops th e  top  value o f  th e  eva luation  stack  and  d iscards it . Pops the  top  
tw o  va lues o f  th e  save stack  and  pushes th em  on to  the  eva lua tion  stack .
R e v e r s e -N o n c r i t ic a l: P ushes D U M M Y  on to  th e  eva luation  stack .
e q l ,  n e q l ,  l e s s ,  l e q l ,  g t r ,  g e q l  C F L A G , T Y P E
If th e  second  value from  th e  top  o f  the  eva lua tion  stack  com pares favorab ly  w ith  the  first, 
then  T R U E  is pushed  on to  the  eva luation  stack . O therw ise FA LSE  is pushed  on to  the  
eva lua tion  stack .
F o rw a rd -C ritic a l:  P ops th e  top  tw o values off th e  evaluation  stack , pushes th e  tw o values 
on to  th e  save stack , com pares th e  b o tto m  value w ith  th e  top . I f  th e  result o f  the  
com parison  m atches th e  com parison  op era tion  perform ed, a  b oo lean  T R U E  is pushed  
on to  th e  eva luation  stack , otherw ise, a  b oo lean  FA LSE is pushed  on to  th e  evaluation  
stack .
F o rw a rd -N o n c r it ic a l:  P ops th e  top  tw o  values off th e  eva luation  stack  and  com pares the  
b o tto m  value w ith  th e  top  value. I f  th e  result m atches th e  com parison  operation  
perform ed, a  b oo lean  T R U E  is pushed  on to  th e  eva luation  stack , o therw ise, a  boolean  
FA LSE is pushed  on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  P ops th e  top  value o f  th e  eva luation  stack  and  d iscards it , then  pops th e  
top  tw o values off the  save stack  and  pushes th em  on to  th e  eva lua tion  stack.
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  the  evaluation  stack .
in s t  C FL A G , #
C reates an  in stance o f  the  variable register # .
F o rw a rd - C r itica l:  A llo ca tes  enough  d a ta  m em ory  for the  variable represen ted  by the vari­
ab le register # .  T h e  address o f  th e  a lloca ted  m em ory  is th en  pushed  on to  the  variable  
reg ister’s stack .
97
F o rw a rd -N o n c r it ic a l:  A llo ca tes enough  d a ta  m em ory  for the  variable represented  by the  
variable register # .  T he size o f  the  variable is stored  in  th e  variable register. T he  
address o f  the  a lloca ted  m em ory  is then  pushed  on to  th e  variable reg ister’s stack .
R e v e r s e -C r it ic a l:  T h e  d a ta  m em ory  occup ied  by  th e  variable register is freed and the  top  
value is popp ed  off th e  variable reg ister’s stack .
R e v e r s e -N o n c r it ic a l:  Frees th e  space taken  up by the  variable in  d a ta  m em ory  and pops  
th e  top  value off the  variable reg ister’s stack .
label #
M arks the  lo ca tion  to  w hich  a branch m ay  be m ade.
F o rw a rd :  P ushes th e  previous program  counter on to  the stack  p o in ted  to  by label register  
# •
R e v e rs e :  P ops the  top  value o f  the  stack  p o in ted  to  by lab e l register #  and  p laces it  in  
th e  program  counter.
link #
A ssoc ia tes  one variable register w ith  the  value o f  another.
F o rw a rd :  P ops th e  top  value o f  the  eva luation  stack  and  pushes it  on to  th e  variable stack  
p o in ted  to  by  variable register # .
R e v e r s e :  P ops th e  top  value o f  th e  variable stack  p o in ted  to  by  variab le register #  and  
pushes it  on to  th e  eva luation  stack .
Ioadar CFLAG, ADDR
P laces th e  address A D D R  in  th e  address register.
F o rw a rd -C r itic a l:  T h e  con ten ts o f  the  address register are pushed  on to  the  save stack . 
T h en  th e  address com puted  for th e  addressing m ode is p laced  in  th e  address register. 
Im portan t note: it  is th e  address th a t  is com p u ted  by th e  addressing  m ode th a t is 
used , n o t the  con ten ts o f  th a t  address.
F o rw a rd -N o n c r it ic a l:  T h e  address com p uted  for th e  addressing m od e  is p laced  in  the  
address register. Sam e n o te  for Forw ard-C ritical applies here.
R e v e r s e -C r it ic a l:  T he address on top  o f th e  save stack  is popp ed  off and  placed  in  the  
address register.
R e v e r s e -N o n c r it ic a l:  N o th in g  happens.
Ioadir CFLAG, #
P laces th e  #  in to  th e  in dex  register.
F o rw a rd -C r itic a l:  T h e  con ten ts  o f  th e  in dex  register are pushed  on to  th e  save stack . T hen  
#  is p laced  in  th e  address register.
F o rw a rd -N o n c r it ic a l:  #  is p laced  in  the  in d ex  register.
.98
R e v e r s e -C r it ic a l:  T h e  value on top  o f  the  save stack  is popped  o ff and  p laced  in  the index  
register.
R e v e r s e -N o n c r it ic a l:  N o th in g  happens.
m o d  C F L A G 1 T Y P E
F inds th e  rem ainder o f  the  d iv ision  o f  the  second  value from  th e  top  o f  th e  evaluation  stack  
by th e  first and  p laces the  resu lt on to  th e  eva luation  stack .
F o rw a rd -C ritic a l:  P ops the  top  tw o values o f  th e  eva luation  stack , pushes the  tw o values 
on to  th e  save stack , and  th en  pushes the  b o tto m  value m odu lo  th e  top  value on to  the  
eva luation  stack .
F o rw a r d -N o n c r it ic a l  P ops th e  top  tw o  values o f  the  eva luation  stack  and  pushes th e  bottom , 
value m odu lo  th e  top  value on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  Pops th e  top  value o f  the  eva luation  stack  and  d iscards it . Pops the  top  
tw o values o f  th e  save stack  and  pushes them  onto  th e  eva lua tion  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  the  eva luation  stack .
m u l t  C F L A G , T Y P E
M ultip lies th e  top  tw o values on  th e  eva luation  stack  and  p laces the  resu lt on to  the  evalu­
a tion  stack .
F o rw a rd -C ritic a l:  P ops th e  top  tw o  values o f  the  eva luation  stack , pushes the  tw o values 
on to  th e  save stack , and  then  pushes their p roduct on to  the  eva lua tion  stack.
F o rw a rd -N o n c r it ic a l:  P ops th e  top  tw o values o f  th e  eva luation  s ta ck .a n d  pushes their  
product on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  Pops the  top  value o f  th e  eva luation  stack  and  d iscards it . Pops the  top  
tw o values o f  th e  save stack  and  pushes th em  on to  the  eva lua tion  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  th e  eva luation  stack .
n e g  T Y P E
N egates th e  top  value on  th e  eva luation  stack .
F o rw ard :  Pops th e  top  o f  the  evaluation  stack  and  pushes the n ega tion  o f  th a t value on to  
the  eva luation  stack .
R e v e rs e :  P ops the  top  o f  the  eva luation  stack  and  pushes the  n ega tion  o f  th a t value on to  
th e  eva luation  stack .
n o p  T h is  in stru ction  is th e  standard  n o -op era tion  in struction . It can  be u sed  to  create packets  
for h igh  level program  tex t  for w hich  no E -m ach ine in struction s are generated  but w hich  
n oneth eless  need  to  be h igh ligh ted  for an im ation  purposes. A n  exam p le  o f  th is is th e  b e g in  
keyw ord  in  P ascal. In illu stra tin g  the  flow  o f  con tro l during program  an im ation , a  b e g in  
keyw ord  m ay  need  to  be h igh ligh ted  (and  thu s have its  ow n underly ing  E -m ach ine packet 
o f in stru ction s). T h e  n o p  in struction  can  be used  in  these  cases.
99
not CFLAG, TYPE
B itw ise com p lem en ts th e  top  value o f  the  eva luation  stack .
F o rw a rd :  P ops th e  top  o f  th e  eva luation  stack  and  pushes th e  b itw ise  no t o f  th a t value  
on to  th e  eva luation  stack .
R e v e rs e :  Pops th e  top  o f  the  eva luation  stack  and  pushes th e  b itw ise  no t o f  th a t value  
on to  th e  eva luation  stack .
or CFLAG, TYPE
B itw ise  or s th e  top  tw o values o f  th e  eva luation  stack  and  p laces th e  resu lt on to  the  evalu­
a tion  stack .
F o rw a rd -C ritic a l:  P ops the  top  two values o f  th e  eva luation  stack , pushes the tw o values 
on to  th e  save stack , and  th en .pu shes the  b o tto m  value b itw ise or’ed w ith  the top  value  
on to  th e  eva luation  stack .
F o rw a rd -N o n c r it ic a l:  P ops the  top  tw o values o f  th e  eva luation  stack  and  pushes the  
b o tto m  value b itw ise o r ’ed w ith  th e  top  value on to  the  eva lua tion  stack .
R e v e r s e -C r it ic a l:  P ops th e  top  value o f  the  eva luation  stack  and  d iscards it . Pops the  top  
tw o  values o f  the  save stack  and  pushes th em  on to  the  eva lua tion  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  th e  eva luation  stack .
pop CFLAG, TYPE, ADDR
P ops th e  top  va lue o f  th e  eva luation  stack  and  p laces it  in  A D D R .
F o rw a rd -C ritic a l:  Pushes the  value in  A D D R  on to  the  save stack  and  th en  pops the  top  
value o f  the  eva lua tion  stack  and  stores it  in  A D D R .
F o rw a rd -N o n c r it ic a l:  P ops th e  top  value o f  th e  eva luation  stack  and  stores it  in  A D D R .
R e v e r s e -C r it ic a l:  P ushes th e  value in  A D D R  on to  the  eva luation  stack  and  then  pops the  
top  va lue o f  th e  save stack  and  p laces it  in  A D D R .
R e v e r s e -N o n c r it ic a l:  P ushes th e  value in  A D D R  on to  th e  eva luation  stack .
popar CFLAG
P ops th e  address on  top  o f  th e  evaluation  stack  and  p laces it  in  th e  address register.
F o rw a rd - C r itica l:  T h e  con ten ts  o f  the  address register are pushed  on to  th e  save stack . T h e  
address on top  o f  th e  evaluation  stack  is popp ed  and  p laced  in  th e  address register.
F o rw a rd -N o n c r it ic a l:  T h e  address on top  o f  the  eva luation  stack  is p op p ed  off and p laced  
in  th e  address register.
R e v e r s e -  C r itica l:  T h e  con ten ts  o f  the  address register are pushed  on to  th e  evaluation  stack . 
T h en  the  address on  top  o f  the  save stack  is p opp ed  off and  p laced  in  the  address 
register.
R e v e r s e -N o n c r it ic a l:  T h e  con ten ts o f  th e  address register are pushed  on to  the  evaluation  
stack .
100
p o p d  P ops th e  top  value from  th e  dynam ic scope stack .
F o rw ard :  P ops the  top  value from  the dynam ic scope stack  and  pushes it  on to  the save  
d ynam ic scope stack .
R e v e r s e :  P op s th e  top  value from  th e  save dynam ic scope stack  and  pushes it  on to  the  
d ynam ic scope stack .
p o p ir  C FL A G
P ops the  in teger on  top  o f  th e  eva luation  stack  and  p laces it  in  th e  in d ex  register.
F o rw a rd -C ritic a l:  T h e  con ten ts o f  th e  in d ex  register are pushed  on to  th e  save stack . T hen  
th e  in teger on  top  o f  th e  eva luation  stack  is popp ed  off and  p laced  in  the  in dex  register.
F o rw a rd -N o n c r it ic a l:  T h e  in teger on  top  o f  the  eva luation  stack  is p op p ed  off and  p laced  
in  the  in d ex  register.
R e v e r s e -C r it ic a l:  T h e  con ten ts  o f  th e  in d ex  register are pushed  on to  th e  eva luation  stack . 
T h en  th e  in teger on  top  o f  the  save stack  is popp ed  off and  p laced  in  th e  in dex  register.
R e v e r s e -N o n c r it ic a l:  T h e  con ten ts o f  the  in dex  register are pushed  on to  the  eva luation  
stack .
p u s h  T Y P E , A D D R
P ushes the  value in  A D D R  on to  the  eva luation  stack .
F o rw a rd :  P ushes th e  value in  A D D R  on to  th e  eva luation  stack .
R e v e rs e :  Pops th e  top  value o f  the  eva luation  stack  and  stores it  in  A D D R .
p u s h a  A D D R
P ushes th e  ca lcu la ted  address o f  A D D R  on to  th e  eva luation  stack . T h is  in struction  is
in tend ed  to  be used  for push ing  the  addresses o f  param eters passed  b y  reference.
F o rw a rd :  P u shes th e  ca lcu la ted  address o f  A D D R  on to  the  eva lua tion  stack .
R e v e r s e :  P ops and  d iscards th e  address on top  o f  the  eva luation  stack .
p u s h d  #
P ushes th e  #  on to  the  dynam ic scope stack  (w here #  is the index  o f  a  program , procedure, 
or fun ction  entry in  the  S ta tic  Scope Table)
F o rw ard :  P ushes #  on to  th e  dynam ic scope stack .
R e v e rs e :  P ops th e  top  value from  th e  dynam ic scope stack .
r e a d  C FL A G , T Y P E
R eads a  value from  th e  user.
F o rw a rd : A  user in terface fun ction  is called  to  get inpu t from  th e  user. T h e  inpu t is 
converted  from  a string  to  the  appropriate  typ e  and  pushed  on to  th e  eva luation  stack .
101
R e v e r s e :  T h e  top  value is popp ed  off the  eva luation  stack .
r e t u r n  R eturns to  th e  appropriate program  address fo llow ing  a  ca ll in struction .
F o rw a rd :  P ops th e  top  value o f  th e  return  address stack  and  load s it  in to  the  program  
counter.
R e v e r s e :  P ushes th e  previous program  counter on to  the return address stack .
s h l  C F L A G 1 T Y P E , #
Sh ifts th e  value on  top  o f  th e  eva luation  stack  #  b its to  th e  left filling on  th e  right w ith  0 ’s.
F o rw a rd -C ritic a l:  Pops th e  top  value o f  the  eva luation  stack , pushes it  on to  the  save stack , 
th en  sh ift it  b its to  the  left and  pushes th e  result back on to  th e  eva luation  stack .
F o rw a rd -N o n c r it ic a l:  P ops the  top  value o f  the  eva luation  stack , sh ifts it  le ft #  b its, then  
pushes th e  resu lt back on to  the  eva luation  stack .
R e v e rs e -  C r itica l:  P ops th e  top  value o f  th e  eva luation  stack . T h en  p op s the  top  value o f  
th e  save stack  and  pushes it  on to  th e  eva luation  stack .
R e v e r s e -N o n c r it ic a l:  N o th in g  happen s.
s h r  C FL A G , T Y P E , #
Sh ifts th e  value on  top  o f  th e  eva luation  stack  #  b its  to  th e  right filling on  th e  left w ith  0 ’s.
F o rw a rd -C ritic a l:  P ops th e  top  value o f  th e  eva luation  stack , pushes it  on to  th e  save stack , 
th en  sh ift it  b its  to  th e  right and  pushes the  result back on to  th e  eva luation  stack .
F o rw a rd -N o n c r it ic a l:  Pops th e  top  value o f  the  eva luation  stack , sh ifts it  right #  b its, 
th en  pushes th e  resu lt back on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  P ops th e  top  value o f  the  eva luation  stack . T h en  pops th e  top  value o f  
th e  save stack  and  pushes it  on to  the  eva luation  stack .
R e v e r s e -N o n c r it ic a l:  N o th in g  happens.
s u b  C F L A G , T Y P E
Sub tracts the  value on  th e  top  o f  the  eva luation  stack  from  the second  value from  the top  
and  p laces the resu lt on to  th e  evaluation  stack .
F o rw a rd -C ritic a l:  P ops th e  top  two values o f  th e  eva luation  stack , pushes the  tw o values  
on to  th e  save stack , and  then  pushes th e  b o tto m  value m inus th e  top  value on to  the  
eva luation  stack .
F o rw a rd -N o n c r it ic a l:  P ops th e  top  tw o values o f  th e  eva luation  stack , and  pushes the  
b o tto m  value m inus th e  top  value on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  Pops the  top  value o f  th e  eva luation  stack  and  d iscards it . Pops the  top  
tw o  values o f  the  save stack  and  pushes th em  on to  the  eva lua tion  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  th e  eva luation  stack .
102
u n a l l o c  C F L A G , #
D ea llo ca tes  a  b lock  o f  m em ory  o f  #  size b eg inn ing  a t th e  d a ta  address a top  the eva luation  
stack .
F o rw a rd -C r itic a l:  Pops th e  top  value off th e  eva luation  stack , w hich  should  be a  d a ta  
address, cop ies #  w ords o f  d a ta  m em ory  sta rtin g  a t th a t address to  the  save stack , 
th en  frees the  d a ta  m em ory.
F o rw a rd -N o n c r it ic a l:  P ops th e  top  value off th e  eva luation  stack , w hich  should  be a d a ta  
address, and  frees #  words o f  d a ta  m em ory  startin g  a t th a t  address.
R e v e r s e -C r it ic a l:  P ops the  top  value off the save stack , w hich  shou ld  be a  d a ta  address, 
pushes it  on to  the  eva luation  stack  and  a lloca tes ^  words o f  d a ta  m em ory  starting  at 
th a t  lo ca tion . #  words are then  m oved  from  th e  save stack  to  th is  d a ta  m em ory.
R e v e r s e -N o n c r it ic a l:  A llo ca tes  #  words o f  d a ta  m em ory  and  pushes th e  address o f  the  
first word o f  a lloca ted  m em ory  on to  th e  eva luation  stack .
u n in s t  C FL A G , #
D isp ose  o f  an  in stance  o f  variable register # .
F o rw a rd -C ritic a l:  Frees th e  m em ory  occup ied  by th e  variable th en  pops the  top  d a ta  
m em ory  address off the  variable reg ister’s stack  and  pushes it  on to  th e  save stack .
F o rw a rd -N o n c r it ic a l:  Frees th e  m em ory  occup ied  by th e  variable th en  p op s the  top  address 
off th e  variable reg ister’s stack .
R e v e r s e -C r it ic a l:  P ops th e  address off the  save stack  and  pushes it  on to  the variable  
reg ister’s stack  th en  rea llocates enough  d a ta  m em ory  for th e  variable #  starting  at 
th a t  address.
R e v e r s e -N o n c r it ic a l:  R ea lloca tes enough  d a ta  m em ory  for the  variab le #  and  pushes the  
address o f  th e  d a ta  m em ory  a lloca ted  on to  th e  variable reg ister’s stack .
u n l in k  #
D isa ssoc ia tes  a  variable register from  another.
F o rw ard :  P ops the  top  value o f  th e  variable stack  p o in ted  to  by variable register #  and  
pushes it  on to  th e  save stack .
R e v e rse :  P ops th e  top  value o f  the  save stack  and  pushes it on to  th e  variable stack  po in ted  
to  by variable register # .
w r i t e  C F L A G , T Y P E
D isp lays a va lue for the  user.
F o rw a rd - C r itica l:  T h e  top  o f  th e  evaluation  stack  is  popp ed  and  the  va lue pushed  on to  the  
save stack . T h is  value is th en  converted  in to  a string  and  passed  to  a  user interface  
fun ction  w hich  takes appropriate a ction  to  d isp lay  the  value.
F o rw a rd -N o n c r it ic a l:  T h e  top  o f  the eva lua tion  stack  is popp ed  and  is converted  in to  a 
string  and  passed  to  a user in terface fun ction  to  be d isp layed .
103
R e v e r s e -C r it ic a l:  T h e  va lue on top  o f  th e  save stack  is popp ed  and  pushed  on to  the  
eva lua tion  stack . T h en  a user in terface fun ction  is called  to  hand le und isp laying  o f  
th e  la st va lue d isp layed .
R e v e r s e -N o n c r it ic a l:  D U M M Y  is pushed  on to  the  eva luation  stack  and  then  a user in ter­
face fun ction  is called  to  hand le u nd isp lay ing  o f  the la st  value d isp layed .
xor CFLAG1 TYPE
B itw ise  exclu sive-or s the  top  tw o values o f  th e  eva luation  stack  and  p laces th e  result on to  
th e  eva luation  stack .
F o rw a rd -C ritic a l:  P ops the  top  tw o Values o f  th e  eva luation  stack , pushes th e  tw o values  
on to  th e  save stack , and  th en  pushes the  b o tto m  value b itw ise exclu sive or’ed w ith  
th e  top  value on to  th e  eva luation  stack .
F o rw a rd -N o n c r it ic a l:  P ops the  top  tw o values o f  th e  eva luation  stack  and  pushes the  
b o tto m  value b itw ise exclu sive or’ed w ith  the  top  value on to  th e  eva luation  stack .
R e v e r s e -C r it ic a l:  P ops th e  top  value o f  the  eva luation  stack  and  d iscards it . P ops the top  
tw o  values o f  th e  save stack  and  pushes th em  on to  the eva luation  stack .
R e v e r s e -N o n c r it ic a l:  P ushes D U M M Y  on to  the  eva luation  stack .
104
A PPE N D IX  B
THE E-M ACHINE ADDRESSING  MODES
This appendix, which is adapted from chapter 2 of Birch’s thesis, describes the 
various addressing modes allowed in E-machine instructions. Quite a few modes 
are defined in order to accommodate standard high level language data structures 
more conveniently. Note that each addressing mode refers to either the data at 
the computed address or the computed address itself, depending on the instruction. 
That is, for those instructions that need a data value, such as push, the data value 
at the address computed from the addressing mode is used. For instructions that 
need an address, such as pop, the address that was computed from the addressing 
mode is used.
For each addressing mode listed below, an example of its intended use is given. 
Each example is given in pseudo assembly language form for clarity; it is important 
to remember that no assembler (and hence no assembly language) has yet been 
developed for the E-machine. However, the pseudo assembly language examples 
should be easily understood.
105
c o n s t a n t  m o d e  - C #
T h is  m ode is o ften  called  th e  im m ed ia te  m ode in  other architectures; #  is itse lf  th e  in teger, 
real, b oo lean , character, or address con stan t operand  required in  the  in struction .
E x a m p le :
A  1.5;
cou ld  be tran sla ted  into:
■ p u s h  R ,C 1 .5  ; push  1.5
p o p  c ,R ,V l  ; assign  to A
v a r ia b le  m o d e  - V # :
va riab le  reg is te r  #  ---- » to p  o f  va riab le  s t a c k ---- > d a ta  m e m o r y
T h is m od e  accesses th e  d a ta  m em ory  lo ca tion  g iven  in  the  top  elem ent o f  th e  variable stack  
th a t  is p o in ted  to  by  variable register f f .  T h is  m od e  is in tended  to  address source program  
variables th a t  are o f  one o f  the  basic E -m ach ine types.
E x a m p le :
B  :=  I;
cou ld  be tran sla ted  into:
p u s h  ! ,C l  ; push  I
p o p  c ,I ,V 3  ; assign  to  B
variable indirect - (V#):
variab le  reg is te r  #  ---- » to p  o f  va r iab le  s t a c k ---- > d a ta  m e m o r y  — > d a ta  m e m o r y
T h is m ode accesses th e  d a ta  in  d a ta  m em ory  w hose loca tion  is stored  at another d a ta  
m em ory  lo ca tion , w hich  is p o in ted  to  by th e  top  o f  th e  variable stack  p o in ted  to  by variable  
register # .  T h is  m ode is in tended  for accessing  th e  conten ts o f  a  h igh  level language pointer  
variables. It w ould  be particu larly  usefu l for hand ling  param eters in  C w hich  are passed  as 
poin ters for th e  in ten tion  o f  passing  by reference.
E x a m p le :
in t fo o ( C ) 
in t *C  
{
*C =  I;
}
106
cou ld  be tran sla ted  into:
la b e l c,5 procedure entry
in s t c ,V 3 create new  in stance  o f  C
p o p c,A ,V 3 assign  argum ent passed  to  *c
p u s h ! ,C l push  I
p o p c,I,(V3) assign  to  *c
u n in s t c,V 3 destroy  in stance  o f  C
r e t u r n return from  call
v a r ia b le  o f f s e t  m o d e  - V # { o f f s e t } :
va r ia b le  re g is te r  #  — > to p  o f  va r iab le  s ta c k  +  I R  — > d a ta  m e m o r y
T h is m ode accesses th e  d a ta  p o in ted  to  by th e  top  o f  the  variable register f f  stack  plus 
a b y te  offset w hich  w as prev iously  loaded  in to  th e  in dex  register. T h is  m ode is useful for 
accessing  fields in  a  structured  d a ta  typ e  such  as a  P asca l record or C struct.
E x a m p le :
A  :=  D .F ie ld 2  
cou ld  be tran sla ted  into:
p u s h 1,2 ; D  is a t offset o f  2 in  structure
p o p ir C ; put offset in to  in dex  register
p u s h R ,V 4{IR > ; push  D .F ie ld 2
p o p c ,R ,V l ; assign  to  A
a d d r e s s  in d i r e c t  - (A ):
a d d ress  re g is te r  — > d a ta  m e m o r y
T h is  m od e  provides access to  d a ta  loca ted  at th e  d a ta  address in  th e  address register. T he  
address register m ust be loaded  w ith  a d a ta  m em ory  address w hich  p o in ts  to  d a ta  m em ory. 
T h is m ode is u sefu l for m u ltip le  ind irection .
E x a m p le :
c =  *(*g);
cou ld  be tran sla ted  into:
lo a d a r  c ,V 7  ; load  addr reg w ith  addr o f  g
lo a d a r  c ,(A ) ; load  addr reg w ith  addr o f  *g
p u s h  I ,(A ) ; push  * (*g )
p o p  c ,I ,V 3  ; assign  to  c
107
a d d r e s s  o f f s e t  m o d e  - A {o ffse t}:
address  reg is te r  +  I R  ---- ► d a ta  m e m o r y
T h is m ode provides access to  structured  d a ta  through  the  address register. T he in dex  
register is added  to  th e  address register to  provide an address to  th e  d a ta  to  be accessed . 
T h is m ode is u sefu l for ind irection  w ith  structured  da ta , such as po in ters to  record's in  
P ascal.
E x a m p le :
I :=  H j .D a ta  
, cou ld  be tran sla ted  into:
p u s h A ,V 8 ; push  H f (address value o f  H)
p o p a r C ; load  ar w ith  H t
p u s h I,C 2 ; D a ta  has offset o f  2 in  record
p o p ir c ; load  Ir w ith  offset
p u s h I ,A { IR } push  H t-D a ta
p o p c,I,V 9 ; assign  to  I
v a r ia b le  in d e x e d  m o d e  - V # [in d ex ]:
va r ia b le  reg is te r  #  — > to p  o f  va r iab le  s ta c k  +  I R  * d a ta  s i z e ---- > d a ta  m e m o r y
T h is  address m od e  uses th e  top  o f  the  variable register #  stack  as a  base address and adds  
th e  in d ex  register, w hich  m ust be p reviously  loaded , m ultip lied  by th e  num ber o f  bytes  
occup ied  by th e  d a ta  typ e , w hich  is a  basic E -m ach ine d a ta  type . T h e  resu lting address 
p o in ts  to  th e  d a ta  item . T h is m ode is u sefu l for accessing  an  array w hose elem en ts are o f  a  
basic E -m ach ine d a ta  type .
E x a m p le :
B :=  L [3];
cou ld  be tran sla ted  into:
p u s h n ,I,3 ; pu t in dex  o f  3 in to
p o p ir C ; th e  in dex  register
p u s h I,V 12[IR ] ; push  L [3]
p o p c ,I,V 2 I assign  to  B
a d d r e s s  i n d e x e d  m o d e  - A  [index]:
address  reg is te r  +  I R  * d a ta  s ize  — > d a ta  m e m o r y
T h is m ode provides the  sam e function  as variable indexed  m ode, excep t in stead  o f  a  variable 
register provid ing  the  base  address, th e  address register is loaded  w ith  th e  base address. 
T h is m ode cou ld  be used  for accessing  elem en ts o f  an array w hich  is p o in ted  to  by a variable.
108
E x a m p le :
B := STM;
cou ld  be tran sla ted  into:
p u s h A ,V 19 ; pu t address o f  array in to
p o p a r C ; address register
p u s h 1,4 ; pu t in dex  o f  4  in to
p o p ir C ; th e  in dex  register
p u s h I1A pR ] ; push  S T [4]
p o p c ,I,V 2 ; assign  to  B
109
A PPE N D IX  C
A m iniPA SCAL COM PILATION EXAM PLE
This appendix provides an example showing the complete results of the compi­
lation of a miniPascal program. The compilation was produced on a DOS machine. 
The example program, shown in figure 27, is referred to as program SampS through­
out the remainder of this appendix. The numbers on the left refer to source program 
line numbers. The program, as shown in figure 27 (with the exception of the line 
numbers), is written to the SOURCESECTION portion of the E-machine object 
file (or E-code file). Program Samp3 contains several features that were not illus­
trated previously. These features include constant and type declarations, a record 
definition, a two dimensional array, and an array of records. The record definition, 
DRec, consists of two fields, one of which is a two-dimensional array of the pre­
viously defined Matrx type. An array of these records (DBase) is then declared, 
with an instance of such an array (Data) being declared in the variable declaration 
section of the main program. Another variable—also named Data—is declared in 
the formal parameter list of procedure InitD. In this case, Data is declared as only 
a single record of type DRec.
Program Samp3 also contains a situation in which a packet becomes fragmented 
(see chapter 4 for a discussion of the packet fragmentation problem). The frag­
mentation occurs in procedure InitD, which contains a nested for loop in which the 
inner for loop is a single statement within another conditionally executed statement.
no
The particular packet fragmentation situation found in program SampS is discussed 
later in this appendix..
Table 12 shows the array containing the program memory addresses correspond­
ing to program SampS’s generated E-code label instructions. The column holding 
the label numbers (or label register numbers) is included for clarity—only the array 
of program memory addresses is actually written to the LABELSECTION portion 
of the E-code file.
Table 13 shows the array containing the data memory sizes reserved for pro­
gram SampS’s variable registers. The columns holding the variable names and the 
variable register numbers are included for clarity—only the array of data memory 
sizes is actually written to the VARIABLESECTION portion of the E-code file. 
The variable registers whose corresponding names are blank are temporary regis­
ters needed to hold intermediate values. In this implementation, the data memory 
sizes are in terms of bytes; hence, the corresponding data memory size for a 32-bit 
integer (e.g., J) is 4. As can be seen in table 13, the full size of the array of records 
(variable Data represented by variable register number 3) is reserved for the array. 
The full size of a single record (40 bytes) is reserved for the record Data (variable 
register number 6) found in procedure InitD. Variable register number 18, repre­
senting a 2-byte temporary variable, holds the result of the i f  comparison found in 
function Fact.
Figure 28 shows the contents of program SampS’s string space array. In this 
example, the string literal, ’Sample Program’ associated with the string constant 
Name, is entered into the string array. The string array is subsequently written to 
the STRINGSEOTION portion of the E-code file.
Table 14 shows the Packet Table generated for program Samp3. The column 
holding the packet number is included for clarity—the remaining fields (columns) 
are written to the PACKETSECTION of the E-code file. As can be seen in table 14,
I l l
packet number 24 is a fragmented packet. This fragmentation situation is discussed 
later in this appendix. There are also two packets (numbers 36 and 43) whose 
execution should not result in changing the animation display. These two packets 
correspond to a return from a function call; this situation was discussed in the Parser 
module section of chapter 4.
Table 1.5 shows the Static Scope Table for program Samp3. The column holding 
the entry number is included for clarity—the remaining fields (columns) are written 
to the STATSCOPESEGTION of the E-code file. Two previously unillustrated 
types of scope blocks are found in table 15. These are a record description scope 
block (entries 6-9) and an array index description scope block (entries 10-13).
As can be seen in table 15, two identifiers (entry I in procedure InitD’s scope 
block and entry 21 in program SampS’s scope) both refer to the same child scope 
block, which is the record scope block describing a record of type DRec (entries 6- 
9). The compiler is able to determine that this record description scope block needs 
to be present only once (and possibly referenced multiple times) by querying the 
Scope Owner Table’s Record Descriptor field, as discussed in the STATSCOPE 
module section of chapter 4.
Examine entry number 7 in table 15. This entry describes field A of a record 
of type DRec; field A is an array of type Matrx. The bounds of A’s first index are 
included in entry number 7. The NxtIdx field of this entry holds the index of the 
first entry of the scope block describing A’s second index (entries 10-13). Entry 
numbers 7 and 8, which describe the fields named A and B, also utilize the Offset 
field to denote the fields’ offsets from the beginning of the record. Finally, note that 
the RecSiz field is utilized for a variable representing an array of records (e.g., entry 
number 21 describing the variable Data in program Samp3). This value is required 
by the animator for proper calculation of offsets when displaying values associated 
with arrays of records.
112
Figure 29 shows the pseudo assembly language representation of the E-code in­
structions generated for program SampS. Figure 29 is formatted to enumerate the 
program’s animation units, with translated E-code packets printed directly beneath 
corresponding animation units. Here again, the reader is reminded that the pseudo 
assembly language format is used for clarity—it is an array of C structures represent­
ing the E-code instruction stream that is actually written to the CODESECTION 
of the E-code file.
Figure 29 illustrates several situations that need to be discussed. First, examine 
E-code instruction numbers 3-11. Each name declared in the constant declaration 
section is assigned a variable register number. The constants’ values are then stored 
in their corresponding variable registers. Thus, the compiler treats constants as 
though they were variable names in order to allow the animator access to their val­
ues at run time. Figure 30 shows a possible animation snapshot after the constant 
declarations have been executed (i.e., at this point the keyword TYPE will be high­
lighted, indicating that it is the next animation unit to be executed). It should be 
noted that as each of the subsequent type declarations are executed, the animator 
simply sequentially highlights the corresponding animation unit—there will be no 
corresponding data memory values added to the right-hand side of the display until 
variable names are actually declared. The nop instructions (numbers 12-15) serve as 
“dummy” instructions to allow the animator to highlight the appropriate animation 
unit.
As mentioned above, program Samp3 contains a situation in which a packet is 
fragmented. This packet, number 24, is the E-code translation of the animation 
unit
Data.A[I,J] := I + 101.33 * MultF;
This animation unit is part of a single for statement, thus illustrating the frag­
mentation of a packet resulting from a single for statement nested within a
113
conditionally executed statement, in this case another for statement. The frag­
mentation problem is manifested as follows. As the inner for loop is executed, the 
animator sequentially highlights the four animation units composing the inner for 
statement (i.e., the animation units translated by E-code packets numbered 21-24). 
The animator repeats this process upon each iteration of the inner loop. When the 
inner loop index eventually reaches its upper limit, the branch to label 7 (shown in 
instruction number 64) is taken. The E-code instruction defining label 7 (instruc­
tion number 117) is contained in packet number 24, which translates the above- 
mentioned animation unit. At this point, however, this animation unit should not 
be highlighted (since the instructions translating the assignment statement repre­
sented by the animation unit will not now be executed). The FragAddr field for 
packet number 24 in the Packet Table (shown in table 14) holds the value 117, indi­
cating that packet number 24 is considered fragmented whenever control branches 
into the packet at (or beyond) instruction number 117. The animator queries the 
E-machine’s program counter and packet register to determine which animation unit 
(if any) should be highlighted prior to the E-machine’s execution of the correspond­
ing packet. Thus, the animator, upon querying the E-machine’s program counter 
(currently 117) and packet register (currently 24), determines that packet number 24 
is fragmented at its current point of entry, instruction number 117. The animator 
must now retain its previous display while the E-machine executes instruction num­
bers 117—118. When the branch to label 2 (shown in instruction number 118) is 
accomplished, the E-machine returns control to the animator, which again queries 
the E-machine’s program counter (currently 33) and packet register (currently 19). 
Since packet number 19 is not fragmented, the animator now highlights the anima­
tion unit corresponding to this packet,
I := I TO Rows
114
Finally, figures 31 and 32 show two possible animation screens occurring during 
the animation of program Samp3. Figure 31 shows an animation display that could 
occur immediately before procedure InitD is called from the main program (i.e., the 
animator is highlighting the animation unit InitD  (Data [ETum] , 3) ; while awaiting 
a response from the user). The dotted lines shown in the source program window 
indicate omitted source lines. Figure 32 shows an animation display that could 
occur immediately before a return is issued from procedure InitD (i.e,, the animator 
is highlighting the animation unit EMD; in procedure InitD). Here again, the dotted 
Hnes indicate omitted source Hnes.
115
0 Program Samp3;
1 CONST
2 Rows = 3;
3 Cols = 3;
4 Name = 'Sample Program';
5 TYPE
6 Matrx = ARRAY [I..Rows,I..Cols] of REAL;
7 DRec = RECORD
8 A:Matrx;
9 B .-INTEGER;
10 END; { DRec >
11 DBase = ARRAY [I..2] OF DRec;
12 VAR
13 Data .-DBase ;
14 Num,nFact:INTEGER;
15
16 Procedure InitD(VAR Data:DRec;
17 MultF:INTEGER);
18 VAR
19 I,J :INTEGER;
20 BEGIN •{ Procedure InitD }
21 FOR I := I TO Rows DO
22 FOR J := I TO Cols DO
23 Data.A [I,J] := I + 101.33 * MultF;
24 Data.B := MultF;
25 END; { Procedure InitD }
26
27 Function Fact(n:INTEGER):INTEGER;
28 BEGIN { Function Fact }
29 IF n = 0
30 THEN Fact := I
31 ELSE Fact := n * Fact(n-1)
32 END; { Function Fact }
33
34 BEGIN { Program Samp3 }
35 Num := 2;
36 InitD(Data[Num],3);
37 nFact := Fact(Data[Num].B);
38 END. -( Program SampS }
Figure 27: The E-code SOURCESECTION for Program Samp3
116
Label
Register
Number
Label
Program
Address
LO 187
LI 21
L2 33
L3 42
L4 .119
L5 51
L6 60
L7 117
L8 139
L9 145
LlO 157
LU 179
L12 166
L13 208
L14 233
Table 12: The E-code LABELSECTION for Program Samp3
117
Variable Variable Variable
Name Register
Number
Size
Rows 0 4
Cols I 4
Name 2 4
Data 3 80
nFact 4 4
Nnm 5 4
Data 6 40
MultF 7 4
J 8 . 4
I 9 4
10 4
11 4
12 4
13 4
14 4
15 4
n 16 4
Fact 17 4
18 2
19 4
20 4
21 4
22 4
23 4
24 4
25 4
26 4
Table 13: The E-code VARIABLESECTION for Program Samp3
118
String
Space
0 , '0
I S
2 a
3 m
4 P
5 ■ I
6 . e .
7
8 P
9 r
10 O'
11 g
12 r
13 a
14 m
15 0 ,
Figure 28: The E-code STRINGSECTION for Program SampS
119
Packet
N um ber
Start
Addr
End
Addr
Start
Line
Start
Col
End
Line
End
Col
Scope
Index
Frag
Addr
Display
Packet
0 0 I 0 0 0 13 0 -I T R U E
I 2 2 I 0 I 4 0 -I T R U E
2 3 5 2 I 2 9 I - I T R U E
3 6 8 3 I 3 9 2 -I T R U E
4 9 11 4 I 4 24 3 -I T R U E
5 12 12 5 0 5 3 3 -I T R U E
6 13 13 6 I 6 40 3 - I T R U E
7 14 14 7 I 10 6 3 -I T R U E
8 15 15 11 I 11 29 3 -I T R U E
9 16 16 12 0 12 2 3 -I T R U E
10 17 17 13 I 13 11 4 -I T R U E
11 18 20 14 I 14 18 6 -I T R U E
12 21 22 16 I 16 15 0 -I T R U E
13 23 23 16 17 16 30 I - I T R U E
14 24 25 17 17 17 31 2 -I T R U E
15 26 26 18 3 18 5 2 -I T R U E
16 27 28 19 4 19 15 4 -I T R U E
17 29 29 20 3 20 7 4 -I T R U E
18 30 31 21 4 21 13 4 -I T R U E
19 32 46 21 8 21 21 4 -I T R U E
20 47 47 21 23 21 24 4 -I T R U E
21 48 49 22 5 22 14 4 -I T R U E
22 50 64 22 9 22 22 4 -I T R U E
23 65 65 22 24 22 25 4 -I T R U E
24 66 118 23 6 23 39 4 117 T R U E
25 119 130 24 4 24 19 4 -I T R U E
26 131 138 25 4 25 7 7 -I T R U E
27 139 140 27 I 27 13 0 -I T R U E
28 141 142 27 15 27 24 I - I T R U E
29 143 143 27 25 27 33 2 -I T R U E
30 144 144 28 3 28 7 2 -I T R U E
31 145 152 29 4 29 11 2 -I T R U E
32 153 153 30 6 30 9 2 -I T R U E
33 154 156 30 11 30 19 2 -I T R U E
34 157 158 31 6 31 9 2 -I T R U E
35 159 165 31 23 31 31 2 -I T R U E
36 166 168 - - - - 2 -I FALSE
37 169 178 31 11 31 31 2 -I T R U E
38 179 186 32 4 32 7 8 -I T R U E
39 187 188 34 I 34 5 8 -I T R U E
40 189 190 35 2 35 10 8 -I T R U E
41 191 207 36 2 36 20 8 -I T R U E
42 208 232 37 11 37 27 8 -I T R U E
43 233 235 - - - - 8 -I FALSE
44 236 237 37 2 37 28 8 -I T R U E
45 238 250 38 2 38 5 8 -I T R U E
Table 14: The E-code PACKETSECTION for Program Samp3
120
En Id Upr Lwr Nxt Off Type Rec Par Ch Var Proc
try Name Bnd Bnd Idx set Siz ent ild Reg Num
S c o p e  b lo c k  d e s c r i b i n g  p r o c e d u r e  I m t D
0 - - - - HEADER - 17 - - -
I Data - - - - RECORD - - 6 6 -
2 MultF - - - - INTEGER - - - 7 -
3 I - - - - INTEGER - - - 9 -
4 J - - - - INTEGER - - - 8 -
5 - - - - END - - - - -
S c o p e  b lo c k  d e s c r i b i n g  r e c o r d  o f  t y p e  D R e c
6 - - - - HEADER - - - - -
7 A 3 I 10 0 REAL - - - - -
8 B - - - 36 INTEGER - - - - -
9 - - - - E N D - - - - -
Scope block describing second index of array of type Matrx
10 - - - - H E A D E R - - - - -
11 3 I - - - - - - - -
12 - - - - E N D - - - - -
Scope block describing function Fact
13 - - - - H E A D E R - 17 - - -
14 n - - - - IN T E G E R - - - 16 -
15 Fact - - - - IN T E G E R - - - 17 -
16 - - - - E N D - - - - -
Scope block describing program SampS
17 - - - - H E A D E R - 27 - - -
18 Rows - - - - IN T C O N S T - - - 0 -
19 Cols - - - - IN T C O N S T - - - I -
20 Name - - - - S T R IN G C O N S T - - - 2 -
21 Data 2 I - - R E C O R D 40 - 6 3 -
22 Num - - - - IN T E G E R - - - 5 -
23 nFact - - - - IN T E G E R - - - 4 -
24 InitD - - - - P R O C E D U R E - - 0 - I
25 Fact - - - - F U N C T IO N - - 13 - 2
26 - - - - E N D - - - - -
Bootstrap scope block
27 - - - - H E A D E R - - - - -
28 Samp3 - - - - P R O G R A M - - 17 - 0
29 - - - - E N D - - - - -
Table 15: The E-code STATSCOPESECTION for Program SampS
121
Pkt
Mum
O
1
2
3
4
5
6 
7
8
9
10
Animation Unit
Instr E-code 
Mum Instruction
Program Samp3;
0 pushd C28 ; Push program's Static Scope Table
I nop
; index onto dynamic scope stack
CONST
2 nop
Rows = 3;
3 inst c ,,VO ; Create instance of Rows
4 push I,C3 ; Store value of Rows in data memory
5 pop c,I,VO
Cols = 3;
6 inst c,Vl ; Create instance of Cols
7 push I,CS ; Store value of Cols in data memory
8 pop c,I,Vl
Name = 'Sample Program';
9 inst c,V2 ; Create instance of Name
10 push I,Cl ■ ; Store Name's string space index in
11 pop c,C,V2 ; corresponding Variable register
TYPE
12 nop
Matrx =: ARRAY [I..Rows,I..Cols] OF REAL;
13 nop
DRec = RECORD 
A : Matrx.;
B :INTEGER;
END;
14 nop
DBase = ARRAY [I..2] OF DRec;
15 nop
VAR
16 npp 
Data:DBase;
17 inst c,V3 ; Create instance of Data
Figure 29: The E-code CODESECTION for Program Samp3
122
Num,nFact:INTEGER;
18 inst c,V4 ; Create instance of nFact
19 inst c,V5 ; Create instance of Nmu
20 br 0 ; Branch to beginning of main program
12 Procedure InitD
2.1 label I ; Enter Procedure InitD
22 pushd 024 ; Push procedure's Static Scope Table 
; index onto dynamic scope stack
13 (VAR Data:DRec;
23 link V6 ; Link Data to actual param
14 MultF INTEGER);
24 inst c,V7 ; Create instance of MultF
25 pop c,I,V7 ; Put actual param into MultF
15 VAR
26 nop
16 I,J: INTEGER;
27 inst c,V8 ; Create instance of J
28 inst c,V9 ; Create instance of I
17 BEGIN
29 nop
18 FOR I := I
30 push I,Cl ; Initialize I with value of I
31 pop c,I,V9
19 I := I TO Rows
32 br 3 ; Branch around MAXINT test and 
; increment of I on first pass 
; through the loop
33 label 2 ; Test label of outer FOR loop
34 push I,V9
35 push 1,032767
36 eql c,I ; Test that I has not exceeded MAXINT
37 brt c,4 ; If so, branch out of loop
38 push I,V9
39 push 1,01
40 add c,I ; Increment I
41 pop c,I,V9
42 label 3
43 push I,V9
44 push 1,03
45 gtr c,I ; Test for I reaching upper loop limit
46 brt c,4 ; If so, branch out of loop
Figure 29 (continued)
123
20
47 nop
21 FOR J := I
48 push I,Cl
49 pop c,I,V8
22
50
J := I TO Cols 
hr 6
51 label 5
52 push I,V8
53 push I,C32767
54 eql c,I
55 brt c,7
56 push I,V8
57 push I,Cl
58 add c,I
59 pop c,I,V8
60 label 6
61 push I,V8
62 push I,C3
63 gtr c,I
64 brt c,7
23
65 nop
24 Data. A [I, J] := I -
66 inst c,VlO
67 push I,V9
68 pop c,I,VlO
69 inst c,Vll
70 push I,V8
71 pop c,I,Vll
72 inst c,V12
73 push I,CO
74 pop c,I,V12
75 push I,VlO
76 push I,C3
77 mult c,I
78 push I,Vll
79 add c,I
80 pop c,I,VlO
81 push I,VlO
82 push I,C4
83 sub c,I
84 pop c,I,V10
; Initialize J with value of I
Branch around MAXINT test and 
increment of J on first pass 
through the loop 
Test label of inner FOR loop
Test that J has not exceeded MAXINT 
If so, branch out of loop
; Increment J
Test for J reaching upper loop limit 
If so, branch out of loop
DO
Create instance of temporary 
variable (VlO) and store value of 
first index (I) in VlO 
Create instance of temporary 
variable (Vll) and store value of 
second index (J) in Vll 
Create instance of temporary 
variable (VI2) and calculate the 
final (lineal) array index value 
based on the values of the two 
indices, I and J
( P a ck e t number 24 c o n tin u e d  on n e x t page)
Figure 29 (continued)
124
(Continuation of packet number 24)
85 push I,VlO
86 push I,C4
87 mult c,I
88 push I,V12
89 add c,I
90 pop c,I,V12
91 push I,V12
92 push I,C0
93 add c,I
94 pop c,I,V12
95 inst c,V13
96 push R 1ClOl.33
97 push I,V7
98 cast c,I,R
99 mult c,R
100 pop C1R 1VlS
101 inst c,V14
102 push I,V9
103 cast C1I1R
104 push R 1V13
105 add C1R
106 pop c,R1V14
107 push R 1Vl4
108 push I,V12
109 popir c
H O pop C1R 1VC-ClRd
111 uninst c,V14
112 uninst c,V13
113 uninst C1Vl2
114 uninst C1Vll
115 uninst C1VlO
116 br 5
117 label 7
118 br 2
Data .B := MultF;
119 label 4
120 inst c,V15
121 push I,CO
122 pop C1I1VlB
123 push I1VlB
124 push I,C36
125 add c,I
126 pop C1I1VlS
127 push I,V7
128 push I1VlB
129 popir c
130 pop C1I1VO-ClRd
Store calculated value of final 
index in Vl2
Convert index value in Vl2 to 
offset value
Create instance of temporary 
variable (V13) to hold result 
of 101.33 * MultF 
Cast MultF to REAL 
101.33 * MultF
Store multiplication result in V13 
Create instance of temporary 
variable (V14) to hold result 
I + V13 and cast I to REAL
I + V13
Store addition result in V14
Put offset value in index reg 
Put V14,s value in Data.A[I,J] 
Delete instances of temporary 
variables created within the 
inner FOR loop
Branch to test of inner FOR loop 
Branch out label of inner FOR loop 
Branch to test of outer FOR loop
Branch out label of outer FOR loop 
Create instance of a temporary 
variable (V15) to hold offset of 
field B
Calculate offset of field B
Store offset of field B in V15 
Put MultF on evaluation stack 
Put offset of field B on eval stack 
Put offset of field B in index reg 
Put MultF in Data.B
Figure 29 (continued)
125
26 END;
131 nop
132 uninst c,V15 Delete instance of temporary 
variable
Delete instance of J133 uninst c,V8
134 uninst c,V9 Delete instance of I
135 uninst c,V7 Delete instance of MultF
136 unlink c,V6 Unlink Data
137 popd Pop procedure's Static Scope Table 
from the dynamic scope stack
138 return Return to calling scope
27 Function Fact
139 label L8 Enter Function Fact
140 pushd 025 Push function's Static Scope Table 
index onto dynamic scope stack
28 (n:INTEGER)
141 inst c,V16 Create instance of n
142 pop c,I,V16 Put actual param into n
29 INTEGER;
143 inst c,V17 Create instance of Fact (function's 
return value)
30 BEGIN
144 nop
31 IF n = 0
145 label L9
146 inst c,V18 Create instance of temporary
147 push I,V16 variable (V18) to hold comparison
148 push 1,00 result
149 eql c,I Check for n = 0
150 pop c,B,V18 Put comparison result in V18
151 push B ,V18
152 brf c,10 ; If n not = 0, branch to ELSE
32 THEN
153 nop
33 Fact := I
154 push 1,01 Put I in Fact
155 pop c,I,V17
156 br 11 Branch around ELSE
34 ELSE
157 label 10 ELSE label
158 nop
Figure 29 (continued)
126
35
36
37
38
39
40
Fact(n-1)
159 inst CjVlO ; Create instance of temporary
160 push I,V16 ; variable (VlO) to hold n-1
161 push I,Cl
162 sub c,I ; Subtract I from n
163 pop CjIjVlO ; Put n-1 in V19
164 push IjVlO ; Push n-1 onto evaluation stack
165 call 8 ; Call Fact
166 label 12 Return from Fact
167 inst c,V20 Create instance of temporary
168 pop c,I,V20 variable (V20) to hold function 
value
Fact := n * Fact(n-1)
169 inst 0,721 Create instance of temporary
170 push I,V16 variable (V21) to hold n* Fact(n-1)
171 push I,V20
172 mult c,I
173 pop 0,1,721 Put multiplication result in V21
174 push I,721
175 pop 0,1,717 Put function value in Fact
176 uninst c,V21 Delete instances of temporary
177 uninst c,V20 variables created in ELSE clause
178 uninst c,V19
END;
179 label 11 Branch out label for ELSE
180 nop
181 push 1,717 Put function value on eval stack 
index from the dynamic scope stack
182 uninst c,V18 Delete instance of temp variable ■
183 uninst c,V17 Delete instance of Fact's result var
184 uninst c,V16 Delete instance of n
185 popd Pop function's Static Scope Table
186 return Return to calling scope
BEGIN
187 label 0 Start label for main program
188 nop
Num : = 2;
189 push I,C2 Put value of 2 in Num
190 pop c,I,V5
Figure 29 (continued)
127
41
42
InitD(Data[Num] ,3) ;
191 inst c,V22 ;
192 push I,V5 ;
193 pop c , I,V22 ;
194 push I,V22 ;
195 push I,Cl ;
196 sub c,I
197 pop c,I,V22 ;
198 push I,C3 ;
199 inst c,V23 ;
200 push I,V22
201 push I,C40 ;
202 mult c,I ;
203 pop c,I,V23 ;
204 push I,V23
205 popir c ;
206 pusha V3{IR> ;
207 call I J
Fact(Data[Hum].B)
208 label 13 ;
209 inst c,V24 ;
210 push I,V5
211 pop c,I,V24 ;
212 inst c,V25 ;
213 push I,CO ;
214 pop c,I,V25 ;
215 push I,V24
216 push !,Cl
217 sub c,I
218 pop c,I,V24
219 push I,V24
220 push I,C40
221 mult c,I
222 push I,V25
223 add c,I
224 . pop c,I,V25
225 push I,V25
226 push I,C36
227 add c,I
228 pop c,I,V25
229 push I,V25 ;
230 popir c ;
231 push I,V3{IR> ;
232 call 8 :
Create instance of temporary 
variable (V22) and store value 
of index (Mum) in V22 
Calculate final (linear) array 
index
Put final array index in V22 
Put 3 on evaluation stack 
Create instance of temporary 
variable (V23) to hold offset of 
Data
Calculate Data's offset and 
put it in V23
Put Data's offset in index reg 
Put address of Data[Mum] on 
eval stack 
Call InitD
Return from InitD 
Create instance of temporary 
variable (V24) and store value 
of index (Mum) in V24 
Create instance of temporary 
variable to hold calculated 
offset of Data[Mum].B
Put offset of Data[Mum] .B in 
index reg
Put Data[Mum].B on eval stack 
Call Fact
Figure 29 (continued)
128
43
44
45
233 label 14 ; Return from Fact
234 inst c,V26 ; Create instance of temp variable
235 pop c,I,V26 ; (V26) to hold function value
nF act : = Fact (Data [Num] .B) :
236 push I,V26
237 pop c,I,V4 Put value of Fact in nFact
END
238 nop
239 uninst c,V26 Delete instances of temporary
240 uninst c,V25 variables
241 uninst c,V24
242 ■uninst c,V23
243 uninst c,V22
244 uninst c,V4 Delete instance of nFact
245 uninst c,V5 Delete instance of Num
246 uninst c,V3 Delete instance of Data
247 uninst c,V2 Delete instance of. Name
248 uninst c,Vl Delete instance of Cols
249 uninst c,VO Delete instance of Rows
250 popd Pop program's Static Scope Table 
index from the dynamic scope stac
Figure 29 (continued)
129
Program Samp3; Program Samp3
CONST Rows = 3
Rows = 3; Cols = 3
Cols = 3; Name = 'Sample Program'
Name = 'Sample Program';
TYPE
Matrx = ARRAY [I..Rows,I..Cols] OF REAL
DRec = RECORD
A: Matrx';
B :INTEGER;
END; -C DRec .}
DBase = ARRAY [I..2] OF DRec;
VAR
Data:DBase;
Num,nFact:INTEGER;
Figure 30: Animation Display After Constant Declarations in Program Samp3
130
Program Samp3; Program SampS
CONST Rows = 3
Rows = 3; Cols = 3
Cols =3; Name = 'Sample Program'
Name = 'Sample Program'; DataEl] .A
TYPE undef undef undef
Matrx = ARRAY [I..Rows,I..Cols] OF REAL undef undef undef
DRec = RECORD undef undef undef
A:Matrx; Data[l].B is undefined
B :INTEGER; Data[2] .A
END; { DRec } undef undef undef
DBase = ARRAY [I..2],OF DRec; undef undef undef
VAR undef undef undef
Data:DBase; Data[2].B is undefined
Num, nFact:INTEGER; Num = 2
BEGIN { Program Samp3 } 
Num := 2;
InitD(Data[Num],3);
nFact is undefined
Figure 31: Animation Display Before Calling Procedure InitD in Program Samp3
131
Program Samp3 
Rows = 3
Procedure InitD(VAR Data:DRec;
Cols = 3
Name = ’Sample Program’
MultF:INTEGER); Data[l] .A
VAR undef undef undef
I,J :INTEGER; undef undef undef
BEGIN d Procedure InitD } undef undef undef
FOR I := I TO Rows DO Data[1].B is undefined
FOR J := I TO Cols DO Data[2].A
Data.A[I,J] := I + 101.33 * MultF; 304.99 304.99 304.99
Data.B := MultF; 305.99 305.99 305.99
END; { Procedure InitD } 306.99 306.99 306.99
Data[2] .B = 3 
Num = 2
BEGIN { Program Samp3 }
nFact is undefined
Num := 2; Procedure InitD
InitD(Data[Num],3); Data.A
nFact := Fact(Data[Num].B); 304.99 304.99 304.99
END. { Program SampS } 305.99 305.99 305.99
306.99 306.99 306.99 
Data.B = 3 
MultF = 3 
I .= 4 
J = 4
Figure 32: Animation Display at End of Procedure InitD in Program SampS
MONTANA STATE UNIVERSlTf LIBRARIES
HOUCHEN
3INDERYLTD
-JTICA/OMAHANE.
I