2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
COMP 2012 Object-Oriented Programming and Data
Structures
Assignment 2 Assembly Emulator
Introduction
In this assignment, you will be practicing the following concepts:
Inheritance
Polymorphism
Pointers and References
Abstract Base Class
Update (Mar 20 0030):
Since some students have issues with the library archive
version of the assignment, we will switch to using the source-only version from now
on. Please check the Download section for the latest skeleton code.
For those of you who have already started the assignment, you can download the new
skeleton code, and copy-paste the AST.h and InstructionAST.cpp directly into the new
skeleton code.
The recording of the briefing session can be found here, and the accompanying slides
can be found here.
We value academic integrity very highly. Please read the Honor Code section on our course
webpage to make sure you understand what is considered as plagiarism and what the
penalties are. The following are some of the highlights:
Do NOT try your "luck" - we use sophisticated plagiarism detection software to find
cheaters. We also review codes for potential cases manually.
The penalty (for BOTH the copier and the copiee) is not just getting a zero in your
assignment. Please read the Honor Code thoroughly.
Serious offenders will fail the course immediately, and there may be additional
disciplinary actions from the department and university, upto and including expulsion.
Glossary
Menu
Introduction
Glossary
Background
Assignment Details
Download
Sample Output and
Grading Scheme
PA2 Q&A on Piazza
Submission &
Deadline
FAQ
Page maintained by
MAK, Ching Hang
(David)
Email:
[email protected].
hk
Last Modified:
03/21/2021 184956
Homepage
Course Homepage
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 2/15
This section will mainly cover the key terminology used in the rest of this description.
Assembly Language
Assembly language generally refer to low-level programming languages which can be
mapped one-to-one to instructions understood by your processor. Unlike C++ (which needs
to be compiled), assembly can be directly executed on your processor; In fact, C++ is
compiled into assembly.
Instruction Set Architecture
An Instruction Set Architecture (or ISA for short) is a definition of the internals of a
processors. This usually includes the list of instructions the architecture needs to support,
how memory should be addressed, and how many and what kind of registers must be
present.
Currently, most desktop and laptop computers use x86-64 (developed by Intel and
augmented by AMD), whereas most mobile devices use AArch64 (developed by ARM).
Those of you who have taken COMP2611 will also know MIPS.
Opcode
Opcodes (Operation code) are names for instructions which describe what the operation
does. You can think of an opcode as a function name in C++.
Operand
Operands are parts of an instruction which specifies the input/output of the instruction. You
can think of each instruction as a function call in C++; the operands will then be the
function's formal parameters (arguments).
Register
Registers are small chunks of very fast memory residing within the processor; Instructions
usually operate on registers.
Stack
A program stack is a chunk of memory which stores variables within a function.
Abstract Syntax Tree (AST)
An abstract syntax tree is a tree which represents the syntactic structure of an expression.
For example, the instruction mov r11, sp will generate the following AST:
MovInstruction
|-RegisterOperand r11
|-RegisterOperand sp/r13
Since mov is an instruction which has two operands, under MovInstruction there are two
RegisterOperands.
The following shows a table of corresponding concepts between our flavor of assembly and
C++.
Assembly C++
Opcode Function Name
Instruction Function Call
Operands Formal Parameters (Arguments)
Register A 32-bit memory residing within the processor.
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 3/15
Background: Emulators & Processors
In this assignment, you will be implementing a small part of an ARM emulator. Emulation is
usually used when you want your machine to "pretend" to be another machine. For example,
mobile application developers use emulators to run a version of Android or iOS on their local
machine for development.
Since an emulator is effectively a "software processor", we will need to replicate some parts
of the processor in our emulator.
Note: You do not need to implement everything stated below, it is just an overview for
you to understand the parts that have been implemented for you (and that you will
need to use).
Assembly?
As you may know, your compiler compiles C++ source code into assembly code. Since the
flavor of assembly is depedent on the processor it is executed on, different compilers on
different platforms running different processors will generate different assembly code.
To demonstrate this example, consider the following C++ snippet.
This will be compiled to the following ARMv7-A assembly (the ISA we will be emulating).
Note that @ are comments, similar to // in C++:
getInt():
mov r0, #0 @ Set 0 as the return value
bx lr @ Return from getInt()
main:
push {lr} @ Store the link-register into the stack;
@ lr will be overwritten in the next
instruction
bl getInt() @ Call getInt; Return value will be in r0
add r0, r0, #1 @ Add 1 to the value returned by getInt()
pop {lr} @ Load the link-register from the stack
@ This is to ensure we "return" to the
correct location
bx lr @ Return from main()
As you can see, a lot of operations are actually similar between C++ and Assembly, although
Assembly language is significantly more verbose. There are a lot more things you need to do,
for example, "spilling" register to the stack if we are jumping between functions, and many
"one-liners" in C++ are now done with multiple instructions.
Don't worry if you don't understand what each instruction means or does in the above
example. More detailed explanation will be given in the latter parts of this description.
Instructions
The most important functionality of a processor is to "perform some work"; These actions
are usually represented in processors as instructions. Instructions are single specific actions
which a processor is able to perform.
In assembly programs, instructions are usually represented as an opcode and operands. An
opcode is a name for an instruction, whereas operands are "things" that an instruction
operate on.
int getInt() { return 0; }
int main() {
return getInt() + 1;
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 4/15
For example, consider the following example:
add r1, r0, #1
add r1, r0, #1 represents an instruction.
add is the opcode.
r1, r0, #1 are all operands to the instruction.
Translating this to C++, this would be equivalent to:
void add(Register& op1, const Register& op2, unsigned op3);
add(r1, r0, 1);
In other words, using C++ terminology:
Opcodes are function names.
Instructions are function calls.
Operands are formal parameters (arguments).
In the above example, you will see that the first two operands are prefixed with r, whereas
the third operand is prefixed with #. These denote different types of operands.
r-prefixed operands represent registers (see below).
-prefixed operands represent immediates, i.e. constant values.
Operands enclosed in square brackets ([]) represent indirect operands, i.e. a
dereferenced memory location.
For example, the operand [r0, #4] refers to the memory location pointed to by
register r0, offset by 4 (i.e. ((char) r0 + 4) in C++).
Operands enclosed in curly brackets ({}) represent register lists, i.e. a non-empty list
of registers.
For example, the operand {r11, sp} represents a register list containing the registers
r11 and sp.
In our emulator, operands are implemented as classes in OperandAST.h.
Note: Operand classes and their implementation are provided to you; You do not need
to do anything.
Registers
While in C++ we have learned to see variables as bytes which have memory addresses,
accessing the main memory from the processor is actually slow (approximately 10
seconds). While this may sound fast, a modern processor clocks at up to 5GHz, meaning
that the same time used to access main memory can also be used to execute up to 500
instructions!
Enter the register. Registers are super-fast memory that resides on the processor itself, and
usually takes 1 clock cycle (i.e. 10 seconds) to access. This is why most instructions tend
to use registers as their operands, as they are super-fast and therefore the processor can do
more work instead of waiting for data to come around.
Most Instruction Sets these days will say how many registers the instruction set needs to
support and what size each register should be. Our emulator is loosely based on ARMv7-A,
so we will have 16 registers, each 32 bits wide.
Of note in these 16 registers:
r0 is usually used to store the return value of a function.
r13 (also known as sp) is the stack pointer; See the below section.
r14 (also known as lr) is the link register. The link register stores which instruction the
processor should jump to when this function returns.
r15 (also known as pc) is the program counter. The program counter stores the next
instruction the processor should execute.
-7
-9
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 5/15
In our emulator, you can obtain a register using u32& Processor::getRegister(u8 reg),
where reg is the register index you want to access.
Note: The implementation for the registers are provided to you.
Stack Memory
In C++ we always define variables in our function to help us implement our programs, and we
know that our variables are generally allocated on the stack. What is the stack then?
A stack data structure (as you may recall from COMP2011) is a data structure which enables
LIFO (Last In, First Out) behavior. This same principle can be applied to variables within
functions.
In assembly code, we generally use the stack as a "backing store" when there are not
enough registers to store the variables in our program.
Important: When the stack grows, the memory address decreases; When the stack
shrinks, the memory address increases. In other words, the "last-in" variables are
stored in lower (smaller) addresses, whereas "first-in" variables are stored in higher
(bigger) addresses.
In our emulator, you can access the stack memory using void MemStack::store(u32 value,
u32 addr) and u32 MemStack::load(u32 addr) to store into and load from the stack
memory respectively.
Note: The implementation for Stack Memory is provided to you; You do not need to do
anything.
Abstract Syntax Tree
An abstract syntax tree (AST) is a tree used to display the syntactical structure of a program.
The benefit of using ASTs is that the meaning of each instruction (and the entire program)
becomes unambiguous, making it easy to write what each instruction does.
In our emulator, all AST nodes are derived from the base class AST (which you will need to
write). All derived AST classes are suffixed with AST.
Library Archive
Since an emulator is very complex, we have already written some parts of the emulator for
you. These parts are distributed as a part of an archive file (.a)
An archive file is generally used when distributing libraries (i.e. a bundle of C++ functions for
external use).
Assignment Details
Since an emulator is very complex, we have already written some parts of the emulator for
you. These parts are distributed as a part of an archive file (.a). There are 15 classes you will
need to implement in 1 cpp file (around 250 lines), and 1 header file you will need to write.
Generated documentation for the header files can be found here.
Custom Integer Data Types
Unlike software, hardware often have more specific data type size requirements; For
example, the Instruction Set we are emulating explicitly states that registers must be 32 bits
wide. However, as mentioned in COMP2011, int and other data types in C++ only have a
minimum size requirement (e.g. int must be at least 16 bits). Therefore, we have provided
some custom data types which are guaranteed to have fixed sizes across all platforms.
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 6/15
These data types are defined as {s/u}{bits}, where s means its a signed integer, u means
its an unsigned integer, and bits is the number of bits of the data type (8/16/32/64 are
defined). Therefore, for example, u32 represents unsigned 32-bit integer, and s16 represents
signed 16-bit integer.
These data types are defined in Util.h.
Unsigned Overflow Behavior
While you have learned the difference between signed and unsigned integers in COMP2011,
there are other intricacies regarding these data types (which are relevant to this
assignment).
In the C++ standard, signed integer overflow is undefined behavior, meaning that there are
no guarantees what will happen. A notorious example is:
One may assume that if i is the largest number representable by int, i + 1 will overflow and
must wraparound (i.e. become 0) and thus this works to check whether an overflow may
occur. However, since this is undefined behavior, a compiler implementation may optimize it
like this:
In a mathematical point-of-view, this makes a lot of sense; In fact, recent versions of GCC
and Clang do perform this kind of optimization, as how can adding one to a value be smaller
than the value itself?
This brings us to a "special property" of unsigned arithmetic operations: overflows will
always wraparound. In other words, it is guaranteed by the standard that adding one to the
largest unsigned number of the data type will always wraparound to 0, and subtracting 1
from 0 will always wraparound to the largest unsigned number.
In addition, given a (two's-complement) signed and an unsigned number with the same bitrepresentation,
performing arithmetic operations on either number will return the same
result; In other words:
These properties may be useful when implementing addition and subtraction operations.
Source Files to Implement
Within the 15 classes, there are generally 4 types of functions to implement.
Constructor: Implement the constructor of the class.
Destructor: Implement the destructor of the class.
Accessors: Implement the accessors of the class; They may return a pointer or a
reference depending on whether the value is optional or not.
bool checkOverflow(int i) {
return i > (i + 1);
bool checkOverflow(int i) {
return false;
int lhs = 1;
int rhs = -2;
int sum = lhs + rhs;
// For demonstration purposes only; This is not technically
correct.
int uSum = static_cast
static_cast
sum == uSum;
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 7/15
eval(Processor& processor): Implement the evaluation function of the class; This
executes what is specified by the instruction on the Processor instance passed as a
reference parameter.
Note that if the dummy implementation is not provided to you in InstructionAST.cpp, the
function has been implemented for you in the library archive.
The description for each instruction is briefly described below; Refer to the header files for
more information.
nop (NopInstructionAST): Does nothing.
mov (MovInstructionAST): Copies the value of a register or constant into another
register.
mDst: The destination register to copy to.
mSrc: The source register to copy from. (Only applicable to
RegMovInstructionAST).
mImm: The 32-bit value to copy from. (Only applicable to ImmMovInstructionAST).
For example:
mov r1, r0 copies the value of register r0 into register r1.
mov r1, #1 copies the value 1 into register r1.
str (StrInstructionAST): Stores the value of a register onto the stack.
mSrc: The register whose content needs to be stored.
mDst: The memory location to store the register content to.
For example:
str r0, [r1] stores the value of register r0 into the memory address pointed to
by r1.
str r0, [r1, #8] stores the value of register r0 into the memory address
pointed to by r1, additionally offseting the memory address by 8 bytes.
ldr (LdrInstructionAST): Loads a value from the stack into a register.
mDst: The register to load to.
mSrc: The memory location to load the register content from.
For example:
ldr r0, [r1] loads into the register r0 by reading from the memory address
pointed to by r1.
ldr r0, [r1, #8] loads into the register r0 by reading from the memory address
pointed to by r1, additionally offseting the memory address by 8 bytes.
push (PushInstructionAST): Stores the values of the specified registers onto the stack,
and shifts the stack pointer.
mRegList: The list of registers whose content needs to be stored into the stack.
For example:
push {r11, sp} decrements the stack pointer by 8 bytes (2 × 32-bit registers),
then stores the value of register r11 and register sp into the memory pointed to
by sp, with r11 storing in a lower (i.e. smaller) memory address than sp.
pop (PopInstructionAST): Loads the values from the stack into the specified registers,
and shifts the stack pointer.
mRegList: The list of registers whose content needs to be loaded from the stack.
For example:
pop {r11, sp} loads the value of the register r11 and register sp by reading from
the memory address pointed to by sp, with r11 reading from a lower (i.e. smaller)
memory address, then incrementing the stack pointer by 8 bytes (2 × 32-bit
registers).
add (AddInstructionAST): Adds the value of the two operands and stores into a
register.
mDst: The destination register to store the result to.
mSrc1: The register of the first operand.
mSrc2: The register of the second operand. (Only applicable to
RegAddInstructionAST).
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 8/15
mImm: A 32-bit value as the second operand. (Only applicable to
ImmAddInstructionAST).
For example:
add r1, r1, r0 adds the value of registers r1 and r0, then stores the result in
register r1.
add r1, r1, #1 adds the value of register r1 with the constant value 1, then
stores the result in register r1.
sub (SubInstructionAST): Subtracts the value of the second operand from the first
operand and stores into a register.
mDst: The destination register to store the result to.
mSrc1: The register of the first operand.
mSrc2: The register of the second operand. (Only applicable to
RegSubInstructionAST).
mImm: A 32-bit value as the second operand. (Only applicable to
ImmSubInstructionAST).
For example:
sub r1, r1, r0 subtracts the value of register r0 from register r1, then stores
the result in register r1.
sub r1, r1, #1 subtracts the constant value 1 from the register r1, then stores
the result in register r1.
Header Files
You will also need to implement the header file for the abstract base class of AST and
ExprAST. AST is the abstract base class for all AST nodes, and ExprAST is the abstract base
class for AST nodes which represent an expression.
For AST, the class must satisify the following requirements:
There is a virtual destructor with an empty function body.
There is a const pure virtual function getTokenName which returns const char*.
There is a const pure virtual function print which accepts a parameter indent of
type u32 and returns void.
For ExprAST, the class must satisify the following requirements:
The class should inherit from AST.
There is a virtual destructor with an empty function body.
There is a const pure virtual function getTokenName which returns const char*.
There is a const virtual function print which accepts a parameter indent of type u32
and returns void.
Note that you do not need to implement ExprAST::print(u32); This has been implemented
for you as part of the library archive.
Other Implemented Classes
You can refer to the generated documentation for a description of classes already
implemented for you
Useful Implemented Functions
To aid you in your implementation, we have written the rest of the emulator for you. Below is
a list of possibly useful functions that you may need to use.
u32& Processor::getRegister(u8): Obtains a reference to the specified register.
MemStack& Processor::getStack(): Obtains a reference to the stack memory.
All public member functions in OperandAST.h
Useful Debugging Functions
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 9/15
To aid you in debugging your code, we have written some helper functions to output
additional information.
Under most circumstances, the most useful function for debugging is Processor::dump,
which outputs the current state of the processor, including instruction to-be-executed, the
current register contents, and the current stack contents. A typical output will look
something like this:
Current Instruction:
MovInstruction
|-RegisterOperand r11
|-RegisterOperand sp/r13
Registers:
r0 0x00000000 [dec:0]
r1 0x00000000 [dec:0]
r2 0x00000000 [dec:0]
r3 0x00000000 [dec:0]
r4 0x00000000 [dec:0]
r5 0x00000000 [dec:0]
r6 0x00000000 [dec:0]
r7 0x00000000 [dec:0]
r8 0x00000000 [dec:0]
r9 0x00000000 [dec:0]
r10 0x00000000 [dec:0]
r11 0x00000000 [dec:0]
r12 0x00000000 [dec:0]
sp/r13 0x7ffffff8 [offset:8]
lr/r14 0xffffffff [label:
pc/r15 0x00050002 [label:main instr:0x0002]
Stack Memory Contents:
[0:0x7fffffff~0x7ffffffc] 0xffffffff
[1:0x7ffffffb~0x7ffffff8] 0x00000000
Current Instruction: Displays the AST of the instruction to be executed.
In this example, the AST shows the equivalent of mov r11, sp.
Registers: Displays the current (hexadecimal) values of each register in the processor.
The [dec:0] field shows the decimal representation of the register values.
sp: The [offset:8] field shows the location of the stack pointer relative to the
top of the stack.
In this example, our stack pointer is currently 8 bytes below the top of the stack.
lr: The [label:
offset of the label when "returning" from this function.
In this example, returning from this function will cause the program to jump to a
similar to returning from main.
pc: The [label:main instr:0x0002] fields show the name and instruction offset
of the label which will be executed next. Note that label indices start from 0.
In this example, the next instruction that will be executed is 2nd instruction in the
main label.
Stack Memory Contents: Displays the current (hexadecimal) values present in the
stack. Note that this will only show values up to the current location of the stack
pointer.
The first number in the square brackets (0) shows the index of the value relative
to the top of the stack.
The hexadecimal range in the square brackets (0x7fffffff~0x7ffffffc) shows
the memory range which the value occupies.
The hexadecimal outside the square brackets shows the current value of the
memory range.
In our example, our stack would then look something like this:
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 10/15
Top of Stack
| | 0x7fffffff
| 0xffffffff |
0x7ffffffc | |
---|---|
0x7ffffffb | |
0x00000000 | |
0x7ffffff8 |
---------------- <-- stack pointer (0x7ffffff8)
You may also set the first parameter of the constructor (Processor::Processor(bool)) to
true to enable outputting the processor state on every instruction.
There are also functions to only print the state of the memory stack (MemStack::dump) and
the AST of the current instruction (AST::print) or program (Program::printAST).
Testing Your Program
Since you are not expected to write assembly for this course, we have provided a small
number of assembly programs for your to test your program. These programs are in
main.cpp of the skeleton code.
In addition, you may also compile your own assembly programs using Compiler Explorer,
which is a tool to inspect the assembly code generated by compilers. You can type a C++
program on the left-hand side pane., and it will generate assembly on the right-hand side
pane.
Note that there are several limitations.
Only a very small subset of instructions are supported by our emulator. If you see the
message Unknown instruction in the program, that means that the specified
instruction is not supported by our emulator.
Most of the functions in the C++ standard library are not supported.
Allocating to the heap (i.e. new and delete) is not supported.
As such, we suggest writing programs which (1) do not need to include any headers, (2) only
performs stack allocation, (3) does not have any struct or class. Branch instructions have
been implemented for you, so conditional statements and loops are also supported.
This assignment uses additional C++ features and/or paradigms which are not covered in
class.
Pointer Ownership
In class, you will have learned that in most cases, a class which allocates memory from the
heap should be responsible for freeing it. This is generally referred to "pointer ownership",
where the responsibility of managing the pointer (including freeing it) lies within the class.
However, sometimes the ownership of pointers may be transferred to other classes or
variables. Take the following example:
You will notice that if we don't delete ptr in line 6, we will have created a memory leak.
However, it is impossible for createInt to free the pointer either, because there is no
mechanism for doing so!
int* createInt(int i) { return new int{i}; }
int main() {
int* ptr = createInt(2);
int ret = *ptr;
delete ptr;
return ret;
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 11/15
In this case, we say that the ownership of the heap-allocated memory (referenced by the
variable ptr) is transferred to the caller of createInt, meaning that it is now the
responsibility of the main function to call delete on ptr.
In our assignment, all constructor parameters will have their ownership transferred to your
class, meaning that you will need to call delete on these pointers on destruction.
Pointer-to-Implementation
Throughout the provided classes, you will see a common coding pattern similar to this:
This paradigm is called "pointer to implementation" (or pImpl for short). The benefit of using
this paradigm is that since the real implementation (including data members) are hidden in a
cpp file, you do not need to change the class definition in the header file when adding,
changing, or removing members, meaning that there is no need to recompile files which
depend on the class definition.
You can read more about this here.
Anonymous Namespaces
Anonymous namespaces have the same effect as declaring static for a global variable: It
restricts the variable to have internal linkage, i.e. the variable is only visible within the same
C++ source file.
For more details, please refer to the self-study notes.
Raw String Literals
In the provided main.cpp and AssemblyProgram.cpp, you will see some usages of R"()".
This syntax (introduced in C++11) is called a "raw string literal", and it is used when you want
to treat the entire string as-is. As an example:
This is very convenient when you want to copy-and-paste a multiline string without adding
\n yourself.
Function Pointer
In AssemblyProgram.cpp, you will see a function with the following prototype:
bool testProgram(
const std::string& input,
void(* setup)(Processor&),
u32 expected,
u32(* actualExpr)(const Processor&),
const std::string& cmpValueName
);
The syntax for the second and forth parameters will look foreign to most of you. To explain
what they are, let us revisit lambdas from the beginning of the semester...
Back then, the instructors used auto to let the compiler deduce the type of lambdas. This is
because the types of lambdas are what is known as a closure type. Closure types are
internally defined by the C++ standard, and each lambda you define will have a unique
class SomeClass {
public:
// public member functions...
private:
struct Impl;
Impl* mImpl;
const char* stringLiteral = "abc\ndef";
// Can be written as:
const char* rawStringLiteral= R"(abc
def)";
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 12/15
unnamed closure type. Since this closure type is unnamed, there is no way of declaring the
type of a lambda variable, and therefore the common practice is to use auto to let the
compiler determine the type.
Function pointers, as their name suggests, are pointers to functions. Before lambdas were
introduced in C++11, function pointers are used to select which function to call at runtime
(without the need for classes and inheritance) as well as higher-order functions. The
following example demonstrates runtime selection of functions:
The function pointer parameter can be seen in the parameter double (*fn)(double). The
name of the parameter is fn, the pointer is pointing to a function of type double(*)(double),
i.e. a function which has one parameter (double) and returns a double. By explicitly
specifying the type of function we want in operateOnDouble, we can make sure that callers
pass in a function (or lambda) with the correct number of parameters and return type.
Some of you may notice that we added + to the lambda variable before passing into the
function pointer parameter. This is a way to cast a lambda into a function pointer.
However, note that only lambdas with no captures can be cast to a function pointer.
Note that there is a class in the Standard Template Library which allows capturing all kinds of
lambdas and function pointers (std::function), but this will not be covered in this
description or this course.
double operateOnDouble(double (*fn)(double), double d) {
return fn(d);
}
double timesTwo(double d) {
return d * 2;
}
int main() {
auto square = [](double d) { return d * d; };
operateOnDouble(×Two, 3); // 6
operateOnDouble(+square, 3); // 9
Download
Skeleton code: skeleton.zip
Note that you should only change and submit AST.h and InstructionAST.cpp. You do not
need to understand the implementation of the provided C++ source files.
This is a Makefile project. Just do what you did in lab 1 to put that in your VS Code / Eclipse /
other IDEs. In your terminal, running make will create a single executable pa2.exe. See the
Sample Output and Grading Scheme section for more information about the test cases.
Note that all header files are provided in the include directory. You will not need to know
what all classes or functions do; they are just provided for your reference.
Generated documentation for the header files can be found here.
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 13/15
Sample Output and Grading Scheme
We have provided a testing program for each instruction you need to implement, as well as
four full programs to test your instructions in the context of a real program. The prototype for
these programs can be found in AssemblyProgram.h.
To use the testing programs, launch the executable with a single argument naming the
instruction you want to test, for example ./pa2.exe nop. If the test cases passes, the output
will be PASSED; Otherwise, some information about the expected/actual values will be
outputted.
To use the full programs, simply replace the right-hand operand of const std::string
input = in main.cpp to the program you want to test.
Note that on ZINC, we will use Valgrind to help you catch runtime issues for visible test
cases, and we will use sanitizers (similar to PA1) for hidden test cases after grading. Please
be reminded to check the Valgrind section to make sure that your program has no memory
leaks, uninitialized variables, undefined behavior etc.
After the submission deadline, we will use our own set of test cases for testing, so be sure to
design your own test cases to test your program. Be reminded to remove any debugging
message that you might have added before submitting your code.
Your program should terminate without any memory leak. Your program will need to have 0
memory leak as well as correct output. Please note the sample outputs are only for you to
test your program on. The actual test cases are hidden and would only be used by us for the
grading. Therefore, simply printing out all the outputs using cout does NOT work. We would
grade your work by comparing the outputs of your program when run on our HIDDEN test
cases.
Submission and Deadline
Your solution to this assignment must be based on the given skeleton code provided in the
Download section.
Your task is to complete the following:
All missing implementations in InstructionAST.cpp.
The class definitions of AST and ExprAST in AST.h.
These are the only 2 files you are supposed to modify/create, zip, and then submit to
ZINC.
Comments in the corresponding header files list out the detailed requirements. The
description above supplements only the missing information. You need to read both the
source/header files and the webpage description carefully to get the whole picture.
Read the FAQ page for some common clarifications. You should check that a day before the
deadline to make sure you don't miss any clarification, even if you have already submitted
your work then.
If you need clarification of the requirements, please feel free to post your question on Piazza
with the pa2 tag. However, to avoid cluttering the forum with repeated/trivial questions,
please do read all the source code comments, webpage description, sample output,
and the latest FAQ (refresh this page regularly) carefully before you post your
questions. You should also check if your question has been asked before posting one. Nontrivial/non-repeated
questions would be prioritized. Please note that we won't debug for any
student's assignment for fairness.
Deadline: 235900
on Apr 1, 2021 (Thursday)
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 14/15
Please zip two files only: InstructionAST.cpp and AST.h. Zip only these 2 files, NOT a folder
containing them. Submit the zip file to ZINC.
Notes:
You may submit your file multiple times, but only the latest version will be graded.
Submit early to avoid any last-minute problem. Only ZINC submissions will be
accepted.
The ZINC server will be very busy in the last day especially in the last few hours, so you
should expect you would get the grading result report not-very-quickly. However, as
long as your submission is successful, we would grade your latest submission with all
test cases after the deadline.
Read the late submission policy here.
Compilation Requirement
It is required that your submissions can be compiled and run successfully in our online
autograder ZINC. If we cannot even compile your work, it won't be graded. Therefore, for
parts that you cannot finish, just put in dummy implementation so that your whole program
can be compiled for ZINC to grade the other parts that you have done. Empty
implementations can be like:
Late submission policy
There will be a penalty of -1 point (out of a maximum 100 points) for every minute you are
late. For instance, since the deadline of the assignment is 235900
on Apr 1st, if you submit
your solution at 10000
on Apr 2nd, there will be a penalty of -61 points for your assignment.
However, the lowest grade you may get from an assignment is zero: any negative score after
the deduction due to late penalty (and any other penalties) will be reset to zero.
int SomeClass::SomeFunctionThatIsTooDifficultToImplementForMe()
{
return 0;
}
void SomeClass::AnotherFunctionThatIsTooDifficultToImplementForMeOhN
{
}
FAQ
Frequently Asked Questions
03-21 Q: What are the expected values of the provided programs in AssemblyProgram.h?
A: Assuming you did not change the implementation of those programs:
progReturns: 1
progTwoNumMax: 2
progIterativeAdd: 45
progRecursiveAdd: 55
03-21 Q: Why is the final output of my assembly program so big when I expect it to be
negative?
A: That is because in the provided main.cpp when printing the final value of the register r0,
the register is outputted as an unsigned number. Use this to convert that number into a
signed number and check whether it matches your expectation.
03-21 Q: Should we treat immediate operands as signed or unsigned?
A: Read the Unsigned Overflow section of this description again. That section should tell you
which representation to use and why.
2021/3/21 COMP 2012 Assignment 2: Assembly Emulator
https://course.cse.ust.hk/com... 15/15
Maintained by COMP 2012 Teaching Team © 2021 HKUST Computer Science and Engineering
03-21 Q: The provided program progTwoNumMax doesn't seem to work correctly when one
number is positive and one number is negative.
A: This is a bug in the provided implementation. However, there will be no test cases (visible
or hidden) which checks for this, as this is already provided for you.
03-21 Q: Why can't I use programs which contains labels with commas (e.g. max(int,
int):)?
A: This is a bug in the provided implementation. As a workaround, you can change all
commas to an underscores (_).
03-21 Q: What does the Stack Pointer (sp) register actually store?
A: The Stack Pointer register actually stores the memory location of the bottom (i.e. end) of
the stack. By moving the pointer downwards/upwards you are growing/shrinking the stack
respectively. This is why when you execute the push instruction, you need to move the Stack
Pointer downwards as you need to make space in the stack to store the contents of the
registers into the stack space.
03-21 Q: Why can't I get the value of the registers using RegisterOperandExprAST and
RegisterListOperandExprAST?
A: This is because all OperandExprAST only store the names of their operand, not the actual
references to the registers themselves. What you want to do is use the provided functions to
get the register index, and pass that index to a function like Processor::getRegister to
actually get the value stored in the register.
03-21 Q: When I use mov (or other immediate operand instructions) with a value larger than
2147483647, my program crashes with terminate called after throwing an instance
of 'std::out_of_range'!
A: This is a bug in the provided implementation. In order to test larger values than
2147483647, you need to convert your number into the signed representation by using a site
like this, and use that value in the immediate instead. For example, if you want to use the
immediate value #4294967295, you convert it into signed representation (which is #-1) and
use that instead. Note that you should still make sure that the unsigned wraparound
behavior (as described earlier in this description) is adhered to.
03-18: Q: Is it normal that I get warning: unused parameter 'processor' for
NopInstructionAST::eval?
A: Yes. The compiler is warning because you are not using the parameter in the function
body. To suppress this warning, you can cast the parameter to void, i.e. static_cast
(processor) or (void) processor.
Q: My code doesn't work / there is an error, here is the code, can you help me fix it?
A: As the assignment is a major course assessment, to be fair, you are supposed to work on
it on your own and we should not finish the tasks for you. We might provide some very
general hints to you, but we shall not fix the problem or debug for you.
Q: Can I add extra helper functions?
A: You may do so in the files that you are allowed to modify and submit.
Q: Can I use global variable or static variable such as static int x?
A: No.
Q: Can I use auto?
A: No.