Execution

Once the source code has been compiled into an easily-executable form, such as bytecode, it then has to be executed. This section, like the Compilation section, is also not binding, and certainly depends upon the form into which the source code is compiled. This section defines how the standard implementation accomplishes its execution.

The Interpreter

The interpreter is the part of the MiniD virtual machine (MDVM) which actually reads the bytecode and acts accordingly. It uses 64-bit RISC-like instructions. There are three instruction formats: R, I, and J.

R-format instructions look like this:

16 bits16 bits16 bits16 bits
Opcoderdrsrt

The opcode field is obvious. rd is the destination, rs and rt are the two sources. The format of rd, rs, and rt is explained below.

I-format instructions preserve the opcode and rd fields, but combine the rs and rt fields:

16 bits16 bits32 bits
OpcoderdUnsigned Immediate

J-format instructions are almost identical to I-format, except their immediate value is signed. J-format instructions are used mostly for jumps, which need to be able to jump forward or backward.

Tagged Fields (rd, rs, rt)

rd, rs, and rt all share a similar format. They are all 16 bits, but the upper two bits are used as a location tag and the lower 14 bits as an index of some sort.

There are four possible locations where a value may reside in MiniD: a local variable in the current function, a constant in the constant table of this function, an upvalue (local in an enclosing function), or a global. These four locations are indicated by the upper two bits of rd, rs, and rt. The index is interpreted differently based on the tag:

  • Local: the index is interpreted as an index into the local stack frame. Local 0 is always the 'this' pointer. When disassembled, this type of operand is denoted as 'rn', where n is the index.
  • Constant: the index is interpreted as an index into the constant table for this function. When disassembled, this type of operand is denoted as 'cn', where n is the index.
  • Upvalue: the index is interpreted as an index into the upvalue table for this function. When disassembled, this type of operand is denoted as 'un', where n is the index.
  • Global: the index is interpreted as an index into the constant table for this function. That constant holds a string name of the global. When disassembled, this type of operand is denoted as 'gn', where n is the index of the name constant.

Note that rd, being the destination, can never be a 'cn' operand.

Paired Instructions

Some operations either need more data than can fit into a single instruction, or there are pairs of instructions which never occur apart and can be executed as a single operation. The first instruction never occurs without the second. The following pairs exist:

  • cmp - jxx: A cmp instruction is always followed by some jump instruction (one of jlt, jle, or je). These pairs are executed as a single operation to do a conditional jump.
  • swcmp - jxx: swcmp is "switch comparison", and is a special kind of comparison used in switch statements. Just like a regular cmp, it's followed by some jump instruction.
  • is - jxx: The is instruction is used to implement 'is' expressions, and is just like cmp.
  • istrue - jxx: istrue is used to implement conditional expressions based on some non-comparison expression (like "if(x){}"). Works the same as cmp.
  • foreach - je: The foreach instruction advances iteration in a foreach loop, and will repeat the loop if there are more values. The paired je specifies the instruction offset to the beginning of the loop.
  • method - [t]call: This pair of instructions implements method calls without custom contexts (i.e. "x.f()"). The second instruction can be either call or tcall. tcall performs a tailcall, call performs a normal call.
  • precall - [t]call: This pair of instructions implements non-method calls and method calls with custom contexts ("f()", "f(with x)", and "x.f(with y)"). Again, the second instruction can be call for a normal call or tcall for a tailcall.
  • closure [- {mov}]: This is kind of odd. closure creates a new function closure from a nested function. It's followed by 0 or more mov instructions. These are used for upvalues, and there will be as many movs as the nested function has upvalues. Note that these are not actually moves: they are just there to embed data in the instruction stream.