Execution

Once the source code has been compiled into an easily-executable form, such as bytecode, it then has to be executed. This section, like the Compilation section, is also not binding, and certainly depends upon the form into which the source code is compiled. This section defines how the standard implementation accomplishes its execution.

This page is more for me than for other implementors; sometimes I forget my rationale for adding certain opcodes or for implementing them a certain way.

The Interpreter

The interpreter is the part of the MiniD virtual machine (MDVM) which actually reads the bytecode and acts accordingly. It uses 64-bit RISC-like instructions. There are three instruction formats: R, I, and J.

R-format instructions look like this:

16 bits16 bits16 bits16 bits
Opcoderdrsrt

The opcode field is obvious. rd is the destination, rs and rt are the two sources. The format of rd, rs, and rt is explained below.

I-format instructions preserve the opcode and rd fields, but combine the rs and rt fields:

16 bits16 bits32 bits
OpcoderdUnsigned Immediate

J-format instructions are almost identical to I-format, except their immediate value is signed. J-format instructions are used mostly for jumps, which need to be able to jump forward or backward.

Tagged Fields (rd, rs, rt)

rd, rs, and rt all share a similar format. They are all 16 bits, but the upper two bits are used as a location tag and the lower 14 bits as an index of some sort.

There are four possible locations where a value may reside in MiniD: a local variable in the current function, a constant in the constant table of this function, an upvalue (local in an enclosing function), or a global. These four locations are indicated by the upper two bits of rd, rs, and rt. The index is interpreted differently based on the tag:

  • Local: the index is interpreted as an index into the local stack frame. Local 0 is always the 'this' pointer. When disassembled, this type of operand is denoted as 'rn', where n is the index.
  • Constant: the index is interpreted as an index into the constant table for this function. When disassembled, this type of operand is denoted as 'cn', where n is the index.
  • Upvalue: the index is interpreted as an index into the upvalue table for this function. When disassembled, this type of operand is denoted as 'un', where n is the index.
  • Global: the index is interpreted as an index into the constant table for this function. That constant holds a string name of the global. When disassembled, this type of operand is denoted as 'gn', where n is the index of the name constant.

Note that rd, being the destination, can never be a 'cn' operand.

Opcodes

EnumName (disassembled name)

Arithmetic and Bitwise

Add (add)
Sub (sub)
Mul (mul)
Div (div)
Mod (mod)
DUH.
Neg (neg)
HURR DURR.
AddEq (addeq)
SubEq (subeq)
MulEq (muleq)
DivEq (diveq)
ModEq (modeq)
LOL WUT.
Inc (inc)
Dec (dec)
FFFFFFF
And (and)
Or (or)
Xor (xor)
Shl (shl)
Shr (shr)
UShr (ushr)
FORBLE
Com (com)
BORBLE
AndEq (andeq)
OrEq (oreq)
XorEq (xoreq)
ShlEq (shleq)
ShrEq (shreq)
UShrEq (ushreq)
FOOBAR

Data Transfer

Move (mov)
Move!
MoveLocal (movl)
Local-to-local move.
LoadConst (lc)
Load const at index RS into local at index RD.
LoadBool (lb)
Load bool with value of RS into RD.
LoadNull (lnull)
Load null into RD.
LoadNulls (lnulls)
Set IMM stack slots starting at index RD to null.
NewGlobal (newg)
Attempt to create new global with constant string at index RT and assign it the value RS.

Logical and Control Flow

Import (import)
Import module with name RS and place its namespace in RD.
Not (not)
Set RD to the inversion of the truth value of RS.
Cmp (cmp)
Compare RS and RT. Always followed by Jxx.
Je, Jne (je, jne)
Jle, Jgt (jle, jgt)
Jlt, Jge (jlt, jge)
DUHH HUHHH
Equals (equals)
Compare RS and RT for equality. Always followed by J[n]e.
Cmp3 (cmp3)
Set RD to the 3-way comparison value of RS and RT.
SwitchCmp (swcmp)
Perform switch-comparison (half-breed of 'is' and '==') on RS and RT. Always followed by Je.
Is (is)
Perform identity comparison of RS and RT. Always followed by J[n]e.
IsTrue (istrue)
See if RS is true. Always followed by J[n]e.
Jmp, Nop (jmp, nop)
Unconditional jump, or nop. Don't know if compiler still emits nop?
Switch (switch)
Switch on the value in RS in switch table at index RT. If a value matches, jump to its offset; otherwise, if there is a default, jump to it; otherwise error.
Close (close)
Close all upvalues pointing to stack slots >= index in RD.
For (for)
Set up a numeric for loop. Slot at RD is the low limit, and is reused as the hidden iteration index. RD + 1 is high, RD + 2 is step. After setting up the loop vars, jump forward by IMM to corresponding ForLoop instruction.
ForLoop (forloop)
Update the for loop. Hidden iteration index is at RD, limit at RD + 1, step at RD + 2, visible iteration index at RD + 3. If the loop is to continue, jump backwards by IMM.
Foreach (foreach)
Set up a foreach loop. Container is at RD, RD + 1, RD + 2. After setting up container, jump forward by IMM to corresponding ForeachLoop instruction.
ForeachLoop (foreachloop)
Update the foreach loop. Always followed by Je which holds the jump target. Container is at RD, RD + 1, RD + 2. Public indices begin at RD + 3, and there are IMM of them. If the loop is to continue, jump to the jump target specified in the following Je.

Exception Handling

PushCatch (pushcatch)
Push a catch frame onto the EH stack. Executed at the beginning of the try of a try-catch.
PushFinally (pushfinal)
Push a finally frame onto the EH stack. Executed at the beginning of the try of a try-finally.
PopCatch (popcatch)
Pop an EH frame. Executed at the end of the try of a try-catch.
PopFinally (popfinally)
Pop an EH frame that corresponds to a finally. Executed at the end of the try of a try-finally.
EndFinal (endfinal)
Executed at the end of a finally block. If there is an exception in-flight, continue the throw. If there is an unwind in progress, continue unwinding. Otherwise, nop.
Throw (throw)
Throw the object in RS.

Function Calling

Method (method)
Call the method named RT on the object RS with base slot RD. Always followed by a [Tail]Call instruction which specifies the num params and returns. Base slot will hold method, base + 1 will hold 'this', base + 2 ... already hold params.
MethodNC (methodnc)
Same as above, but do not use RS as 'this'.
SuperMethod (smethod)
Similar to above two, but looks up method in proto (if available) instead of in RS. Still uses RS as 'this' though.
Call (call)
Call function. RD is base reg - function is at RD, 'this' is at RD + 1, params follow. RS is number of params + 1. If RS is 0, use all values to current end-of-stack (multivalue params). RT is number of expected return values + 1. If RT is 0, get all possible return values.
Tailcall (tcall)
Similar to above, but performs a tail call instead. The number of returns isn't really important.
SaveRets (saverets)
Save return values onto the return value stack. Return values start at index RD. IMM is number of values + 1. If IMM is 0, use all values to current end-of-stack (multivalue return). This is separated from Ret since an EH unwind may need to be performed between SaveRets and Ret. Always followed, either immediately or one instruction after, by a Ret.
Ret (ret)
Returns. Unwinds all EH frames for this function. Return values are then adjusted for the calling function.
Unwind (unwind)
Can be generated by a break, continue, or return which occurs within the try of a try-finally. Causes EH frames to be unwound, calling any intervening finally blocks as it goes. The number of EH frames to unwind is specified by UIMM.
Vararg (varg)
Gets UIMM - 1 varargs and places them in slots starting at RD. If UIMM == 0, get all varargs.
VargLen (varglen)
Places the number of varargs passed to this function in RD.
VargIndex (vargidx)
Places the RS'th vararg into RD.
VargIndexAssign (vargidxa)
Places the value in RT into the RS'th vararg.
VargSlice (vargslice)
Slice varargs, yielding a multivalue. Low and high slice indices are at RD and RD + 1. UIMM is number of values desired + 1. If UIMM == 0, get all values in the slice.
Yield (yield)
Yield from this thread. Similar to Call. Values to yield start at RD. Yield RS - 1 values (or all to end of stack if RS == 0). Expect RT - 1 returns (or all possible if RT == 0).
CheckParams (checkparams)
Check parameter types against type constraint masks for this function. Appears after any default parameter conditional assignments. Only emitted if type constraints were enabled and at least one param had a non-'all' constraint.
CheckObjParam (checkobjparm)
Check that a given parameter derives from a given class. Parameter is RS. Class is RT. RD is the index of a temporary local which is assigned 'true' if RS is not an instance or RS derives from RT, and false otherwise. Always followed immediately by a cmp-je which tests the flag generated in the previous step against true and jumps after the corresponding ObjParamFail. .. hm, why couldn't this just take a Je itself?

This generates 'true' if RS is not an instance because by this point, it has already been determined to be a legal parameter by the bitmask check.

ObjParamFail (objparamfail)
If a parameter does not match any of the class types compared against in the CheckObjParam instructions, this instruction throws an informative error. The index of the parameter that failed is in RS.

Array and List Operations

Length (len)
Place the length of the value in RS in RD.
LengthAssign (lena)
Assign RS to the length of the value in RD.
Append (append)
Append the value held in RS to the array held in RD. This is not generated by ~=, but by array comprehensions, and differs in that it does not 'unpack' sub-arrays when appending (i.e. if you have [] and append [1, 2], the result is [[1, 2]]).
SetArray (setarray)
Generated by array literals. Sets a block of locals starting at RD + 1 and extending for RS - 1 slots (all to end of stack if RS == 0) into the array at RD. RT holds the index of the block to set. The way array literals work is that there is a "block size" (default of 30). Every blockSize values, they are all put into the array at once. The "block index" (RT) is how many multiples of the block size into the array the values should go. That is, if the block size is 30, and block index is 3, then the values should go into the array starting at index 90.
Cat (cat)
Concatenate RT - 1 (RT == 0 means to end of stack..) values starting at RS and place the result in RD.
CatEq (cateq)
Append RT - 1 (RT == 0 means to end of stack) values starting at RS to the object in RD.
Index (idx)
Index the object in RS with the value in RT and place the result in RD.
IndexAssign (idxa)
Index the object in RD with the value in RS and assign it the value in RT.
Field (field)
Er, this doesn't really belong here, but. Get the field with name RT from the object in RS and place the result in RD.
FieldAssign (fielda)
Set the field in object RD with name RS to the value RT.
Slice (slice)
Slice the object at RS with indices [RS + 1 .. RS + 2] and place the result in RD.
SliceAssign (slicea)
Slice-assign the object at RD with indices [RD + 1 .. RD + 2] with the value in RS.
NotIn (notin)
Place the result of 'RS !in RT' in RD.
In (in)
Place the result of 'RS in RT' in RD.

Value Creation

NewArray (newarr)
Create a new array of length UIMM and place it in RD. The length specified here may be different from the final length of the array if it ends in a multivalue.
NewTable (newtab)
Create a new table and place it in RD.
Closure (closure)
Create a closure object from the nested function at index RS. If RT == 0, the closure's environment will be the same as the current function's environment. If RT != 0, use the namespace in RT as the environment. The new closure is placed in RD.

If the closure has upvalues, it will be followed by as many Move instructions as it has upvalues. They are not real Moves, but are just there to embed data in the instruction stream. Each Move instruction uses RD == 0 to indicate that the upvalue exists in the current function (in which case the Move's RS indicates which local), and RD == 1 to indicate that the upvalue exists further up (in which case the Move's UIMM indicates the index into the current function's upvalue table).

Class (class)
Create a class with name RS, deriving from RT, and place it in RD.
Coroutine (coroutine)
Create a thread object from the function in RS and place it in RD.
Namespace (namespace)
Create a namespace with name RS, parent RT, and place it in RD.
NamespaceNP (namespacenp)
Create namespace with name RS, using the current function's environment as the parent, and place it in RD. "NP" stands for "no parent", i.e. it was not declared with a parent.

Class Stuff

As (as)
If RS is not an instance, place "false" in RD. Otherwise, place a boolean in RD indicating whether or not RS derives from the class in RT.
SuperOf (superof)
Get the super of RS and place it in RD. RS can be an instance, a class, or a namespace.