Note: This website is archived. For up-to-date information about D projects and development, please visit wiki.dlang.org.

High-level Assembler and Code Generation

(pragma) The idea of runtime code generation has been of particular interest to me since I began work on DDL. The trick is that in order to make good code, you need to mesh it with the runtime environment. This means being able to link to any arbitrary symbols at runtime, as well as generate opcodes - this is why DDL is so important for such an idea to be feasable.

The idea is simple: implement a minimal instruction set by which D-compatible structures and functions can be composed, compiled and linked at runtime. Building directly on top of DDL, we can easily create a rich capability stack for this purpose and more:

  • RuntimeCompiler? - actual runtime compiler that supports our runtime instruction set and produces an in-memory image and a set of fixups
  • * CodeNode? - some construct to hold a piece of the parse-tree or token list as generated elsewhere
  • * Fixup - something to bind symbols to spots in the code that need patching after linking
  • HLALibrary - custom DDL Library type that invokes the compiler, resolves fixups and is linker friendly
  • HLABuilder - the 'friendly assembler' interface that eases the creation of CodeNodes? ** HLABuilderTarget - basically a consumer of the parse-tree as it is generated by the builder. HLALibrary is one such target.
  • DBuilder - extends HLABuilder and makes complex D constructs, such as classes, easier to generate.

What follows is a rough outline and specification for this stack.

Runtime Compiler

The runtime compiler doesn't have to be as big and scary as it sounds. Ultimately, its job will be to compose a memory image, fixups and ExportSymbols? that directly represent the CodeNodes? that are handed to it. The structure of the CodeNodes? should roughly resemble a parse tree, with some nesting depending on what is needed. The data in such a tree should be kept as platform-independent as possible, such that the compiler itself should be the only piece in this capability stack that needs parsing.

The output will be largely useless, except in very controlled circumstances where there are no external references or fixups generated. In the case of the latter, a link pass of some kind will be needed to make the generated image useful. This is where the higher levels of the stack come into play.

It should be noted that since the runtime compiler, by definition, creates pre-linked code, the output can easily be serialized to a file for use later. This would make it useful in developing a compiler back-end, for example.

HLALibrary

The HLA Library will help wrap the compiler and provide a DDL level interface, so that runtime linking will become virtually transparent. This solves the problem of resolving the fixups and external references generated by the compiler pass, as well as providing an interface for querying runtime symbols. It is at this level that generated code becomes a first-class citizen within a properly equipped D program.

HLABuilder

I'm drawing inspiration from Java's StringBuilder? class here - the HLA Builder's job is to make a tough job easy. More to the point, it takes the onerous task of composing a parse-tree at runtime (a big pile of structs and/or objects) and turns it into a series of method calls that plug the needed parse tree nodes into an HLABuilderTarget (or parse-tree container) for later compilation.

Its worth noting that the methods exposed at this level should not go beyond language neutrality where possible. The only specific constructs provided for are scalar types, scalar operations and platform capabilities (new/delete, volatile, locks, D and C callspecs, and syncronization). This will allow for alternate language implementations including, but not limited to, some kind of reduced cross-platform assembler syntax based directly on the CodeNode? implementation.

DBuilder

This layer provides an extension to the HLABuilder that further simplifies the task of generating a parse-tree. The pragmatism of the HLABuilder is put aside in favor of a D-language specific set of methods that make the generation of classes, structs, unions, delegates, etc all within easy grasp.

The DBuilder may need several helper classes to get to this goal. Also, it may be best rendered as a mirror image or, or an extension to, a formal reflection interface.

Early Concept Code

Here's a very rough sketch of where I started, and where some of this can go. The code neither compiles, nor is it complete in any way shape or form. It exists merely to illustrate a work in progress.

enum TokenType: uint{
	Label,
	Forward,
	External,
	
	DReturn,
	DCall,
	DFrame,
	
	CReturn,
	CCall,
	CFrame,
	
	Push,
	Pop,
	
	Data,
	
	Variable,
	BinaryOp,
	UnaryOp,
	
	Mov,Swap,Test,Branch
}

enum VariableType: uint{
	Local,
	Varargs,
	Temporary,
	Argument,
	Constant
}

enum BinaryOp{
	Add,Sub,Mul,Div,Mod,
	Shl,Shr,Rol,Ror,
	And,Or,Xor
}

enum UnaryOp{
	Not
}

enum Condition{
	Equals,	NotEquals,
	Carry, NotCarry,
	GreaterThan, GreaterThanEqualTo,
	LessThan, LessThanEqualTo
}

struct Varspec{
	uint width;
	union value{
		/ * all primitive types */
	};
}

struct Opspec{
	uint width;
	bool signed;
	(BaseHLAToken*)[3] tokens;
}

// represents a discrete portion of a HLA image
struct BaseHLAToken{
	uint type;
	union data{
		char[] name;
		void[] data;
		Opspec opspec;
		BaseHLAToken* token;
		BaseHLAToken[] tokens;
	};
	HLAContext homeContext; // where we came from
}

/**
	Lowest level interface - the actual compiler
	 - modeled around a list of BaseHLATokens and a symbol table (DDLLibrary)
	 - internally uses an HLACompiler implementation
**/
abstract class HLALibrary: DDLLibrary{
	// looks up local and global symbols (via DDL)
	ExportSymbol getSymbol(HLATokenPtr symbol);
	
	// add tokens to the compiler's token store
	uint addToken(BaseHLAToken token);
	uint addTokens(BaseHLAToken[] tokens...);
	uint setTokens(BaseHLAToken[] tokens...);
	
	void compile();
}

// add typesafety for fewer errors
typedef BaseHLAToken HLAToken;
typedef BaseHLAToken HLAVariable;

//NOTE: these *could* be redone as integers instead of poitners
alias HLAToken* HLATokenPtr;
alias HLAVariable* HLAVariablePtr;

/**
	Friendlier interface for adding code to an HLALibrary.
**/
abstract class HLABuilder{
	/** 
		Moves the internal context pointer.
	*/
	HLATokenPtr label(char[] label); // label
	HLATokenPtr forward(char[] label); // forward reference
	HLATokenPtr external(char[] label); // external reference
	HLATokenPtr publicSymbol(char[] label); // exported symbol
		
	HLATokenPtr dreturn(HLAVariablePtr value=null);
	HLATokenPtr dcall(HLATokenPtr func,HLATokenPtr dest,HLAVariablePtr[] arguments...);
	HLATokenPtr dframe();
	
	HLATokenPtr creturn();
	HLATokenPtr ccall();
	HLATokenPtr cframe();
	
	HLATokenPtr push(HLATokenPtr value);
	HLATokenPtr pop(HLATokenPtr dest);
	
	HLATokenPtr data(ubyte[] data);
	
	HLAVariablePtr var(uint width,uint value=0);
	HLAVariablePtr tempvar(uint width,uint value=0);
	HLAVariablePtr arg(uint width);
	HLAVariablePtr constant(uint width,uint value);
	HLAVariablePtr varargs(HLAVariablePtr[] arguments...);
	
	HLATokenPtr binaryOp(uint op,uint width,bool signed,HLAVariablePtr left,HLAVariablePtr right,HLAVariablePtr dest=null);
	HLATokenPtr unaryOp(uint op,uint width,bool signed,HLAVariablePtr left,HLAVariablePtr dest=null);

	HLATokenPtr mov(uint width,HLAVariablePtr dest,HLAVariablePtr src);
	HLATokenPtr swap(uint width,HLAVariablePtr left,HLAVariablePtr right);
	HLATokenPtr test(uint width,HLAVariablePtr left,HLAVariablePtr right);
	HLATokenPtr branch(uint condition,HLATokenPtr ifTrue,HLATokenPtr ifFalse=null);
	
	//additional aliases here for operations and more
}

/**
void foo(int a,int b){
	writefln("hello world: %d",a+b);
}
**/
void example(DDLLinker linker){
	// function handle - we'll call this later
	void function(int a,int b) testFunc;
	
	// create a new context and generate some code!
	auto lib = new HLALibrary();
	with(new HLABuilder(lib)){
		publicSymbol("_D3fooFddv");
		dframe;
			// function arguments are laid out
			auto a = arg(4);
			auto b = arg(4);
			
			auto temp1 = tempvar(4);
			binaryOp(BinaryOp.Add,4,false,a,b,temp1);
			
			auto temp2 = varargs(temp1);
			auto fn_writefln = external("_D3std5stdio8writeflnFZv");
			dcall(fn_writefln,null,temp2);
		dreturn;
	}
	lib.compile();
	linker.link(lib);
	testFunc = cast(void function())lib.getSymbol("_D3fooFddv").address;
}

/** 
	class Foobar{
		int x,y;
	}
**/
void example2(DDLLinker linker){
	// create a new context and generate some code!
	auto lib = new HLALibrary();
	with(new HLABuilder(lib)){
		// externs and forward references
		auto objectCtor = external("__ctor3std6ObjectFz3std6Object");
		auto newHandle = external("__ctor3std6ObjectFz3std6Object");
		auto memcpyHandle = external("memcpy");
				
		// useful numbers
		auto thisSizeof = constant(4,16);
		
		auto _classinfo = publicSymbol("__class6Foobar");
			data(4,thisSizeof);
			//TODO: typeinfo base classes, etc
			
		auto _vtbl = publicSymbol("__vtblD6Foobar");
			data(4,_classinfo);
		
		auto _init = publicSymbol("__init__6Foobar");
			data(4,_vtbl);
			data(4,0): // "monitor"
			data(4,0); // x
			data(4,0); // y
			
		auto _defaultCtor = publicSymbol("__ctor6FoobarFz6Foobar");
		dframe;
			// allocate and init
			auto _this = tempvar(4);
			dcall(newHandle,_this,thisSizeof);
			dcall(memcpyHandle,_init,_this,thisSizeof);
			
			// super
			dcall(objectCtor,_this);
			
			// no code here
		dreturn(_this);
	}
	lib.compile();
	linker.link(lib);
}
	

/+
	Compilation process:
	
	First Pass:
		- generate local xref for label names
		- generate local xref for public names
		- anchor locals to targets via lookahead
		- anchor publics to targets via lookahead
		
	Second Pass:
		- render tokens to code and/or data
		- resolve forward references to locals via xref
		- resolve forward references to publics via xref
		- generate fixups for local forward references
		- generate fixups for local references
		- generate fixups and strong ExportSymbols for publics
		- generate fixups and extern ExportSymbols for external references
	
	Post-Link Fixup process:
		- resolve fixup targets with addresses in referenced ExportSymbols		
+/