Functions

Functions are how you split your program up into reusable subprograms. MiniD's functions are fairly typical, but they have a few features which make them interesting:

  • Functions are first-class closures. I'll explain what this means, if you don't know already, a bit further down.
  • All MiniD functions have an implicit 'this' parameter, also called the "context", similar to methods in languages like D and Java. Again, this is explained further down.
  • Functions can return multiple values.

So let's get started.

Function Declarations

You've already seen how to declare functions just in the course of reading this guide, but here I'll describe all the possible forms of declarations and what all the bits mean.

function add(x, y)
{
    return x + y
}

This is a typical-looking function. It's identical to how you'd declare a function in ECMAScript or Squirrel, and looks pretty similar to function declarations in most C-style languages. You have the 'function' keyword, followed by the name, then any parameter names in parentheses, and finally the body. Unlike the static languages like D and Java, you do not indicate the return type, or for that matter, whether the function returns any values or not.

MiniD provides some syntactic sugar to make many function declarations much shorter. If your function's body is a single statement, the braces enclosing it are not necessary. So the above function can be written equivalently:

function add(x, y)
    return x + y

But there is an even shorter way to write this function. If your function's body just consists of a single return statement, you can write it like so:

function add(x, y) = x + y

That is, you put an equals sign after the parameters, followed by the value to return. This form restricts you to returning exactly one value.

The last thing to note about function declarations is that they may be optionally preceded by the 'local' or 'global' keywords, like so:

local function add(x, y) = x + y

This is actually a bit of syntactic sugar. In fact, all declarations, function declarations included, are just sugar. They all boil down to variable declarations, where the variable is assigned a function, class, or namespace object. The above is identical to:

local add
add = function add(x, y) = x + y

(The reason the creation and assignment are performed in two steps will be explained in the part on closures.)

If you don't put 'local' or 'global' on a declaration, it has a default "location." When you declare things at module scope, they default to global; when they are declared inside any function, they default to local. Putting 'local' or 'global' in front of the declaration just overrides the default.

Parameters

A function may take any number of parameters. The parameters you list, however, are more of a suggestion to the interpreter than anything else. Consider the following function:

function foo(x, y)
    writefln("x = {}, y = {}", x, y)

foo takes two parameters and prints out their values. If you were to call "foo(3, 4)", it would print "x = 3, y = 4". But it's also completely legal to call it as "foo(3)", or "foo(3, 4, 5)", or even just "foo()". In the case that a function is passed fewer arguments than it expects, the remaining parameters are set to null. In the case that a function is passed more arguments than it expects, the extra arguments are discarded. Here are some results of calling foo with varying numbers of parameters:

foo()        // prints "x = null, y = null"
foo(3)       // prints "x = 3, y = null"
foo(3, 4)    // prints "x = 3, y = 4"
foo(3, 4, 5) // prints "x = 3, y = 4"; 5 is just discarded

The behavior of nulling out extra parameters can be used to implement default values for parameters. MiniD provides some sugar for this as well. Consider the following function that uses a default value of 4 for its parameter.

function fork(x)
{
    if(x is null)
        x = 4
    ...
}

One way to make this shorter is to use the conditional assignment operator, or "?=".

function fork(x)
{
    x ?= 4 // equivalent to "if(x is null) x = 4"
    ...
}

But an even shorter way of writing it is to just put the default value in the parameter list, similar to other C-style languages.

function fork(x = 4)
{
    ...
}

When you put default parameters on a function's parameter list, it is the same as putting conditional assignments right at the top of the function body. The default values are evaluated left-to-right in that case.

The 'this' Parameter

Every function in MiniD takes a hidden 'this' parameter. 'this' is mainly for use with methods of user-defined classes. For free functions, though, 'this' doesn't have much, if any, meaning. We can still demonstrate it, however. It'll become much more useful later on.

function knife(x)
    writefln("this = {}, x = {}", this, x)

This function just prints out its 'this' and 'x' parameters. If we just call it like "knife(3)", it will print out "this = null, x = 3". By default, when you call a free function directly (i.e. not indexed out of an object or namespace or the like), the 'this' parameter is just set to null.

You can override the 'this' parameter by using the 'with' keyword in a function call. If we call the function as "knife(with 5, 3)", it will print out "this = 5, x = 3". This is not very useful with free functions, but can be used to do some cool things with methods.

Variadic Arguments

Normally, functions discard extra arguments. But if you use a variadic function, the function will receive any arguments that are passed after the "normal" arguments in a special variable-sized list of values.

Here's a variadic function:

function voop(x, vararg) // the 'vararg' keyword at the end of the params makes it variadic
{
    writef("x = {} varargs = [", x)

    if(#vararg > 0)
    {
        write(vararg[0])

        for(i: 1 .. #vararg)
            write(", ", vararg[i])
    }

    writeln("]")
}

OK, I've thrown a lot at you. 'vararg' is a keyword, and placing it at the end of the parameter list makes the function variadic. 'vararg' within a function is used as an expression that "kind of sort of" works like an array. It's important to know that it isn't an array, though. You can get its length, index it, assign values into it, and slice it, but it's not actually an object. 'vararg' by itself evaluates to the list of extra parameters as a multivalue, which is explained further down.

Now that you know that 'vararg' sort of acts like an array, the workings of this function should be pretty obvious. It prints out its "normal" parameter, 'x', then any varargs. Here's a listing of what happens when you call this function.

voop()        // prints x = null varargs = []
voop(3)       // prints x = 3 varargs = []
voop(3, 4)    // prints x = 3 varargs = [4]
voop(3, 4, 5) // prints x = 3 varargs = [4, 5]

Just like a normal function, when you pass it fewer arguments than it expects, the normal arguments are filled with null. But unlike a normal function, when you give it more arguments than it has parameters, the extra arguments are collected into the varargs instead of being discarded.

That's almost everything about variadic arguments. The rest will be explained below, in the section on return values and multivalues.

Parameter Type Constraints

Many features of MiniD were designed around the premise that people make mistakes when programming, in dynamic languages in particular. MiniD's compiler can't whine at you about much more than the most basic and obvious lexical and syntactic errors. Most semantic error checking can be performed only at runtime. Many other dynamic languages don't make much of an attempt to help you spot bugs. Some (like PHP and Perl) honestly don't care about you at all and will blindly convert between types left and right in an attempt to be "convenient."

MiniD cares about you!

Parameter type constraints are a way to help you catch bugs earlier, as well as a way to write self-documenting code. They allow you to define the set of allowable types for a given parameter. If you declare a function with parameter type constraints, and call it with parameters that violate those constraints, a descriptive exception will be thrown before the function even gets a chance to execute. This, combined with a stack trace, can make it much faster to spot bugs in your program. Without type constraints, a function called with unexpected parameters may throw a confusing error, or it could cause corruption in the internal state of some data structure which would not be seen until a much later time, making it extremely difficult to pinpoint the original cause.

The same effect can be achieved in other dynamic languages, such as Lua, but the solutions are usually cumbersome and incur a fair amount of overhead. Because of this, many programmers will simply not use the checks regardless of their benefit because they are large, ugly, and degrade performance by an unacceptable degree. For example, consider the following Lua function, which expects a number for the first parameter, a string or number for the second parameter, and an optional table as the third parameter.

function foo(a, b, c)
    assert(type(a) == "number", "Parameter a expected to be 'number', not '" .. type(a) .. "'")
    assert(type(b) == "number" or type(b) == "string", "Parameter b expected to be 'number' or 'string', not '" .. type(b) .. "'")
    c = c or {}
    assert(type(c) == "table", "Parameter c expected to be 'table' or 'null', not '" .. type(c) .. "'")

    ...
end

We have incurred a lot of overhead in doing this. We have the overhead of a function call to type, an identity comparison of two strings, another call to type, a string concatenation, and a call to assert just to check the first parameter. Obviously, this could (and should) be abstracted out into generic typechecking function, but that doesn't really solve the performance or syntax issues.

And it looks awful!

Let's consider the equivalent MiniD function:

function foo(a: int, b: int|string, c: table = {})
{
    ...
}

Yes, that's it. This does the same thing as the Lua function above. This function does still check its arguments each time it's called, but it's a much more efficient process. In the reference implementation, all the parameter type checking for this function is performed in three bitshifts and three bitwise AND operations. It's easy to use, efficient, and extremely beneficial. And if you are really worried about the performance of your code, you can flip a switch in the compiler and it will omit the parameter checking code. It doesn't get much better than that.

Now that you know why parameter type constraints are useful, we can see how to declare functions that use them.

As the above function shows, you write any type constraints after a colon after the parameter you're constraining. Parameter a only accepts ints. You can have a parameter accept multiple types by separating them with a pipe character, like with parameter b. Valid type names are "null", "bool", "int", "float", "char", "string", "table", "array", "function", "class", "instance", "namespace", "thread", "nativeobj", and "weakref". Most of these type names are not keywords. You are free to use these names elsewhere in your code, they just take on special meaning when used in parameter type constraints.

Parameter c is kind of special. When you put a default value on a parameter with a type constraint, it is implicitly given "null" as one of the allowable types. Parameter c could be equivalently declared "c: table|null = {}".

There are two other type constraints which cannot be combined with any other constraint. The first is "any", which is the default: the parameter will accept any type. The other is "!null", which means that the parameter may be any type other than null. "!null" is useful for enforcing that a function be called with a minimum number of parameters. Just put "!null" on the last required parameter, and calling the function with any fewer will throw an error:

function bar(a, b, c: !null)
	writefln("a = {}, b = {}, c = {}", a, b, c)
	
bar(1, 2, 3) // works
bar(1, 2) // fails

Parameter type constraints also work for user-defined types. If you have a parameter that can accept an instance parameter, you can also specify the base class or classes that that parameter should derive from. Consider the following function:

class A {}
class B {}

function spoon(x: A) {}

spoon(A()) // OK
spoon(B()) // error, instance of B is not allowed

Notice that you can just use the name of the class as the type to be allowed for the parameter. The "full" form of declaring this parameter would be "x: instance A", but that's a bit cumbersome to type. The full form would, however, be required if, for some reason, you had a class named something like "int" or "char", which would be interpreted as the basic types.

That's almost all there is to say about parameter type constraints. There's a slightly more complete description of them in the spec on functions, if you're interested.

Returning Values and Multivalues

As was mentioned at the beginning of this section, functions in MiniD can return multiple values. Why? Besides sometimes being a very natural thing to do (such as returning the beginning and end indices of a sequence), multiple return values also neatly circumvent most of the need for reference parameters.

Returning multiple values is very easy. Just.. return multiple values!

function brak()
    return 1, 2, 3

local x, y, z = brak()
writefln("{}, {}, {}", x, y, z) // prints "1, 2, 3"

Returning the values is easy enough; you just put a comma-separated list of values to return.

Here you can also see that we're declaring three variables. Unlike in some C-style languages, this does not mean "leave x and y uninitialized and initialize z to the result of 'brak()'"; instead, it means "initialize x, y, and z to the results from 'brak()'." This is called multiple assignment and can occur anywhere, not just when declaring variables. We could have written

local x, y, z
x, y, z = brak()

just as well.

Returning multiple values and multiple assignment are just two instances of a concept MiniD uses that it calls multivalues. A multivalue can be thought of as a sort of tuple of values. Multivalues are not objects, they are just a mechanism.

The return values from a function are one instance of multivalues. This means that all function call expressions yield a multivalue. The other things that can return multivalues are the 'vararg' expression, the sliced 'vararg' expression, and the 'yield()' expression. We'll deal with the 'yield()' expression later, when talking about coroutines, but for now we can talk about the other three. They all work the same way, though; everything we say here applies to all kinds of multivalues.

We already saw one example of using multivalues: as the source of a multiple assignment. What happens if the number of values on the right-hand side of a multiple assignment doesn't match the number of destinations? Actually, the behavior is exactly the same as with function parameters. If there are more values than destinations, extra values are discarded. If there are fewer values than destinations, the extra destinations are set to null.

local x, y, z, w
x, y = brak() // x is 1, y is 2, 3 is discarded
x, y, z = brak() // matches
x, y, z, w = brak() // x is 1, y is 2, z is 3, w is null

Again, keep in mind that this works for any type of multivalue on the right-hand side. As another example, let's use variadic arguments:

function grape(vararg)
{
	local x, y, z
	x, y, z = vararg // x is 1, y is 2, z is null
	x, y, z = vararg[0 .. 1] // x is 1, y is null, z is null
}

grape(1, 2)

vararg returns all the variadic arguments as a multivalue, so in this example, it evaluates to "1, 2". When we assign it into "x, y, z", it works as expected. Here I've also introduced sliced vararg, which returns a subset of the variadic arguments as a multivalue. Here it's returning a 1-element multivalue that consists of the first variadic argument.

Where else can we use multivalues? One place is at the end of lists of items. This happens in three places: the list of arguments to a function, in array literals, and in return statements. The following shows the first two:

function foo()
    return 1, 2, 3

writeln(foo()) // prints "123"
local a = [foo()]
writeln(a) // prints "[1, 2, 3]"

When you write "writeln(foo())", it's effectively the same as writing "writeln(1, 2, 3)", since all the return values of "foo()" are used as the parameter list. Similarly, writing "[foo()]" is the same as writing "[1, 2, 3]". This behavior only happens, though, when a multivalue is used as the last item in such a list. If another value comes after it, it will be turned into exactly one value.

writeln(foo(), 4) // prints "14", not "1234"
local a2 = [foo(), 4]
writeln(a2) // prints "[1, 4]", not "[1, 2, 3, 4]"

Sometimes you'll want to force a multivalue to become exactly one value where it would otherwise be treated as a multivalue. In order to do that, just enclose the multivalue in parentheses:

writeln((foo())) // prints "1"
local a3 = [(foo())]
writeln(a3) // prints "[1]"

These cases also apply to returning values from functions:

function bar1()
	return foo()
	
function bar2()
	return foo(), 4
	
function bar3()
	return (foo())

writeln(bar1()) // prints "123"
writeln(bar2()) // prints "14"
writeln(bar3()) // prints "1"

Multivalues and arrays are kind of similar. Arrays, being objects, are a bit more flexible, at the cost of allocating memory. It's very easy to convert from a multivalue to an array; as we've seen, you just do "[foo()]", and all the values are captured in the array. Going the other way, from array to multivalue, is often useful. In that case, all you have to do is use the expand() method of arrays.

writeln(a.expand()) // prints "123"

The expand() method simply returns all the items of the array in order as a multivalue. Note that performing expand() on a really big array, while legal, would probably not be good for performance or memory.

Closures

As mentioned at the beginning of this section, MiniD's functions are first-class closures. They are first-class in that you can manipulate them like any other value - you can put them in arrays or tables or namespaces, you can return them, you can pass them as parameters, whatever. The name a function has is not special in any way, it's just a variable that happens to hold a function (and like any other variable, it can be reassigned). What's a closure, then?

A closure is, simply put, a function along with any data that it needs to execute properly. That doesn't make much sense without an example, so here's one.

function outer()
{
	local x = 5
	
	function inner()
		writeln("x is ", x)

	inner()
}

outer() // prints "x is 5"

Here we've defined a function within another function. This is perfectly legal in MiniD. Also notice something interesting - the nested function "inner" is accessing a local variable that was declared in "outer." When you do this, in the context of "inner," 'x' is what is called an upvalue. This is just short for "outer function's local variable."

What's interesting about upvalues is that while the function that owns them (the outer function) is still executing, they refer directly to those local variables. But when the owning function exits, the upvalues will still be valid. That is, inner will still be able to access x even if outer returns. But wait - how can inner still exist when outer returns? Simple: because functions are first-class values!

function outer()
{
	local x = 5

	function inner()
	{
		x++ // modifying the upvalue
		writeln("x is ", x) // prints "x is 6"
	}

	writeln("before inner, x is ", x) // prints "before inner, x is 5"
	inner()
	writeln("after inner, x is ", x) // prints "after inner, x is 6"
	return inner // there it goes..
}

local f = outer() // getting a function as a return value
f() // prints "x is 7"

This shows two things: one, that when inner is called when outer is still executing, it really does modify the local variables (the outer function sees the modifications inner makes). And two, when inner is returned, it still accesses x, even though outer has exited! When the owning function exits, the values of its local variables are saved away so that any nested functions that use them will still be able to use them.

The function that is returned from outer is a closure. Now it makes sense when I say "a function along with any data it needs to execute." The inner function needs x to execute, so the closure includes it.

This naturally leads to a few questions: can functions share upvalues? Are upvalues only at a function level? Can you have multiple closures of the same function with different data?

The answer to the first question is yes. For example:

function thing()
{
	local count = 0
	
	function inc() count++
	function dec() count--
	function print() writeln("Count is now ", count)
	
	return inc, dec, print
}

local i, d, p = thing()

i()
i()
p() // prints "Count is now 2"
d()
p() // prints "Count is now 1"

'thing' defines three nested functions: two of which modify its 'count' variable, and a third which prints it out. We get these functions and call them, showing that the three functions really do modify the same 'count' variable. You might notice an interesting parallel between these closures and object-oriented programming. There's some hidden state that some methods operate upon. In fact this is how some functional languages, like Lisp, can accomplish OO. But I digress.

The answer to the second question - are upvalues only at a function level? - is no. Upvalues are based on the scope in which a local variable was declared. When any scope is left, any local variables which are used as upvalues by nested functions are saved away. In loops, the scope is considered to be exited upon each iteration of the loop. The following examples make these behaviors clear:

local f
local x = 0

// anonymous scope
{
	local y = 1
	f = function() writefln("x = {}, y = {}", x, y)
} // scope ends, and y is saved away

f() // prints "x = 0, y = 1"
x = 4
f() // prints "x = 4, y = 1"

// -----------------------

local a = []

for(i: 1 .. 6)
	a ~= function() = i // i is given a new value upon each iteration of this loop
	
foreach(func; a)
	writeln(func()) // prints the values 1 through 5

The first example shows that even when the scope that contains y is left, f can still access it. The second shows that variables declared within a loop (including loop indices like in for loops) are saved upon each iteration.

The answer to the final question - can you have multiple closures of the same function with different data? - is an emphatic yes. This is often one of the most useful capabilities of closures. Consider the following example:

function makeCounter(from: int = 0)
{
	return function()
	{
		local ret = from
		from++
		return ret
	}
}

local c1 = makeCounter() // counts from 0
local c2 = makeCounter(10) // counts from 10

writeln(c1()) // prints 0
writeln(c1()) // prints 1
writeln(c2()) // prints 10
writeln(c2()) // prints 11

This 'makeCounter' function returns a function which, upon successive calls, returns successive integers, starting from the value that you pass into 'makeCounter'. We have two closures here, one which starts from 0 and one which starts from 10. They both have the same function, but different data.

Closures are getting kind of close to object-oriented programming. But they can be awkward to use for general-purpose OO tasks. MiniD provides you with a simpler, more familiar way of creating your own objects and types, through classes. And that's the next section!