Types

MiniD is a dynamically-typed language. This means that variables and data structures are not constrained to holding any one type of value. MiniD is also a fairly strongly-typed language. Some languages allow for such nonsense as concatenating strings with numbers and objects. In theory it's nice to have for simple output and debugging messages. In practice it's a pointless hole in the type system to do something that can be done more efficiently and flexibly in other ways. So MiniD eschews a bit of convenience for type safety, because even though weak and dynamic typing are nice in a lot of ways, strong typing and elements of static typing can really help with the robustness of programs.

There are fifteen types in MiniD; five of them are value types and the other ten are reference types. Here is a table that summarizes them:

TypeClassDescription
nullvalueAbsence of a useful value
boolvalueTruth (true/false) value
intvalueIntegral number
floatvalueFloating-point number
charvalueUTF-32 codepoint
stringreferenceSequence of UTF-32 codepoints
tablereferenceHash; map; associative array
arrayreferenceInteger-indexed list of values
functionreferencePiece of executable code along with any environment
classreferenceUser-defined type
instancereferenceInstance of a class object
namespacereferenceHash mapping from strings to values
threadreferenceCooperatively-scheduled thread of execution
nativeobjreferenceImplementation-defined host application object
weakrefreferenceWeak reference to any reference type

The split between value and reference types has more implications for the implementation of the language than it does for the use of values of these types. In the reference implementation, value types are not allocated on the heap, and they don't really need to be, whereas reference types are. Strings are kind of a strange midpoint: because they're immutable (their data can't be changed), they act more like value types, but they are still allocated on the heap since they can be of variable size.

If you want to get the type of a value, you can use the global typeof function. An example of use in MDCL:

>>> typeof(5)
 => "int"
>>> typeof("hello!")
 => "string"
>>>

Here are more in-depth explanations of the individual types.

null

This type represents the absence of a useful value, but it doesn't mean it's useless. null is used as a default value for uninitialized variables and parameters. It also has a special relationship with tables as explained later. There is only one value of type null, and that is null.

bool

This is a very easy-to-understand type. It only has two values, true and false. Conditional statements in MiniD will accept any type however. For that purpose, the boolean false value, null, integral 0 and floating-point 0.0 are considered false, and all other types and values are considered true.

int

This is a signed 64-bit integral type. Implementations may choose to use different sizes for the int type, but they are considered nonstandard.

float

This is an IEEE 754 double-precision floating-point number. Implementations may choose to use different precisions for the float type, but they are considered nonstandard.

There are two places in the type system of MiniD where implicit conversions occur, and one of them is when floats and ints are mixed. If an arithmetic operation is performed and one operand is an integer while the other is a float, the integer will be promoted to a float, the operation will be performed, and the result will be a float.

char

This is a single UTF-32 codepoint. Note that because Unicode is a huge pain in the butt, this may not represent an entire character, if you take combining marks into account.

Characters are the other place where implicit conversions between types occur. Characters can be concatenated together, and the result will be a string. Characters can also be concatenated with strings.

string

Strings are seen by the language as a sequence of UTF-32 codepoints. This neatly hides the icky issues of multibyte encodings -- namely, indexing and slicing inside multibyte encodings, and the length of the string not matching the number of codepoints. In reality, the implementation is free to choose how strings are internally represented, as long as it provides the proper interface to the language. For example, the reference implementation uses UTF-8 internally. Other implementations might dynamically change the encoding as necessary, or use nonstandard encodings.

Strings are immutable, and as such, behave a bit like value types. When you concatenate two strings, or append one string to another, a new string is created. When you use a string manipulation function, a new string is created. This might sound like an unreasonable setup but in practice there are two things that make it more reasonable. One, strings are interned, which means that two strings with the same data are guaranteed to be the same object. So even if your program makes a million references to the string "x", there will still only be one string object with the data "x". Two, you're not actually going to be doing complex string manipulations very often, and if you need to, the standard library provides a mutable string buffer object.

table

Tables are associative arrays which can map from any type except null to any type except null. Their key and value types can be completely mixed. You can also define functions inside tables and call them as if they were methods, in order to create lightweight objects. Full object-oriented functionality, however, is provided by the class and instance types.

Tables cannot use null as the key type because it doesn't fit in with the protocol MiniD uses for iterating over items in objects with the foreach loop. If you try to use null as a key you will get an error. However, it's not an error to use null as a value. Rather, tables and null have a special relationship.

If you try to access a key from a table which does not exist, instead of getting an error, it will simply return null. Furthermore, if you assign the value null to a key which does exist, the key will be removed. This has the net effect of more or less storing null values in tables for free, simply because they're not stored.

Similarly to Lua, if you access a field out of a table like "t.x", it is the same as getting the key with the string value "x", that is, something like "t["x"]". However, tables are the only type for which this is true. For all other types, field access and indexing are two separate operations with different purposes.

Here are some simple examples of table usage in MDCL:

>>> global t = {}
>>> t["hello"] = "goodbye"
>>> t["hello"]
 => "goodbye"
>>> t.hello // notice how we're accessing it as a field
 => "goodbye"
>>> #t
 => 1
>>> t[5] = 10
>>> t[5]
 => 10
>>> #t
 => 2 // length of t is now 2
>>> t[5] = null // but we set this element to null...
>>> t[5]
 => null
>>> #t
 => 1 // ...and the length is back to 1 
>>>

You can also iterate over the key-value pairs using a foreach loop.

>>> t[5] = 10
>>> foreach(k, v; t)
...     writefln("{}: {}", k, v)
5: 10
hello: goodbye
>>>

array

Arrays are lists of values of arbitrary (and possibly mixed) types which can be resized, modified, concatenated, appended to, and sliced. They have a length, and you can access values in the array using an integer index in the range [0, length). Alternatively, if you use a negative integer as the index, it will access the values in reverse order (that is, from the end of the array). In that case, the valid range is [-length, 0). -1 is the index of the last element, -2 the second-to-last and so forth.

Arrays are similar to strings in that they are reference types that contain sequences of items, but they differ in that they can be modified, and this brings up a few points. First, you can have multiple array objects which all reference the same data, either in entirety or in pieces. With strings, there is always exactly one object for a given sequence of characters, and if you slice piece off a string you get a new string. Second, if two array objects reference the same data (their slices overlap in some way), when you modify the data through one array object, it will be reflected in the other; strings can't be modified so this just can't happen. This is by design, as "lightweight" slices of arrays are in many ways more useful than creating a new array object for each slice. For example, you can sort a slice of an array in-place using lightweight slices. If you do need a slice that's separate from the original array, simply slice and then duplicate the array object with its dup method.

Arrays support the concatenation and append operators (~ and ~=). Concatenation always creates a new array object that doesn't reference the original data of either operand. Appending will attempt to resize the array object in-place, so the array object may or may not refer to the same data. There is something to keep in mind when using the append operator, however. If you append an array to another array, it will "unpack" the right-hand array and add its elements to the destination. That is,

>>> global a = [1, 2, 3]
>>> a ~= [4, 5, 6]
>>> a
 => [1, 2, 3, 4, 5, 6]
>>>

Sometimes you don't want this to happen, like if you're trying to build up a multidimensional array. In that case, you can use the array's append method instead.

>>> a = [[1, 2], [3, 4]]
>>> a.append([5, 6])
>>> a
 => [[1, 2], [3, 4], [5, 6]]
>>>

The append method will always just tack on the given value (or values) as new elements on the end of the array.

Arrays have a multitude of methods which you can use to do list processing. See the array library reference for more.

function

Functions are.. functions. They are a piece of code which can accept arguments and return zero or more values. They are first-class values in MiniD, which means you can pass them as parameters, return them from other functions, and generally treat them just like any other value.

A function object in MiniD is actually not just a function, but a function along with some extra variables that it might need in order to execute. Thus a function object is a closure as found in many functional languages. You can actually have many function objects which are the same "function" but which have different extra variables, which may mean that they do different things even though they have the same code. This will be explained more in-depth in a later chapter.

class and instance

Classes are user-defined types, and are how MiniD accomplishes object-oriented programming. MiniD's class model is more dynamic than those found in languages like D, Java, and C++, but retains the same basic organization. It's a very simply model, but it's also very flexible. Classes and instances will be covered much more in-depth in later chapters.

namespace

Start a fresh instance of MDCL, and try to get the value of a variable named 'x'.

$ mdcl
MiniD Command-Line interpreter 2.0 beta
Use the "exit()" function to end.
>>> x
Error: stdin(1): Attempting to get nonexistent global 'x'

>>>

Well that didn't work out. That's because 'x' hasn't been declared yet. So if you declare it, you can then use it.

>>> global x = 5
>>> x
 => 5
>>>

So, where is 'x' stored then? It's a global variable, like we declared it. But it's actually stored in a namespace. The global namespace, to be exact.

A namespace is an associative array, similar to a table, but with a few key differences:

  • Tables can map from any type (except null) to any type (except null). Namespaces can only have strings for keys, but can map to any type including null.
  • Accessing a key that doesn't exist from a table simply returns null. Accessing a key that doesn't from a namespace exist will throw an exception.
  • Namespaces also have names and something called a "parent" namespace; tables have no analogue.

Now it makes sense that we got that error when trying to access 'x' the first time. We tried to access a key in the global namespace that didn't exist.

If you want to get a hold of the global namespace, there is a global (stored in the global namespace, no less) called '_G' that refers to it.

>>> x
 => 5
>>> _G.x
 => 5
>>>

In this case, 'x' and '_G.x' refer to the exact same thing.

When you take modules into account, things get a little more complicated. All you need to know now is that each module is given its own namespace, and when you declare a global variable in a module, that variable is inserted as a key-value pair in that module's namespace rather than in '_G'.

thread

Also known as a 'coroutine', the thread type represents a collaboratively-scheduled thread of execution with its own call stack and local variables. We'll come back to threads much later.

nativeobj

Native objects ('nativeobj' for short) are sort of vaguely-defined. Their purpose is to allow MiniD to hold an opaque reference to a host application object. What this means is heavily implementation-dependent. In the reference D implementation, it's necessary to keep track of all native objects that are referenced by MiniD so that the host application's garbage collector doesn't accidentally collect them. In a C implementation, native objects might just be simple void pointers. In any case, native objects have no operations defined on them other than identity, and a native object is only identical to itself. MiniD can't do any more with a native object than what the host allows for it to do.

weakref

Weak references are a way to hold on to a reference to a reference type without causing it to be held onto by the garbage collector. What does that mean? Normally, if you have a reference to an object, it will not be collected. Only when an object is no longer referenced from anywhere else is it collected. Sometimes you don't want this to happen: you want to have a reference to an object that is only valid as long as it is alive. This is what a weak reference is. When the object referred to by a weak reference is collected, the weak reference object still exists, but its object becomes "null". Here's an example:

>>> global a = [1, 2, 3, 4, 5] // create an array
>>> global wa = weakref(a)     // get a weak reference to it
>>> deref(wa)                  // what is pointed to by wa?
 => [1, 2, 3, 4, 5]            // the array.
>>> collectGarbage()           // take out the trash.
 => 8963
>>> deref(wa)                  // dereferencing wa still gets the array, since a still points to it
 => [1, 2, 3, 4, 5]
>>> a = null                   // getting rid of the only "strong reference!"
>>> collectGarbage()           // let's try this again
 => 2204
>>> deref(wa)                  // and what is it?
 => null                       // it's null!
>>>

Weak references are a fairly advanced concept and some examples of actual use will be presented much later.

That's All

For this section. Up next are expressions!