Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Virtual File System

Tango houses a virtualized file system, enabling a variety of true file systems to be treated as a single cohesive entity. Portions of a hard-drive can be combined with one or more zip-files, for example, which can subsequently be combined with others such as FTP or WebDav systems. To a client application, such a filesystem behaves as though it were a single entity, exposing a rich set of operations in a common manner across all of the virtualized systems.

The API is sufficiently rich to also serve as the primary gateway to each of the supported concrete implementations directly (sans virtualization). For instance, using this package to interact directly with the physical file system on your local hard-drive is both efficient and practical. The same is true for manipulating compound and/or compressed file-systems such as zip files and their relatives.

The VFS is modeled by tango.io.vfs.model.Vfs and has various implementations described individually within later sections. Each concrete implementation follows the model, so we can describe the overall system by focusing on the model itself. The API exposed revolves around two principal elements: folders and files, along with sets of each.

Folders

A folder represents a container within the VFS, housing zero or more file and/or sub-folder instances. Folders can be traversed, inspected, flattened, filtered, and manipulated in various ways. Like a traditional directory, a VFS folder can expose its immediate files and sub-folders. In the Tango VFS each folder is represented by a VfsFolder instance. For now we'll ignore how these instances are obtained, and concentrate on how to apply them instead. Each folder has both a short name and a long name, where the short one is a simple identifier used to identify and address each, and the long name is dependent upon the concrete implementation. For example, a folder exposed by a concrete file-system would expose the folder-path via the long name:

import tango.io.vfs.model.Vfs;

VfsFolder folder;
Stdout.formatln ("short name '{}', long name '{}'", folder.name, folder.toString);

You can traverse the immediate child folders where each child is also a VfsFolder instance. This can be used to expose an underlying hierarchy, where each child unturned may be traversed in a similar manner (We'll drop the import from further examples):

VfsFolder folder;

foreach (child; folder)
         Stdout (child.name).newline;

To reach a named subordinate folder you can select it in this manner (the identifiers within quotes are simple folder names):

VfsFolder root;

auto child = root.folder("child");
auto grandchild = child.folder("grand");

A shortcut to reach the grandchild noted above would be as follows:

VfsFolder root;

auto grandchild = root.folder("child/grand");

Note that the '/' separator is used for both Windows and linix environments. When using folder() in this manner, you will need to either open() or create() the folder before proceeding further. That is, to create the noted grandchild (assuming it did not exist already) try the following:

VfsFolder root;

auto grandchild = root.folder("child/grand").create;
Stdout.formatln ("short name '{}', long name '{}'", grandchild.name, grandchild.toString);

Information about a folder is obtained by asking for it:

VfsFolder folder;

auto info = folder.self;
Stdout.formatln ("file count: {}", info.files);
Stdout.formatln ("folders count: {}", info.folders);
Stdout.formatln ("content size (of files): {}", info.bytes);

The above reflects information pertaining specifically to the contents of one folder. To obtain similar data for the subtree represented by a folder, use this instead:

auto info = folder.tree;
Stdout.formatln ("file count: {}", info.files);
Stdout.formatln ("folders count: {}", info.folders);
Stdout.formatln ("content size (of files): {}", info.bytes);

The difference here is subtle yet powerful. In this case we've gathered up the number of files, sub-folders, and the content-size for potentially a large group. There's more behind that info reference than just numbers, though. In fact, both self() and tree() hand you a flattened set of folders to manipulate. There is no hierarchy within a flattened set, so you can now happily extract and operate with a subset instead. One way to generate a subset is to filter the folder names. In this example we select all folders from the tree with names that begin with the letters "dev", and display information about that subset:

auto info = folder.tree.subset("dev*");
Stdout.formatln ("file count: {}", info.files);
Stdout.formatln ("folders count: {}", info.folders);
Stdout.formatln ("content size (of files): {}", info.bytes);

Selection of folders (and files) by name in this manner is supported by a widely used group of meta-characters, including '*', and '?' characters along with "[]" notation for representing alternate characters within the pattern (see tango.io.util.PathUtil for details). You can continue to select or slice your way through these sets of folders. An optimal way to do so is to retain the original 'tree' results, which will avoid returning to the underlying medium for (typically) redundant data:

auto set = folder.tree;
auto dev = set.subset("dev*");
auto user = set.subset("*user*");
auto install = set.subset("*.install");

You can traverse folder sets in the same manner as you do with folders, although each set reflects a flat list of folders instead of representing a hierarchical segment:

auto set = folder.tree;
foreach (folder; set)
         Stdout (folder.name).newline;

The same type of traversal can be performed upon each folder subset, since they are true sets themselves:

foreach (folder; set.subset("dev*"))
         Stdout (folder.name).newline;

If you execute similar code using self() instead of tree(), you'll find the set contains just the one entry representing that specific folder:

foreach (folder; folder.self)
         Stdout (folder.name).newline;

Thus, the only difference between self() and tree() is the number of folders potentially contained within a starting set. Other features of a folder include testing to see if it is writable or not, and the facility to remove the entire folder sub-tree (along with all contained data):

// is this folder mutable?
bool writable = folder.isWritable;

// purge all content within the folder tree, while retaining this folder itself
folder.clear;

To wrap up this section we illustrate how to display a summary of content for each folder within a set:

VfsFolder root;

foreach (folder; root.tree)
        {
        auto info = folder.self;
        Stdout.formatln ("folder '{}' has {} folders and {} files containing {} bytes", 
                          folder.name, info.folders, info.files, info.bytes);
        }

Files

Obtaining a VfsFolder instance gives you access to the files within. You can access a specific file directly, or you can select a set of files using a mechanism similar to folder selection. First let's select a specific file:

VfsFolder folder;
auto file = folder.file ("myfile.txt");

This provides us with a VfsFile reference for the named path, relative to the host folder. You can specify sub-folders within the path like so:

VfsFolder folder;
auto file = folder.file ("somefolder/myfile.txt");

With a VfsFile in hand you can access a variety of attributes. For example:

auto file = folder.file ("myfile.txt");
Stdout.formatln ("file '{}' [{}] contains {} bytes", file.name, file.toString, file.size);

Checking to see if the file exists, and creating it where it does not (using 'create' upon an existing file should truncate the content):

if (file.exists is false)
    file.create;

Removing an existing file:

file.remove;

When you have two VfsFile references, you can move or copy them:

VfsFile   source;
VfsFolder folder;

folder.file ("myfile").copy(source);

Moving the file instead will additionally remove the source. Gaining access to file content is handled by exposing a pair of streams representing input and output. This examples copies file content to the console:

VfsFile file;

auto input = file.input;
Stdout.stream.copy (input);
input.close;

Mutating file content is handled in a similar manner via the stream API. Here we explicitly copy a stream (perhaps from a socket or elsewhere) to a file:

VfsFile     file;
InputStream input;

file.output.copy(input).flush.close;

You can, of course, apply any of the stream wrappers to shape output content. Here we apply some formatted output:

import tango.io.stream.FormatStream;

VfsFile   file;
VfsFolder folder;

auto output = new FormatOutput(file.output);
output.formatln ("folder '{}' contains {} files", folder.name, folder.files);
output.flush.close;

Note that the exposed stream should always be closed in order to avoid leaking system resources (file handles and so on). In all cases, where something goes awry or is considered to be an illegal operation, a VfsException will be thrown.

What about searching for a specific file, or selecting a set of them? This is handled via a different VfsFolder method called catalog(), which supports an optional name for filtering purposes. To list all files within a folder tree, try this:

foreach (file; folder.tree.catalog)
         Stdout.formatln ("'{}' contains {} bytes", file.name, file.size);

To do something similar but for text (".txt") files only, try this:

foreach (file; folder.tree.catalog ("*.txt"))
         Stdout.formatln ("'{}' contains {} bytes", file.name, file.size);

The same pattern-matching mechanism we discussed in the folder section is applied here. To combine both folder and file filtering, how about searching for all text files within folders related to documentation (for example):

foreach (file; folder.tree("doc*").catalog("*.txt"))
         Stdout.formatln ("'{}' contains {} bytes", file.name, file.size);

To search within a specific folder only, use the self() method instead:

foreach (file; folder.self.catalog ("*.txt"))
         Stdout.formatln ("'{}' contains {} bytes", file.name, file.size);

Files can thus be selected, as a set of zero or more entries, from a flattened set of folders. It's a simple mechanism with quite a bit of flexibility. We'll wrap up this section by introducing a custom filter. To select only those files that are, say, less than 1KB in length you could do this:

foreach (file; folder.tree.catalog ((VfsInfo info) {return info.bytes < 1024;}))
         Stdout.formatln ("'{}' contains {} bytes", file.name, file.size);

We used an anonymous delegate in the above example, but you can apply any delegate matching this signature:

bool delegate (VfsInfo info);

If the delegate returns true, the file in question will be added to the set. Otherwise it will be excluded. A similar custom delegate can be applied to folder filtering as an argument to the tree() method.

Drivers

The model described above is implemented through various concrete implementations, known as VFS drivers. These are developed independently, and can be used in concert or as a standalone facility. Given that all drivers adhere to the described model, we discuss only the specific additions relevant to each.

FileFolder

Maps to a (or the) file-system on your computer, enabling the lookup, traversal, and manipulation of file and folder contents therein. When creating an instance, you should provide a path to the physical relevant directory to be used. For example, to map a FileFolder to an generic installation of Tango and access a file within, you might do the following:

auto tango = new FileFolder ("/dev/d/software/tango");
auto file = tango.file ("io/vfs/FileFolder.d");
Stdout.formatln ("'{}' resides at [{}]", file.name, file.toString);

Notice how the file path is specified relative to the containing folder. The example would emit:

FileFolder.d resides at [/dev/d/software/tango/io/vfs/FileFolder.d] 

FileFolder checks the path provided in order to ensure it is valid so, for example, if the path does not currently exist then you can tell FileFolder to create it for you via a boolean second argument:

auto tango = new FileFolder ("/dev/d/software/tango", true);

FileFolder is usually as efficient as any dedicated file-system package. One useful thing to remember is to avoid multiple traversals where they are not necessary (as mentioned the folder section). For instance, try to avoid doing something like this:

auto tango = new FileFolder ("/dev/d/software/tango");
auto files = tango.tree.files;
auto bytes = tango.tree.bytes;

You see the concern there? The tree is captured twice instead of retaining it for reuse. That represents a second traversal of the file system which, comparatively speaking, operates at around the speed of cold molasses. On the other hand, where the structure of a file-system changes rapidly, it can make sense to update any internal representations on a regular basis.

VirtualFolder

This driver manages other drivers, and can arrange them into a virtual tree. For example, I can add a number of other drivers as children of a VirtualFolder, and treat them as a combined entity. Suppose I wish to coalesce two portions of a file system with the content from an ftp site:

auto tango = new FileFolder ("/dev/d/software/tango");
auto other = new FileFolder ("/other");
auto ftp   = new FtpFolder  ("ftp://www.dsource.org/downloads");

auto root = new VirtualFolder ("root");
root.mount(tango).mount(other).mount(ftp);

foreach (folder; root.tree)
	 Stdout.formatln ("folder '{}' has {} files", folder.name, folder.files);

The act of mounting these folders into a virtual 'parent' makes them behave just like sub-folders belonging to a parent folder within a file system. All actions on a virtual folder behave in the manner as the model dictates, but across a set of drivers instead.

You can also mount virtual folders within a virtual folder, in order to create hierarchies as required. In such cases, the name provided is used as a path segment. For example:

auto tango = new FileFolder ("/dev/d/software/tango");
auto other = new FileFolder ("/other");

auto sub = new VirtualFolder ("sub");
sub.mount(tango, "code").mount(other);

auto root = new VirtualFolder ("root");
root.mount(sub);

auto file = root.file("sub/code/io/vfs/VirtualFolder.d");

Notice that we added an optional name when mounting the tango instance above, called "code". Subsequently using a "sub/code/" prefix causes the virtual folder to select "/dev/d/software/tango" as the folder to locate the file within, because "code" is the name given to that specific FileFolder instance within the namespace of the "sub" child. If you don't add that optional name during the mount() call, it will instead default to the rightmost segment of the provided path. In the above example, reaching into the folder 'other' would require a "sub/other" path instead.

Mounting other folders is the primary role of a VirtualFolder. However, it also supports symlinks. Following on from the prior examples, let's add and use a symlink to a folder:

root.map (root.folder("sub/code"), "symlink");
auto file = root.file ("symlink/io/vfs/VirtualFolder.d");

In this case, we've used the name "symlink" to map to a folder called "sub/code", which in turn maps to our original tango folder and enables us to access a file in the normal manner. You can alias files in a similar fashion:

root.map (root.file("sub/code/io/vfs/VirtualFolder.d"), "thatFile");
auto file = root.file ("thatFile");

LinkedFolder

This folder is derived from VirtualFolder, and behaves in the same way except for a twist upon file-lookup behavior. Instead of mapping a file name to a specific folder within the hierarchy, LinkedFolder sweeps its configured child folders for the file, and does so in the order in which those folder were mounted.

This allows you to (for example) setup a folder to contain configuration files, and support user-provided customizations housed within a second folder. If that second folder is mounted first, any file requests will initially be made there before looking in any other folder. File-level overrides, if you like. Where a lookup fails, LinkedFolder continues along the list of mounted folders until it either locates the file or fails entirely:

auto tango = new FileFolder ("/dev/d/software/tango");
auto other = new FileFolder ("/other");
auto ftp   = new FtpFolder  ("ftp://www.dsource.org/downloads");

auto links = new LinkedFolder ("links");
links.mount(tango).mount(other).mount(ftp);

auto file = links.file ("myfile.txt");

In the above example, "myfile.txt" is located by looking first in tango, then in other, and finally (if not already found) in the downloads section of the FtpFolder. The only way in which these linked folders deviate from a virtual file is in the behavior of the file() method. You can still locate and traverse folders and files in the normal fashion using self, tree, catalog, and so on.

WebDavFolder

ZipFolder

FtpFolder