Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #271 (closed defect: fixed)

Opened 17 years ago

Last modified 17 years ago

FileScan does not recurse

Reported by: demise Assigned to: kris
Priority: minor Milestone: 0.96 Beta 2
Component: IO Version: 0.95.1
Keywords: Cc:

Description

From the documentation: "Recursively scan files and directories, adding filtered files to an output structure as we go." "The following example lists all files with suffix ".d" located via the current directory, along with the folders containing them:"

        auto scan = new FileScan;

        scan (new FilePath ("."), ".d");

        Stdout.formatln ("{0} Folders", scan.folders.length);
        foreach (file; scan.folders)
                 Stdout.formatln ("{0}", file);

        Stdout.formatln ("\n{0} Files", scan.files.length);
        foreach (file; scan.files)
                 Stdout.formatln ("{0}", file);

However, this does not recurse. If I do "mkdir foo;touch foo/{1,2,3}.d;./my_app", it does not find those. I'm I supposed to call toList() for every directory?

Change History

02/09/07 15:04:09 changed by demise

I changed the directory name so that it's foo.d now. I also changed the first argument, but it didn't help. This time FileScan? found foo.d, but it thinks that everything other than . is a file, therefore it does not recurse.

Debugging tango/tango/io/FileScan.d revealed that file.getPath.isDir returns false even if the file path is a directory.

I'm running dmd 1.005 and gcc 4.1.1 on Gentoo. Any ideas?

02/09/07 19:03:24 changed by kris

  • status changed from new to assigned.
  • milestone set to 0.96 Beta 2.

This appears to be an issue with how DT_DIR is being handled in FileProxy?.toList(). I've checked-in a temporary fix that would be worth trying out. Can you give it a whirl, please? (In SVN head)

02/10/07 01:50:32 changed by kris

  • status changed from assigned to closed.
  • resolution set to fixed.

02/22/07 13:27:16 changed by demise

  • status changed from closed to reopened.
  • resolution deleted.

I installed the svn head today and tested Changeset 1645. It still does not work. Here's the test environment

$ mkdir 1.d
$ touch 1.d/{foo,bar}.d
$ bud test.d -op

The FileScan? example gives

$ ./test.d

1 Folders
.

2 Files
./1.d
./test.d

I also added some lines to display the contents of the structs returned by readdir (d_ino, d_off, d_reclen, d_type, d_name):

3986
2
16
0
.

2
1154432
16
0
..

54045
2281600
16
0
1.d

40312
112486784
16
0
test

43452
121544064
24
0
test.d

I also tested this on a large directory. It seems the d_type is always zero. I have no idea, why it is so. My other test machine had glibc 2.3.6 and this one version 2.5, if that helps in any way. Both were not able to find any directories.

02/22/07 22:15:58 changed by JJR

I've tested example/filescan.d

Ubuntu Linux 5.10 dmd 1.0007 rebuild 0.11 tango revision 1740

It appears to be functioning correctly:

tango@ubuntu:~/dmd/projects/tango/example$ ls -l
total 296
-rw-r--r--  1 tango tango  19027 2007-02-22 12:58 build-all.bat
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 concurrency
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 conduits
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 console
-rw-r--r--  1 tango tango    677 2007-02-22 12:58 dsss.conf
-rwxr-xr-x  1 tango tango 226080 2007-02-22 13:46 filescan
-rw-r--r--  1 tango tango   2759 2007-02-22 12:58 jake-all.bat
-rw-r--r--  1 tango tango   1733 2007-02-22 12:58 linux.mak
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 locks
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 logging
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 manual
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 networking
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 system
drwxr-xr-x  3 tango tango   4096 2007-02-22 12:58 text
tango@ubuntu:~/dmd/projects/tango/example$ ./filescan
Scanning '.'

10 Folders
./console
./system
./text
./logging
./networking
./conduits
./concurrency
./locks
./manual
.

38 Files
./console/stdout.d
./console/hello.d
./system/process.d
./system/argparser.d
./system/normpath.d
./system/localtime.d
./text/formatindex.d
./text/localetime.d
./text/formatspec.d
./text/token.d
./text/formatalign.d
./logging/chainsaw.d
./logging/logging.d
./networking/socketserver.d
./networking/httpget.d
./networking/sockethello.d
./networking/selector.d
./networking/homepage.d
./conduits/filebubbler.d
./conduits/FileBucket.d
./conduits/filepathname.d
./conduits/filescanregex.d
./conduits/lineio.d
./conduits/fileops.d
./conduits/composite.d
./conduits/filescan.d
./conduits/filecat.d
./conduits/randomio.d
./conduits/filecopy.d
./conduits/mmap.d
./conduits/unifile.d
./concurrency/fiber_test.d
./locks/semaphore.d
./locks/mutex.d
./locks/readwritemutex.d
./locks/barrier.d
./locks/condition.d
./manual/chapterStorage.d

592 entries inspected
tango@ubuntu:~/dmd/projects/tango/example$

02/23/07 01:54:19 changed by demise

I tested this on my laptop running Arch Linux with exactly the same binary. It also worked well.

Too bad I'm doing all my development work with these Gentoo boxes. It seems the functionality is broken only on them. I even tested the same functions found in Phobos. They also don't work on these. I don't understand. It seems these operating systems are not following the posix standard. What compiler / package flag could have caused that? I guess the common *nix programs like 'find' are doing it some other way since they work well.

02/27/07 01:49:35 changed by kris

Can you check the value of DT_DIR on that platform, please?

02/27/07 02:00:08 changed by kris

  • status changed from reopened to closed.
  • resolution set to worksforme.

02/27/07 13:48:46 changed by demise

  • status changed from closed to reopened.
  • resolution deleted.

Kris, I switched from Gentoo to Kubuntu Feisty now. I thought it would fix all the problems. Here /usr/src/linux/include/linux/fs.h shows

#define DT_UNKNOWN      0
#define DT_FIFO         1
#define DT_CHR          2
#define DT_DIR          4
...

The example code still does not work :( -- I have now three different machines with differents distros, kernel and libc versions. Where should I look next? This bug really stops all my development work with D.

Some other fellow also had problems with this: http://uclibc.org/lists/uclibc/2004-May/008845.html - I'm not sure, if these are related in any way.

02/27/07 14:48:00 changed by demise

This C program shows the d_type values:

#include <sys/types.h>
#include <dirent.h>
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>

int main(void) {
  struct dirent *theEnts;
  DIR *theDir;
  theDir = opendir(".");

  while((theEnts = readdir(theDir)) != NULL){
    printf("name: %s, type: ", theEnts->d_name);
    switch(theEnts->d_type){
    case DT_UNKNOWN: printf("UNKNOWN"); break;
    case DT_FIFO: printf("FIFO"); break;
    case DT_CHR: printf("CHR"); break;
    case DT_DIR: printf("DIR"); break;
    case DT_BLK: printf("BLK"); break;
    case DT_REG: printf("REG"); break;
    case DT_LNK: printf("LNK"); break;
    case DT_SOCK: printf("SOCK"); break;
    case DT_WHT: printf("WHT"); break;
    }
    printf("\n");
  }
  return 0;
}

Here it shows

name: ., type: UNKNOWN
name: .., type: UNKNOWN
name: test, type: UNKNOWN
name: test.c, type: UNKNOWN

So it seems this is not an error in Tango. I tested this on several machines:

Arch Linux(libc 2.5-4, kernel 2.6.18): works
DSL Linux (libc 2.3.2, kernel 2.4.26): fails
Gentoo "stable" (libc 2.3.6-r4, kernel 2.6.19-gentoo-r1): fails
Gentoo "testing" (libc 2.5, kernel 2.6.20-vanilla): fails
Kubuntu 6.10 (libc 2.4-1ubuntu12, 2.6.17.10-generic): works
Nordisk Knoppix (libc 2.3.2, kernel 2.4.22): fails
Ubuntu 6.10 server (libc 2.4-1ubuntu12.3, kernel 2.6.17.11): works
Ubuntu 7.04 (libc 2.5-0ubuntu11, kernel 2.6.20-8-generic): fails

That makes me wonder if this is the correct way to do it on Linux? I mean, of course it's POSIX compliant, but if the Linuxes don't support it, how am I supposed to make my software work anywhere?

02/27/07 15:19:51 changed by larsivi

FWIW, I tested this program on my Kubuntu 6.10, and it worked there too.

Could you track down some authority on this subject and find out what would be the correct solution? Please :)

02/27/07 16:53:13 changed by JJR

It appears that d_type in the dirent structure is a BSD extension and not available on all Linux distributions (and certainly not posix). Some distributions have chosen to be more BSD like in operation, so maybe that explains why they implement this optional extension (ArchLinux, early Ubuntu's).

Nonetheless, this proves that d_type is invalid for use here, although we might like to keep it around somewhere in case a BSD gdc version of tango ever uses it (or Darwin for that matter: any word on how it works on Mac OSX?).

The proper way to do this in linux appears to be through struct statbuf and the stat system call. I'll be looking into this.

Nonetheless, good catch. At least it has been proven that DT_DIR is invalid for Linux.

02/27/07 17:18:02 changed by JJR

Ok, details:

1) We must use the stat or more probably the lstat system call. lstat is used to prevent recursing through symbolic links.

2) We use the S_ISDIR(st->st_mode) call to determine if the entry is a directory. st is a pointer to a stat_t structure as defined in linux sys/stat.h. Sean Kelly seems to have provided all the necessary functionality in stc/posix/sys/stat.d. :)

Hopefully we can get this fixed soon.

02/28/07 14:09:41 changed by demise

The new version works for me now. Thank you for fixing it. I don't close this yet since I'm not sure if it's 100% ready.

02/28/07 17:31:46 changed by JJR

Kudos to Gregor Richards for getting this fixed.

03/02/07 07:13:49 changed by kris

  • status changed from reopened to closed.
  • resolution set to fixed.