Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #587 (closed enhancement: fixed)

Opened 1 year ago

Last modified 1 year ago

Allow tango.io.FileScan to proceed past errors and scan non-recursively

Reported by: flithm Assigned to: kris
Priority: normal Milestone: 0.99.1 RC4
Component: IO Version: 0.99 RC3 Xammy
Keywords: Cc: sean, larsivi

Description

Currently io.FileScan? dies when it encounters any errors (including things that aren't errors at all (such as pipes and fifos), and trivial errors such malformed soft-links, permission denied, etc.

For my application I need the ability to produce a listing even in the face of encountering errors.

I also desire the ability to generate some scans recursively and others non-recursively.

I have modified FileScan? to provide the functionality I desire. This works and does what I need it to do, however it may not be exactly how you guys want it to work. Either way I thought I should attach the patch (and original) file in case it is of interest.

Thanks,

Tim.

Attachments

FileScan-OptionalDeathOnErrors.patch (6.2 kB) - added by flithm on 08/22/07 09:06:09.
FileScan.patch (6.2 kB) - added by flithm on 08/26/07 15:06:37.
Allow blank filter to scan all file names
FileScan.d (8.9 kB) - added by flithm on 08/26/07 15:07:14.
Complete FileScan? with patch applied

Change History

08/22/07 09:06:09 changed by flithm

  • attachment FileScan-OptionalDeathOnErrors.patch added.

08/22/07 09:14:15 changed by flithm

I should note that this doesn't change the behavior of FileScan? in existing programs. In order to use the new functionality you need to pass in some parameters to sweep.

08/26/07 15:06:37 changed by flithm

  • attachment FileScan.patch added.

Allow blank filter to scan all file names

08/26/07 15:07:14 changed by flithm

  • attachment FileScan.d added.

Complete FileScan? with patch applied

(follow-up: ↓ 3 ) 08/26/07 16:13:32 changed by kris

  • status changed from new to assigned.

Thanks much for digging into this, though I'd like to adjust the changes somewhat:

  • For unfiltered sweeps, the filter itself should really be modified rather than depending upon 'match' being empty (and non-null). However, using null as an argument won't compile. For the sake of simplicity, this unfiltered option should probably read: sweep (char[] path); with no other arguments.
  • to control recursion, there are already two options. (1) use the non-recursive FilePath?.toList(), or (2) provide a filter which returns false when handed a subdirectory. If possible, I'd rather not add yet another way of doing that?
  • controlling errors is a good idea, but there's perhaps a problem with the suggested implementation in that it would only ever show one error per directory? One question is this: shouldn't the error handling be pushed down into FilePath?.toList() instead ?

(in reply to: ↑ 2 ; follow-up: ↓ 4 ) 08/26/07 16:38:08 changed by flithm

Hmm... good points!

- For unfiltered sweeps, the filter itself should really be modified rather than depending upon 'match' being empty (and non-null). However, using null as an argument won't compile. For the sake of simplicity, this unfiltered option should probably read: sweep (char[] path); with no other arguments.

Agreed!

- to control recursion, there are already two options. (1) use the non-recursive FilePath?.toList(), or (2) provide a filter which returns false when handed a subdirectory. If possible, I'd rather not add yet another way of doing that?

Yeah I was pretty iffy on the best way to do this. I think either of the options you suggest sound good. The only thing I'd add is if by (2) you mean provide a custom filter to the function, I think that's too much effort on the users part. It should really just be a parameter they can pass in. But if it's the best way to go then maybe a couple of pre-defined filters could be made available... like FilterRecursive?, FilterNonRecursive?, etc (that would both take a parameter for the scan filter)?

- controlling errors is a good idea, but there's perhaps a problem with the suggested implementation in that it would only ever show one error per directory? One question is this: shouldn't the error handling be pushed down into FilePath?.toList() instead ?

There are situations where it thinks a file is a directory (ie it doesn't currently handle malformed symlinks, pipes, fifos, etc), so it tries to recurse into to these "directories" and fails, which _does_ result in multiple errors for a single directory. Try scanning /tmp (which often has pipes and fifos and sockets and all that in it). Most of the time though it's only possible for one error per dir (because it can't be scanned).

Regardless of how it's implemented in my thinking you want to know a file is there, and you want to know if it couldn't be scanned, but you also want the scan to keep going.

I had another thought on how this could be implemented. An error function could be passed in that allows you to intercept an error as it happens. If you return "true" the scan will keep going, otherwise not. Perhaps this would make sense if the error handling were pushed down to FilePath?.toList()? It could take the error handler and call it when it encounters errors.

(in reply to: ↑ 3 ; follow-up: ↓ 5 ) 08/26/07 16:48:11 changed by kris

Replying to flithm:

Hmm... good points!

- For unfiltered sweeps, the filter itself should really be modified rather than depending upon 'match' being empty (and non-null). However, using null as an argument won't compile. For the sake of simplicity, this unfiltered option should probably read: sweep (char[] path); with no other arguments.

Agreed!

Checked in :)

- to control recursion, there are already two options. (1) use the non-recursive FilePath?.toList(), or (2) provide a filter which returns false when handed a subdirectory. If possible, I'd rather not add yet another way of doing that?

Yeah I was pretty iffy on the best way to do this. I think either of the options you suggest sound good. The only thing I'd add is if by (2) you mean provide a custom filter to the function, I think that's too much effort on the users part. It should really just be a parameter they can pass in. But if it's the best way to go then maybe a couple of pre-defined filters could be made available... like FilterRecursive?, FilterNonRecursive?, etc (that would both take a parameter for the scan filter)?

That sounds like it's worth investigating further

- controlling errors is a good idea, but there's perhaps a problem with the suggested implementation in that it would only ever show one error per directory? One question is this: shouldn't the error handling be pushed down into FilePath?.toList() instead ?

There are situations where it thinks a file is a directory (ie it doesn't currently handle malformed symlinks, pipes, fifos, etc), so it tries to recurse into to these "directories" and fails, which _does_ result in multiple errors for a single directory. Try scanning /tmp (which often has pipes and fifos and sockets and all that in it). Most of the time though it's only possible for one error per dir (because it can't be scanned).

OK, I see

Regardless of how it's implemented in my thinking you want to know a file is there, and you want to know if it couldn't be scanned, but you also want the scan to keep going. I had another thought on how this could be implemented. An error function could be passed in that allows you to intercept an error as it happens. If you return "true" the scan will keep going, otherwise not. Perhaps this would make sense if the error handling were pushed down to FilePath?.toList()? It could take the error handler and call it when it encounters errors.

Yeah, that's where I was going also. To ensure the simple version of sweep(char[] path) is still available we'd perhaps have to move this option to an attribute setter, rather than provide it as an argument to sweep() directly ...

(in reply to: ↑ 4 ; follow-up: ↓ 6 ) 08/26/07 18:29:09 changed by flithm

I had another thought on how this could be implemented. An error function could be passed in that allows you to intercept an error as it happens. If you return "true" the scan will keep going, otherwise not. Perhaps this would make sense if the error handling were pushed down to FilePath?.toList()? It could take the error handler and call it when it encounters errors.

Yeah, that's where I was going also. To ensure the simple version of sweep(char[] path) is still available we'd perhaps have to move this option to an attribute setter, rather than provide it as an argument to sweep() directly ...

Whatever works best and fits into the API nicely!

Personally I think I'd provide an optional argument to sweep that defaults to null. (Ie if no function passed in, don't report errors). Or there could be another sweep function that, if passed an error callback, sets the property, and makes a call to the simple version of sweep.

But, that's just me. As long as the functionality gets in there in some way I'll be happy!

(in reply to: ↑ 5 ; follow-up: ↓ 7 ) 08/26/07 18:31:23 changed by kris

Replying to flithm:

Whatever works best and fits into the API nicely! Personally I think I'd provide an optional argument to sweep that defaults to null. (Ie if no function passed in, don't report errors). Or there could be another sweep function that, if passed an error callback, sets the property, and makes a call to the simple version of sweep. But, that's just me. As long as the functionality gets in there in some way I'll be happy!

I would if the compiler would allow it :)

i.e. a null default for the other two sweep() signatures will collide?

(in reply to: ↑ 6 ) 08/26/07 18:36:22 changed by flithm

Replying to kris:

Replying to flithm:

Whatever works best and fits into the API nicely! Personally I think I'd provide an optional argument to sweep that defaults to null. (Ie if no function passed in, don't report errors). Or there could be another sweep function that, if passed an error callback, sets the property, and makes a call to the simple version of sweep. But, that's just me. As long as the functionality gets in there in some way I'll be happy!

I would if the compiler would allow it :) i.e. a null default for the other two sweep() signatures will collide?

Oh right I didn't think about that. How about instead of null provide a default error function that just throws an exception?

08/28/07 02:12:00 changed by kris

  • status changed from assigned to closed.
  • resolution set to fixed.
  • milestone changed from 1.0 to 0.99.1 RC4.

OK, I've made some adjustments to FileScan and I think it does what you wanted? However, I'm not content with two aspects:

  • the additional argument 'recurse' adds a level of complexity to the functions that probably shouldn't be there. Yes, they are defaulted for pedestrian usage, but I'm not crazy about such things :p
  • the list of errors is maintained silently. The user has to check scan.errors.length to see if errors occurred, so it's not the most intuitive API I've seen. I could return a bool, but then that breaks some existing code. I could configure an error handler, but that adds further complexity to what ought to be a no-brainer. Other suggestions?

Closing for now. Re-open as required.

08/29/07 12:35:05 changed by flithm

Hmm... took me a bit to review this and it looks good to me. I have the same concerns as you, but I'm unmotivated to provide a proper fix. This seems "good enough" for now, but definitely it would be good to redesign this at some point.

08/29/07 14:13:42 changed by kris

  • cc set to sean, larsivi.