Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #748 (assigned enhancement)

Opened 11 months ago

Last modified 3 months ago

Replacement/Enhancement for tango.util.ArgParser

Reported by: darrylb Assigned to: kris (accepted)
Priority: major Milestone: 0.99.8
Component: Core Functionality Version: 0.99.3 Triller
Keywords: Cc:

Description

Currently, ArgParser? is fairly simplistic. As command line syntax is traditionally fairly similar, it could be extended/replaced with additional functionality.

As example, I present http://www.dsource.org/projects/tango.scrapple/browser/trunk/tango/scrapple/util/Arguments.d

This is a command line argument parsing module that was ported from C for use in D. It could use some polish and 'tangofication', but otherwise, I think is a good example of how argument parsing could be made simpler for the user. A possible future extension of this could be to generate help text automatically.

Attachments

arguments_example.d (2.9 kB) - added by darrylb on 11/21/07 11:55:03.
Arguments.d (18.7 kB) - added by darrylb on 12/17/07 17:09:14.
Updated Arguments module (adds .parse method)
Arguments.2.d (27.2 kB) - added by larsivi on 12/18/07 15:21:17.
current revision for Tango
Arguments_Docs.txt (6.5 kB) - added by darrylb on 03/01/08 18:42:56.
Some preliminary tutorial / docs
CmdParser.d (2.1 kB) - added by darrylb on 03/24/08 22:41:59.
ArgumentsExample.d (2.2 kB) - added by darrylb on 03/24/08 22:59:31.
Arguments version of http://www.dsource.org/projects/tango/wiki/ArgParserExample
Arguments.3.d (35.8 kB) - added by darrylb on 04/07/08 14:02:33.
Arguments, -opIn, +contains, sed "s/\t/ /g"

Change History

11/15/07 18:37:48 changed by darrylb

  • type changed from defect to enhancement.

11/16/07 03:30:46 changed by larsivi

Darryl; Could you write up an example that shows off the same as in the TutArgParser tutorial?

11/21/07 11:54:21 changed by darrylb

Ok, I wrote the example found at: http://www.dsource.org/projects/tango/wiki/ArgParserExample With this Arguments module.

I had to add a .parse function to Arguments, which isn't in what I submitted to scrapple but is fairly trivial.

11/21/07 11:55:03 changed by darrylb

  • attachment arguments_example.d added.

11/24/07 14:58:04 changed by darrylb

Compared to ArgParser?:

Arguments and ArgParser? vary substantially from each other in that Arguments provides a storage and reference structure for parsed arguments, whereas ArgParser? provides for argument parsing but passes the storage and referencing of arguments to an external delegate. This makes for some increased overhead when using ArgParser? versus Arguments. For example, consider the trivial case of seeing if some 'x' was set on the command line.

With ArgParser?:

void main(char[][] cmdlArgs) {

bool xSet = false; ArgParser? parser = new ArgParser?(); parser.bind("-", "x", delegate void() { xSet = true; }); parser.parse(cmdlArgs); if (xSet)

<do cool stuff>

}

Whereas, with Arguments:

void main(char[][] cmdlArgs) {

Arguments args = new Arguments(cmdlArgs); if ("x" in args)

<do cool stuff>

}

This becomes more apparent if you have aliases, as is common in command line parameters. If 'x' could be set with -x, --setX, and --pleaseSetX, then ArgParser? would need a bind and subsequent delegate for each, whereas with Arguments this type of thing is made almost trivial via use of the aliases parameter to the ctor:

void main(char[][] cmdlArgs) {

char[][][] aliases; aliases ~= ["x", "setX", "pleaseSetX"]; Arguments args = new Arguments(cmdlArgs, aliases); if ("x" in args)

<do cool stuff>

}

11/24/07 15:08:56 changed by darrylb

Arguments also provides functionality that I don't think is possible in ArgParser?, that of implicit arguments. Implicit arguments are arguments that are passed on the command line that are implicitly set to some name. For example, you might have:

myProgram --files blah.txt blah2.txt

That you would also like to call as:

myProgram blah.txt blah2.txt

Arguments provides this via the implicitArguments parameter to the ctor. Implicit arguments are described in detail in the comments of Arguments so I won't repeat their use here.

Arguments also provides for standardized validation of arguments, which can be handled in one of three ways:

The first covers most of what you want to check with arguments, that is, that the argument is passed, and that it has a value. This is simply: .addValidation("argName", bool mustExist, bool mustHaveParameter);

For more complex validations, you can also pass in a validation delegate. These delegates can either validate single argument parameters, or validate all passed parameters at once. These validation delegates are also described completely in the comments so I won't dwell on their actual use here, but comparing these to ArgParser?, validation is much simpler as it's not contained within an external function (the ArgParser? delegate for the particular arg). As well, the validators are alias and implicit argument aware, so there's no need for duplicate code. You can of course, also share a validator amongst several arguments, as I do a lot of the time (a validator that checks if the passed file actually exists, etc).

The validation function itself (.validate) throws a well-defined exception that one can use to discover what the issue was, and output or do whatever else as appropriate. These exceptions are also described in the comments, but as compared to ArgParser?, Arguments provides a standardized, and easy to use way to both set validations for arguments and to evaluate the results of said validation.

11/24/07 15:12:14 changed by darrylb

The above examples in code blocks for easier reading... (I wish you could edit your own comments)

ArgParser?:

void main(char[][] cmdlArgs)
{
    bool xSet = false;
    ArgParser? parser = new ArgParser?();
    parser.bind("-", "x", delegate void() { xSet = true; });
    parser.parse(cmdlArgs);
    if (xSet)
        <do cool stuff>
}

Whereas, with Arguments:

void main(char[][] cmdlArgs)
{
    Arguments args = new Arguments(cmdlArgs);
    if ("x" in args)
        <do cool stuff>
}

12/17/07 14:32:27 changed by larsivi

  • owner changed from sean to larsivi.
  • status changed from new to assigned.

This is mine :P

12/17/07 17:09:14 changed by darrylb

  • attachment Arguments.d added.

Updated Arguments module (adds .parse method)

12/18/07 15:21:17 changed by larsivi

  • attachment Arguments.2.d added.

current revision for Tango

12/18/07 19:12:58 changed by Nietsnie

<larsivi> darrylb: in the example, there are no parameters registered with the "response" tag if doing ./arguments -r foobar
<larsivi> darrylb: further, ./arguments -c foo bar -r foobar crash with ArrayOutOfBoundsException? in "if (currentArgument[0] == '-')" inside parse

The first example doesn't work because of how short arguments are handled.

Basically, it finds -r as an argument, and then it gets an argument witout a - delimiter. So it checks implicit arguments first.. and finds the file one ... so you end up with an args array like this:

args["r"] = [];
args["files"] = [ "foobar" ];


You can instead do ./arguments -r:foobar or ./arguments -r=foobar to get the desired functionality

The second example would look like this:

args["c"] = [];
ags["files"] = ["foo", "bar"];
args["r"] = [ "foobar" ];

The exception is thrown because the arguments_example.d doesn't .dup the line it's adding to the arguments array for the "parse" on line 82 (it should be arguments ~= line.dup;) Changing that fixed the issue.

12/19/07 03:59:59 changed by larsivi

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [3047]) Tentative replacement for ArgParser?, ArgParser? is deprecated and will be removed (to tango.scrapple) prior to 1.0. closes #748.

12/19/07 14:51:33 changed by darrylb

The current implicit arguments scheme has some hurdles to overcome:

Background: The implicit argument concept is intended to support traditional command line use such as:

ls -al fileone.txt filetwo.txt

Which we should expect to work the same as:

ls fileone.txt filetwo.txt -al

In this case, we could have argsfiles?, and could have both "a" and "l" in args, and run our program appropriately.

Some issues: a) By itself, the implicit arguments don't know how many parameters they should consume, and so default to consuming one parameter each until the last declared implicit argument, which consumes the rest. This may or may not be the behavior the programmer intends.

b) The implicit arguments may consume the parameter intended for a previous argument. For example, given the command line:

myProg --action delete fileone.txt filetwo.txt

And assuming a single implicit argument called 'files', and a declared argument called 'action', we would probably expect to have: argsaction? = delete? argsfiles? = ["fileone.txt", "filetwo.txt"]

However, what we will have is: argsaction? = null argsfiles? = ["delete", "fileone.txt", "filetwo.txt"]

Note that there is no distinct 'correct' way to parse this type of situation, because the program could just as easily be called via:

myProg --delete fileone.txt filetwo.txt

In which case, we'd not want the 'delete' argument to be assigned the 'fileone.txt' parameter.

Possible resolutions: a) Only allow a single implicit argument. Essentially, this would be 'an array of non-declared parameters', and the programmer can decide what they mean. This would eliminate any possible confusion of 'which implicit parameters went where?'. b) The only solution I can see to this is to force the programmer to declare more information about the arguments and the parameters they are expecting. Because the situation is ambiguous as to possible interpretations, the only resolution is to remove the ambiguity by having the programmer tell us what they expect. Essentially, this could be an addition to the ctor. This is also exactly the type of thing that would be required in order to provide automatic help text generation, so it would serve a dual purpose (removing misinterpretation, and providing a path to future extended functionality).

I can add in this type of thing if it seems like a good idea, however for now I leave it up to discussion.

12/19/07 16:55:27 changed by larsivi

  • status changed from closed to reopened.
  • resolution deleted.

(In [3055]) Revert changes 3047 through 3049 from 0.99.4 tag resulting in no ArgParser?? / Arguments changes. They are still present in trunk though.

12/19/07 17:27:46 changed by larsivi

  • milestone set to 0.99.5.

After trying out Arguments to some degree, I believe there are at least a few things that needs to be fixed.

More obvious mechanics of the implicit parameters seems to be necessary, although it may be a documentative problem. Darryl's comments above details the issue, and to its defence comes the fact that you don't have to use them. One of the surprises to me was that they weren't just the unhandled parameters found at the end, especially I wouldn't expect them to "catch" my parameter to the response option in the example further up.

Being able to bind delegates for handling passed parameters seems to be much more useful than they were made out to be in the Arguments argumentation. Especially does the if (arg in args) approach seem to have problems in more dynamic settings (undefined order of passed parameters, change in the parameter list during parsing / handling, for instance when encountering a response file).

Note that passing of an ordinal to the delegate as is possible with ArgParser?, won't be immediately available with Arguments without adding more data to the internal structure. Where ArgParser? called the handler during parsing, Arguments stores the parameters in an un-ordered list. Whether ordinals is an actual needed feature, I'm undecided about at the moment.

(follow-up: ↓ 14 ) 12/28/07 09:34:22 changed by ShadowIce

Note that tango.net.cluster.tina.CmdParser?.d (and ClusterServer??) uses ArgParser?. This causes an error with build-tango.bat (0.99.4) on my system because CmdParser? can't be compiled.

(in reply to: ↑ 13 ) 12/28/07 09:38:11 changed by ShadowIce

Replying to ShadowIce:

Note that tango.net.cluster.tina.CmdParser?.d (and ClusterServer??) uses ArgParser?. This causes an error with build-tango.bat (0.99.4) on my system because CmdParser? can't be compiled.

Ups, that's already fixed. But for any future deprecations of ArgParser?, don't forget CmdParser?. ;)

12/28/07 10:12:11 changed by larsivi

ShadowIce?; Thanks, I know ;)

02/25/08 17:25:30 changed by larsivi

  • milestone changed from 0.99.5 to 0.99.6.

03/01/08 18:42:56 changed by darrylb

  • attachment Arguments_Docs.txt added.

Some preliminary tutorial / docs

03/18/08 06:42:52 changed by larsivi

Ok, I have done some slight tests and I like. However, your CmdParser is bogus as there is no args instance, but rather 'this'. So it compiles if 'args' in parse are changed to 'this'. I suppose that will work fine, and given that CmdParser is a subclass of Arguments, those semantics are ok. It does however look a bit peculiar to use in and opIndex on 'this', so maybe you see if this is sane?

Also, your Test class isn't in Tango yet, so that can not be part of the unittest. Stdout import must be removed too. Apart from that, I'd say it is ready for inclusion. There's probably some doc lines that needs to be broken though, but that can be looked at after docs are generated the first time. Also need updated example for example/ folder.

03/18/08 12:47:56 changed by kris

I, too, have concerns over this. Looking at the CmdParser update, I had some real difficulty discerning what was what - the parameters() call, for example, is about as opaque as one could hope for, and the indexing[0] on the resultant processing is equally opaque. It's fine if you already grok how the thing operates, but (IMO) crosses the line in terms of easy comprehension.

You might consider revising what the default settings are for each configured 'option', or consider adding some methods to bring some clarity to the configuration. For example, adding a single() method or equivalent to indicate both parameters(1,1) and required() might be a good place to start. I don't know what to suggest regarding the resultant [0] indexing required for single parameter options, but suspect something is necessary.

03/24/08 22:38:14 changed by darrylb

Made some changes to try to address the issues brought up here and on irc:

-made a subclass to access the char[][] parameter array. So, for example, you can do:

char[][] givenFiles = args.parameters["files"];

Thanks to schveiguy for that idea. I think that will work out well.

-argsname? now just returns the 0 index for any discovered parameters for that argument. (ie: the following are the same:)

char[] blah = args["blah"];
char[] blah = args.parameters["blah"][0];

-The single index define is now interpreted as defining both the min and max required number of parameters for that argument. So, the following are the same:

.define("a").parameters(1,1);
.define("a").parameters(1);

-Changed the name of .requires to .prerequisite, to avoid confusion with .required.

-Made separate exception classes for each parsing issue, so that the exception can contain more information on the parse issue.

-Commented out the Test unittests, and added a basic debug(UnitTest?) unittest with asserts.

03/24/08 22:41:59 changed by darrylb

  • attachment CmdParser.d added.

03/24/08 22:49:06 changed by darrylb

Altered the CmdParser? to reflect that there is no Arguments object called 'args'...

Using 'this' for opIn and opIndex doesn't bother me there, I think it makes sense seeing as CmdParser? is extending Arguments.

However, if one wasn't too concerned with pre-definition of arguments, and didn't mind including Log otherwise, CmdParser? could be removed entirely from the tina instances which use it, if wanted. (It's now quite a light class without the extra functions needed when using ArgParser?).

03/24/08 22:59:31 changed by darrylb

  • attachment ArgumentsExample.d added.

03/24/08 23:00:45 changed by darrylb

Larsivi, I already added the altered ArgParserExample? Unless I'm missing something :)

Well, updated it for this revision anyway.

04/07/08 14:02:33 changed by darrylb

  • attachment Arguments.3.d added.

Arguments, -opIn, +contains, sed "s/\t/ /g"

04/07/08 17:09:19 changed by larsivi

(In [3415]) Close to final version of Arguments, the ArgParser? replacement. refs #748, thanks a bunch darrylb!

04/27/08 05:08:48 changed by larsivi

  • milestone changed from 0.99.6 to 0.99.7.

05/07/08 13:45:56 changed by larsivi

  • owner changed from larsivi to kris.
  • status changed from reopened to new.

Kris wants to review it prior to finalizing.

05/28/08 01:28:29 changed by kris

  • status changed from new to assigned.

06/12/08 10:17:28 changed by kris

  • milestone changed from 0.99.7 to 0.99.8.

ran out of time :(

07/05/08 07:55:44 changed by larsivi

There's yet another discussion in the NG with high discussion value, however it exposed this usecase which I'm not able to implement.

ssh -v foo@somehost ls -al

where ls -al is the optional command to be executed at the host. This part should not be parsed, but passed wholesale or in pieces to the function executing the command.

I think the best solution would be to have all these as implicit at the end, and available as ["ls", "-al"] or very optionally but probably not workable for the executor, ["ls", "a", "l"]. What happens now is that the -al part seems to be just ignored, the implicit list is empty after "ls". I think the host part is required in ssh.

07/05/08 09:52:41 changed by larsivi

Also there was a mention of -- in the command line making whatever follows being ignored.

In general, nothing should be silently ignored without the user having an option to do something without it. In particular, I think that the delegate for handling unknown/unhandled arguments/parameters needs to be implemented before this is ready.