Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Quotes!(char) Doesn't escape well or work correctly with delinators.

Moderators: kris

Posted: 12/04/09 01:58:40 Modified: 12/04/09 02:12:45

When using deliniators with Quotes, it ends after the quotes, making [1,"2",3] into [1,2,,3]. Additionally, Quotes does not understand escape sequences.

This is unacceptable for standard applications like CSV's or any mixed quote, or even simple quoted deliniated things.

So I took a look at the Quotes class and made one that allows strip quotes, understands escape sequences, and knows not define empty elements after a quote and before a comma.

The class:

class QuotedDeliminators(T) : Iterator!(T){
	T[] delim;
	T quote_type;
	
	int escape; // escape counter
	bool strip_quotes;
	T[] throw_away; // For stripping
	
	this (T[] delim, InputStream stream = null,bool strip_quotes = true){
		super (stream);
		this.delim = delim;
		
		escape = 0;
		quote_type = false;
		throw_away.length = 1;
		strip_quotes = strip_quotes;
	}
     /***********************************************************************
             
     ***********************************************************************/

     protected size_t scan (void[] data){
         auto content = (cast(T*) data.ptr) [0 .. data.length / T.sizeof];
         bool quoted = false;

         foreach (int i, T c; content){
         	if(quote_type == false ){
         		if (c is '"' || c is '\''){
					quote_type = c;
					quoted = true; // For strip_quotes
         		}else if (has (delim, c)){
         			if(quoted && strip_quotes){;
         				/**
         				 * We need to strip the last, which
         				 * is not supported.
         				 * 
         				 * This is ripped and modified
         				 * from set, since set is final.
         				 */
             			slice = content [1 .. i-1];
                        delim = content [i .. i+1];
                        return found(i);
         			}else{
         				return found (set (content.ptr, 0, i));
         			}
         		}
         	}else{
         		if(c is '\\'){
         			escape++;
				}else{
					if(c is quote_type && escape % 2 == 0){
						quote_type = false;
					}
					escape = 0;
				}
         	}
 		}

         return notFound;
     }
}

The test case:

Stdout("Works well enough:").newline;
	auto qd = new QuotedDeliminators!(char)(",",b,false);
	b.setContent("0,,,\"3\",\"\",5,\",6\",\"7,\",8,\"9,\\\\\\\",\",10,',11\",',12");
	foreach(i,f;qd){
		Stdout(i,f).newline;
	}
	Stdout("Broken:").newline;
	auto q = new Quotes!(char)(",",b);
	b.setContent("0,,,\"3\",\"\",5,\",6\",\"7,\",8,\"9,\\\\\\\",\",10,',11\",',12");
	foreach(i,f;q){
		Stdout(i,f).newline;
	}

Output:

Works well enough:
0, 0
1, 
2, 
3, "3"
4, ""
5, 5
6, ",6"
7, "7,"
8, 8
9, "9,\\\","
10, 10
11, ',11",'
12, 12
Broken:
0, 0
1, 
2, 
3, 3
4, 
5, 
6, 
7, 5
8, ,6
9, 
10, 7,
11, 
12, 8
13, 9,\\\
14, 
15, ,10,',11
16, 
17, ',12
Author Message

Posted: 12/09/09 21:46:53

That's useful ... wanna create a ticket for this?

thanks :)

Posted: 12/14/09 06:23:20

Sure thing.