root/trunk/docsrc/regular-expression.dd

Revision 2040, 4.4 kB (checked in by walter, 2 years ago)

typography

  • Property svn:eol-style set to native
Line 
1 Ddoc
2
3 $(D_S Regular Expressions,
4
5     $(P Regular expressions are a powerful tool for
6     pattern matching on strings of text. They
7     are built in to the core of languages like Perl,
8     Ruby, and Javascript. Perl and Ruby are particulary
9     reknowned for adroitly handling regular expressions.
10     So why aren't they part of the D core language?
11     Read on and see how they're done in D compared with Ruby.
12     )
13
14     $(P This article explains how to use regular expressions
15     in D. It doesn't explain regular expressions themselves,
16     after all, people have written entire books on that topic.
17     D's specific implementation of regular expressions
18     is entirely contained in the Phobos library module
19     $(LINK2 phobos/std_regexp.html, std.regexp).
20     For a more advanced treatment of using regular expressions
21     in conjuction with template metaprogramming, see
22     $(LINK2 templates-revisited.html, Templates Revisited).
23     )
24
25     $(P In Ruby a regular expression can be created
26     as a special literal:
27     )
28
29 $(RUBY
30 r = /pattern/
31 s = /p[1-5]\s*/
32 )
33
34     $(P D doesn't have special literals for them, but they can
35     be created:)
36
37 ---
38 r = RegExp("pattern");
39 s = RegExp(r"p[1-5]\s*");
40 ---
41
42     $(P If the $(I pattern) contains backslash characters \,
43     wysiwyg string literals are used, which have the $(SINGLEQUOTE r) prefix
44     to the string. $(I r) and $(I s) are of type $(B RegExp), but
45     we can use type inference to declare and assign them automatically:
46     )
47
48 ---
49 auto r = RegExp("pattern");
50 auto s = RegExp(r"p[1-5]\s*");
51 ---
52    
53     $(P To check for a match of a string $(I s) with a regular expression
54     in Ruby, use the =~ operator, which returns the index of the
55     first match:)
56
57 $(RUBY
58 s = "abcabcabab"
59 s =~ /b/   /* match, returns 1 */
60 s =~ /f/   /* no match, returns nil */
61 )
62
63     $(P In D this looks like:
64     )
65
66 ---
67 auto s = "abcabcabab";
68 std.regexp.find(s, "b");    /* match, returns 1 */
69 std.regexp.find(s, "f");    /* no match, returns -1 */
70 ---
71
72     $(P Note the equivalence to std.string.find, which searches for
73     substring matches rather than regular expression matches.)
74
75     $(P The Ruby =~ operator sets some implicitly defined variables
76     based on the result:)
77
78 $(RUBY
79 s = "abcdef"
80 if s =~ /c/
81     "#{$`}[#{$&}]#{$'}"   /* generates string ab[c]def
82 )
83
84     $(P The function std.regexp.search() returns a RegExp object
85     describing the match, which can be exploited:
86     )
87
88 ---
89 auto m = std.regexp.search("abcdef", "c");
90 if (m)
91     writefln("%s[%s]%s", m.pre, m.match(0), m.post);
92 ---
93
94     $(P Or even more concisely as:
95     )
96
97 ---
98 if (auto m = std.regexp.search("abcdef", "c"))
99     writefln("%s[%s]%s", m.pre, m.match(0), m.post); // writes ab[c]def
100 ---
101
102 <h2>Search and Replace</h2>
103
104     $(P Search and replace gets more interesting. To replace the
105     occurrences of "a" with "ZZ" in Ruby; the first occurrence, then
106     all:
107     )
108
109 $(RUBY
110 s = "Strap a rocket engine on a chicken."
111 s.sub(/a/, "ZZ") // result: StrZZp a rocket engine on a chicken.
112 s.gsub(/a/, "ZZ") // result: StrZZp ZZ rocket engine on ZZ chicken.
113 )
114
115     $(P In D:)
116
117 ---
118 s = "Strap a rocket engine on a chicken.";
119 sub(s, "a", "ZZ");        // result: StrZZp a rocket engine on a chicken.
120 sub(s, "a", "ZZ", "g");   // result: StrZZp ZZ rocket engine on ZZ chicken.
121 ---
122
123     $(P The replacement string can reference the matches using
124     the $&amp;, $$, $', $`, $0 .. $99 notation:)
125
126 ---
127 sub(s, "[ar]", "[$&]", "g"); // result: St[r][a]p [a] [r]ocket engine on [a] chicken.
128 ---
129
130     $(P Or the replacement string can be provided by a delegate:)
131
132 ---
133 sub(s, "[ar]",
134    (RegExp m) { return toupper(m.match(0)); },
135    "g");    // result: StRAp A Rocket engine on A chicken.
136 ---
137
138 ($(TT toupper()) comes from $(LINK2 phobos/std_string.html, std.string).)
139
140 <h2>Looping</h2>
141
142     $(P It's possible to search over all matches within
143     a string:)
144
145 ---
146 import std.stdio;
147 import std.regexp;
148
149 void main()
150 {
151     foreach(m; RegExp("ab").search("abcabcabab"))
152     {
153         writefln("%s[%s]%s", m.pre, m.match(0), m.post);
154     }
155 }
156 // Prints:
157 // [ab]cabcabab
158 // abc[ab]cabab
159 // abcabc[ab]ab
160 // abcabcab[ab]
161 ---
162
163 <h2>Conclusion</h2>
164
165     $(P D regular expression handling is as powerful as Ruby's. But
166     its syntax isn't as concise:)
167
168     $(UL
169
170     $(LI Regular expression literal syntax - doing so would
171     make it impossible to perform lexical analysis without also
172     doing syntactic or semantic analysis.)
173
174     $(LI Implicit naming of match variables - this causes problems
175     with name collisions, and just doesn't
176     fit with the rest of the way D works.)
177
178     )
179
180     $(P But it is just as powerful.
181     )
182 )
183 Macros:
184     TITLE=Regular Expressions
185     WIKI=RegularExpression
186     RUBY=$(CCODE $0)
Note: See TracBrowser for help on using the browser.