| 1 |
Ddoc |
|---|
| 2 |
|
|---|
| 3 |
$(SPEC_S Lexical, |
|---|
| 4 |
|
|---|
| 5 |
In D, the lexical analysis is independent of the syntax parsing and the |
|---|
| 6 |
semantic analysis. The lexical analyzer splits the source text up into |
|---|
| 7 |
tokens. The lexical grammar describes what those tokens are. The D |
|---|
| 8 |
lexical grammar is designed to be suitable for high speed scanning, it |
|---|
| 9 |
has a minimum of special case rules, there is only one phase of |
|---|
| 10 |
translation, and to make it easy to write a correct scanner |
|---|
| 11 |
for. The tokens are readily recognizable by those familiar with C and |
|---|
| 12 |
C++. |
|---|
| 13 |
|
|---|
| 14 |
<h3>Phases of Compilation</h3> |
|---|
| 15 |
|
|---|
| 16 |
The process of compiling is divided into multiple phases. Each phase |
|---|
| 17 |
has no dependence on subsequent phases. For example, the scanner is |
|---|
| 18 |
not perturbed by the semantic analyzer. This separation of the passes |
|---|
| 19 |
makes language tools like syntax |
|---|
| 20 |
directed editors relatively easy to produce. |
|---|
| 21 |
It also is possible to compress D source by storing it in |
|---|
| 22 |
$(SINGLEQUOTE tokenized) form. |
|---|
| 23 |
|
|---|
| 24 |
$(OL |
|---|
| 25 |
$(LI $(B source character set)$(BR) |
|---|
| 26 |
|
|---|
| 27 |
The source file is checked to see what character set it is, |
|---|
| 28 |
and the appropriate scanner is loaded. ASCII and UTF |
|---|
| 29 |
formats are accepted. |
|---|
| 30 |
) |
|---|
| 31 |
|
|---|
| 32 |
$(LI $(B script line) $(BR) |
|---|
| 33 |
|
|---|
| 34 |
If the first line starts with $(GREEN #!) then the first line |
|---|
| 35 |
is ignored. |
|---|
| 36 |
) |
|---|
| 37 |
|
|---|
| 38 |
$(LI $(B lexical analysis)$(BR) |
|---|
| 39 |
|
|---|
| 40 |
The source file is divided up into a sequence of tokens. |
|---|
| 41 |
$(LINK2 #specialtokens, Special tokens) are replaced with other tokens. |
|---|
| 42 |
$(LINK2 #specialtokenseq, Special token sequences) |
|---|
| 43 |
are processed and removed. |
|---|
| 44 |
) |
|---|
| 45 |
|
|---|
| 46 |
$(LI $(B syntax analysis)$(BR) |
|---|
| 47 |
|
|---|
| 48 |
The sequence of tokens is parsed to form syntax trees. |
|---|
| 49 |
) |
|---|
| 50 |
|
|---|
| 51 |
$(LI $(B semantic analysis)$(BR) |
|---|
| 52 |
|
|---|
| 53 |
The syntax trees are traversed to declare variables, load symbol tables, assign |
|---|
| 54 |
types, and in general determine the meaning of the program. |
|---|
| 55 |
) |
|---|
| 56 |
|
|---|
| 57 |
$(LI $(B optimization)$(BR) |
|---|
| 58 |
|
|---|
| 59 |
Optimization is an optional pass that tries to rewrite the program |
|---|
| 60 |
in a semantically equivalent, but faster executing, version. |
|---|
| 61 |
) |
|---|
| 62 |
|
|---|
| 63 |
$(LI $(B code generation)$(BR) |
|---|
| 64 |
|
|---|
| 65 |
Instructions are selected from the target architecture to implement |
|---|
| 66 |
the semantics of the program. The typical result will be |
|---|
| 67 |
an object file, suitable for input to a linker. |
|---|
| 68 |
) |
|---|
| 69 |
) |
|---|
| 70 |
|
|---|
| 71 |
|
|---|
| 72 |
<h3>Source Text</h3> |
|---|
| 73 |
|
|---|
| 74 |
D source text can be in one of the following formats: |
|---|
| 75 |
|
|---|
| 76 |
$(UL |
|---|
| 77 |
$(LI ASCII) |
|---|
| 78 |
$(LI UTF-8) |
|---|
| 79 |
$(LI UTF-16BE) |
|---|
| 80 |
$(LI UTF-16LE) |
|---|
| 81 |
$(LI UTF-32BE) |
|---|
| 82 |
$(LI UTF-32LE) |
|---|
| 83 |
) |
|---|
| 84 |
|
|---|
| 85 |
UTF-8 is a superset of traditional 7-bit ASCII. |
|---|
| 86 |
One of the |
|---|
| 87 |
following UTF BOMs (Byte Order Marks) can be present at the beginning |
|---|
| 88 |
of the source text: |
|---|
| 89 |
<p> |
|---|
| 90 |
|
|---|
| 91 |
$(TABLE2 UTF Byte Order Marks, |
|---|
| 92 |
$(TR |
|---|
| 93 |
$(TH Format) |
|---|
| 94 |
$(TH BOM) |
|---|
| 95 |
) |
|---|
| 96 |
$(TR |
|---|
| 97 |
$(TD UTF-8) |
|---|
| 98 |
$(TD EF BB BF) |
|---|
| 99 |
) |
|---|
| 100 |
$(TR |
|---|
| 101 |
$(TD UTF-16BE) |
|---|
| 102 |
$(TD FE FF) |
|---|
| 103 |
) |
|---|
| 104 |
$(TR |
|---|
| 105 |
$(TD UTF-16LE) |
|---|
| 106 |
$(TD FF FE) |
|---|
| 107 |
) |
|---|
| 108 |
$(TR |
|---|
| 109 |
$(TD UTF-32BE) |
|---|
| 110 |
$(TD 00 00 FE FF) |
|---|
| 111 |
) |
|---|
| 112 |
$(TR |
|---|
| 113 |
$(TD UTF-32LE) |
|---|
| 114 |
$(TD FF FE 00 00) |
|---|
| 115 |
) |
|---|
| 116 |
$(TR |
|---|
| 117 |
$(TD ASCII) |
|---|
| 118 |
$(TD no BOM) |
|---|
| 119 |
) |
|---|
| 120 |
) |
|---|
| 121 |
|
|---|
| 122 |
$(P If the source file does not start with a BOM, then the first |
|---|
| 123 |
character must be less than or equal to U0000007F.) |
|---|
| 124 |
|
|---|
| 125 |
$(P There are no digraphs or trigraphs in D.) |
|---|
| 126 |
|
|---|
| 127 |
$(P The source text is decoded from its source representation |
|---|
| 128 |
into Unicode $(I Character)s. |
|---|
| 129 |
The $(I Character)s are further divided into: |
|---|
| 130 |
|
|---|
| 131 |
$(LINK2 #whitespace, white space), |
|---|
| 132 |
$(LINK2 #endofline, end of lines), |
|---|
| 133 |
$(LINK2 #comment, comments), |
|---|
| 134 |
$(LINK2 #specialtokens, special token sequences), |
|---|
| 135 |
$(LINK2 #tokens, tokens), |
|---|
| 136 |
all followed by $(LINK2 #eof, end of file). |
|---|
| 137 |
) |
|---|
| 138 |
|
|---|
| 139 |
$(P The source text is split into tokens using the maximal munch |
|---|
| 140 |
technique, i.e., the |
|---|
| 141 |
lexical analyzer tries to make the longest token it can. For example |
|---|
| 142 |
<code>>></code> is a right shift token, |
|---|
| 143 |
not two greater than tokens. An exception to this rule is that a .. |
|---|
| 144 |
embedded inside what looks like two floating point literals, as in |
|---|
| 145 |
1..2, is interpreted as if the .. was separated by a space from the |
|---|
| 146 |
first integer. |
|---|
| 147 |
) |
|---|
| 148 |
|
|---|
| 149 |
<h3>$(LNAME2 eof, End of File)</h3> |
|---|
| 150 |
|
|---|
| 151 |
$(GRAMMAR |
|---|
| 152 |
$(I EndOfFile): |
|---|
| 153 |
$(I physical end of the file) |
|---|
| 154 |
\u0000 |
|---|
| 155 |
\u001A |
|---|
| 156 |
) |
|---|
| 157 |
|
|---|
| 158 |
The source text is terminated by whichever comes first. |
|---|
| 159 |
|
|---|
| 160 |
<h3>$(LNAME2 endofline, End of Line)</h3> |
|---|
| 161 |
|
|---|
| 162 |
$(GRAMMAR |
|---|
| 163 |
$(I EndOfLine): |
|---|
| 164 |
\u000D |
|---|
| 165 |
\u000A |
|---|
| 166 |
\u000D \u000A |
|---|
| 167 |
$(I EndOfFile) |
|---|
| 168 |
) |
|---|
| 169 |
|
|---|
| 170 |
There is no backslash line splicing, nor are there any limits |
|---|
| 171 |
on the length of a line. |
|---|
| 172 |
|
|---|
| 173 |
<h3>$(LNAME2 whitespace, White Space)</h3> |
|---|
| 174 |
|
|---|
| 175 |
$(GRAMMAR |
|---|
| 176 |
$(I WhiteSpace): |
|---|
| 177 |
$(I Space) |
|---|
| 178 |
$(I Space) $(I WhiteSpace) |
|---|
| 179 |
|
|---|
| 180 |
$(I Space): |
|---|
| 181 |
\u0020 |
|---|
| 182 |
\u0009 |
|---|
| 183 |
\u000B |
|---|
| 184 |
\u000C |
|---|
| 185 |
) |
|---|
| 186 |
|
|---|
| 187 |
|
|---|
| 188 |
<h3>$(LNAME2 comment, Comments)</h3> |
|---|
| 189 |
|
|---|
| 190 |
$(GRAMMAR |
|---|
| 191 |
$(I Comment): |
|---|
| 192 |
$(B /*) $(I Characters) $(B */) |
|---|
| 193 |
$(B //) $(I Characters) $(I EndOfLine) |
|---|
| 194 |
$(I NestingBlockComment) |
|---|
| 195 |
|
|---|
| 196 |
$(I Characters): |
|---|
| 197 |
$(I Character) |
|---|
| 198 |
$(I Character) $(I Characters) |
|---|
| 199 |
|
|---|
| 200 |
$(I NestingBlockComment): |
|---|
| 201 |
$(B /+) $(I NestingBlockCommentCharacters) $(B +/) |
|---|
| 202 |
|
|---|
| 203 |
$(I NestingBlockCommentCharacters): |
|---|
| 204 |
$(I NestingBlockCommentCharacter) |
|---|
| 205 |
$(I NestingBlockCommentCharacter) $(I NestingBlockCommentCharacters) |
|---|
| 206 |
|
|---|
| 207 |
$(I NestingBlockCommentCharacter): |
|---|
| 208 |
$(I Character) |
|---|
| 209 |
$(I NestingBlockComment) |
|---|
| 210 |
) |
|---|
| 211 |
|
|---|
| 212 |
D has three kinds of comments: |
|---|
| 213 |
$(OL |
|---|
| 214 |
$(LI Block comments can span multiple lines, but do not nest.) |
|---|
| 215 |
$(LI Line comments terminate at the end of the line.) |
|---|
| 216 |
$(LI Nesting comments can span multiple lines and can nest.) |
|---|
| 217 |
) |
|---|
| 218 |
|
|---|
| 219 |
$(P |
|---|
| 220 |
The contents of strings and comments are not tokenized. Consequently, |
|---|
| 221 |
comment openings occurring within a string do not begin a comment, and |
|---|
| 222 |
string delimiters within a comment do not affect the recognition of |
|---|
| 223 |
comment closings and nested "/+" comment openings. With the exception |
|---|
| 224 |
of "/+" occurring within a "/+" comment, comment openings within a |
|---|
| 225 |
comment are ignored. |
|---|
| 226 |
) |
|---|
| 227 |
|
|---|
| 228 |
------------- |
|---|
| 229 |
a = /+ // +/ 1; // parses as if 'a = 1;' |
|---|
| 230 |
a = /+ "+/" +/ 1"; // parses as if 'a = " +/ 1";' |
|---|
| 231 |
a = /+ /* +/ */ 3; // parses as if 'a = */ 3;' |
|---|
| 232 |
------------- |
|---|
| 233 |
|
|---|
| 234 |
Comments cannot be used as token concatenators, for example, |
|---|
| 235 |
<code>abc/**/def</code> is two tokens, $(TT abc) and $(TT def), |
|---|
| 236 |
not one $(TT abcdef) token. |
|---|
| 237 |
|
|---|
| 238 |
<h3>$(LNAME2 tokens, Tokens)</h3> |
|---|
| 239 |
|
|---|
| 240 |
$(GRAMMAR |
|---|
| 241 |
$(I Token): |
|---|
| 242 |
$(LINK2 #identifier, $(I Identifier)) |
|---|
| 243 |
$(LINK2 #StringLiteral, $(I StringLiteral)) |
|---|
| 244 |
$(LINK2 #characterliteral, $(I CharacterLiteral)) |
|---|
| 245 |
$(LINK2 #integerliteral, $(I IntegerLiteral)) |
|---|
| 246 |
$(LINK2 #floatliteral, $(I FloatLiteral)) |
|---|
| 247 |
$(LINK2 #keyword, $(I Keyword)) |
|---|
| 248 |
$(B /) |
|---|
| 249 |
$(B /=) |
|---|
| 250 |
$(B .) |
|---|
| 251 |
$(B ..) |
|---|
| 252 |
$(B ...) |
|---|
| 253 |
$(B &) |
|---|
| 254 |
$(B &=) |
|---|
| 255 |
$(B &&) |
|---|
| 256 |
$(B |) |
|---|
| 257 |
$(B |=) |
|---|
| 258 |
$(B ||) |
|---|
| 259 |
$(B -) |
|---|
| 260 |
$(B -=) |
|---|
| 261 |
$(B --) |
|---|
| 262 |
$(B +) |
|---|
| 263 |
$(B +=) |
|---|
| 264 |
$(B ++) |
|---|
| 265 |
$(B <) |
|---|
| 266 |
$(B <=) |
|---|
| 267 |
$(B <<) |
|---|
| 268 |
$(B <<=) |
|---|
| 269 |
$(B <>) |
|---|
| 270 |
$(B <>=) |
|---|
| 271 |
$(B >) |
|---|
| 272 |
$(B >=) |
|---|
| 273 |
$(B >>=) |
|---|
| 274 |
$(B >>>=) |
|---|
| 275 |
$(B >>) |
|---|
| 276 |
$(B >>>) |
|---|
| 277 |
$(B !) |
|---|
| 278 |
$(B !=) |
|---|
| 279 |
$(B !<>) |
|---|
| 280 |
$(B !<>=) |
|---|
| 281 |
$(B !<) |
|---|
| 282 |
$(B !<=) |
|---|
| 283 |
$(B !>) |
|---|
| 284 |
$(B !>=) |
|---|
| 285 |
$(B $(LPAREN)) |
|---|
| 286 |
$(B $(RPAREN)) |
|---|
| 287 |
$(B [) |
|---|
| 288 |
$(B ]) |
|---|
| 289 |
$(B {) |
|---|
| 290 |
$(B }) |
|---|
| 291 |
$(B ?) |
|---|
| 292 |
$(B ,) |
|---|
| 293 |
$(B ;) |
|---|
| 294 |
$(B :) |
|---|
| 295 |
$(B $) |
|---|
| 296 |
$(B =) |
|---|
| 297 |
$(B ==) |
|---|
| 298 |
$(B *) |
|---|
| 299 |
$(B *=) |
|---|
| 300 |
$(B %) |
|---|
| 301 |
$(B %=) |
|---|
| 302 |
$(B ^) |
|---|
| 303 |
$(B ^=) |
|---|
| 304 |
$(B ~) |
|---|
| 305 |
$(B ~=) |
|---|
| 306 |
$(V2 $(B @)) |
|---|
| 307 |
) |
|---|
| 308 |
|
|---|
| 309 |
<h3>$(LNAME2 identifier, Identifiers)</h3> |
|---|
| 310 |
|
|---|
| 311 |
$(GRAMMAR |
|---|
| 312 |
$(I Identifier): |
|---|
| 313 |
$(I IdentiferStart) |
|---|
| 314 |
$(I IdentiferStart) $(I IdentifierChars) |
|---|
| 315 |
|
|---|
| 316 |
$(I IdentifierChars): |
|---|
| 317 |
$(I IdentiferChar) |
|---|
| 318 |
$(I IdentiferChar) $(I IdentifierChars) |
|---|
| 319 |
|
|---|
| 320 |
$(I IdentifierStart): |
|---|
| 321 |
$(B _) |
|---|
| 322 |
$(I Letter) |
|---|
| 323 |
$(I UniversalAlpha) |
|---|
| 324 |
|
|---|
| 325 |
$(I IdentifierChar): |
|---|
| 326 |
$(I IdentiferStart) |
|---|
| 327 |
$(B 0) |
|---|
| 328 |
$(I NonZeroDigit) |
|---|
| 329 |
) |
|---|
| 330 |
|
|---|
| 331 |
|
|---|
| 332 |
Identifiers start with a letter, $(B _), or universal alpha, |
|---|
| 333 |
and are followed by any number |
|---|
| 334 |
of letters, $(B _), digits, or universal alphas. |
|---|
| 335 |
Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. |
|---|
| 336 |
(This is the C99 Standard.) |
|---|
| 337 |
Identifiers can be arbitrarily long, and are case sensitive. |
|---|
| 338 |
Identifiers starting with $(B __) (two underscores) are reserved. |
|---|
| 339 |
|
|---|
| 340 |
<h3>$(LNAME2 StringLiteral, String Literals)</h3> |
|---|
| 341 |
|
|---|
| 342 |
$(GRAMMAR |
|---|
| 343 |
$(I StringLiteral): |
|---|
| 344 |
$(I WysiwygString) |
|---|
| 345 |
$(I AlternateWysiwygString) |
|---|
| 346 |
$(I DoubleQuotedString) |
|---|
| 347 |
$(V1 |
|---|
| 348 |
$(I EscapeSequence)) |
|---|
| 349 |
$(I HexString) |
|---|
| 350 |
$(V2 |
|---|
| 351 |
$(I DelimitedString) |
|---|
| 352 |
$(I TokenString)) |
|---|
| 353 |
|
|---|
| 354 |
$(I WysiwygString): |
|---|
| 355 |
$(B r") $(I WysiwygCharacters) $(B ") $(I Postfix<sub>opt</sub>) |
|---|
| 356 |
|
|---|
| 357 |
$(I AlternateWysiwygString): |
|---|
| 358 |
$(B `) $(I WysiwygCharacters) $(B `) $(I Postfix<sub>opt</sub>) |
|---|
| 359 |
|
|---|
| 360 |
$(I WysiwygCharacters): |
|---|
| 361 |
$(I WysiwygCharacter) |
|---|
| 362 |
$(I WysiwygCharacter) $(I WysiwygCharacters) |
|---|
| 363 |
|
|---|
| 364 |
$(I WysiwygCharacter): |
|---|
| 365 |
$(I Character) |
|---|
| 366 |
$(I EndOfLine) |
|---|
| 367 |
|
|---|
| 368 |
$(I DoubleQuotedString): |
|---|
| 369 |
$(B ") $(I DoubleQuotedCharacters) $(B ") $(I Postfix<sub>opt</sub>) |
|---|
| 370 |
|
|---|
| 371 |
$(I DoubleQuotedCharacters): |
|---|
| 372 |
$(I DoubleQuotedCharacter) |
|---|
| 373 |
$(I DoubleQuotedCharacter) $(I DoubleQuotedCharacters) |
|---|
| 374 |
|
|---|
| 375 |
$(I DoubleQuotedCharacter): |
|---|
| 376 |
$(I Character) |
|---|
| 377 |
$(I EscapeSequence) |
|---|
| 378 |
$(I EndOfLine) |
|---|
| 379 |
|
|---|
| 380 |
$(LNAME2 EscapeSequence, $(I EscapeSequence)): |
|---|
| 381 |
$(B \') |
|---|
| 382 |
$(B \") |
|---|
| 383 |
$(B \?) |
|---|
| 384 |
$(B \\) |
|---|
| 385 |
$(B \a) |
|---|
| 386 |
$(B \b) |
|---|
| 387 |
$(B \f) |
|---|
| 388 |
$(B \n) |
|---|
| 389 |
$(B \r) |
|---|
| 390 |
$(B \t) |
|---|
| 391 |
$(B \v) |
|---|
| 392 |
$(B \) $(I EndOfFile) |
|---|
| 393 |
$(B \x) $(I HexDigit) $(I HexDigit) |
|---|
| 394 |
$(B \) $(I OctalDigit) |
|---|
| 395 |
$(B \) $(I OctalDigit) $(I OctalDigit) |
|---|
| 396 |
$(B \) $(I OctalDigit) $(I OctalDigit) $(I OctalDigit) |
|---|
| 397 |
$(B \u) $(I HexDigit) $(I HexDigit) $(I HexDigit) $(I HexDigit) |
|---|
| 398 |
$(B \U) $(I HexDigit) $(I HexDigit) $(I HexDigit) $(I HexDigit) $(I HexDigit) $(I HexDigit) $(I HexDigit) $(I HexDigit) |
|---|
| 399 |
$(B \&) $(LINK2 entity.html, $(I NamedCharacterEntity)) $(B ;) |
|---|
| 400 |
|
|---|
| 401 |
$(I HexString): |
|---|
| 402 |
$(B x") $(I HexStringChars) $(B ") $(I Postfix<sub>opt</sub>) |
|---|
| 403 |
|
|---|
| 404 |
$(I HexStringChars): |
|---|
| 405 |
$(I HexStringChar) |
|---|
| 406 |
$(I HexStringChar) $(I HexStringChars) |
|---|
| 407 |
|
|---|
| 408 |
$(I HexStringChar): |
|---|
| 409 |
$(I HexDigit) |
|---|
| 410 |
$(I WhiteSpace) |
|---|
| 411 |
$(I EndOfLine) |
|---|
| 412 |
|
|---|
| 413 |
$(I Postfix): |
|---|
| 414 |
$(B c) |
|---|
| 415 |
$(B w) |
|---|
| 416 |
$(B d) |
|---|
| 417 |
|
|---|
| 418 |
$(V2 |
|---|
| 419 |
$(I DelimitedString): |
|---|
| 420 |
$(B q") $(I Delimiter) $(I WysiwygCharacters) $(I MatchingDelimiter) $(B ") |
|---|
| 421 |
|
|---|
| 422 |
$(I TokenString): |
|---|
| 423 |
$(B q{) $(I Tokens) $(B }) |
|---|
| 424 |
) |
|---|
| 425 |
) |
|---|
| 426 |
|
|---|
| 427 |
$(P |
|---|
| 428 |
A string literal is either a double quoted string, a wysiwyg quoted |
|---|
| 429 |
string, an escape sequence, |
|---|
| 430 |
$(V2 a delimited string, a token string,) |
|---|
| 431 |
or a hex string. |
|---|
| 432 |
) |
|---|
| 433 |
|
|---|
| 434 |
<h4>Wysiwyg Strings</h4> |
|---|
| 435 |
|
|---|
| 436 |
$(P |
|---|
| 437 |
Wysiwyg quoted strings are enclosed by r" and ". |
|---|
| 438 |
All characters between |
|---|
| 439 |
the r" and " are part of the string except for $(I EndOfLine) which is |
|---|
| 440 |
regarded as a single \n character. |
|---|
| 441 |
There are no escape sequences inside r" ": |
|---|
| 442 |
) |
|---|
| 443 |
|
|---|
| 444 |
--------------- |
|---|
| 445 |
r"hello" |
|---|
| 446 |
r"c:\root\foo.exe" |
|---|
| 447 |
r"ab\n" // string is 4 characters, 'a', 'b', '\', 'n' |
|---|
| 448 |
--------------- |
|---|
| 449 |
|
|---|
| 450 |
$(P |
|---|
| 451 |
An alternate form of wysiwyg strings are enclosed by backquotes, |
|---|
| 452 |
the ` character. The ` character is not available on some keyboards |
|---|
| 453 |
and the font rendering of it is sometimes indistinguishable from |
|---|
| 454 |
the regular ' character. Since, however, the ` is rarely used, |
|---|
| 455 |
it is useful to delineate strings with " in them. |
|---|
| 456 |
) |
|---|
| 457 |
|
|---|
| 458 |
--------------- |
|---|
| 459 |
`hello` |
|---|
| 460 |
`c:\root\foo.exe` |
|---|
| 461 |
`ab\n` // string is 4 characters, 'a', 'b', '\', 'n' |
|---|
| 462 |
--------------- |
|---|
| 463 |
|
|---|
| 464 |
<h4>Double Quoted Strings</h4> |
|---|
| 465 |
|
|---|
| 466 |
Double quoted strings are enclosed by "". Escape sequences can be |
|---|
| 467 |
embedded into them with the typical \ notation. |
|---|
| 468 |
$(I EndOfLine) is regarded as a single \n character. |
|---|
| 469 |
|
|---|
| 470 |
--------------- |
|---|
| 471 |
"hello" |
|---|
| 472 |
"c:\\root\\foo.exe" |
|---|
| 473 |
"ab\n" // string is 3 characters, 'a', 'b', and a linefeed |
|---|
| 474 |
"ab |
|---|
| 475 |
" // string is 3 characters, 'a', 'b', and a linefeed |
|---|
| 476 |
--------------- |
|---|
| 477 |
|
|---|
| 478 |
$(V1 |
|---|
| 479 |
<h4>Escape Strings</h4> |
|---|
| 480 |
|
|---|
| 481 |
$(P Escape strings start with a \ and form an escape character sequence. |
|---|
| 482 |
Adjacent escape strings are concatenated: |
|---|
| 483 |
) |
|---|
| 484 |
|
|---|
| 485 |
<pre> |
|---|
| 486 |
\n the linefeed character |
|---|
| 487 |
\t the tab character |
|---|
| 488 |
\" the double quote character |
|---|
| 489 |
\012 octal |
|---|
| 490 |
\x1A hex |
|---|
| 491 |
\u1234 wchar character |
|---|
| 492 |
\U00101234 dchar character |
|---|
| 493 |
\&reg; ® dchar character |
|---|
| 494 |
\r\n carriage return, line feed |
|---|
| 495 |
</pre> |
|---|
| 496 |
|
|---|
| 497 |
$(P Undefined escape sequences are errors. |
|---|
| 498 |
Although string literals are defined to be composed of |
|---|
| 499 |
UTF characters, the octal and hex escape sequences allow |
|---|
| 500 |
the insertion of arbitrary binary data. |
|---|
| 501 |
\u and \U escape sequences can only be used to insert |
|---|
| 502 |
valid UTF characters. |
|---|
| 503 |
) |
|---|
| 504 |
) |
|---|
| 505 |
|
|---|
| 506 |
<h4>Hex Strings</h4> |
|---|
| 507 |
|
|---|
| 508 |
$(P Hex strings allow string literals to be created using hex data. |
|---|
| 509 |
The hex data need not form valid UTF characters. |
|---|
| 510 |
) |
|---|
| 511 |
|
|---|
| 512 |
-------------- |
|---|
| 513 |
x"0A" // same as "\x0A" |
|---|
| 514 |
x"00 FBCD 32FD 0A" // same as "\x00\xFB\xCD\x32\xFD\x0A" |
|---|
| 515 |
-------------- |
|---|
| 516 |
|
|---|
| 517 |
Whitespace and newlines are ignored, so the hex data can be |
|---|
| 518 |
easily formatted. |
|---|
| 519 |
The number of hex characters must be a multiple of 2. |
|---|
| 520 |
<p> |
|---|
| 521 |
|
|---|
| 522 |
Adjacent strings are concatenated with the ~ operator, or by simple |
|---|
| 523 |
juxtaposition: |
|---|
| 524 |
|
|---|
| 525 |
-------------- |
|---|
| 526 |
"hello " ~ "world" ~ \n // forms the string 'h','e','l','l','o',' ', |
|---|
| 527 |
// 'w','o','r','l','d',linefeed |
|---|
| 528 |
-------------- |
|---|
| 529 |
|
|---|
| 530 |
The following are all equivalent: |
|---|
| 531 |
|
|---|
| 532 |
----------------- |
|---|
| 533 |
"ab" "c" |
|---|
| 534 |
r"ab" r"c" |
|---|
| 535 |
r"a" "bc" |
|---|
| 536 |
"a" ~ "b" ~ "c" |
|---|
| 537 |
\x61"bc" |
|---|
| 538 |
----------------- |
|---|
| 539 |
|
|---|
| 540 |
The optional $(I Postfix) character gives a specific type |
|---|
| 541 |
to the string, rather than it being inferred from the context. |
|---|
| 542 |
This is useful when the type cannot be unambiguously inferred, |
|---|
| 543 |
such as when overloading based on string type. The types corresponding |
|---|
| 544 |
to the postfix characters are: |
|---|
| 545 |
<p> |
|---|
| 546 |
|
|---|
| 547 |
$(TABLE2 String Literal Postfix Characters, |
|---|
| 548 |
$(TR |
|---|
| 549 |
$(TH Postfix) |
|---|
| 550 |
$(TH Type) |
|---|
| 551 |
) |
|---|
| 552 |
$(TR |
|---|
| 553 |
$(TD $(B c)) |
|---|
| 554 |
$(TD char[ ]) |
|---|
| 555 |
) |
|---|
| 556 |
$(TR |
|---|
| 557 |
$(TD $(B w)) |
|---|
| 558 |
$(TD wchar[ ]) |
|---|
| 559 |
) |
|---|
| 560 |
$(TR |
|---|
| 561 |
$(TD $(B d)) |
|---|
| 562 |
$(TD dchar[ ]) |
|---|
| 563 |
) |
|---|
| 564 |
) |
|---|
| 565 |
|
|---|
| 566 |
--- |
|---|
| 567 |
"hello"c // char[] |
|---|
| 568 |
"hello"w // wchar[] |
|---|
| 569 |
"hello"d // dchar[] |
|---|
| 570 |
--- |
|---|
| 571 |
|
|---|
| 572 |
$(P String literals are read only. Writes to string literals |
|---|
| 573 |
cannot always be detected, but cause undefined behavior.) |
|---|
| 574 |
|
|---|
| 575 |
$(V2 |
|---|
| 576 |
<h4>Delimited Strings</h4> |
|---|
| 577 |
|
|---|
| 578 |
$(P Delimited strings use various forms of delimiters. |
|---|
| 579 |
The delimiter, whether a character or identifer, |
|---|
| 580 |
must immediately follow the " without any intervening whitespace. |
|---|
| 581 |
The terminating delimiter must immediately precede the closing " |
|---|
| 582 |
without any intervening whitespace. |
|---|
| 583 |
A $(I nesting delimiter) nests, and is one of the |
|---|
| 584 |
following characters: |
|---|
| 585 |
) |
|---|
| 586 |
|
|---|
| 587 |
$(TABLE2 Nesting Delimiters, |
|---|
| 588 |
$(TR |
|---|
| 589 |
$(TH Delimiter) |
|---|
| 590 |
$(TH Matching Delimiter) |
|---|
| 591 |
) |
|---|
| 592 |
$(TR |
|---|
| 593 |
$(TD [) |
|---|
| 594 |
$(TD ]) |
|---|
| 595 |
) |
|---|
| 596 |
$(TR |
|---|
| 597 |
$(TD $(LPAREN)) |
|---|
| 598 |
$(TD $(RPAREN)) |
|---|
| 599 |
) |
|---|
| 600 |
$(TR |
|---|
| 601 |
$(TD <) |
|---|
| 602 |
$(TD >) |
|---|
| 603 |
) |
|---|
| 604 |
$(TR |
|---|
| 605 |
$(TD {) |
|---|
| 606 |
$(TD }) |
|---|
| 607 |
) |
|---|
| 608 |
) |
|---|
| 609 |
|
|---|
| 610 |
--- |
|---|
| 611 |
q"(foo(xxx))" // "foo(xxx)" |
|---|
| 612 |
q"[foo{]" // "foo{" |
|---|
| 613 |
--- |
|---|
| 614 |
|
|---|
| 615 |
$(P If the delimiter is an identifier, the identifier must |
|---|
| 616 |
be immediately followed by a newline, and the matching |
|---|
| 617 |
delimiter is the same identifier starting at the beginning |
|---|
| 618 |
of the line: |
|---|
| 619 |
) |
|---|
| 620 |
--- |
|---|
| 621 |
writefln(q"EOS |
|---|
| 622 |
This |
|---|
| 623 |
is a multi-line |
|---|
| 624 |
heredoc string |
|---|
| 625 |
EOS" |
|---|
| 626 |
); |
|---|
| 627 |
--- |
|---|
| 628 |
$(P The newline following the opening identifier is not part |
|---|
| 629 |
of the string, but the last newline before the closing |
|---|
| 630 |
identifier is part of the string. |
|---|
| 631 |
) |
|---|
| 632 |
|
|---|
| 633 |
$(P Otherwise, the matching delimiter is the same as |
|---|
| 634 |
the delimiter character:) |
|---|
| 635 |
|
|---|
| 636 |
--- |
|---|
| 637 |
q"/foo]/" // "foo]" |
|---|
| 638 |
q"/abc/def/" // error |
|---|
| 639 |
--- |
|---|
| 640 |
|
|---|
| 641 |
<h4>Token Strings</h4> |
|---|
| 642 |
|
|---|
| 643 |
$(P Token strings open with the characters $(B q{) and close with |
|---|
| 644 |
the token $(B }). In between must be valid D tokens. |
|---|
| 645 |
The $(B {) and $(B }) tokens nest. |
|---|
| 646 |
The string is formed of all the characters between the opening |
|---|
| 647 |
and closing of the token string, including comments. |
|---|
| 648 |
) |
|---|
| 649 |
|
|---|
| 650 |
--- |
|---|
| 651 |
q{foo} // "foo" |
|---|
| 652 |
q{/*}*/ } // "/*}*/ " |
|---|
| 653 |
q{ foo(q{hello}); } // " foo(q{hello}); " |
|---|
| 654 |
q{ @ } // error, @ is not a valid D token |
|---|
| 655 |
q{ __TIME__ } // " __TIME__ ", i.e. it is not replaced with the time |
|---|
| 656 |
q{ __EOF__ } // error, as __EOF__ is not a token, it's end of file |
|---|
| 657 |
--- |
|---|
| 658 |
|
|---|
| 659 |
) |
|---|
| 660 |
|
|---|
| 661 |
<h3>$(LNAME2 characterliteral, Character Literals)</h3> |
|---|
| 662 |
|
|---|
| 663 |
$(GRAMMAR |
|---|
| 664 |
$(I CharacterLiteral): |
|---|
| 665 |
$(B ') $(I SingleQuotedCharacter) $(B ') |
|---|
| 666 |
|
|---|
| 667 |
$(I SingleQuotedCharacter): |
|---|
| 668 |
$(I Character) |
|---|
| 669 |
$(I EscapeSequence) |
|---|
| 670 |
) |
|---|
| 671 |
|
|---|
| 672 |
Character literals are a single character or escape sequence |
|---|
| 673 |
enclosed by single quotes, ' '. |
|---|
| 674 |
|
|---|
| 675 |
<h3>$(LNAME2 integerliteral, Integer Literals)</h3> |
|---|
| 676 |
|
|---|
| 677 |
$(GRAMMAR |
|---|
| 678 |
$(GNAME IntegerLiteral): |
|---|
| 679 |
$(I Integer) |
|---|
| 680 |
$(I Integer) $(I IntegerSuffix) |
|---|
| 681 |
|
|---|
| 682 |
$(I Integer): |
|---|
| 683 |
$(I Decimal) |
|---|
| 684 |
$(I Binary) |
|---|
| 685 |
$(I Octal) |
|---|
| 686 |
$(I Hexadecimal) |
|---|
| 687 |
|
|---|
| 688 |
$(I IntegerSuffix): |
|---|
| 689 |
$(B L) |
|---|
| 690 |
$(B u) |
|---|
| 691 |
$(B U) |
|---|
| 692 |
$(B Lu) |
|---|
| 693 |
$(B LU) |
|---|
| 694 |
$(B uL) |
|---|
| 695 |
$(B UL) |
|---|
| 696 |
|
|---|
| 697 |
$(GNAME Decimal): |
|---|
| 698 |
$(B 0) |
|---|
| 699 |
$(I NonZeroDigit) |
|---|
| 700 |
$(I NonZeroDigit) $(I DecimalDigits) |
|---|
| 701 |
|
|---|
| 702 |
$(I Binary): |
|---|
| 703 |
$(B 0b) $(I BinaryDigits) |
|---|
| 704 |
$(B 0B) $(I BinaryDigits) |
|---|
| 705 |
|
|---|
| 706 |
$(I Octal): |
|---|
| 707 |
$(B 0) $(I OctalDigits) |
|---|
| 708 |
|
|---|
| 709 |
$(I Hexadecimal): |
|---|
| 710 |
$(B 0x) $(I HexDigits) |
|---|
| 711 |
$(B 0X) $(I HexDigits) |
|---|
| 712 |
|
|---|
| 713 |
$(I NonZeroDigit): |
|---|
| 714 |
$(B 1) |
|---|
| 715 |
$(B 2) |
|---|
| 716 |
$(B 3) |
|---|
| 717 |
$(B 4) |
|---|
| 718 |
$(B 5) |
|---|
| 719 |
$(B 6) |
|---|
| 720 |
$(B 7) |
|---|
| 721 |
$(B 8) |
|---|
| 722 |
$(B 9) |
|---|
| 723 |
|
|---|
| 724 |
$(GNAME DecimalDigits): |
|---|
| 725 |
$(I DecimalDigit) |
|---|
| 726 |
$(I DecimalDigit) $(I DecimalDigits) |
|---|
| 727 |
|
|---|
| 728 |
$(GNAME DecimalDigit): |
|---|
| 729 |
$(B 0) |
|---|
| 730 |
$(I NonZeroDigit) |
|---|
| 731 |
$(B _) |
|---|
| 732 |
|
|---|
| 733 |
$(I BinaryDigits): |
|---|
| 734 |
$(I BinaryDigit) |
|---|
| 735 |
$(I BinaryDigit) $(I BinaryDigits) |
|---|
| 736 |
|
|---|
| 737 |
$(I BinaryDigit): |
|---|
| 738 |
$(B 0) |
|---|
| 739 |
$(B 1) |
|---|
| 740 |
$(B _) |
|---|
| 741 |
|
|---|
| 742 |
$(I OctalDigits): |
|---|
| 743 |
$(I OctalDigit) |
|---|
| 744 |
$(I OctalDigit) $(I OctalDigits) |
|---|
| 745 |
|
|---|
| 746 |
$(I OctalDigit): |
|---|
| 747 |
$(B 0) |
|---|
| 748 |
$(B 1) |
|---|
| 749 |
$(B 2) |
|---|
| 750 |
$(B 3) |
|---|
| 751 |
$(B 4) |
|---|
| 752 |
$(B 5) |
|---|
| 753 |
$(B 6) |
|---|
| 754 |
$(B 7) |
|---|
| 755 |
$(B _) |
|---|
| 756 |
|
|---|
| 757 |
$(I HexDigits): |
|---|
| 758 |
$(I HexDigit) |
|---|
| 759 |
$(I HexDigit) $(I HexDigits) |
|---|
| 760 |
|
|---|
| 761 |
$(I HexDigit): |
|---|
| 762 |
$(I DecimalDigit) |
|---|
| 763 |
$(B a) |
|---|
| 764 |
$(B b) |
|---|
| 765 |
$(B c) |
|---|
| 766 |
$(B d) |
|---|
| 767 |
$(B e) |
|---|
| 768 |
$(B f) |
|---|
| 769 |
$(B A) |
|---|
| 770 |
$(B B) |
|---|
| 771 |
$(B C) |
|---|
| 772 |
$(B D) |
|---|
| 773 |
$(B E) |
|---|
| 774 |
$(B F) |
|---|
| 775 |
$(B _) |
|---|
| 776 |
) |
|---|
| 777 |
|
|---|
| 778 |
Integers can be specified in decimal, binary, octal, or hexadecimal. |
|---|
| 779 |
<p> |
|---|
| 780 |
Decimal integers are a sequence of decimal digits. |
|---|
| 781 |
<p> |
|---|
| 782 |
$(LNAME2 binary-literals, Binary integers) are a sequence of binary digits preceded |
|---|
| 783 |
by a $(SINGLEQUOTE 0b). |
|---|
| 784 |
<p> |
|---|
| 785 |
Octal integers are a sequence of octal digits preceded by a $(SINGLEQUOTE 0). |
|---|
| 786 |
<p> |
|---|
| 787 |
Hexadecimal integers are a sequence of hexadecimal digits preceded |
|---|
| 788 |
by a $(SINGLEQUOTE 0x). |
|---|
| 789 |
<p> |
|---|
| 790 |
Integers can have embedded $(SINGLEQUOTE _) characters, which are ignored. |
|---|
| 791 |
The embedded $(SINGLEQUOTE _) are useful for formatting long literals, such |
|---|
| 792 |
as using them as a thousands separator: |
|---|
| 793 |
|
|---|
| 794 |
------------- |
|---|
| 795 |
123_456 // 123456 |
|---|
| 796 |
1_2_3_4_5_6_ // 123456 |
|---|
| 797 |
------------- |
|---|
| 798 |
|
|---|
| 799 |
Integers can be immediately followed by one $(SINGLEQUOTE L) or one |
|---|
| 800 |
$(SINGLEQUOTE u) or both. |
|---|
| 801 |
<p> |
|---|
| 802 |
The type of the integer is resolved as follows: |
|---|
| 803 |
<p> |
|---|
| 804 |
|
|---|
| 805 |
$(TABLE2 Decimal Literal Types, |
|---|
| 806 |
$(TR |
|---|
| 807 |
$(TH Decimal Literal) |
|---|
| 808 |
$(TH Type) |
|---|
| 809 |
) |
|---|
| 810 |
$(TR |
|---|
| 811 |
$(TD 0 .. 2_147_483_647) |
|---|
| 812 |
$(TD int) |
|---|
| 813 |
) |
|---|
| 814 |
$(TR |
|---|
| 815 |
$(TD 2_147_483_648 .. 9_223_372_036_854_775_807L) |
|---|
| 816 |
$(TD long) |
|---|
| 817 |
) |
|---|
| 818 |
$(TR |
|---|
| 819 |
$(TH Decimal Literal, L Suffix) |
|---|
| 820 |
$(TH Type) |
|---|
| 821 |
) |
|---|
| 822 |
$(TR |
|---|
| 823 |
$(TD 0L .. 9_223_372_036_854_775_807L) |
|---|
| 824 |
$(TD long) |
|---|
| 825 |
) |
|---|
| 826 |
$(TR |
|---|
| 827 |
$(TH Decimal Literal, U Suffix) |
|---|
| 828 |
$(TH Type) |
|---|
| 829 |
) |
|---|
| 830 |
$(TR |
|---|
| 831 |
$(TD 0U .. 4_294_967_296U) |
|---|
| 832 |
$(TD uint) |
|---|
| 833 |
) |
|---|
| 834 |
$(TR |
|---|
| 835 |
$(TD 4_294_967_296U .. 18_446_744_073_709_551_615UL) |
|---|
| 836 |
$(TD ulong) |
|---|
| 837 |
) |
|---|
| 838 |
$(TR |
|---|
| 839 |
$(TH Decimal Literal, UL Suffix) |
|---|
| 840 |
$(TH Type) |
|---|
| 841 |
) |
|---|
| 842 |
$(TR |
|---|
| 843 |
$(TD 0UL .. 18_446_744_073_709_551_615UL) |
|---|
| 844 |
$(TD ulong) |
|---|
| 845 |
) |
|---|
| 846 |
|
|---|
| 847 |
$(TR |
|---|
| 848 |
$(TH Non-Decimal Literal) |
|---|
| 849 |
$(TH Type) |
|---|
| 850 |
) |
|---|
| 851 |
$(TR |
|---|
| 852 |
$(TD 0x0 .. 0x7FFF_FFFF) |
|---|
| 853 |
$(TD int) |
|---|
| 854 |
) |
|---|
| 855 |
$(TR |
|---|
| 856 |
$(TD 0x8000_0000 .. 0xFFFF_FFFF) |
|---|
| 857 |
$(TD uint) |
|---|
| 858 |
) |
|---|
| 859 |
$(TR |
|---|
| 860 |
$(TD 0x1_0000_0000 .. 0x7FFF_FFFF_FFFF_FFFF) |
|---|
| 861 |
$(TD long) |
|---|
| 862 |
) |
|---|
| 863 |
$(TR |
|---|
| 864 |
$(TD 0x8000_0000_0000_0000 .. 0xFFFF_FFFF_FFFF_FFFF) |
|---|
| 865 |
$(TD ulong) |
|---|
| 866 |
) |
|---|
| 867 |
$(TR |
|---|
| 868 |
$(TH Non-Decimal Literal, L Suffix) |
|---|
| 869 |
$(TH Type) |
|---|
| 870 |
) |
|---|
| 871 |
$(TR |
|---|
| 872 |
$(TD 0x0L .. 0x7FFF_FFFF_FFFF_FFFFL) |
|---|
| 873 |
$(TD long) |
|---|
| 874 |
) |
|---|
| 875 |
$(TR |
|---|
| 876 |
$(TD 0x8000_0000_0000_0000L .. 0xFFFF_FFFF_FFFF_FFFFL) |
|---|
| 877 |
$(TD ulong) |
|---|
| 878 |
) |
|---|
| 879 |
$(TR |
|---|
| 880 |
$(TH Non-Decimal Literal, U Suffix) |
|---|
| 881 |
$(TH Type) |
|---|
| 882 |
) |
|---|
| 883 |
$(TR |
|---|
| 884 |
$(TD 0x0U .. 0xFFFF_FFFFU) |
|---|
| 885 |
$(TD uint) |
|---|
| 886 |
) |
|---|
| 887 |
$(TR |
|---|
| 888 |
$(TD 0x1_0000_0000UL .. 0xFFFF_FFFF_FFFF_FFFFUL) |
|---|
| 889 |
$(TD ulong) |
|---|
| 890 |
) |
|---|
| 891 |
$(TR |
|---|
| 892 |
$(TH Non-Decimal Literal, UL Suffix) |
|---|
| 893 |
$(TH Type) |
|---|
| 894 |
) |
|---|
| 895 |
$(TR |
|---|
| 896 |
$(TD 0x0UL .. 0xFFFF_FFFF_FFFF_FFFFUL) |
|---|
| 897 |
$(TD ulong) |
|---|
| 898 |
) |
|---|
| 899 |
|
|---|
| 900 |
) |
|---|
| 901 |
|
|---|
| 902 |
|
|---|
| 903 |
<h3>$(LNAME2 floatliteral, Floating Literals)</h3> |
|---|
| 904 |
|
|---|
| 905 |
$(GRAMMAR |
|---|
| 906 |
$(GNAME FloatLiteral): |
|---|
| 907 |
$(I Float) |
|---|
| 908 |
$(I Float) $(I Suffix) |
|---|
| 909 |
$(I Integer) $(I ImaginarySuffix) |
|---|
| 910 |
$(I Integer) $(I FloatSuffix) $(I ImaginarySuffix) |
|---|
| 911 |
$(I Integer) $(I RealSuffix) $(I ImaginarySuffix) |
|---|
| 912 |
|
|---|
| 913 |
$(I Float): |
|---|
| 914 |
$(I DecimalFloat) |
|---|
| 915 |
$(I HexFloat) |
|---|
| 916 |
|
|---|
| 917 |
$(I DecimalFloat): |
|---|
| 918 |
$(GLINK LeadingDecimal) $(B .) |
|---|
| 919 |
$(GLINK LeadingDecimal) $(B .) $(I DecimalDigits) |
|---|
| 920 |
$(I DecimalDigits) $(B .) $(I DecimalDigits) $(I DecimalExponent) |
|---|
| 921 |
$(B .) $(I Decimal) |
|---|
| 922 |
$(B .) $(I Decimal) $(I DecimalExponent) |
|---|
| 923 |
$(GLINK LeadingDecimal) $(I DecimalExponent) |
|---|
| 924 |
|
|---|
| 925 |
$(I DecimalExponent) |
|---|
| 926 |
$(B e) $(I DecimalDigits) |
|---|
| 927 |
$(B E) $(I DecimalDigits) |
|---|
| 928 |
$(B e+) $(I DecimalDigits) |
|---|
| 929 |
$(B E+) $(I DecimalDigits) |
|---|
| 930 |
$(B e-) $(I DecimalDigits) |
|---|
| 931 |
$(B E-) $(I DecimalDigits) |
|---|
| 932 |
|
|---|
| 933 |
$(I HexFloat): |
|---|
| 934 |
$(I HexPrefix) $(I HexDigits) $(B .) $(I HexDigits) $(I HexExponent) |
|---|
| 935 |
$(I HexPrefix) $(B .) $(I HexDigits) $(I HexExponent) |
|---|
| 936 |
$(I HexPrefix) $(I HexDigits) $(I HexExponent) |
|---|
| 937 |
|
|---|
| 938 |
$(I HexPrefix): |
|---|
| 939 |
$(B 0x) |
|---|
| 940 |
$(B 0X) |
|---|
| 941 |
|
|---|
| 942 |
$(I HexExponent): |
|---|
| 943 |
$(B p) $(I DecimalDigits) |
|---|
| 944 |
$(B P) $(I DecimalDigits) |
|---|
| 945 |
$(B p+) $(I DecimalDigits) |
|---|
| 946 |
$(B P+) $(I DecimalDigits) |
|---|
| 947 |
$(B p-) $(I DecimalDigits) |
|---|
| 948 |
$(B P-) $(I DecimalDigits) |
|---|
| 949 |
|
|---|
| 950 |
$(I Suffix): |
|---|
| 951 |
$(I FloatSuffix) |
|---|
| 952 |
$(I RealSuffix) |
|---|
| 953 |
$(I ImaginarySuffix) |
|---|
| 954 |
$(I FloatSuffix) $(I ImaginarySuffix) |
|---|
| 955 |
$(I RealSuffix) $(I ImaginarySuffix) |
|---|
| 956 |
|
|---|
| 957 |
$(I FloatSuffix): |
|---|
| 958 |
$(B f) |
|---|
| 959 |
$(B F) |
|---|
| 960 |
|
|---|
| 961 |
$(I RealSuffix): |
|---|
| 962 |
$(B L) |
|---|
| 963 |
|
|---|
| 964 |
$(I ImaginarySuffix): |
|---|
| 965 |
$(B i) |
|---|
| 966 |
|
|---|
| 967 |
$(GNAME LeadingDecimal): |
|---|
| 968 |
$(GLINK Decimal) |
|---|
| 969 |
$(B 0) $(GLINK DecimalDigits) |
|---|
| 970 |
) |
|---|
| 971 |
|
|---|
| 972 |
Floats can be in decimal or hexadecimal format, |
|---|
| 973 |
as in standard C. |
|---|
| 974 |
<p> |
|---|
| 975 |
|
|---|
| 976 |
Hexadecimal floats are preceded with a $(B 0x) and the |
|---|
| 977 |
exponent is a $(B p) |
|---|
| 978 |
or $(B P) followed by a decimal number serving as the exponent |
|---|
| 979 |
of 2. |
|---|
| 980 |
<p> |
|---|
| 981 |
|
|---|
| 982 |
Floating literals can have embedded $(SINGLEQUOTE _) characters, which are ignored. |
|---|
| 983 |
The embedded $(SINGLEQUOTE _) are useful for formatting long literals to |
|---|
| 984 |
make them more readable, such |
|---|
| 985 |
as using them as a thousands separator: |
|---|
| 986 |
|
|---|
| 987 |
--------- |
|---|
| 988 |
123_456.567_8 // 123456.5678 |
|---|
| 989 |
1_2_3_4_5_6_._5_6_7_8 // 123456.5678 |
|---|
| 990 |
1_2_3_4_5_6_._5e-6_ // 123456.5e-6 |
|---|
| 991 |
--------- |
|---|
| 992 |
|
|---|
| 993 |
Floating literals with no suffix are of type double. |
|---|
| 994 |
Floats can be followed by one $(B f), $(B F), |
|---|
| 995 |
or $(B L) suffix. |
|---|
| 996 |
The $(B f) or $(B F) suffix means it is a |
|---|
| 997 |
float, and $(B L) means it is a real. |
|---|
| 998 |
<p> |
|---|
| 999 |
|
|---|
| 1000 |
If a floating literal is followed by $(B i), then it is an |
|---|
| 1001 |
$(I ireal) (imaginary) type. |
|---|
| 1002 |
<p> |
|---|
| 1003 |
|
|---|
| 1004 |
Examples: |
|---|
| 1005 |
|
|---|
| 1006 |
--------- |
|---|
| 1007 |
0x1.FFFFFFFFFFFFFp1023 // double.max |
|---|
| 1008 |
0x1p-52 // double.epsilon |
|---|
| 1009 |
1.175494351e-38F // float.min |
|---|
| 1010 |
6.3i // idouble 6.3 |
|---|
| 1011 |
6.3fi // ifloat 6.3 |
|---|
| 1012 |
6.3Li // ireal 6.3 |
|---|
| 1013 |
--------- |
|---|
| 1014 |
|
|---|
| 1015 |
It is an error if the literal exceeds the range of the type. |
|---|
| 1016 |
It is not an error if the literal is rounded to fit into |
|---|
| 1017 |
the significant digits of the type. |
|---|
| 1018 |
<p> |
|---|
| 1019 |
|
|---|
| 1020 |
Complex literals are not tokens, but are assembled from |
|---|
| 1021 |
real and imaginary expressions in the semantic analysis: |
|---|
| 1022 |
|
|---|
| 1023 |
--------- |
|---|
| 1024 |
4.5 + 6.2i // complex number |
|---|
| 1025 |
--------- |
|---|
| 1026 |
|
|---|
| 1027 |
<h3>$(LNAME2 keyword, Keywords)</h3> |
|---|
| 1028 |
|
|---|
| 1029 |
Keywords are reserved identifiers. |
|---|
| 1030 |
|
|---|
| 1031 |
$(GRAMMAR |
|---|
| 1032 |
$(I Keyword): |
|---|
| 1033 |
$(B abstract) |
|---|
| 1034 |
$(B alias) |
|---|
| 1035 |
$(B align) |
|---|
| 1036 |
$(B asm) |
|---|
| 1037 |
$(B assert) |
|---|
| 1038 |
$(B auto) |
|---|
| 1039 |
|
|---|
| 1040 |
$(B body) |
|---|
| 1041 |
$(B bool) |
|---|
| 1042 |
$(B break) |
|---|
| 1043 |
$(B byte) |
|---|
| 1044 |
|
|---|
| 1045 |
$(B case) |
|---|
| 1046 |
$(B cast) |
|---|
| 1047 |
$(B catch) |
|---|
| 1048 |
$(B cdouble) |
|---|
| 1049 |
$(B cent) |
|---|
| 1050 |
$(B cfloat) |
|---|
| 1051 |
$(B char) |
|---|
| 1052 |
$(B class) |
|---|
| 1053 |
$(B const) |
|---|
| 1054 |
$(B continue) |
|---|
| 1055 |
$(B creal) |
|---|
| 1056 |
|
|---|
| 1057 |
$(B dchar) |
|---|
| 1058 |
$(B debug) |
|---|
| 1059 |
$(B default) |
|---|
| 1060 |
$(B delegate) |
|---|
| 1061 |
$(B delete) |
|---|
| 1062 |
$(B deprecated) |
|---|
| 1063 |
$(B do) |
|---|
| 1064 |
$(B double) |
|---|
| 1065 |
|
|---|
| 1066 |
$(B else) |
|---|
| 1067 |
$(B enum) |
|---|
| 1068 |
$(B export) |
|---|
| 1069 |
$(B extern) |
|---|
| 1070 |
|
|---|
| 1071 |
$(B false) |
|---|
| 1072 |
$(B final) |
|---|
| 1073 |
$(B finally) |
|---|
| 1074 |
$(B float) |
|---|
| 1075 |
$(B for) |
|---|
| 1076 |
$(B foreach) |
|---|
| 1077 |
$(B foreach_reverse) |
|---|
| 1078 |
$(B function) |
|---|
| 1079 |
|
|---|
| 1080 |
$(B goto) |
|---|
| 1081 |
|
|---|
| 1082 |
$(B idouble) |
|---|
| 1083 |
$(B if) |
|---|
| 1084 |
$(B ifloat) |
|---|
| 1085 |
$(V2 |
|---|
| 1086 |
$(B immutable) |
|---|
| 1087 |
) $(B import) |
|---|
| 1088 |
$(B in) |
|---|
| 1089 |
$(B inout) |
|---|
| 1090 |
$(B int) |
|---|
| 1091 |
$(B interface) |
|---|
| 1092 |
$(B invariant) |
|---|
| 1093 |
$(B ireal) |
|---|
| 1094 |
$(B is) |
|---|
| 1095 |
|
|---|
| 1096 |
$(B lazy) |
|---|
| 1097 |
$(B long) |
|---|
| 1098 |
|
|---|
| 1099 |
$(B macro) |
|---|
| 1100 |
$(B mixin) |
|---|
| 1101 |
$(B module) |
|---|
| 1102 |
|
|---|
| 1103 |
$(B new) |
|---|
| 1104 |
$(V2 |
|---|
| 1105 |
$(B nothrow) |
|---|
| 1106 |
) $(B null) |
|---|
| 1107 |
|
|---|
| 1108 |
$(B out) |
|---|
| 1109 |
$(B override) |
|---|
| 1110 |
|
|---|
| 1111 |
$(B package) |
|---|
| 1112 |
$(B pragma) |
|---|
| 1113 |
$(B private) |
|---|
| 1114 |
$(B protected) |
|---|
| 1115 |
$(B public) |
|---|
| 1116 |
$(V2 |
|---|
| 1117 |
$(B pure) |
|---|
| 1118 |
) |
|---|
| 1119 |
$(B real) |
|---|
| 1120 |
$(B ref) |
|---|
| 1121 |
$(B return) |
|---|
| 1122 |
|
|---|
| 1123 |
$(B scope) |
|---|
| 1124 |
$(V2 |
|---|
| 1125 |
$(B shared) |
|---|
| 1126 |
) $(B short) |
|---|
| 1127 |
$(B static) |
|---|
| 1128 |
$(B struct) |
|---|
| 1129 |
$(B super) |
|---|
| 1130 |
$(B switch) |
|---|
| 1131 |
$(B synchronized) |
|---|
| 1132 |
|
|---|
| 1133 |
$(B template) |
|---|
| 1134 |
$(B this) |
|---|
| 1135 |
$(B throw) |
|---|
| 1136 |
$(B true) |
|---|
| 1137 |
$(B try) |
|---|
| 1138 |
$(B typedef) |
|---|
| 1139 |
$(B typeid) |
|---|
| 1140 |
$(B typeof) |
|---|
| 1141 |
|
|---|
| 1142 |
$(B ubyte) |
|---|
| 1143 |
$(B ucent) |
|---|
| 1144 |
$(B uint) |
|---|
| 1145 |
$(B ulong) |
|---|
| 1146 |
$(B union) |
|---|
| 1147 |
$(B unittest) |
|---|
| 1148 |
$(B ushort) |
|---|
| 1149 |
|
|---|
| 1150 |
$(B version) |
|---|
| 1151 |
$(B void) |
|---|
| 1152 |
$(B volatile) |
|---|
| 1153 |
|
|---|
| 1154 |
$(B wchar) |
|---|
| 1155 |
$(B while) |
|---|
| 1156 |
$(B with) |
|---|
| 1157 |
$(V2 |
|---|
| 1158 |
$(B __FILE__) |
|---|
| 1159 |
$(B __LINE__) |
|---|
| 1160 |
$(B __gshared) |
|---|
| 1161 |
$(B __thread) |
|---|
| 1162 |
$(B __traits)) |
|---|
| 1163 |
) |
|---|
| 1164 |
|
|---|
| 1165 |
<h3>$(LNAME2 specialtokens, Special Tokens)</h3> |
|---|
| 1166 |
|
|---|
| 1167 |
$(P |
|---|
| 1168 |
These tokens are replaced with other tokens according to the following |
|---|
| 1169 |
table: |
|---|
| 1170 |
) |
|---|
| 1171 |
|
|---|
| 1172 |
$(TABLE2 Special Tokens, |
|---|
| 1173 |
$(TR |
|---|
| 1174 |
$(TH Special Token) |
|---|
| 1175 |
$(TH Replaced with...) |
|---|
| 1176 |
) |
|---|
| 1177 |
$(V1 |
|---|
| 1178 |
$(TR |
|---|
| 1179 |
$(TD $(B __FILE__)) |
|---|
| 1180 |
$(TD string literal containing source file name) |
|---|
| 1181 |
) |
|---|
| 1182 |
$(TR |
|---|
| 1183 |
$(TD $(B __LINE__)) |
|---|
| 1184 |
$(TD integer literal of the current source line number) |
|---|
| 1185 |
) |
|---|
| 1186 |
) |
|---|
| 1187 |
$(TR |
|---|
| 1188 |
$(TD $(B __DATE__)) |
|---|
| 1189 |
$(TD string literal of the date of compilation "$(I mmm dd yyyy)") |
|---|
| 1190 |
) |
|---|
| 1191 |
$(V2 |
|---|
| 1192 |
$(TR |
|---|
| 1193 |
$(TD $(B __EOF__)) |
|---|
| 1194 |
$(TD sets the scanner to the end of the file) |
|---|
| 1195 |
) |
|---|
| 1196 |
) |
|---|
| 1197 |
$(TR |
|---|
| 1198 |
$(TD $(B __TIME__)) |
|---|
| 1199 |
$(TD string literal of the time of compilation "$(I hh:mm:ss)") |
|---|
| 1200 |
) |
|---|
| 1201 |
$(TR |
|---|
| 1202 |
$(TD $(B __TIMESTAMP__)) |
|---|
| 1203 |
$(TD string literal of the date and time of compilation "$(I www mmm dd hh:mm:ss yyyy)") |
|---|
| 1204 |
) |
|---|
| 1205 |
$(TR |
|---|
| 1206 |
$(TD $(B __VENDOR__)) |
|---|
| 1207 |
$(TD Compiler vendor string, such as "Digital Mars D") |
|---|
| 1208 |
) |
|---|
| 1209 |
$(TR |
|---|
| 1210 |
$(TD $(B __VERSION__)) |
|---|
| 1211 |
$(TD Compiler version as an integer, such as 2001) |
|---|
| 1212 |
) |
|---|
| 1213 |
) |
|---|
| 1214 |
|
|---|
| 1215 |
<h3>$(LNAME2 specialtokenseq, Special Token Sequences)</h3> |
|---|
| 1216 |
|
|---|
| 1217 |
Special token sequences are processed by the lexical analyzer, may |
|---|
| 1218 |
appear between any other tokens, and do not affect the syntax |
|---|
| 1219 |
parsing. |
|---|
| 1220 |
<p> |
|---|
| 1221 |
|
|---|
| 1222 |
There is currently only one special token sequence, $(TT #line). |
|---|
| 1223 |
|
|---|
| 1224 |
$(GRAMMAR |
|---|
| 1225 |
$(I SpecialTokenSequence): |
|---|
| 1226 |
$(B # line) $(I Integer) $(I EndOfLine) |
|---|
| 1227 |
$(B # line) $(I Integer) $(I Filespec) $(I EndOfLine) |
|---|
| 1228 |
|
|---|
| 1229 |
$(I Filespec): |
|---|
| 1230 |
$(B ") $(I Characters) $(B ") |
|---|
| 1231 |
) |
|---|
| 1232 |
|
|---|
| 1233 |
This sets the source line number to $(I Integer), |
|---|
| 1234 |
and optionally the source file name to $(I Filespec), |
|---|
| 1235 |
beginning with the next line of source text. |
|---|
| 1236 |
The source file and line number is used for printing error messages |
|---|
| 1237 |
and for mapping generated code back to the source for the symbolic |
|---|
| 1238 |
debugging output. |
|---|
| 1239 |
<p> |
|---|
| 1240 |
|
|---|
| 1241 |
For example: |
|---|
| 1242 |
|
|---|
| 1243 |
----------------- |
|---|
| 1244 |
int #line 6 "foo\bar" |
|---|
| 1245 |
x; // this is now line 6 of file foo\bar |
|---|
| 1246 |
----------------- |
|---|
| 1247 |
|
|---|
| 1248 |
Note that the backslash character is not treated specially inside |
|---|
| 1249 |
$(I Filespec) strings. |
|---|
| 1250 |
|
|---|
| 1251 |
) |
|---|
| 1252 |
|
|---|
| 1253 |
Macros: |
|---|
| 1254 |
TITLE=Lexical |
|---|
| 1255 |
WIKI=Lex |
|---|