| 1 |
Ddoc |
|---|
| 2 |
|
|---|
| 3 |
$(SPEC_S Garbage Collection, |
|---|
| 4 |
|
|---|
| 5 |
$(P D is a fully garbage collected language. That means that it is never |
|---|
| 6 |
necessary |
|---|
| 7 |
to free memory. Just allocate as needed, and the garbage collector will |
|---|
| 8 |
periodically return all unused memory to the pool of available memory. |
|---|
| 9 |
) |
|---|
| 10 |
|
|---|
| 11 |
$(P C and C++ programmers accustomed to explicitly managing memory |
|---|
| 12 |
allocation and |
|---|
| 13 |
deallocation will likely be skeptical of the benefits and efficacy of |
|---|
| 14 |
garbage collection. Experience both with new projects written with |
|---|
| 15 |
garbage collection in mind, and converting existing projects to garbage |
|---|
| 16 |
collection shows that: |
|---|
| 17 |
) |
|---|
| 18 |
|
|---|
| 19 |
$(UL |
|---|
| 20 |
|
|---|
| 21 |
$(LI Garbage collected programs are faster. This is counterintuitive, |
|---|
| 22 |
but the reasons are: |
|---|
| 23 |
|
|---|
| 24 |
$(UL |
|---|
| 25 |
$(LI Reference counting is a common solution to solve explicit |
|---|
| 26 |
memory allocation problems. The code to implement the increment and |
|---|
| 27 |
decrement operations whenever assignments are made is one source |
|---|
| 28 |
of slowdown. Hiding it behind smart pointer classes doesn't help |
|---|
| 29 |
the speed. (Reference counting methods are not a general solution |
|---|
| 30 |
anyway, as circular references never get deleted.) |
|---|
| 31 |
) |
|---|
| 32 |
|
|---|
| 33 |
$(LI Destructors are used to deallocate resources acquired by an object. |
|---|
| 34 |
For most classes, this resource is allocated memory. |
|---|
| 35 |
With garbage collection, most destructors then become empty and |
|---|
| 36 |
can be discarded entirely. |
|---|
| 37 |
) |
|---|
| 38 |
|
|---|
| 39 |
$(LI All those destructors freeing memory can become significant when |
|---|
| 40 |
objects are allocated on the stack. For each one, some mechanism must |
|---|
| 41 |
be established so that if an exception happens, the destructors all |
|---|
| 42 |
get called in each frame to release any memory they hold. If the |
|---|
| 43 |
destructors become irrelevant, then there's no need to set up special |
|---|
| 44 |
stack frames to handle exceptions, and the code runs faster. |
|---|
| 45 |
) |
|---|
| 46 |
|
|---|
| 47 |
$(LI All the code necessary to manage memory can add up to quite a |
|---|
| 48 |
bit. The larger a program is, the less in the cache it is, |
|---|
| 49 |
the more paging it does, and the slower |
|---|
| 50 |
it runs. |
|---|
| 51 |
) |
|---|
| 52 |
|
|---|
| 53 |
$(LI Garbage collection kicks in only when memory gets tight. When |
|---|
| 54 |
memory is not tight, the program runs at full speed and does not |
|---|
| 55 |
spend any time freeing memory. |
|---|
| 56 |
) |
|---|
| 57 |
|
|---|
| 58 |
$(LI Modern garbage collectors are far more advanced now than the |
|---|
| 59 |
older, slower ones. Generational, copying collectors eliminate much |
|---|
| 60 |
of the inefficiency of early mark and sweep algorithms. |
|---|
| 61 |
) |
|---|
| 62 |
|
|---|
| 63 |
$(LI Modern garbage collectors do heap compaction. Heap compaction |
|---|
| 64 |
tends to reduce the number of pages actively referenced by a program, |
|---|
| 65 |
which means that memory accesses are more likely to be cache hits |
|---|
| 66 |
and less swapping. |
|---|
| 67 |
) |
|---|
| 68 |
|
|---|
| 69 |
$(LI Garbage collected programs do not suffer from gradual deterioration |
|---|
| 70 |
due to an accumulation of memory leaks. |
|---|
| 71 |
) |
|---|
| 72 |
) |
|---|
| 73 |
) |
|---|
| 74 |
|
|---|
| 75 |
$(LI Garbage collectors reclaim unused memory, therefore they do not suffer |
|---|
| 76 |
from "memory leaks" which can cause long running applications to gradually |
|---|
| 77 |
consume more and more memory until they bring down the system. GC programs |
|---|
| 78 |
have longer term stability. |
|---|
| 79 |
) |
|---|
| 80 |
|
|---|
| 81 |
$(LI Garbage collected programs have fewer hard-to-find pointer bugs. This |
|---|
| 82 |
is because there are no dangling references to freed memory. There is no |
|---|
| 83 |
code to explicitly manage memory, hence no bugs in such code. |
|---|
| 84 |
) |
|---|
| 85 |
|
|---|
| 86 |
$(LI Garbage collected programs are faster to develop and debug, because |
|---|
| 87 |
there's no need for developing, debugging, testing, or maintaining the |
|---|
| 88 |
explicit deallocation code. |
|---|
| 89 |
) |
|---|
| 90 |
|
|---|
| 91 |
$(LI Garbage collected programs can be significantly smaller, because |
|---|
| 92 |
there is no code to manage deallocation, and there is no need for exception |
|---|
| 93 |
handlers to deallocate memory. |
|---|
| 94 |
) |
|---|
| 95 |
) |
|---|
| 96 |
|
|---|
| 97 |
$(P Garbage collection is not a panacea. There are some downsides: |
|---|
| 98 |
) |
|---|
| 99 |
|
|---|
| 100 |
$(UL |
|---|
| 101 |
|
|---|
| 102 |
$(LI It is not predictable when a collection gets run, so the program |
|---|
| 103 |
can arbitrarily pause. |
|---|
| 104 |
) |
|---|
| 105 |
|
|---|
| 106 |
$(LI The time it takes for a collection to run is not bounded. While in |
|---|
| 107 |
practice it is very quick, this cannot be guaranteed. |
|---|
| 108 |
) |
|---|
| 109 |
|
|---|
| 110 |
$(LI All threads other than the collector thread must be halted while |
|---|
| 111 |
the collection is in progress. |
|---|
| 112 |
) |
|---|
| 113 |
|
|---|
| 114 |
$(LI Garbage collectors can keep around some memory that an explicit |
|---|
| 115 |
deallocator |
|---|
| 116 |
would not. In practice, this is not much of an issue since explicit |
|---|
| 117 |
deallocators usually have memory leaks causing them to eventually use |
|---|
| 118 |
far more memory, and because explicit deallocators do not normally |
|---|
| 119 |
return deallocated memory to the operating system anyway, instead just |
|---|
| 120 |
returning it to its own internal pool. |
|---|
| 121 |
) |
|---|
| 122 |
|
|---|
| 123 |
$(LI Garbage collection should be implemented as a basic operating |
|---|
| 124 |
system |
|---|
| 125 |
kernel service. But since they are not, garbage collecting programs must |
|---|
| 126 |
carry around with them the garbage collection implementation. While this |
|---|
| 127 |
can be a shared DLL, it is still there. |
|---|
| 128 |
) |
|---|
| 129 |
) |
|---|
| 130 |
|
|---|
| 131 |
$(P These constraints are addressed by techniques outlined |
|---|
| 132 |
in <a href="memory.html">Memory Management</a>. |
|---|
| 133 |
) |
|---|
| 134 |
|
|---|
| 135 |
<h2>How Garbage Collection Works</h2> |
|---|
| 136 |
|
|---|
| 137 |
$(P The GC works by:) |
|---|
| 138 |
|
|---|
| 139 |
$(OL |
|---|
| 140 |
$(LI Looking for all the pointer $(SINGLEQUOTE roots) into GC allocated memory.) |
|---|
| 141 |
|
|---|
| 142 |
$(LI Recursively scanning all allocated memory pointed to by |
|---|
| 143 |
roots looking for more pointers into GC allocated memory.) |
|---|
| 144 |
|
|---|
| 145 |
$(LI Freeing all GC allocated memory that has no active pointers |
|---|
| 146 |
to it.) |
|---|
| 147 |
|
|---|
| 148 |
$(LI Possibly compacting the remaining used memory by copying the |
|---|
| 149 |
allocated objects (called a copying collector).) |
|---|
| 150 |
) |
|---|
| 151 |
|
|---|
| 152 |
<h2>Interfacing Garbage Collected Objects With Foreign Code</h2> |
|---|
| 153 |
|
|---|
| 154 |
$(P The garbage collector looks for roots in:) |
|---|
| 155 |
$(OL |
|---|
| 156 |
$(LI its static data segment) |
|---|
| 157 |
$(LI the stacks and register contents of each thread) |
|---|
| 158 |
$(LI any roots added by std.gc.addRoot() or std.gc.addRange()) |
|---|
| 159 |
) |
|---|
| 160 |
|
|---|
| 161 |
$(P If the only root of an object |
|---|
| 162 |
is held outside of this, then the collecter will miss it and free the |
|---|
| 163 |
memory. |
|---|
| 164 |
) |
|---|
| 165 |
|
|---|
| 166 |
$(P To avoid this from happening,) |
|---|
| 167 |
|
|---|
| 168 |
$(UL |
|---|
| 169 |
$(LI Maintain a root to the object in an area the collector does scan |
|---|
| 170 |
for roots.) |
|---|
| 171 |
|
|---|
| 172 |
$(LI Add a root to the object using std.gc.addRoot() or std.gc.addRange().) |
|---|
| 173 |
|
|---|
| 174 |
$(LI Reallocate and copy the object using the foreign code's storage |
|---|
| 175 |
allocator |
|---|
| 176 |
or using the C runtime library's malloc/free. |
|---|
| 177 |
) |
|---|
| 178 |
) |
|---|
| 179 |
|
|---|
| 180 |
<h2>Pointers and the Garbage Collector</h2> |
|---|
| 181 |
|
|---|
| 182 |
$(P Pointers in D can be broadly divided into two categories: those that |
|---|
| 183 |
point to garbage collected memory, and those that do not. Examples |
|---|
| 184 |
of the latter are pointers created by calls to C's malloc(), pointers |
|---|
| 185 |
received from C library routines, pointers to static data, |
|---|
| 186 |
pointers to objects on the stack, etc. For those pointers, anything |
|---|
| 187 |
that is legal in C can be done with them. |
|---|
| 188 |
) |
|---|
| 189 |
|
|---|
| 190 |
$(P For garbage collected pointers and references, however, there are |
|---|
| 191 |
some |
|---|
| 192 |
restrictions. These restrictions are minor, but they are intended |
|---|
| 193 |
to enable the maximum flexibility in garbage collector design. |
|---|
| 194 |
) |
|---|
| 195 |
|
|---|
| 196 |
$(P Undefined behavior:) |
|---|
| 197 |
|
|---|
| 198 |
$(UL |
|---|
| 199 |
|
|---|
| 200 |
$(LI Do not xor pointers with other values, like the |
|---|
| 201 |
xor pointer linked list trick used in C. |
|---|
| 202 |
) |
|---|
| 203 |
|
|---|
| 204 |
$(LI Do not use the xor trick to swap two pointer values. |
|---|
| 205 |
) |
|---|
| 206 |
|
|---|
| 207 |
$(LI Do not store pointers into non-pointer variables using casts and |
|---|
| 208 |
other tricks. |
|---|
| 209 |
|
|---|
| 210 |
------ |
|---|
| 211 |
void* p; |
|---|
| 212 |
... |
|---|
| 213 |
int x = cast(int)p; // error: undefined behavior |
|---|
| 214 |
------ |
|---|
| 215 |
|
|---|
| 216 |
The garbage collector does not scan non-pointer types for roots. |
|---|
| 217 |
) |
|---|
| 218 |
|
|---|
| 219 |
$(LI Do not take advantage of alignment of pointers to store bit flags |
|---|
| 220 |
in the low order bits: |
|---|
| 221 |
|
|---|
| 222 |
------ |
|---|
| 223 |
p = cast(void*)(cast(int)p | 1); // error: undefined behavior |
|---|
| 224 |
------ |
|---|
| 225 |
) |
|---|
| 226 |
|
|---|
| 227 |
$(LI Do not store into pointers values that may point into the |
|---|
| 228 |
garbage collected heap: |
|---|
| 229 |
|
|---|
| 230 |
------ |
|---|
| 231 |
p = cast(void*)12345678; // error: undefined behavior |
|---|
| 232 |
------ |
|---|
| 233 |
|
|---|
| 234 |
A copying garbage collector may change this value. |
|---|
| 235 |
) |
|---|
| 236 |
|
|---|
| 237 |
$(LI Do not store magic values into pointers, other than $(TT null). |
|---|
| 238 |
) |
|---|
| 239 |
|
|---|
| 240 |
$(LI Do not write pointer values out to disk and read them back in |
|---|
| 241 |
again. |
|---|
| 242 |
) |
|---|
| 243 |
|
|---|
| 244 |
$(LI Do not use pointer values to compute a hash function. A copying |
|---|
| 245 |
garbage collector can arbitrarily move objects around in memory, |
|---|
| 246 |
thus invalidating |
|---|
| 247 |
the computed hash value. |
|---|
| 248 |
) |
|---|
| 249 |
|
|---|
| 250 |
$(LI Do not depend on the ordering of pointers: |
|---|
| 251 |
|
|---|
| 252 |
------ |
|---|
| 253 |
if (p1 < p2) // error: undefined behavior |
|---|
| 254 |
... |
|---|
| 255 |
------ |
|---|
| 256 |
since, again, the garbage collector can move objects around in |
|---|
| 257 |
memory. |
|---|
| 258 |
) |
|---|
| 259 |
|
|---|
| 260 |
$(LI Do not add or subtract an offset to a pointer such that the result |
|---|
| 261 |
points outside of the bounds of the garbage collected object originally |
|---|
| 262 |
allocated. |
|---|
| 263 |
|
|---|
| 264 |
------ |
|---|
| 265 |
char* p = new char[10]; |
|---|
| 266 |
char* q = p + 6; // ok |
|---|
| 267 |
q = p + 11; // error: undefined behavior |
|---|
| 268 |
q = p - 1; // error: undefined behavior |
|---|
| 269 |
------ |
|---|
| 270 |
) |
|---|
| 271 |
|
|---|
| 272 |
$(LI Do not misalign pointers if those pointers may |
|---|
| 273 |
point into the gc heap, such as: |
|---|
| 274 |
|
|---|
| 275 |
------ |
|---|
| 276 |
align (1) struct Foo |
|---|
| 277 |
{ byte b; |
|---|
| 278 |
char* p; // misaligned pointer |
|---|
| 279 |
} |
|---|
| 280 |
------ |
|---|
| 281 |
|
|---|
| 282 |
Misaligned pointers may be used if the underlying hardware |
|---|
| 283 |
supports them $(B and) the pointer is never used to point |
|---|
| 284 |
into the gc heap. |
|---|
| 285 |
) |
|---|
| 286 |
|
|---|
| 287 |
$(LI Do not use byte-by-byte memory copies to copy pointer values. |
|---|
| 288 |
This may result in intermediate conditions where there is |
|---|
| 289 |
not a valid pointer, and if the gc pauses the thread in such a |
|---|
| 290 |
condition, it can corrupt memory. |
|---|
| 291 |
Most implementations of $(TT memcpy()) will work since the |
|---|
| 292 |
internal implementation of it does the copy in aligned chunks |
|---|
| 293 |
greater than or equal to a pointer size, but since this kind of |
|---|
| 294 |
implementation is not guaranteed by the C standard, use |
|---|
| 295 |
$(TT memcpy()) only with extreme caution. |
|---|
| 296 |
) |
|---|
| 297 |
|
|---|
| 298 |
$(LI Do not have pointers in a struct instance that point back |
|---|
| 299 |
to the same instance. The trouble with this is if the instance |
|---|
| 300 |
gets moved in memory, the pointer will point back to where it |
|---|
| 301 |
came from, with likely disastrous results. |
|---|
| 302 |
) |
|---|
| 303 |
|
|---|
| 304 |
) |
|---|
| 305 |
|
|---|
| 306 |
$(P Things that are reliable and can be done:) |
|---|
| 307 |
|
|---|
| 308 |
$(UL |
|---|
| 309 |
|
|---|
| 310 |
$(LI Use a union to share storage with a pointer: |
|---|
| 311 |
|
|---|
| 312 |
------ |
|---|
| 313 |
union U { void* ptr; int value } |
|---|
| 314 |
------ |
|---|
| 315 |
) |
|---|
| 316 |
|
|---|
| 317 |
$(LI A pointer to the start of a garbage collected object need not |
|---|
| 318 |
be maintained if a pointer to the interior of the object exists. |
|---|
| 319 |
|
|---|
| 320 |
------ |
|---|
| 321 |
char[] p = new char[10]; |
|---|
| 322 |
char[] q = p[3..6]; |
|---|
| 323 |
// q is enough to hold on to the object, don't need to keep |
|---|
| 324 |
// p as well. |
|---|
| 325 |
------ |
|---|
| 326 |
) |
|---|
| 327 |
) |
|---|
| 328 |
|
|---|
| 329 |
$(P One can avoid using pointers anyway for most tasks. D provides |
|---|
| 330 |
features |
|---|
| 331 |
rendering most explicit pointer uses obsolete, such as reference |
|---|
| 332 |
objects, |
|---|
| 333 |
dynamic arrays, and garbage collection. Pointers |
|---|
| 334 |
are provided in order to interface successfully with C APIs and for |
|---|
| 335 |
some low level work. |
|---|
| 336 |
) |
|---|
| 337 |
|
|---|
| 338 |
<h2>Working with the Garbage Collector</h2> |
|---|
| 339 |
|
|---|
| 340 |
$(P Garbage collection doesn't solve every memory deallocation problem. |
|---|
| 341 |
For |
|---|
| 342 |
example, if a root to a large data structure is kept, the garbage |
|---|
| 343 |
collector cannot reclaim it, even if it is never referred to again. To |
|---|
| 344 |
eliminate this problem, it is good practice to set a reference or |
|---|
| 345 |
pointer to an object to null when no longer needed. |
|---|
| 346 |
) |
|---|
| 347 |
|
|---|
| 348 |
$(P This advice applies only to static references or references embedded |
|---|
| 349 |
inside other objects. There is not much point for such stored on the |
|---|
| 350 |
stack to be nulled, since the collector doesn't scan for roots past the |
|---|
| 351 |
top of the stack, and because new stack frames are initialized anyway. |
|---|
| 352 |
) |
|---|
| 353 |
|
|---|
| 354 |
<h2>Object Pinning and a Moving Garbage Collector</h2> |
|---|
| 355 |
|
|---|
| 356 |
$(P Although D does not currently use a moving garbage collector, by following |
|---|
| 357 |
the rules listed above one can be implemented. No special action is required |
|---|
| 358 |
to pin objects. A moving collector will only move objects for which there |
|---|
| 359 |
are no ambiguous references, and for which it can update those references. |
|---|
| 360 |
All other objects will be automatically pinned. |
|---|
| 361 |
) |
|---|
| 362 |
|
|---|
| 363 |
<h2>D Operations That Involve the Garbage Collector</h2> |
|---|
| 364 |
|
|---|
| 365 |
$(P Some sections of code may need to avoid using the garbage collector. |
|---|
| 366 |
The following constructs may allocate memory using the garbage collector: |
|---|
| 367 |
) |
|---|
| 368 |
|
|---|
| 369 |
$(UL |
|---|
| 370 |
$(LI $(GLINK2 expression, NewExpression)) |
|---|
| 371 |
$(LI Array appending) |
|---|
| 372 |
$(LI Array concatenation) |
|---|
| 373 |
$(LI Array literals (except when used to initialize static data)) |
|---|
| 374 |
$(LI Associative array literals) |
|---|
| 375 |
$(LI Any insertion, removal, or lookups in an associative array) |
|---|
| 376 |
$(LI Extracting keys or values from an associative array) |
|---|
| 377 |
$(V2 |
|---|
| 378 |
$(LI Taking the address of (i.e. making a delegate) a nested function that |
|---|
| 379 |
accesses variables in an outer scope) |
|---|
| 380 |
$(LI A function literal that access variables in an outer scope) |
|---|
| 381 |
) |
|---|
| 382 |
$(LI An $(GLINK2 expression, AssertExpression) that fails its condition) |
|---|
| 383 |
) |
|---|
| 384 |
|
|---|
| 385 |
<h2>References</h2> |
|---|
| 386 |
|
|---|
| 387 |
$(UL |
|---|
| 388 |
$(LI $(LINK2 http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29, Wikipedia)) |
|---|
| 389 |
$(LI $(LINK2 http://www.iecc.com/gclist/GC-faq.html, GC FAQ)) |
|---|
| 390 |
$(LI $(LINK2 ftp://ftp.cs.utexas.edu/pub/garbage/gcsurvey.ps, Uniprocessor Garbage Collector Techniques)) |
|---|
| 391 |
$(LI $(LINK2 http://www.amazon.com/exec/obidos/ASIN/0471941484/classicempire, |
|---|
| 392 |
Garbage Collection : Algorithms for Automatic Dynamic Memory Management)) |
|---|
| 393 |
) |
|---|
| 394 |
|
|---|
| 395 |
) |
|---|
| 396 |
|
|---|
| 397 |
Macros: |
|---|
| 398 |
TITLE=Garbage Collection |
|---|
| 399 |
WIKI=Garbage |
|---|