root/trunk/docsrc/memory.dd

Revision 2040, 15.3 kB (checked in by walter, 2 years ago)

typography

  • Property svn:eol-style set to native
Line 
1 Ddoc
2
3 $(D_S Memory Management,
4
5     $(P Most non-trivial programs needs to allocate and free memory.
6     Memory management techniques become more and more important as
7     programs increase in complexity, size, and performance.
8     D offers many options for managing memory.
9     )
10
11     $(P The three primary methods for allocating memory in D are:)
12
13     $(OL
14     $(LI Static data, allocated in the default data segment.)
15     $(LI Stack data, allocated on the CPU program stack.)
16     $(LI $(LINK2 garbage.html, Garbage collected data),
17     allocated dynamically on the
18     garbage collection heap.)
19     )
20
21     $(P The techniques for using them, as well
22     as some advanced alternatives are:
23     )
24
25     $(UL
26     $(LI $(LINK2 #copy-on-write, Strings (and Array) Copy-on-Write))
27     $(LI <a href="#realtime">Real Time</a>)
28     $(LI <a href="#smoothoperation">Smooth Operation</a>)
29     $(LI <a href="#freelists">Free Lists</a>)
30     $(LI <a href="#referencecounting">Reference Counting</a>)
31     $(LI <a href="#newdelete">Explicit Class Instance Allocation</a>)
32     $(LI <a href="#markrelease">Mark/Release</a>)
33     $(LI <a href="#raii">RAII (Resource Acquisition Is Initialization)</a>)
34     $(LI <a href="#stackclass">Allocating Class Instances On The Stack</a>)
35     $(LI <a href="#uninitializedarrays">Allocating Uninitialized Arrays On The Stack</a>)
36     $(LI $(LINK2 #isr, Interrupt Service Routines))
37     )
38
39 <h2>$(LNAME2 copy-on-write, Strings (and Array) Copy-on-Write)</h2>
40
41     $(P Consider the case of passing an array to a function, possibly
42     modifying the contents of the array, and returning the modified
43     array. Since arrays are passed by reference, not by value,
44     a crucial issue is who owns the contents of the array?
45     For example, a function to convert an array of characters to
46     upper case:
47     )
48
49 ------
50 char[] toupper(char[] s)
51 {
52     int i;
53
54     for (i = 0; i < s.length; i++)
55     {
56     char c = s[i];
57     if ('a' <= c && c <= 'z')
58         s[i] = c - (cast(char)'a' - 'A');
59     }
60     return s;
61 }
62 ------
63
64     $(P Note that the caller's version of s[] is also modified. This may
65     be not at all what was intended, or worse, s[] may be a slice
66     into a read-only section of memory.
67     )
68
69     $(P If a copy of s[] was always made by toupper(), then that will
70     unnecessarily consume time and memory for strings that are already
71     all upper case.
72     )
73
74     $(P The solution is to implement copy-on-write, which means that a copy
75     is made only if the string needs to be modified. Some string
76     processing languages do do this as the default behavior, but there
77     is a huge cost to it. The string "abcdeF" will wind up being copied 5
78     times by the function. To get the maximum efficiency using the protocol,
79     it'll have to be done explicitly in the code. Here's toupper()
80     rewritten to implement copy-on-write in an efficient manner:
81     )
82
83 ------
84 char[] toupper(char[] s)
85 {
86     int changed;
87     int i;
88
89     changed = 0;
90     for (i = 0; i < s.length; i++)
91     {
92     char c = s[i];
93     if ('a' <= c && c <= 'z')
94     {
95         if (!changed)
96         {   char[] r = new char[s.length];
97         r[] = s;
98         s = r;
99         changed = 1;
100         }
101         s[i] = c - (cast(char)'a' - 'A');
102     }
103     }
104     return s;
105 }
106 ------
107
108     $(P Copy-on-write is the protocol implemented by array processing
109     functions in the D Phobos runtime library.
110     )
111
112 <h2><a name="realtime">Real Time</a></h2>
113
114     $(P Real time programming means that a program must be able to
115     guarantee a maximum latency, or time to complete an operation.
116     With most memory allocation schemes, including malloc/free and
117     garbage collection, the latency is theoretically not bound.
118     The most reliable way to guarantee latency is to preallocate
119     all data that will be needed by the time critical portion.
120     If no calls to allocate memory are done, the GC will not run
121     and so will not cause the maximum latency to be exceeded.
122     )
123
124 <h2><a name="smoothoperation">Smooth Operation</a></h2>
125
126     $(P Related to real time programming is the need for a program to
127     operate smoothly, without arbitrary pauses while the garbage
128     collector stops everything to run a collection.
129     An example of such a program would be an interactive shooter
130     type game. Having the game play pause erratically, while not
131     fatal to the program, can be annoying to the user.
132     )
133
134     $(P There are several techniques to eliminate or mitigate the effect:)
135
136 $(UL
137     $(LI Preallocate all data needed before the part of the code
138     that needs to be smooth is run.)
139
140     $(LI Manually run a GC collection cycle at points in program
141     execution where it is already paused. An example of such a place
142     would be where the program has just displayed a prompt for user
143     input and the user has not responded yet.
144     This reduces the odds that a collection cycle will be needed
145     during the smooth code.)
146
147     $(LI Call std.gc.disable() before the smooth code is run, and
148     std.gc.enable() afterwards. This will cause the GC to favor allocating
149     more memory instead of running a collection pass.)
150 )
151
152 <h2><a name="freelists">Free Lists</a></h2>
153
154     $(P Free lists are a great way to accelerate access to a frequently
155     allocated and discarded type. The idea is simple - instead of
156     deallocating an object when done with it, put it on a free list.
157     When allocating, pull one off the free list first.
158     )
159 ------
160 class Foo
161 {
162     static Foo freelist;        // start of free list
163
164     static Foo allocate()
165     {   Foo f;
166
167     if (freelist)
168     {   f = freelist;
169         freelist = f.next;
170     }
171     else
172         f = new Foo();
173     return f;
174     }
175
176     static void deallocate(Foo f)
177     {
178     f.next = freelist;
179     freelist = f;
180     }
181
182     Foo next;       // for use by FooFreeList
183     ...
184 }
185
186 void test()
187 {
188     Foo f = Foo.allocate();
189     ...
190     Foo.deallocate(f);
191 }
192 ------
193
194     Such free list approaches can be very high performance.
195
196     $(UL
197     $(LI If used by multiple threads, the allocate() and
198     deallocate() functions need to be synchronized.)
199
200     $(LI The Foo constructor is not re-run by allocate() when
201     allocating from the free list, so the allocator may need
202     to reinitialize some of the members.)
203
204     $(LI It is not necessary to practice RAII with this, since
205     if any objects are not passed to deallocate() when done, because
206     of a thrown exception, they'll eventually get picked up by
207     the GC anyway.)
208     )
209
210 <h2><a name="referencecounting">Reference Counting</a></h2>
211
212     $(P The idea behind reference counting is to include a count
213     field in the object. Increment it for each additional reference
214     to it, and decrement it whenever a reference to it ceases.
215     When the count hits 0, the object can be deleted.
216     )
217
218     $(P D doesn't provide any automated support for reference counting,
219     it will have to be done explicitly.
220     )
221
222     $(P <a href="windows.html#com">Win32 COM programming</a>
223     uses the members AddRef() and Release()
224     to maintain the reference counts.
225     )
226
227 <h2><a name="newdelete">Explicit Class Instance Allocation</a></h2>
228
229     $(P D provides a means of creating custom allocators and deallocators
230     for class instances. Normally, these would be allocated on the
231     garbage collected heap, and deallocated when the collector decides
232     to run. For specialized purposes, this can be handled by
233     creating $(I NewDeclaration)s and $(I DeleteDeclaration)s.
234     For example, to allocate using the C runtime library's
235     $(TT malloc) and $(TT free):
236     )
237
238 ------
239 import std.c.stdlib;
240 import std.outofmemory;
241 import std.gc;
242
243 class Foo
244 {
245     new(size_t sz)
246     {
247     void* p;
248
249     p = std.c.stdlib.malloc(sz);
250     if (!p)
251         throw new OutOfMemoryException();
252     std.gc.addRange(p, p + sz);
253     return p;
254     }
255
256     delete(void* p)
257     {
258     if (p)
259     {   std.gc.removeRange(p);
260         std.c.stdlib.free(p);
261     }
262     }
263 }
264 ------
265
266     $(P The critical features of new() are:)
267
268     $(UL
269     $(LI new() does not have a return type specified,
270     but it is defined to be void*. new() must return
271     a void*.)
272
273     $(LI If new() cannot allocate memory, it must
274     not return null, but must throw an exception.)
275
276     $(LI The pointer returned from new() must be to memory
277     aligned to the default alignment. This is 8 on win32
278     systems.)
279
280     $(LI The $(I size) parameter is needed in case the
281     allocator is called from a class derived from Foo and is
282     a larger size than Foo.)
283
284     $(LI A null is not returned if storage cannot be allocated.
285     Instead, an exception is thrown. Which exception gets thrown
286     is up to the programmer, in this case, OutOfMemory() is.)
287
288     $(LI When scanning memory for root pointers into the garbage
289     collected heap, the static data segment and the stack are
290     scanned automatically. The C heap is not. Therefore, if Foo
291     or any class derived from Foo using the allocator contains
292     any references to data allocated by the garbage collector, the
293     GC needs to be notified. This is done with the std.gc.addRange()
294     method.)
295
296     $(LI No initialization of the memory is necessary, as code
297     is automatically inserted after the call to new() to set the
298     class instance members to their defaults and then the constructor
299     (if any) is run.)
300     )
301
302     The critical features of delete() are:
303
304     $(UL
305     $(LI The destructor (if any) has already been called on the
306     argument p, so the data it points to should be assumed to
307     be garbage.)
308
309     $(LI The pointer p may be null.)
310
311     $(LI If the GC was notified with std.gc.addRange(), a corresponding
312     call to std.gc.removeRange() must happen in the deallocator.)
313
314     $(LI If there is a delete(), there should be a corresponding new().)
315     )
316
317     $(P If memory is allocated using class specific allocators and deallocators,
318     careful coding practices must be followed to avoid memory leaks
319     and dangling references. In the presence of exceptions, it is
320     particularly important to practice RAII to prevent memory leaks.
321     )
322
323     $(P Custom allocators and deallocators can be done for structs
324     and unions, too.)
325
326 <h2><a name="markrelease">Mark/Release</a></h2>
327
328     $(P Mark/Release is equivalent to a stack method of allocating and
329     freeing memory. A $(SINGLEQUOTE stack) is created in memory. Objects are allocated
330     by simply moving a pointer down the stack. Various points are
331     $(SINGLEQUOTE marked), and then whole sections of memory are released
332     simply by resetting the stack pointer back to a marked point.
333     )
334
335 ------
336 import std.c.stdlib;
337 import std.outofmemory;
338
339 class Foo
340 {
341     static void[] buffer;
342     static int bufindex;
343     static const int bufsize = 100;
344
345     static this()
346     {   void *p;
347
348     p = malloc(bufsize);
349     if (!p)
350         throw new OutOfMemoryException;
351     std.gc.addRange(p, p + bufsize);
352     buffer = p[0 .. bufsize];
353     }
354
355     static ~this()
356     {
357     if (buffer.length)
358     {
359         std.gc.removeRange(buffer);
360         free(buffer);
361         buffer = null;
362     }
363     }
364
365     new(size_t sz)
366     {   void *p;
367
368     p = &buffer[bufindex];
369     bufindex += sz;
370     if (bufindex > buffer.length)
371         throw new OutOfMemory;
372     return p;
373     }
374
375     delete(void* p)
376     {
377     assert(0);
378     }
379
380     static int mark()
381     {
382     return bufindex;
383     }
384
385     static void release(int i)
386     {
387     bufindex = i;
388     }
389 }
390
391 void test()
392 {
393     int m = Foo.mark();
394     Foo f1 = new Foo;       // allocate
395     Foo f2 = new Foo;       // allocate
396     ...
397     Foo.release(m);     // deallocate f1 and f2
398 }
399 ------
400
401     $(P The allocation of buffer[] itself is added as
402     a region to the GC, so there is no need for a separate
403     call inside Foo.new() to do it.)
404
405 <h2><a name="raii">RAII (Resource Acquisition Is Initialization)</a></h2>
406
407     $(P RAII techniques can be useful in avoiding memory leaks
408     when using explicit allocators and deallocators.
409     Adding the <a href="attribute.html#scope">scope attribute</a>
410     to such classes can help.
411     )
412
413 <h2>$(LNAME2 stackclass, Allocating Class Instances On The Stack)</h2>
414
415     $(P Class instances are normally allocated on the garbage
416     collected heap. However, if they:)
417
418     $(UL
419     $(LI are allocated as local symbols in a function)
420     $(LI are allocated using $(B new))
421     $(LI use $(B new) with no arguments (constructor arguments are allowed))
422     $(LI have the $(B scope) storage class)
423     )
424
425     $(P then they are allocated on the stack. This is more efficient
426     than doing an allocate/free cycle on the instance. But be
427     careful that any reference to the object does not survive
428     the return of the function.)
429
430 ---
431 class C { ... }
432
433 scope c = new C();  // c is allocated on the stack
434 scope c2 = new C(5);    // allocated on stack
435 scope c3 = new(5) C();  // allocated by a custom allocator
436 ---
437
438     $(P If the class has a destructor, then that destructor is
439     guaranteed to be run when the class object goes out of scope,
440     even if the scope is exited via an exception.)
441
442
443 <h2><a name="uninitializedarrays">Allocating Uninitialized Arrays On The Stack</a></h2>
444
445     $(P Arrays are always initialized in D. So, the following declaration:)
446
447 ------
448 void foo()
449 {   byte[1024] buffer;
450
451     fillBuffer(buffer);
452     ...
453 }
454 ------
455
456     $(P will not be as fast as it might be since the buffer[] contents
457     are always initialized. If careful profiling of the program shows
458     that this initialization is a speed problem, it can be eliminated using
459     a $(I VoidInitializer):
460     )
461
462 ------
463 void foo()
464 {   byte[1024] buffer = $(B void);
465
466     fillBuffer(buffer);
467     ...
468 }
469 ------
470
471     $(P Uninitialized data on the stack comes with some caveats that need
472     to be carefully evaluated before using:
473     )
474
475     $(UL
476
477     $(LI The uninitialized data that is on the stack will get scanned by the
478     garbage collector looking for any references to allocated memory. Since
479     the uninitialized data consists of old D stack frames, it is highly
480     likely that some of that garbage will look like references into the GC
481     heap, and the GC memory will not get freed. This problem really does
482     happen, and can be pretty frustrating to track down.)
483
484     $(LI It's possible for a function to pass out of it a reference to data
485     on that function's stack frame. By then allocating a new stack frame
486     over the old data, and not initializing, the reference to the old data
487     may still appear to be valid. The program will then behave erratically.
488     Initializing all data on the stack frame will greatly increase the
489     probability of forcing that bug into the open in a repeatable manner.)
490
491     $(LI Uninitialized data can be a source of bugs and trouble, even when
492     used correctly. One design goal of D is to improve reliability and
493     portability by eliminating sources of undefined behavior, and
494     uninitialized data is one huge source of undefined, unportable, erratic
495     and unpredictable behavior. Hence this idiom should only be used after
496     other opportunities for speed optimization are exhausted and if
497     benchmarking shows that it really does speed up the overall execution.)
498
499     )
500
501 <h2><a name="isr">Interrupt Service Routines</a></h2>
502
503     $(P When the garbage collector does a collection pass, it must
504     pause all running threads in order to scan their stacks and register
505     contents for references to GC allocated objects.
506     If an ISR (Interrupt Service Routine) thread is paused,
507     this can break the program.
508     )
509
510     $(P Therefore, the ISR thread should not be paused.
511     Threads created with the $(LINK2 phobos/std_thread.html, std.thread)
512     functions will be paused. But threads created with C's
513     $(TT _beginthread()) or equivalent won't be, the GC
514     won't know they exist.
515     )
516
517     $(P For this to work successfully:)
518
519     $(UL
520
521     $(LI The ISR thread cannot allocate any memory using the GC.
522     This means that the global $(TT new) cannot be used.
523     Nor can dynamic arrays be resized, nor can any elements be
524     added to associative arrays. Any use of the D runtime library
525     should be examined for any possibility of allocating GC memory -
526     or better yet, the ISR should not call any D runtime library
527     functions at all.)
528
529     $(LI The ISR cannot hold the sole reference to any GC allocated
530     memory, otherwise the GC may free the memory while the ISR
531     is still using it. The solution is to have one of the paused
532     threads hold a reference to it too, or store a reference to
533     it in global data.)
534
535     )
536 )
537
538 Macros:
539     TITLE=Memory Management
540     WIKI=Memory
Note: See TracBrowser for help on using the browser.