Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

root/trunk/tango/io/Buffer.d

Revision 3838, 54.6 kB (checked in by kris, 4 months ago)

fixes #1188 :: "Token is too large to fit within buffer" while using GrowBuffer?

Added an expand() method to take care of this. Kudos to fvbommel

  • Property svn:mime-type set to text/x-dsrc
  • Property svn:eol-style set to native
Line 
1 /*******************************************************************************
2
3         copyright:      Copyright (c) 2004 Kris Bell. All rights reserved
4
5         license:        BSD style: $(LICENSE)
6
7         version:        Mar 2004: Initial release
8                         Dec 2006: Outback release
9
10         authors:        Kris
11
12 *******************************************************************************/
13
14 module tango.io.Buffer;
15
16 private import  tango.core.Exception;
17
18 public  import  tango.io.model.IBuffer,
19                 tango.io.model.IConduit;
20
21 /******************************************************************************
22
23 ******************************************************************************/
24
25 extern (C)
26 {
27         protected void * memcpy (void *dst, void *src, uint);
28 }
29
30 /*******************************************************************************
31
32         Buffer is central concept in Tango I/O. Each buffer acts
33         as a queue (line) where items are removed from the front
34         and new items are added to the back. Buffers are modeled
35         by tango.io.model.IBuffer, and a concrete implementation
36         is provided by this class.
37
38         Buffer can be read from and written to directly, though
39         various data-converters and filters are often leveraged
40         to apply structure to what might otherwise be simple raw
41         data.
42
43         Buffers may also be tokenized by applying an Iterator.
44         This can be handy when one is dealing with text input,
45         and/or the content suits a more fluid format than most
46         typical converters support. Iterator tokens are mapped
47         directly onto buffer content (sliced), making them quite
48         efficient in practice. Like other types of buffer client,
49         multiple iterators can be mapped onto one common buffer
50         and access will be serialized.
51
52         Buffers are sometimes memory-only, in which case there
53         is nothing left to do when a client has consumed all the
54         content. Other buffers are themselves bound to an external
55         device called a conduit. When this is the case, a consumer
56         will eventually cause a buffer to reload via its associated
57         conduit and previous buffer content will be lost.
58
59         A similar approach is applied to clients which populate a
60         buffer, whereby the content of a full buffer will be flushed
61         to a bound conduit before continuing. Another variation is
62         that of a memory-mapped buffer, whereby the buffer content
63         is mapped directly to virtual memory exposed via the OS. This
64         can be used to address large files as an array of content.
65
66         Direct buffer manipulation typically involves appending,
67         as in the following example:
68         ---
69         // create a small buffer
70         auto buf = new Buffer (256);
71
72         auto foo = "to write some D";
73
74         // append some text directly to it
75         buf.append ("now is the time for all good men ").append(foo);
76         ---
77
78         Alternatively, one might use a formatter to append the buffer:
79         ---
80         auto output = new FormatOutput (new Buffer(256));
81         output.format ("now is the time for {} good men {}", 3, foo);
82         ---
83
84         A slice() method will return all valid content within a buffer.
85         GrowBuffer can be used instead, where one wishes to append beyond
86         a specified limit.
87
88         A common usage of a buffer is in conjunction with a conduit,
89         such as FileConduit. Each conduit exposes a preferred-size for
90         its associated buffers, utilized during buffer construction:
91         ---
92         auto file = new FileConduit ("file.name");
93         auto buf = new Buffer (file);
94         ---
95
96         However, this is typically hidden by higher level constructors
97         such as those exposed via the stream wrappers. For example:
98         ---
99         auto input = new DataInput (new FileInput("file.name"));
100         ---
101
102         There is indeed a buffer between the resultant stream and the
103         file source, but explicit buffer construction is unecessary in
104         common cases.
105
106         An Iterator is constructed in a similar manner, where you provide
107         it an input stream to operate upon. There's a variety of iterators
108         available in the tango.text.stream package, and they are templated
109         for each of utf8, utf16, and utf32. This example uses a line iterator
110         to sweep a text file:
111         ---
112         auto lines = new LineInput (new FileInput("file.name"));
113         foreach (line; lines)
114                  Cout(line).newline;
115         ---
116
117         Buffers are useful for many purposes within Tango, but there
118         are times when it may be more appropriate to sidestep them. For
119         such cases, all conduit derivatives (such as FileConduit) support
120         direct array-based IO via a pair of read() and write() methods.
121
122 *******************************************************************************/
123
124 class Buffer : IBuffer
125 {
126         protected OutputStream  sink;                   // optional data sink
127         protected InputStream   source;                 // optional data source
128         protected void[]        data;                   // the raw data buffer
129         protected uint          index;                  // current read position
130         protected uint          extent;                 // limit of valid content
131         protected uint          dimension;              // maximum extent of content
132         protected bool          canCompress = true;     // compress iterator content?
133
134
135         protected static char[] overflow  = "output buffer is full";
136         protected static char[] underflow = "input buffer is empty";
137         protected static char[] eofRead   = "end-of-flow whilst reading";
138         protected static char[] eofWrite  = "end-of-flow whilst writing";
139
140         /***********************************************************************
141
142                 Ensure the buffer remains valid between method calls
143
144         ***********************************************************************/
145
146         invariant
147         {
148                 assert (index <= extent);
149                 assert (extent <= dimension);
150         }
151
152         /***********************************************************************
153
154                 Construct a buffer
155
156                 Params:
157                 conduit = the conduit to buffer
158
159                 Remarks:
160                 Construct a Buffer upon the provided conduit. A relevant
161                 buffer size is supplied via the provided conduit.
162
163         ***********************************************************************/
164
165         this (IConduit conduit)
166         {
167                 assert (conduit);
168
169                 this (conduit.bufferSize);
170                 setConduit (conduit);
171         }
172
173         /***********************************************************************
174
175                 Construct a buffer
176
177                 Params:
178                 stream = an input stream
179                 capacity = desired buffer capacity
180
181                 Remarks:
182                 Construct a Buffer upon the provided input stream.
183
184         ***********************************************************************/
185
186         this (InputStream stream, uint capacity)
187         {
188                 this (capacity);
189                 input = stream;
190         }
191
192         /***********************************************************************
193
194                 Construct a buffer
195
196                 Params:
197                 stream = an output stream
198                 capacity = desired buffer capacity
199
200                 Remarks:
201                 Construct a Buffer upon the provided output stream.
202
203         ***********************************************************************/
204
205         this (OutputStream stream, uint capacity)
206         {
207                 this (capacity);
208                 output = stream;
209         }
210
211         /***********************************************************************
212
213                 Construct a buffer
214
215                 Params:
216                 capacity = the number of bytes to make available
217
218                 Remarks:
219                 Construct a Buffer with the specified number of bytes.
220
221         ***********************************************************************/
222
223         this (uint capacity = 0)
224         {
225                 setContent (new ubyte[capacity], 0);
226         }
227
228         /***********************************************************************
229
230                 Construct a buffer
231
232                 Params:
233                 data = the backing array to buffer within
234
235                 Remarks:
236                 Prime a buffer with an application-supplied array. All content
237                 is considered valid for reading, and thus there is no writable
238                 space initially available.
239
240         ***********************************************************************/
241
242         this (void[] data)
243         {
244                 setContent (data, data.length);
245         }
246
247         /***********************************************************************
248
249                 Construct a buffer
250
251                 Params:
252                 data =          the backing array to buffer within
253                 readable =      the number of bytes initially made
254                                 readable
255
256                 Remarks:
257                 Prime buffer with an application-supplied array, and
258                 indicate how much readable data is already there. A
259                 write operation will begin writing immediately after
260                 the existing readable content.
261
262                 This is commonly used to attach a Buffer instance to
263                 a local array.
264
265         ***********************************************************************/
266
267         this (void[] data, uint readable)
268         {
269                 setContent (data, readable);
270         }
271
272         /***********************************************************************
273
274                 Attempt to share an upstream Buffer, and create an instance
275                 where there not one available.
276
277                 Params:
278                 stream = an input stream
279                 size = a hint of the desired buffer size. Defaults to the
280                 conduit-defined size
281
282                 Remarks:
283                 If an upstream Buffer instances is visible, it will be shared.
284                 Otherwise, a new instance is created based upon the bufferSize
285                 exposed by the stream endpoint (conduit).
286
287         ***********************************************************************/
288
289         static IBuffer share (InputStream stream, uint size=uint.max)
290         {
291                 auto b = cast(Buffered) stream;
292                 if (b)
293                     return b.buffer;
294
295                 if (size is uint.max)
296                     size = stream.conduit.bufferSize;
297
298                 return new Buffer (stream, size);
299         }
300
301         /***********************************************************************
302
303                 Attempt to share an upstream Buffer, and create an instance
304                 where there not one available.
305
306                 Params:
307                 stream = an output stream
308                 size = a hint of the desired buffer size. Defaults to the
309                 conduit-defined size
310
311                 Remarks:
312                 If an upstream Buffer instances is visible, it will be shared.
313                 Otherwise, a new instance is created based upon the bufferSize
314                 exposed by the stream endpoint (conduit).
315
316         ***********************************************************************/
317
318         static IBuffer share (OutputStream stream, uint size=uint.max)
319         {
320                 auto b = cast(Buffered) stream;
321                 if (b)
322                     return b.buffer;
323
324                 if (size is uint.max)
325                     size = stream.conduit.bufferSize;
326
327                 return new Buffer (stream, size);
328         }
329
330         /***********************************************************************
331
332                 Reset the buffer content
333
334                 Params:
335                 data =  the backing array to buffer within. All content
336                         is considered valid
337
338                 Returns:
339                 the buffer instance
340
341                 Remarks:
342                 Set the backing array with all content readable. Writing
343                 to this will either flush it to an associated conduit, or
344                 raise an Eof condition. Use clear() to reset the content
345                 (make it all writable).
346
347         ***********************************************************************/
348
349         IBuffer setContent (void[] data)
350         {
351                 return setContent (data, data.length);
352         }
353
354         /***********************************************************************
355
356                 Reset the buffer content
357
358                 Params:
359                 data =          the backing array to buffer within
360                 readable =      the number of bytes within data considered
361                                 valid
362
363                 Returns:
364                 the buffer instance
365
366                 Remarks:
367                 Set the backing array with some content readable. Writing
368                 to this will either flush it to an associated conduit, or
369                 raise an Eof condition. Use clear() to reset the content
370                 (make it all writable).
371
372         ***********************************************************************/
373
374         IBuffer setContent (void[] data, uint readable)
375         {
376                 this.data = data;
377                 this.extent = readable;
378                 this.dimension = data.length;
379
380                 // reset to start of input
381                 this.index = 0;
382
383                 return this;
384         }
385
386         /***********************************************************************
387
388                 Access buffer content
389
390                 Params:
391                 size =  number of bytes to access
392                 eat =   whether to consume the content or not
393
394                 Returns:
395                 the corresponding buffer slice when successful, or
396                 null if there's not enough data available (Eof; Eob).
397
398                 Remarks:
399                 Read a slice of data from the buffer, loading from the
400                 conduit as necessary. The specified number of bytes is
401                 sliced from the buffer, and marked as having been read
402                 when the 'eat' parameter is set true. When 'eat' is set
403                 false, the read position is not adjusted.
404
405                 Note that the slice cannot be larger than the size of
406                 the buffer ~ use method fill(void[]) instead where you
407                 simply want the content copied, or use conduit.read()
408                 to extract directly from an attached conduit. Also note
409                 that if you need to retain the slice, then it should be
410                 .dup'd before the buffer is compressed or repopulated.
411
412                 Examples:
413                 ---
414                 // create a buffer with some content
415                 auto buffer = new Buffer ("hello world");
416
417                 // consume everything unread
418                 auto slice = buffer.slice (buffer.readable);
419                 ---
420
421         ***********************************************************************/
422
423         void[] slice (uint size, bool eat = true)
424         {
425                 if (size > readable)
426                    {
427                    if (source is null)
428                        error (underflow);
429
430                    // make some space? This will try to leave as much content
431                    // in the buffer as possible, such that entire records may
432                    // be aliased directly from within.
433                    if (size > writable)
434                       {
435                       if (size > dimension)
436                           error (underflow);
437                       if (canCompress)
438                           compress ();
439                       }
440
441                    // populate tail of buffer with new content
442                    do {
443                       if (fill(source) is IConduit.Eof)
444                           error (eofRead);
445                       } while (size > readable);
446                    }
447
448                 auto i = index;
449                 if (eat)
450                     index += size;
451                 return data [i .. i + size];
452         }
453
454         /**********************************************************************
455
456                 Fill the provided buffer. Returns the number of bytes
457                 actually read, which will be less that dst.length when
458                 Eof has been reached and IConduit.Eof thereafter
459
460         **********************************************************************/
461
462         uint fill (void[] dst)
463         {
464                 uint len = 0;
465
466                 while (len < dst.length)
467                       {
468                       uint i = read (dst [len .. $]);
469                       if (i is IConduit.Eof)
470                           return (len > 0) ? len : IConduit.Eof;
471                       len += i;
472                       }
473                 return len;
474         }
475
476         /***********************************************************************
477
478                 Copy buffer content into the provided dst
479
480                 Params:
481                 dst = destination of the content
482                 bytes = size of dst
483
484                 Returns:
485                 A reference to the populated content
486
487                 Remarks:
488                 Fill the provided array with content. We try to satisfy
489                 the request from the buffer content, and read directly
490                 from an attached conduit where more is required.
491
492         ***********************************************************************/
493
494         void[] readExact (void* dst, uint bytes)
495         {
496                 auto tmp = dst [0 .. bytes];
497                 if (fill (tmp) != bytes)
498                     error (eofRead);
499
500                 return tmp;
501         }
502
503         /***********************************************************************
504
505                 Append content
506
507                 Params:
508                 src = the content to _append
509
510                 Returns a chaining reference if all content was written.
511                 Throws an IOException indicating eof or eob if not.
512
513                 Remarks:
514                 Append an array to this buffer, and flush to the
515                 conduit as necessary. This is often used in lieu of
516                 a Writer.
517
518         ***********************************************************************/
519
520         IBuffer append (void[] src)
521         {
522                 return append (src.ptr, src.length);
523         }
524
525         /***********************************************************************
526
527                 Append content
528
529                 Params:
530                 src = the content to _append
531                 length = the number of bytes in src
532
533                 Returns a chaining reference if all content was written.
534                 Throws an IOException indicating eof or eob if not.
535
536                 Remarks:
537                 Append an array to this buffer, and flush to the
538                 conduit as necessary. This is often used in lieu of
539                 a Writer.
540
541         ***********************************************************************/
542
543         IBuffer append (void* src, uint length)
544         {
545                 if (length > writable)
546                     // can we write externally?
547                     if (sink)
548                        {
549                        flush ();
550
551                        // check for pathological case
552                        if (length > dimension)
553                           {
554                           do {
555                              auto written = sink.write (src [0 .. length]);
556                              if (written is IConduit.Eof)
557                                  error (eofWrite);
558                              src += written, length -= written;
559                              } while (length > dimension);
560                           }
561                        }
562                     else
563                        error (overflow);
564
565                 copy (src, length);
566                 return this;
567         }
568
569         /***********************************************************************
570
571                 Append content
572
573                 Params:
574                 other = a buffer with content available
575
576                 Returns:
577                 Returns a chaining reference if all content was written.
578                 Throws an IOException indicating eof or eob if not.
579
580                 Remarks:
581                 Append another buffer to this one, and flush to the
582                 conduit as necessary. This is often used in lieu of
583                 a Writer.
584
585         ***********************************************************************/
586
587         IBuffer append (IBuffer other)
588         {
589                 return append (other.slice);
590         }
591
592         /***********************************************************************
593
594                 Consume content from a producer
595
596                 Params:
597                 The content to consume. This is consumed verbatim, and in
598                 raw binary format ~ no implicit conversions are performed.
599
600                 Remarks:
601                 This is often used in lieu of a Writer, and enables simple
602                 classes, such as FilePath and Uri, to emit content directly
603                 into a buffer (thus avoiding potential heap activity)
604
605                 Examples:
606                 ---
607                 auto path = new FilePath (somepath);
608
609                 path.produce (&buffer.consume);
610                 ---
611
612         ***********************************************************************/
613
614         void consume (void[] x)
615         {
616                 append (x);
617         }
618
619         /***********************************************************************
620
621                 Retrieve the valid content
622
623                 Returns:
624                 a void[] slice of the buffer
625
626                 Remarks:
627                 Return a void[] slice of the buffer, from the current position
628                 up to the limit of valid content. The content remains in the
629                 buffer for future extraction.
630
631         ***********************************************************************/
632
633         void[] slice ()
634         {
635                 return  data [index .. extent];
636         }
637
638         /***********************************************************************
639
640                 Move the current read location
641
642                 Params:
643                 size = the number of bytes to move
644
645                 Returns:
646                 Returns true if successful, false otherwise.
647
648                 Remarks:
649                 Skip ahead by the specified number of bytes, streaming from
650                 the associated conduit as necessary.
651
652                 Can also reverse the read position by 'size' bytes, when size
653                 is negative. This may be used to support lookahead operations.
654                 Note that a negative size will fail where there is not sufficient
655                 content available in the buffer (can't _skip beyond the beginning).
656
657         ***********************************************************************/
658
659         bool skip (int size)
660         {
661                 if (size < 0)
662                    {
663                    size = -size;
664                    if (index >= size)
665                       {
666                       index -= size;
667                       return true;
668                       }
669                    return false;
670                    }
671                 return slice(size) !is null;
672         }
673
674         /***********************************************************************
675
676                 Iterator support
677
678                 Params:
679                 scan = the delagate to invoke with the current content
680
681                 Returns:
682                 Returns true if a token was isolated, false otherwise.
683
684                 Remarks:
685                 Upon success, the delegate should return the byte-based
686                 index of the consumed pattern (tail end of it). Failure
687                 to match a pattern should be indicated by returning an
688                 IConduit.Eof
689
690                 Each pattern is expected to be stripped of the delimiter.
691                 An end-of-file condition causes trailing content to be
692                 placed into the token. Requests made beyond Eof result
693                 in empty matches (length is zero).
694
695                 Note that additional iterator and/or reader instances
696                 will operate in lockstep when bound to a common buffer.
697
698         ***********************************************************************/
699
700         bool next (uint delegate (void[]) scan)
701         {
702                 while (read(scan) is IConduit.Eof)
703                        // not found - are we streaming?
704                        if (source)
705                           {
706                           // did we start at the beginning?
707                           if (position && canCompress)
708                               // yep - move partial token to start of buffer
709                               compress;
710                           else
711                              // no more space in the buffer?
712                              if (writable is 0 && expand(0) is 0)
713                                  error ("Token is too large to fit within buffer");
714
715                           // read another chunk of data
716                           if (fill(source) is IConduit.Eof)
717                               return false;
718                           }
719                        else
720                           return false;
721
722                 return true;
723         }
724
725         /***********************************************************************
726
727                 Configure the compression strategy for iterators
728
729                 Remarks:
730                 Iterators will tend to compress the buffered content in
731                 order to maximize space for new data. You can disable this
732                 behaviour by setting this boolean to false
733
734         ***********************************************************************/
735
736         final bool compress (bool yes)
737         {
738                 auto ret = canCompress;
739                 canCompress = yes;
740                 return ret;
741         }
742        
743         /***********************************************************************
744
745                 Available content
746
747                 Remarks:
748                 Return count of _readable bytes remaining in buffer. This is
749          &n