Download Reference Manual
The Developer's Library for D
About Wiki Forums Source Search Contact

Ticket #1914 (closed defect: fixed)

Opened 10 years ago

Last modified 10 years ago

tango.text.convert.Utf and UnicodeBom.decode: "ate" poorly documented

Reported by: Deewiant Assigned to: kris
Priority: major Milestone: 1.0
Component: Tango Version: 0.99.9 Kai
Keywords: Cc:

Description

The functions in tango.text.convert.Utf appear to assume that they've been given a valid output buffer if they're given a non-null "ate" parameter.

For example, the following causes a segmentation fault:

import tango.text.convert.Utf;
void main() {
	uint ate;
	dchar[] utf32 = "foo";
	char[] utf8 = toString(utf32, cast(char[])null, &ate);
	assert (ate == 3);
	assert (utf8.length == 3);
}

Change History

05/03/10 15:19:32 changed by Deewiant

  • summary changed from tango.text.convert.Utf: crashes on null output but non-null "ate" to tango.text.convert.Utf and UnicodeBom.decode: "ate" poorly documented.

Okay, scratch that: I see now how it's supposed to work, and am changing the ticket to complain about the documentation.

If "ate" is given, the buffer is intentionally never resized as it's assumed to be a streaming operation: the way to convert the whole input would be something like calling toString(utf32[ate..$], buffer, &ate) until ate reaches utf32.length. Right?

The documentation doesn't state anything about this. "ate" is undocumented in Utf (but not in UnicodeBom?.decode), and all that is said about the output buffer is that it'll be heap-allocated if it's too small. The UnicodeBom? documentation, on the other hand, says nothing about the output buffer, although it mentions "ate"—with no hint of how it changes the behaviour.

Originally I understood "ate" as being a tool for finding invalid UTF, with the idea that if it doesn't match the input length, there's an invalid code unit at input[ate]. Turns out I was badly mistaken!

05/09/10 22:34:03 changed by kris

  • status changed from new to closed.
  • resolution set to fixed.

(In [5439]) fixes #1914 :: tango.text.convert.Utf and UnicodeBom?.decode: "ate" poorly documented

thanks to Deewiant