FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

About strings/ints

 
Post new topic   Reply to topic     Forum Index -> MiniD
View previous topic :: View next topic  
Author Message
bobef



Joined: 05 Jun 2005
Posts: 269

PostPosted: Tue Aug 21, 2007 7:09 am    Post subject: About strings/ints Reply with quote

Hi,

I've been looking at MiniD these days. It seems very nice with good integration with D, but one thing keeps me wondering. What is the reason for using utf32 and not utf8 like D? So many conversations seems slow. Also, why not double/long, but int/float? This seems like an awful restriction.
Back to top
View user's profile Send private message
JarrettBillingsley



Joined: 20 Jun 2006
Posts: 457
Location: Pennsylvania!

PostPosted: Tue Aug 21, 2007 5:21 pm    Post subject: Reply with quote

Quote:
What is the reason for using utf32 and not utf8 like D? So many conversations seems slow.


The convention in D is, and has been, to use char[] as the string type, which has mostly been perpetuated by a largely western audience and Phobos' lack of support for anything but char[]. But thanks to Tango's equally capable handling of all three UTF encodings that D supports, there's no overriding reason to use one over the other, except for space.

That being said, the only requirement for MiniD's strings is that they appear to the language as if they were an immutable sequence of UTF-32 codepoints. This was chosen mostly to avoid having to deal with ugly multibyte character issues (indexing, slicing, etc.) from within script code. The internal representation can be just about anything, as long as it provides that illusion to the script code. I've been considering using Chris Miller's dstring struct, which automatically chooses which encoding to use in order to save space.

(lastly, since this is D1 without constness, no matter what encoding is used, the string data is still duplicated to preserve immutability of string objects.)

Quote:
why not double/long, but int/float? This seems like an awful restriction.


floats in MiniD are double, though. The spec page on types says "A float is the same as a D double: a double-precision IEEE 754 floating-point number." You can also re-alias mdfloat in minid.utils to whatever you'd like for your particular project; to float if you'd like to save a bit of space in the MDValue struct, double or real if you need lots of precision.

It uses 32-bit ints because I don't have a 64-bit machine to test long on. I know that's a poor excuse because you can test long on a 32-bit machine as well. Of course, I could probably do an "version(X86_64) alias long mdint; else alias int mdint;" much like the mdfloat alias, but all things aside, using 'long' as the integer type shouldn't cause any problems.
Back to top
View user's profile Send private message
bobef



Joined: 05 Jun 2005
Posts: 269

PostPosted: Wed Aug 22, 2007 12:02 am    Post subject: Reply with quote

Quote:
It uses 32-bit ints because I don't have a 64-bit machine to test long on


What is there to test? Just replace int with long and will work. Since it holds more data than int it won't break anything Smile Just adjust minid to accept longer numbers Wink

And about the strings what troubles me is that that in utf32 each character takes 4 bytes of memory instead of 1, which obviously eats more memory and is slower.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> MiniD All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group