Essentially, Localisation in SecureSqueak is going to be implemented by add-on packages. As little localisation as possible will occur in the kernel because it cannot be updated there.
Issues
- Character sort order often differs by country and alphabet.
- String sort order offen differs by country and alphabet.
- Number formatting can differ between and within countries:
- There are multiple ways of representing negative numbers; scientists and mathematicians use a minus sign, accountants use parenthesis, and there are even more formats.
- Different countries put commas, dots and other thingies in their numbers for decimal points and thousands separators.
- Number>>asWords returns an English description.
- There are multiple ways of notating other bases such as binary, hexadecimal, etc.
- Dates, times, durations, periods etc have dozens of ways of being formatted and handled.
- Keyboard layouts differ; character events need to come into Subcanvas as Unicode or something.
- If fonts are going to be supported by Subcanvas, there are so many complications:
- Right-to-left text,
- ligatures which are required by some cultures,
- ligatures which are made by context,
- missing characters in the font,
Solutions (or ideas)
The ideal solution would be to have any mechanism that can be localised and complex be remotely loadable code.
- Date, time, duration can be in an externally loaded package. The kernel only needs to provide a millisecond value since the epoch.
- Canvas targets can implement their own keyboard layouts by processing raw keyboard events. Currently the keyboard events from the VM do (handily) provide Unicode characters. These should perhaps not return a character instance though...
- Character and String classes should (if possible) be in externally loaded packages.
- If Character and String remain in the kernel, Character and String ordering is done strictly by numerical Unicode value.
- Number formatting in >>asString returns a Smalltalk number format.
- Dates et al are removed from SecureSqueak and are to be provided in an external package. The SecureSqueak kernel provides a method somewhere to return the number of milli/nanoseconds since 1970.
- Number>>asWords should be removed. Replace with external NumberFormatter>>asWords: aNumber.
- As many English strings as possible are to be removed from the kernel. Error and informative messages should be returned as a code, type of Exception, or something. If this becomes unwieldy, then some localisation of the kernel could be investigated.
- Ideally, the locale is determined per user rather than by VM. This means the user's information should be made available to applications and is not part of the SecureSqueak kernel.
- Optionally, the locale is determined by the operating system (??? maybe?)
- Fonts are managed by external packages; SecureSqueak has no fonts in the kernel.
- Geographic location and locale are separate: travellers, ex-pats and con-langers have different locales than the people around them.
In summary:
- No English.
- No fonts.
- >>asString returns Smalltalk formatted numbers, characters, etc.
Links
http://en.wikipedia.org/wiki/Internationalization_and_localization
http://www-01.ibm.com/software/globalization/index.jsp
http://msdn.microsoft.com/en-us/goglobal/bb688110.aspx
http://www.unicode.org/versions/Unicode5.0.0/ch05.pdf
Comments (0)
You don't have permission to comment on this page.