gulik

 

Improving Smalltalk

Page history last edited by Michael van der Gulik 1 yr ago

Improving Smalltalk

 

Smalltalk is my favourite programming language. It does have some things which annoy me though.

 

Syntax

 

  • Comments use double quotes. Double quotes means "this is a string" to me. I propose using bars, hashes or asterisks for comment delimiters. Single quotes could delimit Strings and double-quotes could delimit Symbols.
  • Often we quote bits of Smalltalk in English-text documents such as this one. In this case, it would be nice to be able to use "reverse" comment delimters (e.g. "^ 'hello world'".)
  • Using a quote (') as a String delimiter is annoying because quotes can occur in strings quite often ('I''m an ''example''').
  • Symbols begin with hashes, but then require quotes if they contain characters requiring delimiters.
  • | variable declarations | also seem a bit odd to me. Instead, the variable declarations should be part of method metadata (see Metadata below).
  • There's no easy way to do a switch/case statement. Sometimes they are genuinely useful.
  • Comments should be able to use Text with formatting (bold, italic, colours, headings, etc). Perhaps even images and hyperlinks should be available?
  • You should be able to use Unicode anywhere in a method - for method names, variables, etc.
  • I actually like Squeak's way of defining Arrays: {'this', 'is', #an, 'Array'}.
  • Along with Squeak's way of defining arrays, I think the compiler should be able to pick out dictionaries as well: {1->2} would automatically become a dictionary rather than an array.
  • The other array syntax makes no sense to me: #(1 2 3).

 

Just imagine that your code was going to be professionally typeset. For example, you'd put multi-line strings and comments in boxes, or perhaps in a side panel.

 

I'm not sure how a case statement would be implemented. Maybe they could be a dictionary? The compiler would need to process this, however.

 

acceptPacket: byteArray

    byteArray first caseOf: < {

        1 -> [ do something ],

        2 -> [ do something else]

    } asDictionary >.

 

Here, the case statement syntactically looks like an array (or dictionary) with a number of Associations. It uses an angled bracket syntax described below which runs the code in the angled brackets at compilation time and stores the result in a method literal. The implementation of Object>>caseOf: would compare the object to each Association key, and execute the block found.

 

Would compiler optimisations be worthwhile here?

 

Comments could also just be String literals in the method, which would be found to be never used and discarded by the compiler:

someMethod

    'I return a number.'

    ^ 5.

 

Semantics

 

Metadata

 

Metadata is used to describe methods, classes and so forth. They could be used to:

  • Describe the variables available in a namespace, class, method or block.
  • Describe the types of variables.
  • Describe the security of a namespace or method (public, protected or private).
  • Do comments.
  • If executable, initialize "static" variables.

 

For example:

 

someMethod

    <desc: 'I am a method that does something.'>

    <private> "i.e. a private method"

    <vars: {'i', 'j'}>

    i := 10.

    ^ i.

 

(alternate version:)

 

someMethod

    <method desc: 'I am a method that does something.'>

    <method private> "i.e. a private method"

    <method vars: {'i', 'j'}>

    i := 10.

    ^ i.

 

 

Where code in "<" and ">" would be somehow executed during compilation to add metadata to the method. <var: x> might declare "x" to be a variable. Comments could be then implemented using <c: 'hello world'>.

 

Tags could be added as <todo: 'I don''t know what I''m doing here.>. Assertions could ...maybe?... be added as <assert: [x=4]>.

 

Complex compile-time literals.

 

Perhaps this is how compile-time literals could be added? In the following examples, the result of executing the code in < and > would be evaluated at compile time and the result becomes a method literal:

 

a := <1+36i>.

b := <{#an, 'array', 'as', 'A' lowercase, 'sorted', 'collection' } asSortedCollection. >

 

I'm not sure how you would elegantly make this work. The metadata of a method would somehow execute on the method itself (rather than on an instance of that method) which would be a CompiledMethod. The instance variables of the CompiledMethod would be available to code in angled brackets. Methods such as >>private and >>desc: would be implemented on CompiledMethod? Or perhaps the object is actually a temporary object defined by the compiler?

 

With a syntax that could allow compile-time execution and making fancy method literals, more things are possible:

 

oneToSix

    " Return an array of numbers from one to six "

    ^ < 1,, 2,, 3,, 4,, 5,, 6 >.

    " Object>>,, makes arrays. "

 

exampleDictionary

    " Return an example dictionary. "

    ^ < 1->'one',, 2->'two',, 3->'three' >.

    " This uses Smalltalk syntax, but is starting to get ugly. "

 

One problem is that method literals need to be read-only. This could be avoided by invoking >>asReadOnly on method literals at some stage.

 

Angled brackets are a really bad idea because they could be valid message selectors; some other syntax would be needed.

 

Macros

 

...what if... the metadata blocks had access to the code being compiled. Say for example, that a preprocessor ran over the code, picking out the angled bracket blocks and evaluating them. As it did so, it output the method source to a stream which the metadata blocks then also has access to via a variable, e.g. "code":

 

macroTrickery

     < | myMacro | myMacro := [ code nextPutAll: '''hello, world!''' ] >

     Transcript show: < myMacro value >; cr.

 

Which would be preprocessed to:

 

macroTrickery

    Transcript show: 'hello, world!'; cr.

 

and then compiled.

 

Apparantly this is similar to Lisp macros.

 

This is also similar to JSP pages.

 

What would the scope of a macro be? Obviously, it would not be defined in the same method it is used. Perhaps it is relevant only to one package or namespace? A package would make sense; a package is compiled as a unit.

 

On the other hand, macros are stupid and are encouraging the programmer to write spaghetti code.

 

Passing parameters.

 

Is it possible to pass runtime information to one of these "compile-time blocks"?

 

a := < :param | param doSomething > value: something.

 

??? or is this pointless or impossible? Thinking...

 

Alternatively, using metadata is a bad idea. Standard Smalltalk syntax should be used, and the Compiler should be responsible for optimisations.

 

I.e. Method metadata could be done like this:

 

someMethod: var

    Method private. "Method>>private makes sure that 'thisContext sender' is self. The compiler or VM could optimise this."

    var implements: StringType. "Throw an exception if var does not implement String's interface"

    var caseOf:

        'begin' -> [self begin ],,

        'middle' -> [self middle],,

        'end' -> [self end].

        " The compiler should be able to recognise that this dictionary is never modified and make it a method literal or something. "

 

Code is a [formatted] Object.

 

Code should be more than just plain text. As well as features in IDEs such as folding, syntax highlighting and so forth, I would like to see:

  • Formatted comments: italic, bold, headings, structure and so forth.
  • Hyperlinks in comments and code, perhaps linking to other documentation, bug reports, diagrams, hand-written scanned-in notes, etc.
  • Embedded objects, such as diagrams, images and audio snippets (?).
  • Anchors for links to the code.

 

The use of diagrams, e.g. UML, in code would be very nice.

 

Methods and Classes should have a templated format for their documentation, so that the Browser could implement something more powerful than JavaDoc.

 

Code should have searchable tags, such as TODO.

 

Version control should be built into the browser. This is already supported with the "versions" button, but this can have a lot more functionality added to it.

 

Versions should have an "author" rather than just initials, and that author has a link to the actual author's details such as full name, location, hair colour, email address etc.

 

Perhaps be able to fancy-format things like mathematical formulae, matrices, etc?

 

Blocks of code could be in boxes?

 

Unicode in Smalltalk

 

Unicode has a bunch of interesting symbols. Below, I'm not including the symbols because it's hard work.

 

Constants which would be useful anywhere in code:

  • Infinity
  • pi
  • e
  • Empty set (couldn't be a global variable; would need to be made by the compiler or something)

 

Method names:

 

 

  • Boolean operators: and, or, not. The not symbol would look a bit silly at the end of each expression. Also implies (right-arrow).
  • Set operators: element-of, sub-set of, union, etc.
  • Sum (epsilon).
  • Superscripts
  • "..." symbol for 1 ... 5 do:
  • Multiplication symbol, division symbol.
  • Not-equals, not greater than, etc.
  • Not-almost-equal-to for floats.
  • Concatenation operator of some sort - the union operator.
  • Fractions rendered vertically (!!??)
  • Square roots rendered right? Or maybe this is going too far...?
  • +/- sign to return variables with error margins.
  •  

 

Variable names:

  • Script letters, e.g. i, j in mathematical script italics.
  • Sub-scripts.
  • Greek letters.

Comments (0)

You don't have permission to comment on this page.