Syntax matters

Something that I haven’t really explained well is why I believe Smile code is shorter, better, and simpler than nearly everything else out there. I’ve talked about it at a high level, and I’ve shown “Hello, World” and simple test programs, which are nice, but none of that really gives you the feel of coding in Smile.

So here’s an example of some unit tests written using a simple unit-test suite that I’ll be including out-of-the-box with the Smile distro. Writing unit tests is a little closer to the day-to-day activities of Real Programmers, and it helps you to see a little better what code in a language feels like:

#include "testing"

tests for the arithmetic unit {

    it should add small numbers {
        x = 5 + 7
        assert x == 12
    }

    check the boundary conditions {
        x = 5 + 7
        assert x != 0 (adding positive numbers should never produce zero)
    }
}

Those are dumb unit tests, to be sure. But they show some really interesting properties of the language. They’re really readable — you can easily see what’s going on. The scope of the test suite is obvious, and each test has a pretty, readable name, and the assertions are obvious and simple. There are no extra symbols, no extra punctuation, no weird keywords, nothing like a crazy blob of (()=>{ ... }))() markings just to make some unit tests work. You could reasonably argue that this is the simplest possible form of unit tests: Just a test-suite name, test names, curly braces to show the relationship between them, and code to perform the actual test operations themselves. The output reads like this:

Tests for the arithmetic unit:
  - it should add small numbers: OK (10 msec)
  - check the boundary conditions: FAILED

Total: 2 tests    Pass: 1 test   Fail: 1 test

But the big obvious question is how can that code possibly even work? Does Smile have built-in unit-test support? Is “testing” a magic library, integrated with the compiler, or written in C or some kind of crazy unportable voodoo?

No. There’s nothing built-in. The “testing” module that this imports is only about fifty lines of code, all written in Smile itself. The critical part of the test-system’s guts is only about ten lines long.

Here’s the most critical part of the guts:

#syntax STMT: [ tests for [TEST-NAME+ suite-names] { TEST-DECL* tests } ] =>
    `[Tests.run-suite [List.join @(suite-names)] [@(tests)]]
#syntax TEST-NAME: [[NAME name]] => name
#syntax TEST-NAME: [[STRING name]] => name
#syntax TEST-DECL: [[TEST-NAME+ test-names] { STMT* stmts }] =>
    `[[List.join @(test-names)] [$fn [] @(stmts)]]

Okay, so… that’s pretty much gibberish, unless you’re well-versed in Smile’s syntax rules. But you don’t have to understand it to be able to use it! And that gibberish does a lot in just a few lines of code:

It adds a new “tests for” statement, which accepts any arbitrary names or strings up to an opening curly brace, and then a body of custom “test” code inside the curly braces.
The “test” code consists of a series of custom test declarations, which start with names or strings up to an opening curly brace, and then a body of ordinary code inside those curly braces.
In a macro-like fashion, each test is transformed into an unnamed (anonymous) function, with the test code inside it. That function is paired with its name into a two-piece list of data, like this: [name function]
In macro-like fashion, all of the tests are combined into a big list, which is passed to some method named “Tests.run-suite”, along with the name of the test suite.

If you’re reasonably well-versed in JavaScript and I asked you “write a function named runTestSuite() that accepts a name and an array of functions to run,” you could probably write that in just a few minutes. That’s not really the interesting part of this exercise; in most languages, it’s trivial to write a foreach loop that runs other functions. The guts of the trivial run-suite function look something like this (truncated to remove a bunch of boring timing and custom-error-message code):

run-suite: |name tests| {
    print-line "Tests for {name}:"
    tests each |test| {
        print "  {test:0}: "

        try result = [test:1]
        catch ex result = false

        if [test:1] then print-line "OK"
        else print-line "FAILED"
    }
}

But writing your unit tests as an array of functions is ugly. Smile believes syntax matters. Keeping code simple matters. A consumer of a test library doesn’t want to think about arrays and functions and declarations and whatever other stuff is necessary just to make it go. A consumer of a test library just wants to write tests.

So Smile’s custom syntax jumps in, acting like macros, and transforms the really pretty test code into the nice clean data structures a test suite can execute. Could you write the original data structures? Sure. Do you want to? Heck no.

The syntax is itself an abstraction. It’s not just sugar on code; it’s an abstraction that lets you avoid thinking about programming and think instead about testing. Testing is a domain-specific activity, so it gets a domain-specific micro-language, written in Smile and embedded within Smile, to make testing as simple as possible.

And that’s really the whole point. All of programming is really domain-specific: When you write an if statement, you’re really using a custom domain-specific micro-language for writing conditional behavior in a program. The if statement has a custom syntax, and a custom behavior, and doesn’t work like an arithmetic expression or function call — both of which themselves are really domain-specific languages for performing arithmetic or invoking functions.

When you view the universe as a pile of interacting micro-DSLs, the notion of being able to customize your language on the fly by adding another DSL seems not only easy but the obvious solution to a whole raft of problems. Why embed SQL? Why not just write SQL? Or why call some .CrossProduct() method on two vectors when you can just write “x” instead? Everything in programming is either creating a new language or using somebody else’s language; Smile just makes creating and consuming those DSLs as simple as possible.

So how does it work? The special #syntax directive in Smile directly mutates the parsing engine. Smile’s parser is split in half: There’s a recursive-descent parser for common constructs like arithmetic, and there’s a customizable, LL(1) table-driven parser for user-defined constructs. When you write

#syntax STMT: [ tests for [TEST-NAME+ suite-names] { TEST-DECL* tests } ] =>
    `[Tests.run-suite [List.join @(suite-names)] [@(tests)]]

you’re directly extending the parse table for the STMT (statement) production. It’s an extended BNF grammar, written inline in your program. On the right side of the => arrow, there’s a substitution rule, a Lisp-style templated form that describes how to rewrite the syntax form as something executable. (It’s not a precise match to Lisp, but Lisp programmers should be able to mostly read the right side of that arrow. And the custom-syntax parser supports a few common extensions like + and * and ? to describe repetitions and optional constructs.)

(For the record, I’d have loved to implement it as an LR(1) system, as LR(k) supports a lot more constructs than LL(k) does, but the LR family is hard to compute tables for on-the-fly, and it’s even harder to implement using lexically-scoped copy-on-write data structures. Perhaps some future version of Smile will support LR(1), but for now, the custom parser is in the LL family instead, with a few special extensions to support repetition easier than LL(k)’s right-recursion natively does.)

You can create your own productions on the fly, just by using your own nonterminals. To keep things sane and avoid syntax conflicts, Smile forces you to namespace any nonterminals that aren’t built-in, which is why all of the unit-test nonterminals start with “TEST-“. Your productions can directly reference built-in productions like EXPR and STMT, which is why the whole transformation can be written in four lines of code: It’s a micro-language that adds a few keywords of its own, a little syntax of its own, and that builds on top of everything else to provide the rest of the functionality.

And this all works. It’s been working for a couple years, actually, with the native Smile recursive-descent syntax-translator existing long before the custom-syntax translator.

So I find it funny that people like to say “syntax doesn’t matter” in a language. Sure, replacing {...} with BEGIN...END or vice-versa doesn’t really matter, but that utterly discounts the true power of syntax transforms in a language. Because in Smile, the abstractions provided by the user’s custom syntax transforms matter more than almost anything else, and they can readily transform hundreds of lines of ugly, inscrutable code into nearly English-like one-liners.

And that’s what coding in Smile is really like: You’re not trying to write code to solve a problem. Anybody can do that, in any language. In Smile, you’re trying to write pretty code that you can read and maintain, and that is a challenge with some meat on its bones — but it’s a challenge that Smile makes nearly as easy as it’s possible to get.

One Response to Syntax matters

Résumé

Related Links

Recent Posts

Archives

Meta