Thursday, January 19, 2017

Thank You For Admitting on the Internet...the Stupid Things You've Done (3D Printer Edition)

I'd like to keep this blog post a short and sweet, I mean long and winding Thank You, to everyone who's ever admitted somewhere on the internet that they've done something stupid. One such person saved me a lot of hassle and a lot of money on my Printrbot Simple Metal.

The exact issue I had? Mysteriously, it would stop two to three layers into my prints. Are you having this issue? Please don't read my rambling. skip to the bottom for the solution that worked for me. Oh, and then come back and read the rambling because I worked hard on it.


I just moved to San Francisco from Portland to work from Google a few months ago. In the process, I got full custody of the 3D printer that my brother and I had bought together. Woo hoo!

A few months in, I start really getting the itch to print something again. I know, months. I'm not really a hardcore user. That will be obvious by the end of the post. Though I had a few things going against me: I had recently ruined my Fedora install in a botched attempt to install 32 bit libs (which had my slicer, modeler, favorite settings...) and an even more botched attempt to fix the failure. If I were to get that working again, I had other more pressing things to do with more important data on that disk. I didn't have a lot of time because it as a new job. And even where to put the printer was up in the air. No, not literally. It ended up on a desk.

Side note, the best part of unpacking our movers' good work once we got settled in the right place in San Francisco, was that everything was labeled by room & general contents: books, dishes, clothes. No, that's not the best part. The best part is the 3D printer was in a box labeled "complicated machine." Which it certainly is. They did their best to pack it well, but the filament strand into the printer broke, leaving likely all kinds of twists in the spool during hundreds of miles of bumpy transport. I had just fixed a twisted spool of pla, and its the simple things that discourage you. Still, better than when the UPS people delivered it to us with the words "THIS SIDE UP" in big letters and completely upside down.

So I officially missed the Christmas season for printing presents and decorations, but alas, there's always one next year.

So two days ago I found the cables, installed the modeler and slicer anew and designed my model. I went to print, and a few minor hitches later got underway on breaking the fast that is my 3D printing hobby. Gotta love breakfast.

But two layers in, it just stopped. Shit.

I of course first simply tried again. Still shit.

USB connection? Nope -- tried it on a microUSB card and it had the same issue.

At this point I reasoned, it could be an issue related to the gcode. Either corrupt gcode from the slicer, or a hardware failure that happens for a particular instruction. Except, no, because the same exact print would fail at different times.

I mean, I'm a smart person usually, and I did a bunch of smart things.

I double-checked that no wires were loose coming into the board. I think that was really smart of me. I did indeed later find that as one of the official instructions for what to do when you have this problem. I smiled that I had beaten that advice to the punch.

But still, no luck. I had already, at this point, dialed up the cost of a replacement board, and a replacement power supply. $80+ was not what I wanted to see. But sometimes that's just how 3D printers roll.

So just before calling it quits, and ordering a part, I decided to read through a long long long long long (like, longer than this blogpost telling one stupid dumb story long), long forum thread about people having this issue. And by the grace of God, there was an idiot. On the thread. An idiot who...admitted it.

Blah blah blah blah, "and oh wow do I feel dumb" were his words. Basically, he had two different power supplies under his desk. One of them was the right one. That is not what he plugged in to his computer.

With less power available, it can only keep the hothead hot and the servos spinning at high speed for so long. Start a print, and you're ticking down until the powersupply is overwhelmed, its voltage drops, and then suddenly...the board crashes. And then recovers, because the power requirements immediately go down. Tries to print again.

At this point it is absolutely crucial that you reread this already really long blogpost about one tiny thing, from the top. You don't want to? Really? Ok, I'll give you a spoiler; I mentioned something about unboxing the cables...

Turns out my Logitech GT25 or something racing wheel has a DC power supply with the same connector as my Printrbot Simple Metal. So that's what I had grabbed instead of the right one. Much like my equally dumb friend on the internet.


Did my dumb friend on the internet have to post that he was dumb? Did he have to post his solution at all? Couldn't he have worded it in the highly-ignorable-but-face-saving-manner of "just check the right cable is plugged in, you jerks!" Yes. And unlike most people on the internet, that's not what he did.

He did something dumb that day and he admitted it.

I did something dumb and I'm admitting it.

The next time you do something dumb you should post it on the internet, too. You might just save someone $80+, time, stress, and the hairloss that comes with it. We're all stupid, and it helps to admit it.

Tuesday, June 21, 2016

Else If Is Not Special (Except in Python)

Prepare yourselves, for I am about to make a code formatting argument, which is almost always certainly a waste of time. But this formatting argument is rooted in a truth of how programming languages work, and as a programming language designer, I just feel too damn compelled to speak up.

Elif Hell


OK OK, so you want to know why else if is not special, probably about 101x more than you want to know how I happen to horizontally and vertically align my code using spaces, tabs, or god forbid a mixture of both. I don't blame you. But forgive me delaying the good parts for just a moment, and let me plunge you into the deep dark depths of code formatting hell. Hell is fun too right?

This is a simple hell, but imagine, if you will, that else if had not yet been invented.

Even in the real world today, using lots of else ifs is an antipattern. But here in elif hell, lots of else ifs is a cause of tears, stress wrinkles, and death.

if (statusCode == 200) {
  // ...
} else {
  if (statusCode == 201) {
    // ...
  } else {
    if (statusCode == 202) {
      // ...
    } else {
      if (statusCode == 203) {
        // ...
      } else {
        // Nesting Continues Forever In Elif Hell
      }
    }
  }
}

In this hell, for anything more than three else ifs you can get fired unless you resort to needlessly complicating the runtime in order to not require else if. Imagine, someone writing a library that does for elif hell what promises did for callback hell. In this case, the runtime is not the problem and a runtime solution would be crazy.

But before you start praying to the elif gods for salvation, you can take in a breath of fresh air, because in most languages, elif hell could not exist.

Defining If And Else


The standard definition of if/else looks like this:

ifstmt -> 'if' '(' expr ')' stmt_or_block
       | 'if' '(' expr ')' stmt_or_block 'else' stmt_or_block


Now, I wouldn't be able to call myself a language designer if I didn't tell you that this has an ambiguity which will make parser generators rear their ugly heads and tell you it has a shift/reduce conflict. The problem is related to allowing statements after if conditions, allowing the parser to see if(...) if(...) if(...) if(...) else ... In this case, the parse rules above could in theory match the else against any of the four if statements that had been opened. This is called the "dangling else" problem.

In practice, choosing to match the else to the last opened if statement (shift over reduce) works like a charm, which is how c, c++, java, php, and javascript, etc all work. This approach has been working for decades. Others demand blocks as if it were the only way to catch the infamous "goto fail" bug, a request I both sympathize with and intend to slightly complicate later.

So long as else may be followed by a stmt_or_block, elif hell cannot exist. The following code is, by these simple rules, a natural phenomenon which parses down according to how I commented it.

if (statusCode == 200) {
  // within an if
} else if (statusCode == 201) {
  // within an if within an else after an if
} else if (statusCode == 202) {
  // within an if within an else after an if within an else after an if
} else if (statusCode == 203) {
  // within an if within an else after an if within an else after an if within
  // an else after an if
} else {
  // within an else after an if within an else after an if within an else after
  // an if within an else after an if
}

Although it gets complicated in the final parse tree, code which is parsed recursively need not be formatted in such a way. This is absolutely how else ifs should be formatted, not nested forever and ever, and this style springs right out of the ground based on the most intuitive definitions of else/if, no help needed.

There is nothing special here. Usually. (I'm looking at you, python, bash, ruby...)

An Unnamed Hell We Live In Today


There is a lesson to be learned here, from a language designer's perspective, though I do not know who to give credit for seeing through the stigma and finding it. All I know is I first saw it in action while browsing through the code of gnome-terminal.

Although else if seems like a first-class language concept, it absolutely is not. It is just a combination of two language constructs that serve a purpose together. And let me tell you, else and if are not the only constructs with useful combinations.

Would you enjoy a language that had a loop where feature? An if then loop feature? A loop case feature, or a loop loop feature? Would you be interested in someone like me designing one?

Programming is all about combining constructs, and these constructs are ripe to be combined. They are literally...figuratively begging you to combine them. And the general programming community has sworn itself into some kind of hell we don't yet have a name for, where we write code recursively with nesting where we need not.

With that grandeur, I present code formatted in a way that you will probably hate.

// loop where
for (item in items)
if (item.isHub()) {
  // ...
}

// if then loop
if (items.containsHub())
for (item in items) {
  // ...
}

// loop case
for (item in items)
switch(...) {
   case ...
   // ...
}

// loop loop
for (item in items)
for (tokens in item.getTokens()) {
  // ...
}

These statements are no different from an if else, and are waiting eagerly to be used.

While you may not find them palatable at first, I implore you to remember how you first felt about camel case, or snake case, or curly braces, or whitespace significance, or spaces around your operators. There must be some type of formatting that you remember hating at first, which has grown on you since then. I urge you to give this formatting style half the chance you gave your last preference change.

Among the reasons of why this formatting style really is worth considering, is that the lack of nesting not only looks simple, but it calls out the simplicity of how the two control statements are 100% linked. That is, 100% of the code within the inner control structure makes up 100% of code within the outer control structure. Once delimiters are added and indentation is used, that should signify that that type of linked relationship is not true, and the code requires more careful reading.

I do not expect this formatting style to win, because the fact is that it's already out there, and not spreading, not catching on. And who am I to call that a loss for us programmers?

Well, I would ask in turn, who are you to not give it a chance?

Obligatory Why-Did-You-Rip-On-Python-And-Ruby Addendum


How Python and Ruby etc were designed is really not the goal of this blog post, but it is worth calling out and exploring the unexpected consequences of their design choices. After all, every language design choice is a tradeoff, and it is hardly insulting either language to explore that.

As I said earlier, else if actually is special when else cannot be followed by a statement instead of a block. Python and ruby have specific parse rules designed to handle else ifs to prevent elif hell, and that's part of why they don't use the keywords else if but rather elif and elsif. For the same reasons, in Python and Ruby, you cannot write if for, or for if, etc., without using unnecessary nesting, or waiting for them to be manually indoctrinated in the language.

In Python's case this trick simply cannot work, and perhaps that is OK, as whitespace significance does carry some inarguable benefits. In Ruby's case, it is more an expression of how programmers prefer keywords to symbols, so much that curly brace languages have failed make them required due to "verbosity" while keyword based languages have managed to convince everyone to type three alphabetic characters to end everything in the pursuit of readability.

What gets in the way is how these languages try to do us a favor. It does seem to myself and to many people that allowing if statements and else statements without blocks is an unnecessary risk. You risk running into the goto fail bug, and in fact, I myself never omit curly braces except when combining constructs as listed above.

In any case, we need not throw the baby out with the bathwater. Perhaps if the languages which support this formatting style already begin to use it and champion its benefits, then maybe our next round of languages and our current generation of linters can be made to accept control flow combinations but not lone statements otherwise, and maybe we will be all the slightly better off for it.

In either case, for most languages, else if is not special, and perhaps the trend of treating it as such is a step backwards.

Sunday, November 29, 2015

Going Open with "Encase"

I've been pondering for a while a concept for a new language. Partly its interesting to me because I want to get more diverse compiler experience than just Wake's set of problems,and partly because its a program I know I would use.

I'm going do something that I didn't do with Wake, that I think in hindsight is actually the best way to start a language -- I'm going to make it open and public from day one.

Now that wake has been officially "alpha released," there are still mountains to do with it. And just the same I have seen interest and I feel like I have made a positive impact on programmers who even just ask the question, "what problem is wake trying to solve?"

With that new experience (along with some advice that its never too early to publish an idea for an open source project), I wanted to share what little I have to say on it yet, and invite you all to critique it -- or, if it ever makes it into other developers hands in a serious way, it will serve as a cool journal of how Encase came to be.

A publicly commentable specification draft of Encase can be found here. It is still a work in progress.

What problem is Encase trying to solve?

I love the command line interface, and one of the reasons why is the magnificent composability of bash. Naturally I have a sort of terminal-formatting fetish, where I'm constantly trying to make my programs look as cool as possible.

But the two concepts are actually very separate. On the one hand, you have awk, cut, and xargs, which cooperate as beautifully as a centaur hugging a unicorn, and on the other hand you have top, less, and vim, which are a set of distant vacation spots. They look great by themselves but you can't be in both Sicily and Venice at the same time. These programs are built in ncurses, a complex GUI framework for the terminal.

I believe that ncurses was not *nix-y enough, and that it is the choice of making ncurses a library that separates the top from the less. I believe that if we could make ncurses into a command in just the right way, you could awk your less and maybe even less your top.

My approach with Encase will be to adapt ncurses into a language. You can write a program that simply spits out a mix of content and Encase, and pipe it into any program you choose, usually that program being an interpreter that renders it for you live as your program runs.

Many immediate uses come to mind.

top # view the encase code raw
top | encase # view it how humans are meant to see it
top | strip-encase > computerFriendlyLog # save JUST the actual content to a file
top | encase-record > topReplay.ncs # "record" a specific run of a program
top | custom_top_reskin.sh | encase # completely re-skin an Encase app??!

The last one specifically may be a stretch; but the great thing about ncurses as a language is it would suddenly just be strings. You could trivially, say, replace all instances of 'red' with 'blue', all borders with one twice as thick, etc etc.

As an added benefit, you would be able to run curses in any language without any required re-learning: python, bash, c, c++, php, ruby, javascript. No new APIs to learn, no special porting or required compilation flags (looking at you, PHP).

And last but not least, if done well it would make a sleek CLI program interface a breeze, no longer requiring you to look up the right ANSI escapes, and in a much more tailored way than a library (ncurses) could ever provide.

What inspired the (perhaps temporary) name?

At first I was going to go with Curse as a blatant ncurses reference, but, I thought it was too negative and too similar to ye olde curses. After all it will be a very different beast in the end.

I then thought about using "jinx" as a synonym of curse, but wanted a stronger hat tilt to ncurses than that. I almost chose "njinxes" but it was very similar to the web server nginx.

Playing still more off of ncurses, I found myself saying "ncase" when trying to play off of its phonetics. Combined with the layout engine inside it that "encases" your content, I decided that "encase" hit on the marks of inspiration, coolness, and brevity.

Some Examples

While literally every idea I'm about to present is subject to change, and/or may never see the light of day, it wouldn't be very "open" of me to tell you the goals (and even describe the name!) without showing what I had in mind.

First of all, one goal is to make it very easy to ignore most (if not all) Encase code if you want to. For this reason, I've decided to make every line of Encase begin with a standard sequence. The rest is output.

$$ this is some Encase code
this is raw output

I chose $ because it is (hopefully somewhat uniformly) rare, and also does not require an escape. I almost used backslash for its rarity, but didn't want it to be heavily escaped in other languages (ie, println("\\ your Encase code")). I chose to use multiple $s so that the chances of ever needing to worry about escaping your output is really really low.

Beyond that, it should have imports, and it should be turing complete (so that your turing complete custom UI components can be shared across languages).

$$ import alert
$$ n + 1 times alert "hello" // turing completeness

Here we assume someone has written a plugin in Encase which creates momentary popup alerts. We can import it (provided its installed on the right path...) and then use it. Notice that we're using a variable (n), doing some math (n + 1), and looping (n + 1 times ...).

I have a feeling the times construct will work very nicely in creating formatted output. I included it in Wake as part of the standard library and liked it; its a great example of the type of constructs one would find in a language tailored to string formatting.

Writers of plugins (such as this alert plugin shown above) would be able to leverage a box layout engine to create the effect of windows, panes, buttons, and more. They would (ideally) all be easily describable in Encase itself even if the layout engine they used was a native layer.

$$ format window title
$$  open border
$$      open horizbox
$$          height 2
$$          open centerAlign
$$              name titleSection
$$          end
$$      end 
$$      open horizbox
$$          open scrollbox
$$              name contentSection
$$          end
$$      end 
$$  end 
$$  render content
$$      use titleSection
$$      clear
$$      print title
$$      width times print "-" 
$$      use contentSection
$$      clear
$$      print content
$$  end 
$$ end

Here we would create a format named window, which would require the title as a parameter. The whole box is inside a border format (which would be equally simple and just draw a box around whatever inner content it gets). Inside that border, we create a horizontal box containing the title and its dividing line, then below that a scrollbox. We name those sections for use later.

When we say open border we can print to it and put other formats inside it. Same thing goes for our window. This is why we say render content, we are going to describe how to render or rerender a window whenever our content or sizing has changed. content in this case is a variable but its quite nicely readable when content is the chosen name.

The rendering engine would buffer this in memory, and at a fixed frame rate it would diff the previous state with subsequent states, performing as few and as optimal updates as possible to keep performance high.

In terms of the layout engine itself, there are many papers on some layout engines and their time complexity. I have yet to choose which I'll use but I'm sure a good one exists.

This leaves us in a nice place to design a layout, name its components, and then as our app runs we can switch panes, change their contents. We should be able to add and remove components on the fly to create very dynamic interfaces.

Problems I'm Still Thinking On

This section will be very disorganized! Muddle through as long as it holds your interest.

Unlike most GUIs, I am deliberately running the visual code in a separate process than the application. This makes conventional two-way binding and/or event handler concepts more or less impossible. For a while I thought my biggest problem would be managing scroll states, but I did realize that I can rely on the mouse (where supported by the terminal) to handle scrolling. Still, this leaves a burden on the app developer to performantly follow the state of the visuals and update them as needed, capture keystrokes and form navigation themselves. If you wanted to add click-to-select-text-box functionality, you'd be out of luck.

I think this might be elegantly solvable by changing top | encase to encase top, and let the encase interpreter call top to watch all I/O. However I'm not sure how exactly this would all work out. Perhaps a fifo streams events into the inner process? Still an unsolved issue. Remember too that the goal is to make programs composable; how could such a fifo be made simple to use in your own way?

I'm also not entirely sure what to make of multithreading. And I don't mean multi threading in terms of efficiently rendering the layouts on a multi core processor, I mean a threaded program where each thread spits out encase code that works perfectly on its own, but doesn't work perfectly when the lines get all jumbled up. On the one hand multi threading support would be difficult to seamlessly add (obviously the interpreter would have to be threadsafe, and have multiple states per thread all interacting at once...). On the other hand it might be clunky. Every thread would need to have its own prefix, and going into no-code mode for printing content would be ambiguous. If Encase ran all lines of code on one thread, anyone doing parallized output would need to manually synchronize their updates into sensical batches of code -- not exactly a minor quirk.

I'm also kind of curious about late binding variables. Notice in the window example, we have a title that is passed in when the window is created. What if you want to change the title? We would probably add in some functions specific to windows that could be called whenever you are inside a window box. But it would be cooler if there was a slick way to make it automatically rerender the title of the window whenever the original variable it came from was changed.

I'm also thinking pretty hard on what types of mistakes would be common to code in encase as its written. Anythnig I can do to avoid confusion (forgetting to end a window, forgetting a rerender function, using formats in the render function instead of the format body and getting lowered performance, etc etc) and make it less obvious. There will be no type system to catch this stuff for the users!


What I'm Excited About

Mostly I'm excited to try to add JIT aspects to the compiler. It may be a challenge for me to get it right cross platform and everything, but I think this is a great example of a language that could use a JIT (all that rerendering from scratch in a necessarily dynamic language!), and I would have a blast even optimizing a few things for a specific architecture.

I'm also excited in that it seems like it would be very easy to whip up some clones of some classic unix utilities like nano, top, and/or some bigger fish like a simple vim clone. It sounds like a cool project...maybe even make a Roguelike or two!

Either way (if I make it and its as good as I hope, or not) it has been fun to think about, and I invite you to comment with your thoughts/suggestions.

Sunday, February 1, 2015

The Incredible Rewards Of Creating a Programming Language

Many of us enjoy seeing programming languages pop up that we might one day use, especially with cool ones like rust (memory safe without GC) and avail (extremely literate programming).

But I would argue that many of us programmers, whether we like these trending languages or not, should start putting time into programming language infrastructure. That infrastructure is what enables us to do what we do every day, and the work is incredibly rewarding.

For the past year I have been creating the programming language Wake, and it is indescribable how cool a side-project it has been.

Language creation forces you to think like both a programmer and a computer at the same time. You will learn things about your favorite programming languages whether you create something on your own, or contribute to a powerhouse like java, or help a new kid on the block like rust, or join the Wake team.

I'm not the only person to think that making compilers and compiler tools can turn you from a great programmer to a best programmer. Here's my story of discovery and accomplishment in creating Wake.

The First Wake Program

When I first started making Wake, I doodled away on a text file, making little syntaxes and commenting them like a madman writing on his padded walls.

I wanted Wake to be a testable, statically typed, compiled language, ensuring any and all code you write will have testable seams via built-in dependency injection. This turned out to be a bit of a challenge for creating the first running wake program, since I had to make a parser, and a code generator, a semantic analyzer, and a dependency injection library all in one before I could so much as print "hello world."

I do not have a formal education, so I turned to Coursera to learn the basics of parsing, type checking, and codegen before getting too far along. I highly recommend it to you too, for getting started.

And at long last, I was able to write, compile, and run the first ever Wake program. It felt like I had created life.

every Main is:
    needs Printer;
    main() {
        Printer.printLine("Hello World!");
    }

This actually ran on my computer. And just seeing the two most overused words in all of computer science, it actually felt like Wake was opening its eyes for the first time, and letting the world know it was here.

Playing With Optional Names Instead Of Optional Types

One of the biggest experiments in Wake is making variable names universally optional in lieu of type information.

Rather than relying on semantic-restricting and/or fallible type inference to keep programs clean, I set to work on a simpler idea: that types are often a great name for variables, which both the programmer and compiler can use to easily and effectively understand most common code.

As Wake became more complete I began writing unit tests in Wake that tested Wake, and found myself constantly stopping to admire my work instead of finishing it. Reading these test let it sink in that I had created a whole little language of my own. Just by skimming these little tests it was clear that they could only be Wake. And even cooler? They all passed.

Notice how not a single variable in this code has a name. They all only have types.

import Asserts;

@TestClass
every AssignmentsTest is:

    @Test
    testPlusEquals(Asserts) {
        var Num = 5;
        Num += 0;
        Asserts.that(Num)Equals(5);
        Num += 1;
        Asserts.that(Num)Equals(6);
        Num += 5;
        Asserts.that(Num)Equals(11);
    }

This is probably the most controversial idea in Wake. It gets the most positive feedback of any feature when shown in person, and the most negative feedback of any feature when shown on the internet. Don't worry, variable names are allowed absolutely everywhere, they're just, surprisingly, often not that useful.

Its been a staple of Wake syntax, and while I've run into a few syntactic hitches in emulating languages that keep the two as separate as oil and water, I know to ignore the haters because it works, exactly as well as I always dreamed.

This idea also led to the best happy-accident in Wake's design. Months and months later, I realized one day that thanks to this simple idea, I could give Wake the tightest loop syntax ever. In disbelief I began writing yet more working code that reads like a book. See how the compiler automatically changes the array variable WatchedArgument[] to its singular counterpart WatchedArgument for me.

        foreach(WatchedArgument[]) {
            if(!Text[].contains(WatchedArgument.getWatchedType())) {
                Text[].push(WatchedArgument.getWatchedType());
            }
        }

and it runs. I still, to this day, get an incredibly warm feeling when I review code like this.

A Big Boy Language

Since running loops with an automatic variable name felt so cool, I had to keep going. At this point, I'm just plain hooked.

But not every feature needs to be a brand new idea to feel that mix of pride and power. In fact, that feeling is no more intense anywhere than when I look back to making generics, lambdas, and type inference.

At this point, simply parsing the generics felt like little kids' stuff. But writing the code that understands them, and validates them, with micro transformations based on context...and thinking endlessly about the subtle effects of generic programming...that process is completely unique.

Generics in Wake look much like those in Java or C#. Excuse this example for being a bit long, its an excerpt from Wake's mocking framework wockito. Specifically, this code tracks what return values have been stubbed for a method on a mock, which is a perfect place for generic programming.

every When{T} is:

    with T[] returns = [];
    with Num = 0;

    When{T} -- thenReturn(T) {
        returns.push(T);
        return this;
    }

    T -- get() {
        var T? = returns[?Num];
        if T exists {
            Num += 1;
            return T;
        } else {
            Num = 0.orMaxOf(Num - 1); // go back one, stopping at 0
            return returns[Num]; // will throw when we stopped at 0
        }
    }

I had honestly never expected Wake to get this far. There's still work to be done on generics; issues which I've tackled as the standard library development demands it. But the fact that this code compiles and runs still blows my mind. Some day I will post about all the subtle quirks of generics I've realized, simply by being forced to implement them. I'd wonder how Java handled these corner cases, and write up a little program to test it. And every time, java simply didn't. Its the type of understanding that type-system geeks brag about, and writing compilers forces you to discover these things if you don't know them already.

Interfacing With Javascript

We're still working on the best ways to interface Wake with javascript and other languages, and having closures simply couldn't (and shouldn't!) be avoided.

Closures were yet another far off, likely-never-to-be-implemented sort of idea, that led me down a crazy path of type inference research and even refactoring the typechecker into a different design pattern, but as of a week ago I readied everything to cut the first release of Wake that included closure support.

These are the first anonymous functions written into the wake-on-wake test suite, including type inference in all of its glory...and yet I immediately had a problem.

    @Test
    ~[test list.filter]~(Asserts) {
        var Text[] = ["hey", "hey", "man"];
        Asserts.that(Text[].filter({ t -> return true; }).getSize())Equals(3);
        Asserts.that(Text[].filter({ t -> return false; }).getSize())Equals(0);

        var filtered Text[] = Text[].filter({ t -> return t != "hey"; });
        Asserts.that(filtered.getSize())Equals(2);
        Asserts.that(filtered[0])Equals("hello");
        Asserts.that(filtered[1])Equals("man");
    }

I could make List{T}.filter(), and .any(), and .sort(), but I could not make List{T}.map(), or .foldr(), or .foldl(), without creating yet another feature in a category of its own.

Generic Methods

Basking in my pride over turning out closures with type inference in such a short amount of time, I decided to try my hand at generic methods before releasing. I simply didn't want to have stupid caveats in my release notes.

It would seem at first that generic methods are very similar to generic classes, and to some degree this is true. After all, List{T}.map() could be written with generic classes only without too much hassle.

every ListMapper{T, R} is:

    R[] -- map(T[], R--fn(T) mapper) {
        var R[] = [];
        foreach(T[]) {
            R[].push(mapper(T));
        }
        return R[];
    }

But this solution requires injecting a ListMapper of the right type into your classes, which is simply not good enough for my dream language.

To make the leap from generic classes to generic functions required writing three new recursive algorithms in the type checker, and some tweaking to work with type inference. Here's how it looks now:

every List{T} is:

    {R} R[] -- map(R -- fn(T) mapper) {
        //...
    }
    //...

And calling it:

Text[].map({ t -> return t.getSize(); });

In our type signature, we have to save R differently than T, because one is essentially a wildcard where the other is just a fancily-tracked Object. The distinction means that within map() you can recurse into map() with any value for R that you want, but T must be the same. Then from outside of map(), T is actually bound to something, for instance, Num in Num[]. And to make matters worse, T can be bound to another generic type or argument.

warning: things are about to get like way detailed

So here's what we actually do:
  • save R as a 'generic argument'
  • typecheck the method as if all Rs are Objects, but only compatible with other Rs
  • upon use of the generic method, typecheck each argument so we know what type was actually passed in (i.e. Num)
  • recurse over each argument's signature type, and the passed-in-type, alongside each other
  • check the inside of nested types, such as lists, lambdas, and other generics
  • bail out upon any structural differences between the signature and actual types (i.e. passing Num[] into R?)
  • whenever we find a generic type, save it by name (i.e. R) to the actual value passed in (i.e. Num)
  • never allow two definitions of these generic types within one structural traversal (i.e. passing anything into Map{R, R}). Instead, assert the redefinition is the same as the previous definition.
  • copy the signature, and rewrite all Rs with the type that we found
  • go on to the next argument
This is the basics of what is required. As a final detail, by waiting to check arguments with lambda declarations until the last step, we can figure figure out that

Text[].foldl({ t, acc -> return acc + t.getSize(); }, 0);

has 0, a Num, passed into the accumulator, of type R. Therefore we take the type of the first argument (R--fn(T,R)) combined with our generic bindings (T=Text and R=Num), to infer the argument types accordingly.

I apologize for overly detailing an algorithm you've probably never cared about before, but I hope I've given you a new sense of appreciation for the types of challenges that come up when creating a compiler, and just how fun those problems can be to solve.

So Where Is Wake Now?

Wake just released v0.1.0, and there is so much more we want it to be able to do.

A lot of the dream features are in, so we're focusing now on the standard library, solving bugs, and javascript integration.

There is still a huge amount of flexibility in where we take it, and some great challenges left to solve. Integrating Wake smoothly with javascript and more is no small challenge, and we warmly invite you all to join us in having a blast designing the interop layer.

There is always a desire for more feedback, and some awesome pending features like currying and recipes and type tags. We have a roadmap describing some of it here: http://wakelang.com/roadmap.html

Of course, if Wake isn't the perfect project for you, then make something on your own, or hack on the Rust compiler, or write some static analysis tools. The realm of possible projects is massive.

So Go Hack On A Compiler!


You'll learn more than you imagined, experience phenomenal pride in your work, and thoroughly strengthen the way you read code.

Friday, January 9, 2015

The Visitor Pattern vs Object Composition

Over the last week or so, I've been going down a large refactoring of the wake compiler. And in all the hours I've spent rewriting the parse tree's data structure, I have yet to choose between two design patterns. Its possible that there is already a third that unifies the two, and if not then I'd like to share this problem as a little hole in computer science.

The design problem is as follows: say you want to traverse a binary tree.

We'd probably do something like:

class BinaryTree {
    Object item;
    BinaryTree left;
    BinaryTree right;
}

From here you have two choices. Either create your algorithms as methods on the tree:

   public Object findByPath(int binaryPath) {
       if(!binaryPath) return item;
       else if(binaryPath & 1) {
            return left.findByPath(binaryPath << 1);
        } else {
            return right.findByPath(binaryPath << 1);
        }
    }

or you could create a class for each algorithm that operates on the tree.

class BinaryPathFinder {

    public Object findByPath(BinaryTree root, int binaryPath) {
         if(!binaryPath) return root.item;
         else if(binaryPath & 1) {
             return findByPath(root.left, binaryPath << 1);
         } else {
             return findByPath(root.right, binaryPath << 1);
         }
     }
}

This could at times be the better approach -- a binary tree has many types of algorithms that could be performed on it, so this separates the algorithms from the data structure and allows users of your API to perform operations on your data that you didn't forsee.

But our tree isn't very typesafe right now. We'd be able to, for instance, set only a left node, or a right node, and/or set both nodes and a value at once. Maybe that's expected, but maybe not. Solving this typesafety is where we create the key differences between the Visitor pattern and plain 'ol object composition.

So, we can easily make it typesafe by going with the first approach:

interface BinaryTreeNode {
    public Object findByPath(int binaryPath);
}

class BinaryTreeLeaf  implements BinaryTreeNode {
    Object item;
    public Object findByPath(binaryPath) {
        return item;
     }
}

class BinaryTreeFork implements BinaryTreeNode {
    BinaryTreeNode left;
    BinaryTreeNode right;
    public Object findByPath(binaryPath) {
         ....
    }
}

But the problem is that we cannot decouple this data structure from its algorithm without actually worsening our typesafety with reverse casts and instanceof checks, turning BinaryTreeNodes into BinayTreeForks and BinaryTreeLeafs. In fact, without our algorithm's method 'findByPath(int)', our BinaryTreeNode interface would be 100% empty and 100% useless.

The Visitor pattern is a simple but clever solution to this problem.

interface BinaryTreeNode {
    public Object visit(BinaryTreeVisitor visitor);
}

interface BinaryTreeVisitor {
    public Object visit(BinaryTreeLeaf leaf);
    public Object visit(BinaryTreeFork fork);
}

class BinaryTreeLeaf implements BinaryTreeNode {
    Object item;
    public Object visit(BinayTreeVisitor visitor) {
        return visitor.visit(this);
    }
}

class BinaryTreeFork implements BinaryTreeNode {
     BinaryTreeNode left;
     BinaryTreeNode right;
     public Object visit(BinaryTreeVisitor visitor) {
         return visitor.visit(this);
     }
}

class BinaryTreeFindByPath implements BinaryTreeVisitor {
    public __construct(int binaryPath) {
        ...
    }

    public Object visit(BinaryTreeLeaf leaf) {
        ...
    }

    public Object visit(BinaryTreeFork fork) {
        ...
    }
}

Note: this may technically be a variation from the visitor pattern. The visitor pattern is often used to attach visitors to the tree in one step and then operate on it separately, while this does both in one step.

This visitor is more complicated here, because it needs to track the state of binaryPath, but the important thing is that its typesafe and decoupled. Especially assuming a stateless algorithm, this could once again be the better choice.

But there's an advantage to the non-visitor solution that gets lost once we make that change. Object composition allows us to create invisible little nifty wrapper classes around BinaryTree. For instance, we could write a quick little caching layer.

class BinaryTreeCache implements BinaryTreeNode {

    public __construct(BinaryTreeNode wrapped) { ... }
    public Object[] cache;
    public Object findByPath(int path) {
        if(cache[path]) return cache[path];
        else return wrapped.findByPath(path);
    }

}

This allows any single node (leaf or binary) to be cached, without caching all of them. You could use inheritance here, but you'd have to choose whether to extend BinaryTreeFork or BinaryTreeLeaf. With a composition based approach, you can extend a leaf, a branch, or one of your other wrapper classes, all seamlessly.

Notice that this behaviour would not work with the visitor pattern. Without extra code for detecting a change in the visitor, you would return an outdated, possibly nonsensical result. This is because we have no longer fully decoupled behaviour from the tree, and it may step on what is supposedly the visitor's job. It's supposed to be the visitors job to handle these behaviour changes, but the visitor can not easily add caching over individual nodes in as nearly elegant of a way.

You could definitely try to mix the two patterns naively, and for some things it would work. For instance, you could make a BinaryTreeNode that lazy-loads its contents from a file, which the visitor would be able to traverse without issue. But any tweaks to the behavior of your algorithm would not be guaranteed to play nicely with the visitor (timeouts, threading, probabilistics, distributing across servers).

Essentially, the object composition approach gives you the ability to separate how to configure an algorithm from the actual moment it runs. If you want a cache at every three layers and a distributed answer for every five, you can put all those if-statements into code that builds or transforms the tree, instead of putting those if-statements inside the code that operates on the tree.

Putting my rambling about the greatness of object composition aside, the advantages here specifically include:
  • Decoupling edge-case code from main use-cases
  • Separating configuration step from operations step
  • Code-driven configuration for maximum flexibility
  • Infinite layerability of edge-case wrappers
  • Easy to extend behaviour without breaking SRP
  • Small classes
  • Possibility of tree transformers that transform your configuration


By giving the visitor the responsibility of behaviour instead of the nodes, we need to be able to compose the visitor in order to compose behavior in these neat ways. But if we wrap or extend the visitor class, we end up changing the way that every leaf or branch is used. I can't see any easy ways around this.

So the question is, could the visitor pattern be upgraded to allow object composition to change the behaviour of the algorithm on a node by node basis? I'd welcome the solution, and it would live happily in the belly of the wake compiler to stand the test of time.

Wednesday, December 24, 2014

Why did we need this to be a framework?

While this post will be about Spring, the java framework, I highly urge you to read this as more than a criticism of Spring. In fact, I think Spring+Jpa+rdbms is currently an unrivaled web development stack.

But I want to question what spring does, or at least used to, do for developers. And ultimately I want to question why developers won't do these things for themselves.

Dependency Injection

For the time being, lets put aside @Autowired. We're often told that Dependency Injection is a good thing, and the prevalence of Spring seems to support that. We're also often told that Dependency Injection is a pattern and not a framework, and Spring seems to fly right in the face of that.

Lets take a look at Spring's claim of decoupling, with regards to its xml-based dependency injection.

<bean id="mySpringComponent" 
    class="com.wakelang.HowDoIUseOutsideOfSpring"
    p:injected-ref="myOtherSpringComponent" />

<bean id="myOtherSpringComponent"
    class="com.wakelang.AlsoHowDoIUseThisOutsideOfSpring"
    p:injectedNumber="10" />

Of course this code is decoupled from Spring! I hope I'm not shocking anyone when I present the following code:

AlsoHowDoIUseThisOutsideOfSpring myOtherSpringComponent
    = new HowDoIUseThisOutsideOfSpring();
mySpringComponent.setInjected(myOtherSpringComponent);

HowDoIUseOutsideOfSpring mySpringComponent = new HowDoIUseOutsideOfSpring();
mySpringComponent.setInjectedNumber(10);

Give some more realistic classnames here, and our example would be both clearer than the xml, and terser than the xml, and have something called "type safety" that's usually a big hit among java devs. Did I mention no startup performance where its doing xml parsing/analysis plus reflection?

Clearly, with this many drawbacks to the xml approach, it never would've gotten off of the drawing board. So lets look at some other reasons why the spring does things for you that you couldn't have done more easily yourself. For instance, the point of DI is to not call new on our services. Spring solved that for us, right?

Its a Design Pattern, Jim, But Not As We Know It

The only real important part of Dependency Injection is that the code which wires your dependencies is not your business logic. If the code that pulls your apps weight around reaches into global state, calls a static method, or uses a new, then you're not going to be able two swap out or reuse bitesized chunks of business logic without also rewriting and/or complicating your business logic.

So spring lets us have a main method that looks like:

ApplicationContext ctx =
    new SpringXmlApplicationContextFromClasspath("/blah/blah.xml");

This main method can be changed without complicating your business logic, and the xml config that can be changed without complicating your business logic. Seems pretty good. Otherwise we have a main method that looks like:

AlsoHowDoIUseThisOutsideOfSpring myOtherSpringComponent
    = new HowDoIUseThisOutsideOfSpring();
mySpringComponent.setInjected(myOtherSpringComponent);

HowDoIUseOutsideOfSpring mySpringComponent = new HowDoIUseOutsideOfSpring();
mySpringComponent.setInjectedNumber(10);

Oh no! This is the exact same code example we had before! This seems to suggest our verbose xml solution is even clunkier than our verbose xml led us to believe.

I will note that we now have established an advantage of Spring xml config, which is that the xml can be changed by a user without compiling a fork of Main and reassembling your jar. Which is nice. Why would I try to take that from everyone?

Type Safety: A Love Story

Type safety is a spectrum. We've all done things with Objects than can't be done to any old object, suppressed casting warnings, and held perfectly typeable data in string form because we were in a situation where we simply weren't worried about it, for better or worse. Meanwhile, languages keep coming out that improve our type systems, with things like null-safe type systems, memory-safe type systems, and automatic extraction of impurity out of functions (though I can't seem to find the language which pioneered that one).

We don't need to abandon typesafety wholesale to allow truly dynamic configuration.

Class <?> clazz = Class.forName(properties.get("LoggerFactory"));
Constructor<?> constructor = clazz.getConstructor(Properties.class);
LoggerFactory loggerFactory = (LoggerFactory) constructor.newInstance(properties);

Now you can create any main class you want, to read any properties file you want, to dynamically construct any factories you want. Now you can simply include an implementation of LoggerFactory on the classpath and it can enjoy the benefits of a typesafety in the way it constructs the Logger, even though it can do so in absolutely any way it wants.

Alternatively, if you are lower on the I-care-about-typesafety spectrum and want to save yourself the classpath hassle and/or the compile step, you can implement an XMLLoggerFactory which uses the properties or the classpath to find an xml file that would probably look more or less like the spring files I was ripping on earlier. Nevertheless, you get to mix typesafe config with dynamic config.

And as for aspects such as transactions and discovering controller routes, I believe the approach all along should have been object-graph transformers, which traverse every object property recursively and replace them with wrapped instances implementing the new behaviours.

aspects.applyTo(mySpringComponent, myOtherSpringComonent);
webmvc.applyTo(mySpringComponent); 

In short, we don't need no stinking framework. With a little bit of diligence and some isolated libraries, we can do what spring does, in an arguably superior way. This qualifies dependency injection and aspect-oriented programming as design patterns. So why do we need a framework for us to adopt them?

Passing in Straw-Man Arguments

I hope I can more or less invalidate my own point here by pointing out that modern Spring has so many features which my approach wholly fails to address, such as autowiring private properties, circular constructor references, enforcing more-or-less declarative configs. But without these things, the original Spring probably would've been off as just a thin library with a standardized xml structure for loading factories by classname.

Spring's newest annotation-based config is leaps and bounds better than the XML solutions and the one that I offered as well. As are many other dependency injection frameworks out there: Guice and Dagger and others. All of these frameworks are about simplifying the code we write in the implementation of these design patterns, and should do little else.

Challenge Yourself as a Programmer

After all this, I still wonder why the early days of Spring ever inspired anyone. Or, more accurately, why so few people were moved by the design pattern and it took a framework which does mostly nothing in order to convince the masses.

Ultimately, I would say its that we don't challenge ourselves enough as programmers. Our threshold for appreciating new ideas and others' code is so low that we require completely new implementations of simple ideas in order to accept their power.

Challenge yourself to improve all of the simple things first. Sometimes good code can look like boilerplate code at first. Challenge yourself to discover and explore why others' suggestions have more merit than you might've first thought.

Not all advancements in programming are a crazy new framework, build tool, library, algorithm, or language. Challenge yourself to move forward in the simple ways too.

Thursday, November 13, 2014

Annotations, Pretty errors now built into Wake

It becomes tougher and tougher to keep making quick releases of Wake. In addition to compiling the feature, I now have a slew of libraries written in Wake that might need to be updated, and/or recompiled in the proper order.

This was the case with the most recent release of Wake, which involved the creation of Annotations. Notably, they are now used by wUnit to find what's a @TestClass and what's a @TestMethod. From here on, they can be used to create javascript interoperability, as I covered in my last blog post.

This annotation data is stored in a binary file that just tells the compiler how a class can be used, which I've called a table file (and it has the .table extension). It is basically the same as a .hi file, for haskellers out there. While many features and bugfixes for the Wake compiler do not affect these files, such as the new error reporting, other features do.

But changing the table files is not a compiler-only change. The libraries that you use in your wake projects must be rebuilt with a compatible version, and some libraries themselves, such as the wake reflection library, must be updated to read the latest stuff. It's a blast to spend time writing in Wake instead of C++, but it risks lowering productivity as we near V1.

That's why, behind the scenes, this release features a new build server for Wake. When features come out that are not binary compatible with the last version of the compiler, it is now a simple task to rebuild all of the core libraries in order with the latest compiler, and save those libs in a bundle for our users to download and use in their projects.

Infrastructure like this is exactly what we need to keep productivity high as features keep coming out and we have keep building new Wake libraries to maintain.

We are still looking for helping hands, and still chugging onward. See you again for the next major feature addition!