Sunday, November 18, 2007

Erlang: Initial Thoughts

As an experiment, I recently tried porting some code from a Java project at work to Erlang, to see if that might be a direction we'd want to head in. The code being converted is very functional (e.g. almost all data structures are immutable) and easily parallelizable, so it seemed like a good fit.

I was hoping to quickly find some excuse for why Erlang wouldn't be suitable for the kind of work we're doing. The little voice in my head has been nagging me for a while that "Java is the wrong language for this project" and I was hoping to shut it up, a least for a while, so that I could get back to work.

But so far it's been pretty clear sailing. Erlang is famous for its concurrency features, but the thing that surprised me was how concise a language it is. Of the handful of classes I've converted, the Erlang versions were pretty consistently about 25% the size of the Java versions (and I'm sure they'll get even smaller once I know Erlang better).

The reasons for the big difference seem to be:

Higher-order Functions

A lot of our code involves iterating through lists, performing some action on each element, and aggregating the results. For example, in Java, if you want to go through a list of items and add up all of their "quantity" fields, you might do something like:


public int totalQuantity(List<item> items) {
int totalQuantity = 0;
for(Item item : items) {
totalQuantity += item.getQuantity();
}
return totalQuantity;
}

Whereas in Erlang you can use higher-order functions ("map" in this case) to do this in one line:

total_quantity(Items) -> sum(map(fun(Item) -> quantity(Item) end, Items)).

Erlang also supports list comprehensions, which can make things a bit more readable:

total_quantity(Items) -> sum([quantity(Item) || Item <- Items]).

This is pretty standard functional programming stuff - no real surprises. (Java will hopefully improve on this front with the introduction of closures in JDK7.)

Pattern Matching

I was more surprised by how handy Erlang's pattern matching facilities were. For example, using a single "=" operator, you can do multiple things like check that a status code equals "ok", assert that a list contains at least two items, and extract the first item from that list:

{ok, [H|_|_]} = foo().

If you've only ever used languages where "=" means assignment, this statement may look a bit strange. I won't get into how it all works here, since it's a big topic, but I will say that once you get the hang of it, it's hard to go back.

From a language perspective, Erlang's pervasive use of pattern matching is probably what differentiates it most from some of the more mainstream programming languages. Patterns are used everywhere, including function parameters, case/if statements, and when receiving messages from other processes.

Tuples

Being able to create new data structures on the fly, without having to define a new class, is a big time saver. For example, to create a value representing an item and its quantity, you can just do:

Item = {"Carrots", 3}.

These types of structures are a pain to deal with in Java, since you need to create an entirely new class. It's easy to blame static typing for this, but the .NET folks can do something similar, thanks to anonymous types.

Summary

So, I've yet to hit any major roadblocks. That's not to say that Erlang isn't without its quirks - the error messages it spits out can be quite cryptic, and the way it deals with strings seems clumsy. But so far no showstoppers.