Wednesday, June 11, 2008

Erlang R12B-3 Released

There's a new release of Erlang/OTP out today, which you can download from the usual location. I've also imported it into the erlang-otp repository at GitHub.

According to the readme, this release contains an "experimental" regular expression module called re. The module wraps a lower-level PCRE library and is "many times faster than the pure Erlang implementation". It also looks like it works equally well with both binary- and list-based strings. So I guess the race is one to see who can get their WideFinder implementation up and running first...

Sunday, June 8, 2008

Updated Erlang/OTP Repository at GitHub

I ended up deleting and recreating the erlang-otp repository at github today, in case anyone's having problems accessing it. My original import script had a bug that prevented files from being deleted correctly between versions, which meant the files in the repository didn't exactly match the files in the source tarballs. It should be fixed now, but you were using the old repo you'll probably need to do a clean git clone to get things working again.

Sunday, June 1, 2008

Message Ordering in Erlang

I've noticed that when people first learn Erlang (myself included), they tend to assume that messages sent between processes must be handled in the same order they're received. Turns out this isn't the case - you can pluck a message out from anywhere in a process's inbox, assuming you know a pattern that uniquely matches it. This is called a "selective receive".

For example, let's say we fire off a couple of messages to a process registered as myserver:

myserver ! {name, "World"},
myserver ! {greeting, "Hello"}
Even though the we're sending the greeting message after the name message, it's possible for the receiver to process the greeting first, by matching on the greeting atom in the first position of the tuple:
Greeting = receive {greeting,G} -> G end,
Name = receive {name,N} -> N end
Here the first line removes the second message (the greeting) from the inbox, and the second line removes the first message (the name).

This may not seem like a big deal, but there are a lot of cases where this behavior comes in extremely handy. Below are couple of examples: synchronous messaging, and parallel processing of lists:

Synchronous Messaging

Layering synchronous messaging (i.e. your basic remote request/response) on top of the asynchronous stuff built into Erlang seems like it should be easy - you just send a message to another process, and wait until you get a message back. And a lot of the time that's all there is to it.

However, it's possible that your client process has lots of messages coming in from different places, in which case there's no guarantee that the first message you receive after sending a request will be the actual response to your request. You'll have to do a selective receive to filter out the irrelevant messages.

An easy way to do this is to include a unique token with each request, and have the server include it in its response. That way the client can just pattern match on the token.

For example, say we have a time server that sends the current time back to any process that requests it (plus the token):
time_server() ->
receive
{get_time, Pid, Token} ->
Pid ! {time, time(), Token}
end,
time_server().
We can then get the time from the server using the following two functions. The first is responsible for sending the get_time message to the time server. It uses make_ref the generate the token:
request_time(TimeServer) -> 
Token = make_ref(),
TimeServer ! {get_time, self(), Token},
Token.
The second is responsible for taking that token and using it to selectively retrieve the response from the inbox:
receive_time(Token) ->
receive {time, Time, Token} -> Time end.
That's pretty much it. We just need to put the two together:
get_time(TimeServer) ->
Token = request_time(TimeServer),
receive_time(Token).
We now have a function that will send a request to another process and block until it receives a response, without affecting any of the other messages in the inbox.

Parallel Processing Of Lists

Now let's say we want to make calls to a bunch of different servers. And we'd like those servers to execute the calls in parallel because, well, that's just how we do things in Erlang.

So continuing our previous example, assume we have a list of time_server processes. We want to send our get_time message to all of them, and we want to block until we receive all of the responses. We'd also like the responses to be returned as a list whose order corresponds with the original list, so that we know which time value goes with which time_server. (We can't just use the order the responses are received, since that's basically random.)

A very concise way to implement all this in Erlang is to use two list comprehensions: one to send the requests out and generate the list of tokens, and another to take the list of tokens and selectively receive the responses. For example:
get_times(TimeServers) ->
Tokens = [ request_time(TimeServer) || TimeServer <- TimeServers],
[ receive_time(Token) || Token <- Tokens ].
That's all it takes to execute those calls in parallel, and to put the results in the right order. Not bad for two lines of code.

And More...

We could take this example a lot further: maybe we want to ignore time servers that don't respond within a certain amount of time; or automatically restart them when they crash; or process the responses as they come in instead of collecting them in a list; and so on.

All these scenarios are easy to implement using selective receives (and other language features like linked processes). The interprocess communication built into Erlang seems simple, but it's been carefully designed to give you a lot of flexibility when designing your system.

Friday, May 9, 2008

"Hello, World" Revisited - Automatic Reloading

My last post contained some code for a minimal "Hello, World" webapp in Erlang. However, that code wasn't very Erlangy - if you wanted to make a change to the application, you had to kill the shell and restart it. That's clearly not going to impress the "nine nines uptime" crowd. Plus it's annoying to develop that way.

So here's some bonus code that will cause the server to compile and reload changes on the fly, without having to restart the server. It probably won't get you to nine nines, but it's a start. Just append the following to the original "hello_world.erl" file:


reload(SessionID, Env, Input) ->
case make:all([load]) of
up_to_date ->
hello_world:service(SessionID, Env, Input);
_ ->
mod_esi:deliver(SessionID, [
"Content-Type: text/plain\r\n\r\n",
"compilation error"
])
end.

The reload function acts as a wrapper around our original service function, calling make:all (which will compile and reload any out of date code) before forwarding the request.

You'll also need to stick an export at the top of the file, after the module declaration:

-export([reload/3]).

Last but not least, you'll need to create a file called "Emakefile", which tells Erlang what to compile during the call to make:all. In our case, the file is pretty simple:

{'*', []}.

Now start up the erl shell and run the same steps as before:

Eshell V5.6 (abort with ^G)
1> c(hello_world).
{ok,hello_world}
2> inets:start().
ok
3> hello_world:start().
{ok,<0.51.0>}

But this time, use the following URL to test the app, which will hit our new reload function:

http://localhost:8081/erl/hello_world:reload

You should now be able to make changes to hello_world.erl (e.g. change the message from "Hello, world" to something else), hit refresh in your browser, and immediately see the changes.

What's really happening here?

When I first started learning Erlang, I'd assumed that whenever you reloaded modules (via make:all or code:load_file or whatever) that those changes would go into effect immediately, similar to the way the load command works in Ruby. However, that's not quite how things work - just because changes to a module have been loaded doesn't necessarily mean they will be executed. This is because Erlang gives you very fine grained control over when code changes take effect.

The key to all this is the ':' operator. On the surface, it just looks like a namespace separator, used to disambiguate between functions with the same name in different modules. For example, within our hello_world module, you would think that the following two lines of code would be identical, and that the hello_world: in the first line would be redundant:

hello_world:service(SessionID,Env,Input).
service(SessionID,Env,Input).

However, they're not quite the same. The call in the first line actually checks to see if a newer version of the hello_world module has been loaded, and if so, dispatches to that version. The call in the second line always dispatches to the same version as is currently running, even if a newer version has been loaded. If a new version of the module has not been loaded, their behavior is the same.

The upshot of this is that you can have two versions of your code running at the same time (Erlang doesn't let you have more than two, however). This is similar to the way some web servers have a "graceful" restart option, that allows currently executing requests to continue using the old configuration, while new requests use a different configuration. Except in Erlang you can pick the exact function call in your application where these upgrades happen. Very cool.

You can test this behavior by leaving out the hello_world: prefix in the fourth line of the reload function. You'll notice that your changes are no longer picked up immediately, but are picked up on the next invocation (since the initial call to hello_world:reload by the HTTP server will trigger an update).

Sunday, May 4, 2008

"Hello World" Webapp in Erlang

When I first started using Erlang it took a fair bit of trial and error to create a server that generated dynamic HTTP content. The main problem is the lack of good tutorials out there for doing these kinds of things. The documentation that comes with Erlang is good for reference, but is not that helpful if you're just learning the language.

In particular, there didn't seem to be a basic "Hello World" example for building web applications. So here's my attempt to fix that. Below is some code that starts up an HTTP server, and dynamically generates a simple "Hello, World" page:


-module(hello_world).
-export([start/0,service/3]).

start() ->
inets:start(httpd, [
{modules, [
mod_alias,
mod_auth,
mod_esi,
mod_actions,
mod_cgi,
mod_dir,
mod_get,
mod_head,
mod_log,
mod_disk_log
]},
{port,8081},
{server_name,"hello_world"},
{server_root,"log"},
{document_root,"www"},
{erl_script_alias, {"/erl", [hello_world]}},
{error_log, "error.log"},
{security_log, "security.log"},
{transfer_log, "transfer.log"},
{mime_types,[
{"html","text/html"},
{"css","text/css"},
{"js","application/x-javascript"}
]}
]).

service(SessionID, _Env, _Input) ->
mod_esi:deliver(SessionID, [
"Content-Type: text/html\r\n\r\n",
"<html><body>Hello, World!</body></html>"
]).

To run it, save the code to a file called hello_world.erl, and create two subdirectories next to it called "www" and "log" (these subdirectories can be empty, but they need to be there for the server to start). Then fire up erl and run the following three commands:

Eshell V5.6 (abort with ^G)
1> c(hello_world).
{ok,hello_world}
2> inets:start().
ok
3> hello_world:start().
{ok,<0.51.0>}

You should now be able to browse to the following URL and see your message (if for some reason it doesn't work for you please let me know):

http://localhost:8081/erl/hello_world:service

For more info, here are the reference docs for the relevant Erlang modules:

Friday, April 25, 2008

Erlang/OTP Source at GitHub

I wrote a little script the other day to download all of the Erlang/OTP source releases that were available at erlang.org, and stick them in a single git repository. I've uploaded it to GitHub, if anyone's interested:

http://github.com/mfoemmel/erlang-otp/tree/master

I found out after the fact that archaelus had done something similar, and has a git repository hosted here:

http://git.erlang.geek.nz/?p=erlang-otp.git;a=summary

The main difference between the two is that the one at GitHub includes releases going a lot further back (R6B-0 vs R11B-5) - which is good if you're curious about how Erlang has evolved over time, but also means the repository is that much bigger when it comes time to do a clone (Erlang includes a bunch of binary files in their "source" releases, which don't seem to compress very well). Archaelus also includes a few 3rd party patchsets in his repository, which may be of interest.

The nice thing about GitHub, however, is that it makes it really easy for anyone to branch a project and make changes, and then make those changes available to everyone else (who can then merge them back into their own branches, and so on). Maybe this could help open up the Erlang development process a bit?

Thursday, April 10, 2008

Erlang 12B-2 Released

Looks like the second service pack for Erlang/OTP 12B has been released.

One nice change is that the percept application no longer depends on libgd. This should make compiling for Mac OS X a bit easier, since Leopard doesn't ship with GD by default.

They also appear to have fixed some of the other OS X related build issues that cropped up in the first service pack.