Monday, January 8, 2007

JRake, Part 4: Multithreading

First off, I'd like to thank everyone for all the feedback I've received so far regarding JRake. It's been extremely helpful - please keep it coming.

One of the things I should probably clarify is that JRake, in its current form, is not really a standalone tool that people should be considering for real world use. It's a series of spikes, meant to kick JRuby's tires and to test out some ideas I've had for how to build a better build tool. I decided to blog about it as I was pleasantly surprised by how well it all worked, and thought others might be interested in seeing the types of things you can now do with JRuby (it's come a long way).

My hope is that JRake eventually will evolve into a production-quality tool, either as a standalone codebase or part of another project, but we're not there yet. In the mean time, all the features I mention here should be considered "experimental".

With that in mind, let's check out the next hack:

Parallel JRake

In the previous posts, we tackled performance by eliminating the number JVM startups during the build. In the end, we got that number down to zero, so we can't really follow that line of attack much further.

The next logical step (assuming we don't want to start tearing apart the actual javac code) is to start executing tasks in parallel, where possible. This won't help folks running on single-processor/single-core machines, and in practice won't really help folks who just need to recompile one or two files; but there's a potential big win for those doing clean builds on multi-processor/multi-core machines.

The idea is not new, and most existing build tools support at least some sort of parallel processing. For example, ant has a parallel task that can be used to execute subtasks in parallel, and (standard) rake has a multitask task. These are examples of explicit parallelism, as the user must declare exactly which tasks are to be run in parallel.

The alternative is implicit parallelism, which is supported by tools like make (via its -j option, explained here). In this case, the tool uses the dependencies specified in the build script to automatically infer which steps can be run in parallel. For example, consider a simple client/server application with the following build targets:

Here we have a "test" target, which depends on both the "client" and "server" targets, which in turn depend on the "common" target. There is no direct dependency between the "client" and "server" targets, however. A build tool that supports implicit parallelism will note this, and build those targets in parallel if it can. This can lead to big improvements in build times, depending on the complexity of the build script (more targets equals more chances for parallelism) and the type of computer its running on (the more processors the better).

Of the two, I believe implicit parallelism is the way to go. Multicore machines are becoming the standard, and the number of developers who will be able to take advantage of parallel builds is going to increase quite a bit over the next few years, so having that be the default makes sense. Plus it will be easier to bake it in from the start, than to try and tack it on later.

Implementation

To get it working, I ended up bypassing the normal Rake methods for determining execution order, and created my own TaskCoorindator class. It uses a pool of worker threads to handle task execution, while the main thread does the work of figuring out which tasks to submit to the pool. The worker threads use a blocking queue to communicate the results of each task execution back to the main thread.

Here's the TaskCoordinator code (for whatever subset of you happen to be familiar with both JRuby/Rake and the java.util.concurrent package):

class TaskCoordinator

def initialize(pool_size=2)
@thread_pool = java.util.concurrent.Executors.newFixedThreadPool(pool_size)
end

def execute(root)
@result_queue = java.util.concurrent.LinkedBlockingQueue.new
@leafs = Set.new # tasks with no prereqs
@children = {} # maps task -> tasks it depends on
@parents = {} # maps task -> tasks that depend on it

analyze_dependencies(root)

@leafs.each do |leaf|
execute_task(leaf)
end

# Keep pulling task results from the queue, until we get to the root target
# Remove non-root targets from the dependency graph
# (i.e. delete the relevent members in the children/parent hash maps)
# If this results in any tasks with no children (i.e. no more prereqs to execute)
# then submit that task to the thread pool for execution.
while (result = @result_queue.take).task != root
@parents[result.task].each do |posttask|
list = @children[posttask]
list.delete(result.task)
if list.empty?
@children.delete(posttask)
execute_task(posttask)
end
end

@parents.delete(result.task)
end
end

def shutdown
@thread_pool.shutdown
end

private

# Stores the relationship between the tasks involved in this
# build, using the "children" and "parents" hash tables
def analyze_dependencies(task, analyzed=Set.new)
return if analyzed.member?(task)

analyzed << task

prereqs = task.prerequisite_tasks
if prereqs.empty?
@leafs << task
else
prereqs.each do |child|
(@children[task] ||= Set.new) << child
(@parents[child] ||= Set.new) << task
analyze_dependencies(child,analyzed)
end
end
end

# Wraps the task in a TaskRunner and passes it to the
# thread pool for execution
def execute_task(task)
@thread_pool.execute(TaskRunner.new(task, @result_queue))
end
end

TaskResult = Struct.new(:task, :exception)

class TaskRunner < java.lang.Runnable
attr_reader :task

def initialize(task, queue)
super()

@task = task
@queue = queue
@exception = nil
end

def run
begin
@task.execute
rescue Exception => exception
puts exception
@exception = exception
ensure
@queue.add(TaskResult.new(@task, @exception))
end
end
end

Sample Output

I set up a simple build script using the client/server dependency graph shown above. The following output demonstrates how the build script runs when we only use one thread:

[pool-1-thread-1] common - begin
[pool-1-thread-1] common - end
[pool-1-thread-1] client - begin
[pool-1-thread-1] client - end
[pool-1-thread-1] server - begin
[pool-1-thread-1] server - end
[pool-1-thread-1] test - begin
[pool-1-thread-1] test - end

real 0m3.230s
user 0m0.007s
sys 0m0.013s
Just as we would expect: the tasks are executed serially and there is no overlap.

Below is the output when running with two threads:

[pool-1-thread-2] common - begin
[pool-1-thread-2] common - end
[pool-1-thread-1] client - begin
[pool-1-thread-2] server - begin
[pool-1-thread-2] server - end
[pool-1-thread-1] client - end
[pool-1-thread-2] test - begin
[pool-1-thread-2] test - end

real 0m2.542s
user 0m0.007s
sys 0m0.012s

As you can see, the "client" and "server" tasks overlap, and the total build time is reduced.

Danger

Unfortunately, Parallel JRake has the same problem as all multithreaded code: nondeterminism. With the exception of explicitly declared dependencies between tasks, users can't control the order in which things get executed.

Using the example above, this means that sometimes "client" will get compiled first, sometimes "server" will get compiled first, and sometimes they will overlap. In theory, we shouldn't care. However, if there is actually is some sort of dependency between the two tasks, but this hasn't been explicitly declared in the rakefile, the build might succeed one time and fail the next.

There are couple ways that I can think of to lessen the impact of this, though it can't be completely eliminated. One way is to use Raven style tasks, where the values for properties like "output directory" and "classpath" are automatically supplied by the tool itself (by analyzing the tasks dependencies). That way, the only way to get things to compile is to declare your dependencies correctly.

On top of this, it might make sense to introduce some nondeterminism even when running in single-threaded mode. For example, we could intentionally randomize whether we execute the "client" task or the "server" task first. That won't prevent bugs from occurring in the build script, but should help us catch them faster.

Source Code

Once again, I've made the complete project available via subversion, if anyone would like to play with it:

svn://svn.foemmel.com/blog/jrake/parallel

To run the server, go to the project root directory and run:

bin/jraked

You can then build specific targets like this (you'll need to have curl installed):

bin/jrake unit
bin/jrake clean

Alternatively, you can run builds from your browser, by visiting a URL like this:

Other Changes

I've also made a few other tweaks to JRake since my last post, in case anyone is interested:

  • You can now send an HTTP request to the JRake server to build any target, using the URL syntax: http://localhost:3030/targetname. The build log will be sent back to the client as text/plain so you can just use curl or wget to run builds. This eliminates the need for the JRake Shell.

  • The autoreload servlet will now only perform a reload when the use hits "refresh" in their browser. A "hard refresh" (ctrl-F5) is required in IE, and it doesn't work at all with Safari. So it still needs some work...

No comments: