If you’ve not heard about node.js, do watch the video from a talk given by Ryan Dahl (project lead) at JSConfEU 2009.
The thing which I found most interesting is how he handled the shortcomings of libraries like eventmachine. eventmachine is great for building highly-scalable single-threaded network servers using async network I/O. But while writing your network servers, you’ve to be careful not to introduce some blocking calls in your event callbacks. Since your server is single-threaded, if you block inside the event callback then you’re basically not handling any of the other incoming requests during the time that the thread was blocked. Why would you need to do blocking calls? Your MySQL client library will typically do blocking I/O. POSIX filesystem calls are blocking. There are many other such instances where the usefulness of eventmachine is reduced. So in node.js, Ryan has decided to wrap away these blocking APIs and expose them in Javascript in a non-blocking manner - basically a batteries-included arsenal of non-blocking APIs in one package if you will. Take a look at the node.js API docs to get an idea of the kinds of things that are supported.
The way node.js works underneath is that it offloads all the blocking calls to a thread-pool and when the calls finish, the results are fetched by reading off a pipe to which the threads write to. Incidentally pipes are select()-able which means these events can be handled within the main event loop of node.js in exactly the same manner as the other network socket events. I quickly implemented a barebones proof-of-concept of this idea in Python:
Note - if you’re reading this via a RSS reader, open this in a browser to see the gist code embed above.
The code doesn’t necessarily do a lot of error handling.
Note - if you’re reading this via a RSS reader, this blog post contains a Gist embed; open the page in your browser.
I was playing with Riak yesterday and ran into this error on my Macbook running Snow Leopard:
{
{badmatch,{error,{{badmatch,{error,
{file_error,"./dets-store/1392993748081016843912887106182707253109560705024",emfile}}},
[ {riak_vnode,init,1},
{gen_server2,init_it,6},
{proc_lib,init_p_do_apply,3} ]}}
},
[{riak_vnode_master,get_vnode,2},
{riak_vnode_master,handle_cast,2},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]
}
When using the dets storage backend, Riak seems to create/open a dets database file for each vnode (partition) in your ring. When there’s only one node in your cluster, I’m guessing all the vnodes/partitions are owned by this node which results in a whole bunch of files being opened (under the dets-store folder you configured). In Snow Leopard, I’d to do ulimit -n 8192 to increase the limit on the no. of fds a process could take. You probably won’t notice this normally - I increased the partition size in the config file and hence ran into this problem.
I’ll be speaking about Hadoop at FOSS.MY 2009 in KL. They’ve an interesting bunch of speakers this year including Brain Aker, David Axmark, RMS. This will be my first time to FOSS.MY!
If you’re attending, do drop by my talk on Sunday afternoon :)