Node.JS Threading Model

By Jordi Gomez jordi.gomez@skyscanner.net
Senior Software Engineer at Skyscanner www.skyscanner.net Made with Reveal.js http://lab.hakim.se/reveal-js/

I am not a full-time Node.JS developer

but I am passionate about researching new stuff

Node.JS is

  • Open-source
  • A cross-platform runtime environment for Javascript
  • For server-side web applications
  • Based on Google V8 JavaScript engine
  • Which is written in C++
  • Hosted by the Node.js Foundation
  • Which is a collaborative project at Linux Foundation

    Source: wikipedia https://en.wikipedia.org/wiki/Node.js

Made of

Stack

Source: http://apps-masters.com/database/amazing-features-node-js-top-5-server-side-scripts/ Source: https://github.com/nodejs/node-v0.x-archive/tree/master/deps

NPM - Package Manager for Javascript

  • Allows to install Javascript packages
  • Default for Node.JS
  • Simple to use:
    $ npm search express
    $ npm install express
    $ npm uninstall express

Code looks like

  • Install some packages
  • $ npm install connect serve-static
  • server.js
  • 
    var connect = require('connect');
    var serveStatic = require('serve-static');
    var server = connect().use(serveStatic(__dirname))
    server.listen(8080);
                                
  • Run a static file server
  • $ node server.js

Source: http://stackoverflow.com/questions/6084360/using-node-js-as-a-simple-web-server

Browse it

Browser

Something a little more complex


var http = require("http"),
    url = require("url"),
    path = require("path"),
    fs = require("fs"),
    memcached = require('memcached');   

var error = function(response, status, err) {
  response.writeHead(status, {"Content-Type": "text/plain"});
  response.write(err + "\n");
  response.end();
};

var success = function(response, data) {
  response.writeHead(200);
  response.write(data, "binary");
  response.end();
};

var cache = new memcached('localhost:11211');
                        

http.createServer(function(request, response) {
  var uri = url.parse(request.url).pathname,
      filename = path.join(process.cwd(), uri);

  cache.get(filename, function(err, data) {
    if (data !== undefined) return success(response, data);

    fs.exists(filename, function(exists) {
      if(!exists) return error(response, 404, "404 Not Found");

      fs.readFile(filename, "binary", function(err, file) {
        if(err) return error(response, 500, err);

        cache.set(filename, file, 10, function (err) {
          success(response, file);
        });
      });
    });
  });
}).listen(8080);
                        

That's basically I/O

  • We access memcache: network
  • We read the file: disk
  • We access memcache again: network
  • The same with database access or external services
  • Slow operations: wait for the OS to signal the completion

The cost of I/O

Source: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/

Threading models

  • Single thread (Python)
  • Multiple threads (Java)
  • Multiple processes (Apache+PHP)
  • Single thread, event driven (NodeJS, Python+AsyncIO)

The event loop

  • Single thread but ...
  • ... event driven programmed
  • Based on libuv (low-level library)
  • No support for some operations async in Kernel
  • Background threads in libuv do the wait ...
  • ... so everything runs in parallel BUT our code
Event loop
  • Every call that involves I/O, requires a callback
  • Usually, yields control to the event loop
  • When I/O completed, push the callback to the event loop

Advantages

Event loop
  • Inherently thread safe: no race conditions
  • Just callbacks
  • Easy parallelism for I/O

Disadvantages

  • CPU-intensive work blocks the process
  • Memory leaks
    
    function(base, cb) {
        // Closure used in the callback.
        var obj = new LeakObject();
        var once = function(e) {
            cb(e.type, obj);
            base.removeListener('change', once);
        };
        base.on("change", once);
        // obj will be freed!
    }
                                    
  • Debug is not easy

libuv

Callback queue
  • Async network I/O: an event is placed in the poll queue when the OS reports activity
  • Async filesystem I/O: normal blocking system calls in a separate thread from the pool

Source: http://libuv.readthedocs.org/en/latest/design.html

libuv: the I/O loop

Callback queue
  • The loop is alive if referenced handles, active requests or closing handles
  • The entire callback queue is processed in each event loop iteration
  • setImmediate: end of the iteration
  • process.nextTick: end of the iteration phase (beware of recursive starvation!)

Multiple processes

  • One Node.JS server process per core
  • Cluster module (should scale linearly)
    
    if (cluster.isMaster) {
      // Fork workers.
      for (var i = 0; i < numCPUs; i++) {
        cluster.fork();
      }
    } else {
      http.Server(function(req, res) { ... }).listen(8000);
    }
                                
  • Or other cores can do the CPU intensive work (maybe Web Workers)
  • With socket.io, which does multiple connections for handshare, sticky-session module.

Source: http://stackoverflow.com/questions/2387724/node-js-on-multi-core-machines

Let's pick a fight

With PHP, 1K concurrent, 100K completed

Source: http://www.hostingadvice.com/blog/comparing-node-js-vs-php-performance/

With Java (Paypal)

With Java again

Source: https://www.paypal-engineering.com/2013/11/22/node-js-at-paypal/

Some conclusions

  • Ideal for IO-bounded (network or disk)
  • Not suited for CPU-bounded tasks
  • Callback paradigm (libuv background threads wait)
  • Sometimes not very intuitive

Interesting References

  • "Nodejs in flames": http://techblog.netflix.com/2014/11/nodejs-in-flames.html
  • "600k concurrent websocket connections":http://www.jayway.com/2015/04/13/600k-concurrent-websocket-connections-on-aws-using-node-js/
  • "Understanding process.nextTick": http://howtonode.org/understanding-process-next-tick
  • "C++ bindings with Node.JS": https://pravinchavan.wordpress.com/2013/11/08/c-binding-with-node-js/
  • "Architecture of NodeJS": http://mcgill-csus.github.io/student_projects/Submission2.pdf
  • http://chimera.labs.oreilly.com/books/1234000001808/ch03.html#using_multiple_processors

Interestint stackoverflow References

  • http://stackoverflow.com/questions/2353818/how-do-i-get-started-with-node-js
  • http://stackoverflow.com/questions/1884724/what-is-node-js
  • http://stackoverflow.com/questions/14795145/how-the-single-threaded-non-blocking-io-model-works-in-node-js
  • http://stackoverflow.com/questions/6084360/using-node-js-as-a-simple-web-server
  • http://stackoverflow.com/questions/22847406/what-steps-does-node-js-takes-to-execute-a-program
  • http://stackoverflow.com/questions/23893872/how-to-properly-remove-event-listeners-in-node-js-eventemitter
  • http://stackoverflow.com/questions/25568613/node-js-event-loop

Thanks

Skyscanner

  • Travel smarter with Skyscanner.
  • Leading global travel search site offering an unbiased, comprehensive and free flight search service as well as online comparisons for hotels and car hire.
  • Founded: 2003
  • 700 global employees, 50 nationalities
  • 9 global offices
  • 40m+ unique monthly visitors
  • 35m+ app downloads