Node.js Runtime Internals
V8 Engine and libuv Event Loop
Node.js consists of two core components: the V8 engine handles JavaScript compilation and execution, while the libuv library handles cross-platform asynchronous I/O. They are connected through Node.js’s C++ binding layer.
The event loop is the heart of Node.js, dividing operations into 6 phases, each maintaining a callback queue:
flowchart TD
A[Event Loop] --> B["timers<br/>Execute setTimeout/setInterval callbacks"]
B --> C["pending callbacks<br/>System-level callbacks (TCP errors, etc.)"]
C --> D["idle, prepare<br/>libuv internal use"]
D --> E["poll<br/>Retrieve new I/O events, execute I/O callbacks"]
E --> F["check<br/>Execute setImmediate callbacks"]
F --> G["close callbacks<br/>Close event callbacks (socket.on('close'))"]
G --> B
H["nextTickQueue<br/>process.nextTick"] -.->|Between each phase| I["microTaskQueue<br/>Promise.then"]
Six Phases In Detail
| Phase | Execution Content | Typical API |
|---|---|---|
| timers | Expired timer callbacks | setTimeout, setInterval |
| pending callbacks | System operation callbacks | TCP connection error callbacks |
| idle, prepare | libuv internal | — |
| poll | I/O events and callbacks | File read, network request completion callbacks |
| check | Immediately executed callbacks | setImmediate |
| close callbacks | Close events | socket.destroy() callbacks |
Key details:
process.nextTickis not in any event loop phase; it executes between phases and takes priority over the microtask queuesetImmediateexecutes after the poll phase, whilesetTimeout(0)executes in the timers phase, but their order is non-deterministic in non-I/O contexts- If the poll queue is empty: proceed to check phase if there’s a
setImmediate; wait for timers if there are scheduled timers; otherwise block and wait for I/O
Module System
CommonJS Loading Mechanism
// Export
module.exports = { add, subtract };
// Or
exports.add = function(a, b) { return a + b; };
// Import
const math = require('./math');
CommonJS loading is synchronous. When require() executes, it:
- Resolves the module path to an absolute path
- Checks the cache (
require.cache); returnsmodule.exportsif already loaded - If not cached, creates a
moduleobject and executes the module code - Caches and returns
module.exports
This means CommonJS modules support conditional loading and dynamic loading:
if (process.env.NODE_ENV === 'production') {
const monitoring = require('./monitoring'); // Conditional loading
}
ESM (ECMAScript Modules)
// Export
export function add(a, b) { return a + b; }
export default class Calculator {}
// Import
import Calculator, { add } from './math.js';
Key differences between ESM and CommonJS:
| Feature | CommonJS | ESM |
|---|---|---|
| Loading | Runtime synchronous loading | Compile-time static analysis |
| Export value | Copy of the value | Live binding (reference) |
| Top-level this | module.exports |
undefined |
| Circular dependency | Snapshot of executed portion | Reference, may be uninitialized |
| Tree-shaking | Not supported | Supported |
// CJS: Value copy
// counter.js
let count = 0;
module.exports = { count, increment: () => ++count };
// main.js
const { count, increment } = require('./counter');
console.log(count); // 0
increment();
console.log(count); // 0 — count is a copy, won't update
// ESM: Live binding
// counter.mjs
export let count = 0;
export function increment() { ++count; }
// main.mjs
import { count, increment } from './counter.mjs';
console.log(count); // 0
increment();
console.log(count); // 1 — count is a live binding, updated in real-time
Stream Processing
Streams are Node.js’s core abstraction for handling large data, transmitting data in chunks rather than loading everything into memory at once:
flowchart LR
A[Readable<br/>Stream] -->|pipe| B[Transform<br/>Stream]
B -->|pipe| C[Writable<br/>Stream]
D[Duplex<br/>Stream<br/>Independent read/write] --> E[Example: TCP Socket]
Four stream types:
- Readable: Readable stream (
fs.createReadStream, HTTP request body) - Writable: Writable stream (
fs.createWriteStream, HTTP response body) - Duplex: Duplex stream with independent read/write (TCP Socket)
- Transform: Transform stream that modifies data passing through (
zlib.createGzip)
Backpressure Mechanism
When the producer speed > consumer speed, data accumulates in memory. The backpressure mechanism uses feedback signals to pause the producer:
const fs = require('fs');
const zlib = require('zlib');
// ❌ Not handling backpressure
readable.on('data', (chunk) => {
writable.write(chunk); // Internal buffer may overflow
});
// ✅ pipe automatically handles backpressure
fs.createReadStream('bigfile.txt')
.pipe(zlib.createGzip())
.pipe(fs.createWriteStream('bigfile.txt.gz'));
// ✅ Manual backpressure handling
const source = fs.createReadStream('bigfile.txt');
const dest = fs.createWriteStream('copy.txt');
source.on('data', (chunk) => {
const canContinue = dest.write(chunk);
if (!canContinue) {
source.pause(); // Write buffer full, pause reading
}
});
dest.on('drain', () => {
source.resume(); // Buffer drained, resume reading
});
Cluster and Worker Threads
Cluster: Multiple Processes
Cluster leverages multi-core CPUs; the master process listens on the port and distributes connections to worker processes via IPC:
const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;
if (cluster.isPrimary) {
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker) => {
console.log(`Worker ${worker.process.pid} died`);
cluster.fork(); // Auto-restart
});
} else {
http.createServer((req, res) => {
res.writeHead(200);
res.end('Hello from worker ' + process.pid);
}).listen(3000);
}
Worker Threads: Multiple Threads
Worker Threads share process memory, suitable for CPU-intensive tasks:
const { Worker, isMainThread, parentPort, SharedArrayBuffer } = require('worker_threads');
if (isMainThread) {
const worker = new Worker(__filename);
worker.on('message', (result) => console.log('Result:', result));
worker.postMessage({ data: 'compute this' });
} else {
parentPort.on('message', (msg) => {
const result = heavyComputation(msg.data);
parentPort.postMessage(result);
});
}
| Feature | Cluster | Worker Threads |
|---|---|---|
| Isolation | Process-level isolation | Thread-level, shared memory |
| Communication | IPC (serialization) | MessagePort / SharedArrayBuffer |
| Use case | HTTP service multi-core scaling | CPU-intensive computation |
| Memory overhead | Each process has independent V8 instance | Shares main process V8 |
Memory Management
V8’s garbage collection is based on generational collection:
- Young Generation: Stores short-lived objects; uses Scavenge (semi-space copy) algorithm; frequent but fast GC
- Old Generation: Stores long-lived objects; uses Mark-Sweep / Mark-Compact; less frequent but longer pause times
Common memory leak scenarios:
// 1. Closure references
function createLeak() {
const bigData = new Array(1000000);
return function() {
return bigData.length; // bigData cannot be garbage collected
};
}
// 2. Global variables
function handler(req, res) {
leakedData = req.body; // Forgot var/let/const
}
// 3. Event listeners not removed
emitter.on('event', callback);
// Forgot emitter.removeListener('event', callback);
// 4. Unbounded cache growth
const cache = new Map();
app.get('/data/:id', (req, res) => {
if (!cache.has(req.params.id)) {
cache.set(req.params.id, fetchData(req.params.id)); // Only grows, never shrinks
}
res.json(cache.get(req.params.id));
});
Debugging tools: process.memoryUsage() to view heap memory usage, --inspect with Chrome DevTools for heap snapshot comparison, node --max-old-space-size=4096 to adjust old generation limit.
Comments