JavaScript logo

Back when I first wrote my unzip implementation in pure JS using Web Workers (code here), JavaScript runtimes were a very new thing (NodeJS had been released less than a year before). Ok, I had played with C++ bindings to the V8 JS engine for a hobby video game engine I had been writing, but that was it for me when it came to "JavaScript outside of the browser".

Well over a decade later and JavaScript/Typescript runtimes are all the rage in this continuously fractious software world. Even so, it hadn't really ever occurred to me that the unzip/unrar/untar implementations in BitJS might be useful in NodeJS or other runtimes (Deno, Bun) until someone opened a bug.

Anyway, the way unzip/unrar worked was pretty straightforward: The host code passes bytes into the unzip/unrar implementation via a postMessage() call, the implementation does some bits and bobs as a WebWorker (aka not on the UI thread), crawling through bytes of the archive and emitting interesting events that the host code listens for (like "here's a file I extracted").

sequenceDiagram participant Host participant Worker box Worker JavaScript Context participant WorkerGlobalScope participant unrar.js end Host->>Worker: postMessage<br/>(rar bytes) Worker-->>WorkerGlobalScope: WorkerGlobalScope->>unrar.js: onmessage<br/>(rar bytes) Note right of unrar.js: unrar<br/>the thing unrar.js->>WorkerGlobalScope: postMessage<br/>(an extracted file) WorkerGlobalScope-->>Worker: Worker->>Host: onmessage<br/>(an extracted file) unrar.js->>WorkerGlobalScope: postMessage<br/>(2nd extracted file) WorkerGlobalScope-->>Worker: Worker->>Host: onmessage<br/>(2nd extracted file)

Unfortunately, Node still has not adopted Web Workers (though eventually they may); they even have their own different thing called Worker Threads - confusing. Anyway, it left me wondering how I should approach supporting Node... until I learned about MessageChannel / MessagePort, which are now supported nearly universally (as of Node 15).

So in the end, it continues to be pretty simple. The MessageChannel becomes the abstraction through which messages are passed, the host code owns one MessagePort, the unzip implementation owns the other, and the implementation no longer assumes it lives in a WebWorker (oh and thanks Dynamic Imports!).

sequenceDiagram participant Host Code participant Port1 box Any JavaScript Context (could be a Web Worker) participant Port2 participant unrar.js end Host Code->>Port1: postMessage(rar bytes) Port1-->>Port2: (MessageChannel) Port2->>unrar.js: onmessage(rar bytes) Note right of unrar.js: unrar<br/>the thing unrar.js->>Port2: postMessage(an extracted file) Port2-->>Port1: (MessageChannel) Port1->>Host Code: onmessage(an extracted file) unrar.js->>Port2: postMessage(2nd extracted file) Port2-->>Port1: (MessageChannel) Port1->>Host Code: onmessage(2nd extracted file)

This allows environments that support Web Workers to keep their Web Worker implementation and the NodeJS version to have the implementation in its main thread. If someone wants to make it more performant using Node's Worker Threads send pull requests!

It seems like all JS libraries that do intensive computations (like training ML models or mining teh bitcoinz) and then emit a series of events, should probably think of MessageChannel as the means of communication with the host software going forward so that the implementation can be ported to more environments. What? WebAssembly? ... oh shhhh!

This little weekend hack also let me write some decent automated unit tests for BitJS decompression, so hurrah for that too!

§1357 · December 27, 2023 · Uncategorized · · [Print]

Leave a Reply