In a previous post, we described a framework for distributing user-created software objects as a service. The framework can therefore be as powerful as the objects it hosts. Like lego, the range of end products you can build all depends on the variety and capability of the pieces.
In 3D game engine, we often find ourselves in need of processing huge number of triangles, e.g. to apply affine transformation. Similary, in AI, passing huge number of data points through a processing pipeline is central to pattern recognition.
Fortunately, SIMD and multi-threading are recently enabled by WebAssembly and promising huge speedup depending on the number cores your system have and the intrinsic library it supports. For example, on a quad-core system with Intel instrinsic set, a 4x4 = 16 times or higher performance boost is possible.
However, an unexpected bottleneck emerges. Theoretically, since shared memory is enabled, once threads finished writing to memory, the task is done. But for timing-critical task, a callback is often required to signal completion. It turns out, the callback to the main thread takes an extraordinary amount of time to trigger, especially when the main thread is busy. Note that, there is no such thing as interrupt in browser engine.
As of now, we are still investigating the best way to integrate with a busy main thread. An alternative is to move all busy tasks off the main thread so that it handles UI and thread messages exclusively.