Wasm Builders

Cover image for An essay on the bi-directional exchange of strings between the Wasm module (with TinyGo) and Node.js (with WASI support)
Philippe Charrière
Philippe Charrière

Posted on

An essay on the bi-directional exchange of strings between the Wasm module (with TinyGo) and Node.js (with WASI support)

Disclaimers:

  • I'm more comfortable with JavaScript and Java
  • I didn't use pointers (before)
  • Bit masking is totally new for me (and hard to understand)

Despite all of this, when I understood that the WASI support already landed in Node.js (thanks this blog post: Getting started with NodeJS and the WebAssembly System Interface), I decided to try an explanation about how to exchange strings between Wasm modules and hosts when using WASI.

I will write the Wasm module with TinyGo, and use Node.js as the WASI host.
I was able to write this blog post thanks to the three below persons and one below project:

(Many) Thanks to:

Only four data types

With WebAssembly, data exchanges between the host and a Wasm module function can only be done with four data types:

  • i32: 32 Bit Integer
  • i64: 64 Bit Integer
  • f32: 32 Bit Floating Point Number
  • f64: 64 Bit Floating Point Number

Suffice to say that passing a string as a parameter or returning a string is not necessarily trivial.

So today, my main focus is to use the simplicity of JavaScript to explain the mechanics of passing and returning strings.

Returning a string from Wasm - Getting it with JavaScript

Returning a string from a Wasm function

This is the source code of the Wasm module

📝 hello.go

package main

func main() { }

//export hello 1️⃣
func hello() *byte {
  return &(([]byte)("hello world")[0]) 2️⃣
}
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: this annotation allows to export the hello function (and make it "importable" from the host)
  • 2️⃣: the function returns a pointer on the starting position of an array of bytes that contains the "hello world" string (the string's length equals 11)

To compile the Wasm module, use this command:

tinygo build -o hello.wasm -target wasi ./hello.go 1️⃣
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: it will produce a file named hello.wasm

Calling the hello function from JavaScript (Node.js)

This is the JavaScript code to call the function and retrieve the string:

📝 index.js

"use strict";
const fs = require("fs");
const { WASI } = require("wasi");
const wasi = new WASI();
const importObject = { wasi_snapshot_preview1: wasi.wasiImport };

(async () => {
  const wasm = await WebAssembly.compile(
    fs.readFileSync("./hello.wasm") 1️⃣
  );
  const instance = 
    await WebAssembly.instantiate(wasm, importObject); 2️⃣

  wasi.start(instance);

  const helloStringPosition = instance.exports.hello(); 3️⃣

  const memory = instance.exports.memory; 4️⃣

  const extractedBuffer = 
    new Uint8Array(memory.buffer, helloStringPosition, 11); 5️⃣
    // 11 == length of "hello world"

  const str = 
    new TextDecoder("utf8").decode(extractedBuffer); 6️⃣

  console.log(str); // "hello world"

})();
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: load the Wasm module
  • 2️⃣: create an instance of the module
  • 3️⃣: call the exported hello function; helloStringPosition is the memory position of the pointer
  • 4️⃣: get the sharable memory of the module.
  • 5️⃣: the memory.buffer is an ArrayBuffer. We need to create a typed array object (Uint8Array) to manipulate its content. To extract the appropriate content, we need the start position (helloStringPosition) and the length of the string (11).
  • 6️⃣ Then we decode the extracted buffer to get the whole string "hello world".

To run the script, use this command:

node --experimental-wasi-unstable-preview1 index.js 1️⃣
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: it will display hello world

For a better understanding

You can display the content of the extracted buffer thanks to console.log:

console.log(extractedBuffer)
Enter fullscreen mode Exit fullscreen mode

And this is the content of the buffer:

Uint8Array(11) [
  104, 101, 108, 108,
  111,  32, 119, 111,
  114, 108, 100
]
Enter fullscreen mode Exit fullscreen mode

Each item of the array is a letter of the string "hello world". Add this line in your source code:

extractedBuffer.forEach(
  item => 
    console.log(item,":",String.fromCharCode(item))
)
Enter fullscreen mode Exit fullscreen mode

For each item of the array you'll get the corresponding letter

104 : h
101 : e
108 : l
108 : l
111 : o
32 :  
119 : w
111 : o
114 : r
108 : l
100 : d
Enter fullscreen mode Exit fullscreen mode

The annoying thing is that you have to know the length of the return string. A solution could be to parse the buffer and stop the extraction when you get a 0 code that means "end of the string", but it's not well optimized.

In a perfect world, the function would return the pointer position and the size of the string, but, right now, in the Wasm world, a function can return only one value.

But it was by reading the code of an example from the Wazero project that I understood that it was possible to "put" two values in a single one:

// _greeting is a WebAssembly export that accepts a string pointer 
// (linear memory offset) and returns a pointer/size pair 
// packed into a uint64.
//
// Note: This uses a uint64 instead of two result values
// for compatibility with WebAssembly 1.0.
//export greeting
func _greeting(ptr, size uint32) (ptrSize uint64) {
    name := ptrToString(ptr, size)
    g := greeting(name)
    ptr, size = stringToPtr(g)
    return (uint64(ptr) << uint64(32)) | uint64(size)
}
Enter fullscreen mode Exit fullscreen mode

extract from: https://github.com/tetratelabs/wazero/blob/main/examples/allocation/tinygo/testdata/greet.go#L50

This "magic" line (uint64(ptr) << uint64(32)) | uint64(size) allows to pack the pointer position and the size of the string in an only one return value.

It was my first baby steps in the discovery of bit masking and bit shifting: 🤔 how to use it with Node.js? It was hard for my little 🧠, luckily I was helped a lot by Axel Rauschmayer. So, in the next section, I will try to explain another way to return a string from TinyGo and to read it from JavaScript.

AGAIN: Returning a string from Wasm - Getting it with JavaScript

Returning a string pointer and a size from a Wasm function

I changed the source code of the Wasm module inspired by the example of wasero:

📝 hello.go

package main

import (
  "unsafe"
  "fmt"
)

func main() { }

//export hello
func hello() uint64 { // ptrAndSize

  message := "hello world"
  buf := []byte(message)
  bufPtr := &buf[0]
  unsafePtr := uintptr(unsafe.Pointer(bufPtr))

  ptr := uint32(unsafePtr) 1️⃣
  size := uint32(len(buf)) 2️⃣

  ret := (uint64(ptr) << uint64(32)) | uint64(size) 3️⃣

  return ret
}
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: I "transform" the pointer position to a uint32.
  • 2️⃣: I "transform" the string's length to a uint32.
  • 3️⃣: I "pack" a pointer/size pair into a uint64.

Calling the hello function from JavaScript (Node.js)

This is the JavaScript code to call the function and retrieve the string without knowing the size of the return string:

📝 index.js

"use strict";
const fs = require("fs");
const { WASI } = require("wasi");
const wasi = new WASI();
const importObject = { wasi_snapshot_preview1: wasi.wasiImport };

(async () => {
  const wasm = await WebAssembly.compile(
    fs.readFileSync("./hello.wasm")
  );
  const instance = 
    await WebAssembly.instantiate(wasm, importObject);

  wasi.start(instance);

  // call hello
  // get a kind of pair of value
  const helloPointerSize = instance.exports.hello(); 1️⃣

  const memory = instance.exports.memory;

  const completeBufferFromMemory = 
    new Uint8Array(memory.buffer); 2️⃣

  const MASK = (2n**32n)-1n; 3️⃣

  // extract the values of the pair
  const ptrPosition = Number(helloPointerSize >> BigInt(32)); 4️⃣
  const stringSize = Number(helloPointerSize & MASK); 5️⃣

  const extractedBuffer = completeBufferFromMemory.slice(
    ptrPosition, ptrPosition+stringSize
  ); 6️⃣

  const str = 
    new TextDecoder("utf8").decode(extractedBuffer); 7️⃣

  console.log(str) 8️⃣

})();
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: Call the hello function and retrieve a bigint (you can add this line to check the type of the return value: console.log(helloPointerSize, typeof helloPointerSize);).
  • 2️⃣: Create an Uint8Array from the memory buffer.
  • 3️⃣: Create a mask (that will help us to extract the values of the "pair", I will write some explanations later (*)).
  • 4️⃣: Extract the pointer position (with the left shift operator).
  • 5️⃣: Extract the string size (thanks to the mask).
  • 6️⃣: Extract the piece of the buffer that contains the string thanks to the pointer position and the string's length.
  • 7️⃣: Decode the extracted buffer.
  • 8️⃣: Print "hello world".

To run the script, use this command:

> node --experimental-wasi-unstable-preview1 index.js
"hello world"
Enter fullscreen mode Exit fullscreen mode

(*): why the mask?

The mask allows to extract the lowest 32 bits value from the "pair value". The mask is "created" like that:

> BigInt(0b11111111111111111111111111111111) 1️⃣
4294967295n
> (2n ** 32n) - 1n 2️⃣
4294967295n
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: You can count 32 bits with the value 1.
  • 2️⃣: BigInt(0b11111111111111111111111111111111) is equivalent to (2n ** 32n) - 1n.

🖐 the explanation of [Axel Rauschmayer (https://twitter.com/rauschma) is evident in my mind, but it's more difficult to simply explain as I'm not yet perfectly fluent with "bits acrobatics". Don't hesitate to send me any ideas.

Now, we know how to retrieve a string from a "Wasm function", but what about passing a string as a parameter to a Wasm function?

Pass a string as a parameter to the Wasm function

Pass a string pointer and a size to a Wasm function

For this use case, the host (Node.js) has to copy the string parameter to the memory of the Wasm module before being able to call the hello function. To allow this "copy", we need to add a memory allocation function to the Wasm module.

So, we will change the source code my "Wasm hello function" and add a new function alloc:

📝 hello.go

package main

import (
  "unsafe"
  "fmt"
  "strings"
)

func main() { }

//export alloc
func alloc(size uint32) *byte { 1️⃣
  buf := make([]byte, size)
  return &buf[0]
}

//export hello
func hello(subject *uint32, length int) uint64 { 2️⃣

  var subjectStr strings.Builder 3️⃣
  pointer := uintptr(unsafe.Pointer(subject))
  for i := 0; i < length; i++ { 4️⃣
    s := *(*int32)(unsafe.Pointer(pointer + uintptr(i)))
    subjectStr.WriteByte(byte(s))
  }

  output := subjectStr.String() 5️⃣

  message := "👋 hello " + output 6️⃣
  buf := []byte(message)
  bufPtr := &buf[0]
  unsafePtr := uintptr(unsafe.Pointer(bufPtr))

  ptr := uint32(unsafePtr)
  size := uint32(len(buf))

  ret := (uint64(ptr) << uint64(32)) | uint64(size)

  return ret 7️⃣
}
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: Allocate the in-Wasm memory region and returns its pointer to hosts.
  • 2️⃣: We will pass a memory pointer on the string and the length of the string.
  • 3️⃣: Initialize a string builder.
  • 4️⃣: Iterate through the memory region to feed the string builder.
  • 5️⃣: Generate a string from the string builder
  • 6️⃣: Add the string parameter to message
  • 7️⃣: if the parameter is "Bob Morane", then, ret == "👋 hello Bob Morane".

Calling the hello function from JavaScript (Node.js)

This is the JavaScript code to call the function with a string parameter (in fact a string pointer and a string length) and retrieve the string result:

📝 index.js

"use strict";
const fs = require("fs");
const { WASI } = require("wasi");
const wasi = new WASI();
const importObject = { wasi_snapshot_preview1: wasi.wasiImport };

(async () => {
  const wasm = await WebAssembly.compile(
    fs.readFileSync("./hello.wasm")
  );
  const instance = 
    await WebAssembly.instantiate(wasm, importObject);

  wasi.start(instance);

  // 🖐 Prepare the string parameter
  const stringParameter = "Bob Morane";
  const bytes = 
    new TextEncoder("utf8").encode(stringParameter); 1️⃣

  // © Completely inspired by:
  // https://radu-matei.com/blog/practical-guide-to-wasm-memory/
  // Copy the contents of the string into the module's memory
  const ptr = instance.exports.alloc(bytes.length); 2️⃣
  const mem = new Uint8Array( 3️⃣
    instance.exports.memory.buffer, ptr, bytes.length
  );
  mem.set(new Uint8Array(bytes)); 4️⃣

  let helloPointerSize = 
    instance.exports.hello(ptr, bytes.length); 5️⃣

  let memory = instance.exports.memory;

  const completeBufferFromMemory = 
    new Uint8Array(memory.buffer);

  const MASK = (2n**32n)-1n;
  let stringPtrPosition = Number(helloPointerSize >> BigInt(32));
  let stringSize = Number(helloPointerSize & MASK);

  const extractedBuffer = completeBufferFromMemory.slice(
    stringPtrPosition, stringPtrPosition+stringSize
  );

  const str = new TextDecoder("utf8").decode(extractedBuffer);
  console.log(`📝: ${str}`);

})();
Enter fullscreen mode Exit fullscreen mode
  • 1️⃣: Transform the input string into its UTF-8 representation
  • 2️⃣: Copy bytes into the instance exported memory buffer. The alloc function (from the Wasm module) returns an offset in the module's memory to the start of the block.
  • 3️⃣: Create a typed ArrayBuffer at ptr of proper size
  • 4️⃣: Copy the content of bytes into the memory buffer
  • 5️⃣: Call the module's hello function and get the offset into the memory where the module wrote the result string.
  • 6️⃣: Print "👋 hello Bob Morane".

To run the script, use this command:

> node --experimental-wasi-unstable-preview1 index.js
"👋 hello Bob Morane"
Enter fullscreen mode Exit fullscreen mode

That's all for today. I learned a lot while writing this blog post. I hope this will help you and do not hesitate to suggest me improvements.👋

Photo by Steve Johnson on Unsplash

Discussion (0)