Disclaimers:
- I'm more comfortable with JavaScript and Java
- I didn't use pointers (before)
- Bit masking is totally new for me (and hard to understand)
Despite all of this, when I understood that the WASI support already landed in Node.js (thanks this blog post: Getting started with NodeJS and the WebAssembly System Interface), I decided to try an explanation about how to exchange strings between Wasm modules and hosts when using WASI.
I will write the Wasm module with TinyGo, and use Node.js as the WASI host.
I was able to write this blog post thanks to the three below persons and one below project:
(Many) Thanks to:
- Axel Rauschmayer for his patience, his knowledge and his pedagogy when explaining bit masking to me.
- Radu Matei for his two brilliant blog posts Getting started with NodeJS and the WebAssembly System Interface and A practical guide to WebAssembly memory
- Takeshi Yoneda for his awesome project Wazero ans especially this example tinygo/testdata/greet.go
- WasmEdge Project for its very instructive examples: Pass complex parameters to Wasm functions
Only four data types
With WebAssembly, data exchanges between the host and a Wasm module function can only be done with four data types:
- i32: 32 Bit Integer
- i64: 64 Bit Integer
- f32: 32 Bit Floating Point Number
- f64: 64 Bit Floating Point Number
Suffice to say that passing a string as a parameter or returning a string is not necessarily trivial.
So today, my main focus is to use the simplicity of JavaScript to explain the mechanics of passing and returning strings.
Returning a string from Wasm - Getting it with JavaScript
Returning a string from a Wasm function
This is the source code of the Wasm module
📝 hello.go
package main
func main() { }
//export hello 1️⃣
func hello() *byte {
return &(([]byte)("hello world")[0]) 2️⃣
}
- 1️⃣: this annotation allows to export the
hello
function (and make it "importable" from the host)- 2️⃣: the function returns a pointer on the starting position of an array of bytes that contains the
"hello world"
string (the string's length equals11
)
To compile the Wasm module, use this command:
tinygo build -o hello.wasm -target wasi ./hello.go 1️⃣
- 1️⃣: it will produce a file named
hello.wasm
Calling the hello function from JavaScript (Node.js)
This is the JavaScript code to call the function and retrieve the string:
📝 index.js
"use strict";
const fs = require("fs");
const { WASI } = require("wasi");
const wasi = new WASI();
const importObject = { wasi_snapshot_preview1: wasi.wasiImport };
(async () => {
const wasm = await WebAssembly.compile(
fs.readFileSync("./hello.wasm") 1️⃣
);
const instance =
await WebAssembly.instantiate(wasm, importObject); 2️⃣
wasi.start(instance);
const helloStringPosition = instance.exports.hello(); 3️⃣
const memory = instance.exports.memory; 4️⃣
const extractedBuffer =
new Uint8Array(memory.buffer, helloStringPosition, 11); 5️⃣
// 11 == length of "hello world"
const str =
new TextDecoder("utf8").decode(extractedBuffer); 6️⃣
console.log(str); // "hello world"
})();
- 1️⃣: load the Wasm module
- 2️⃣: create an instance of the module
- 3️⃣: call the exported
hello
function;helloStringPosition
is the memory position of the pointer- 4️⃣: get the sharable memory of the module.
- 5️⃣: the
memory.buffer
is anArrayBuffer
. We need to create a typed array object (Uint8Array
) to manipulate its content. To extract the appropriate content, we need the start position (helloStringPosition
) and the length of the string (11
).- 6️⃣ Then we decode the extracted buffer to get the whole string
"hello world"
.
To run the script, use this command:
node --experimental-wasi-unstable-preview1 index.js 1️⃣
- 1️⃣: it will display
hello world
For a better understanding
You can display the content of the extracted buffer thanks to console.log
:
console.log(extractedBuffer)
And this is the content of the buffer:
Uint8Array(11) [
104, 101, 108, 108,
111, 32, 119, 111,
114, 108, 100
]
Each item of the array is a letter of the string "hello world". Add this line in your source code:
extractedBuffer.forEach(
item =>
console.log(item,":",String.fromCharCode(item))
)
For each item of the array you'll get the corresponding letter
104 : h
101 : e
108 : l
108 : l
111 : o
32 :
119 : w
111 : o
114 : r
108 : l
100 : d
The annoying thing is that you have to know the length of the return string. A solution could be to parse the buffer and stop the extraction when you get a 0 code that means "end of the string", but it's not well optimized.
In a perfect world, the function would return the pointer position and the size of the string, but, right now, in the Wasm world, a function can return only one value.
But it was by reading the code of an example from the Wazero project that I understood that it was possible to "put" two values in a single one:
// _greeting is a WebAssembly export that accepts a string pointer
// (linear memory offset) and returns a pointer/size pair
// packed into a uint64.
//
// Note: This uses a uint64 instead of two result values
// for compatibility with WebAssembly 1.0.
//export greeting
func _greeting(ptr, size uint32) (ptrSize uint64) {
name := ptrToString(ptr, size)
g := greeting(name)
ptr, size = stringToPtr(g)
return (uint64(ptr) << uint64(32)) | uint64(size)
}
extract from: https://github.com/tetratelabs/wazero/blob/main/examples/allocation/tinygo/testdata/greet.go#L50
This "magic" line (uint64(ptr) << uint64(32)) | uint64(size)
allows to pack the pointer position and the size of the string in an only one return value.
It was my first baby steps in the discovery of bit masking and bit shifting: 🤔 how to use it with Node.js? It was hard for my little 🧠, luckily I was helped a lot by Axel Rauschmayer. So, in the next section, I will try to explain another way to return a string from TinyGo and to read it from JavaScript.
AGAIN: Returning a string from Wasm - Getting it with JavaScript
Returning a string pointer and a size from a Wasm function
I changed the source code of the Wasm module inspired by the example of wasero:
📝 hello.go
package main
import (
"unsafe"
"fmt"
)
func main() { }
//export hello
func hello() uint64 { // ptrAndSize
message := "hello world"
buf := []byte(message)
bufPtr := &buf[0]
unsafePtr := uintptr(unsafe.Pointer(bufPtr))
ptr := uint32(unsafePtr) 1️⃣
size := uint32(len(buf)) 2️⃣
ret := (uint64(ptr) << uint64(32)) | uint64(size) 3️⃣
return ret
}
- 1️⃣: I "transform" the pointer position to a
uint32
.- 2️⃣: I "transform" the string's length to a
uint32
.- 3️⃣: I "pack" a pointer/size pair into a
uint64
.
Calling the hello function from JavaScript (Node.js)
This is the JavaScript code to call the function and retrieve the string without knowing the size of the return string:
📝 index.js
"use strict";
const fs = require("fs");
const { WASI } = require("wasi");
const wasi = new WASI();
const importObject = { wasi_snapshot_preview1: wasi.wasiImport };
(async () => {
const wasm = await WebAssembly.compile(
fs.readFileSync("./hello.wasm")
);
const instance =
await WebAssembly.instantiate(wasm, importObject);
wasi.start(instance);
// call hello
// get a kind of pair of value
const helloPointerSize = instance.exports.hello(); 1️⃣
const memory = instance.exports.memory;
const completeBufferFromMemory =
new Uint8Array(memory.buffer); 2️⃣
const MASK = (2n**32n)-1n; 3️⃣
// extract the values of the pair
const ptrPosition = Number(helloPointerSize >> BigInt(32)); 4️⃣
const stringSize = Number(helloPointerSize & MASK); 5️⃣
const extractedBuffer = completeBufferFromMemory.slice(
ptrPosition, ptrPosition+stringSize
); 6️⃣
const str =
new TextDecoder("utf8").decode(extractedBuffer); 7️⃣
console.log(str) 8️⃣
})();
- 1️⃣: Call the
hello
function and retrieve abigint
(you can add this line to check the type of the return value:console.log(helloPointerSize, typeof helloPointerSize);
).- 2️⃣: Create an
Uint8Array
from the memory buffer.- 3️⃣: Create a mask (that will help us to extract the values of the "pair", I will write some explanations later (*)).
- 4️⃣: Extract the pointer position (with the left shift operator).
- 5️⃣: Extract the string size (thanks to the mask).
- 6️⃣: Extract the piece of the buffer that contains the string thanks to the pointer position and the string's length.
- 7️⃣: Decode the extracted buffer.
- 8️⃣: Print
"hello world"
.
To run the script, use this command:
> node --experimental-wasi-unstable-preview1 index.js
"hello world"
(*): why the mask?
The mask allows to extract the lowest 32 bits value from the "pair value". The mask is "created" like that:
> BigInt(0b11111111111111111111111111111111) 1️⃣
4294967295n
> (2n ** 32n) - 1n 2️⃣
4294967295n
- 1️⃣: You can count
32
bits with the value1
.- 2️⃣:
BigInt(0b11111111111111111111111111111111)
is equivalent to(2n ** 32n) - 1n
.🖐 the explanation of [Axel Rauschmayer (https://twitter.com/rauschma) is evident in my mind, but it's more difficult to simply explain as I'm not yet perfectly fluent with "bits acrobatics". Don't hesitate to send me any ideas.
Now, we know how to retrieve a string from a "Wasm function", but what about passing a string as a parameter to a Wasm function?
Pass a string as a parameter to the Wasm function
Pass a string pointer and a size to a Wasm function
For this use case, the host (Node.js) has to copy the string parameter to the memory of the Wasm module before being able to call the hello
function. To allow this "copy", we need to add a memory allocation function to the Wasm module.
So, we will change the source code my "Wasm hello function" and add a new function alloc
:
📝 hello.go
package main
import (
"unsafe"
"fmt"
"strings"
)
func main() { }
//export alloc
func alloc(size uint32) *byte { 1️⃣
buf := make([]byte, size)
return &buf[0]
}
//export hello
func hello(subject *uint32, length int) uint64 { 2️⃣
var subjectStr strings.Builder 3️⃣
pointer := uintptr(unsafe.Pointer(subject))
for i := 0; i < length; i++ { 4️⃣
s := *(*int32)(unsafe.Pointer(pointer + uintptr(i)))
subjectStr.WriteByte(byte(s))
}
output := subjectStr.String() 5️⃣
message := "👋 hello " + output 6️⃣
buf := []byte(message)
bufPtr := &buf[0]
unsafePtr := uintptr(unsafe.Pointer(bufPtr))
ptr := uint32(unsafePtr)
size := uint32(len(buf))
ret := (uint64(ptr) << uint64(32)) | uint64(size)
return ret 7️⃣
}
- 1️⃣: Allocate the in-Wasm memory region and returns its pointer to hosts.
- 2️⃣: We will pass a memory pointer on the string and the length of the string.
- 3️⃣: Initialize a string builder.
- 4️⃣: Iterate through the memory region to feed the string builder.
- 5️⃣: Generate a string from the string builder
- 6️⃣: Add the string parameter to
message
- 7️⃣: if the parameter is "Bob Morane", then,
ret == "👋 hello Bob Morane"
.
Calling the hello function from JavaScript (Node.js)
This is the JavaScript code to call the function with a string parameter (in fact a string pointer and a string length) and retrieve the string result:
📝 index.js
"use strict";
const fs = require("fs");
const { WASI } = require("wasi");
const wasi = new WASI();
const importObject = { wasi_snapshot_preview1: wasi.wasiImport };
(async () => {
const wasm = await WebAssembly.compile(
fs.readFileSync("./hello.wasm")
);
const instance =
await WebAssembly.instantiate(wasm, importObject);
wasi.start(instance);
// 🖐 Prepare the string parameter
const stringParameter = "Bob Morane";
const bytes =
new TextEncoder("utf8").encode(stringParameter); 1️⃣
// © Completely inspired by:
// https://radu-matei.com/blog/practical-guide-to-wasm-memory/
// Copy the contents of the string into the module's memory
const ptr = instance.exports.alloc(bytes.length); 2️⃣
const mem = new Uint8Array( 3️⃣
instance.exports.memory.buffer, ptr, bytes.length
);
mem.set(new Uint8Array(bytes)); 4️⃣
let helloPointerSize =
instance.exports.hello(ptr, bytes.length); 5️⃣
let memory = instance.exports.memory;
const completeBufferFromMemory =
new Uint8Array(memory.buffer);
const MASK = (2n**32n)-1n;
let stringPtrPosition = Number(helloPointerSize >> BigInt(32));
let stringSize = Number(helloPointerSize & MASK);
const extractedBuffer = completeBufferFromMemory.slice(
stringPtrPosition, stringPtrPosition+stringSize
);
const str = new TextDecoder("utf8").decode(extractedBuffer);
console.log(`📝: ${str}`);
})();
- 1️⃣: Transform the input string into its UTF-8 representation
- 2️⃣: Copy
bytes
into theinstance
exported memory buffer. Thealloc
function (from the Wasm module) returns an offset in the module's memory to the start of the block.- 3️⃣: Create a typed
ArrayBuffer
atptr
of proper size- 4️⃣: Copy the content of
bytes
into the memory buffer- 5️⃣: Call the module's
hello
function and get the offset into the memory where the module wrote the result string.- 6️⃣: Print
"👋 hello Bob Morane"
.
To run the script, use this command:
> node --experimental-wasi-unstable-preview1 index.js
"👋 hello Bob Morane"
That's all for today. I learned a lot while writing this blog post. I hope this will help you and do not hesitate to suggest me improvements.👋
Photo by Steve Johnson on Unsplash
Top comments (0)