Notes on working with C and WebAssembly

If you would like to make extremely lean software for the web, C & WebAssembly is one option. Truthfully the only really practical way of doing that has been to use Emscripten, emulating a lot of what you expect from a normal C environment in Javascript at the cost of a fair bit of overhead.

If you have a "every byte is precious" attitude like me (and like to get your hands dirty!) you can certainly do it yourself, as well. The available information online is rather sparse, so this page mostly consists of things I've discovered toying around with just that.

I was able to make a WebGL2 demo in 6.2KiB (uncompressed, not counting textures) without too much effort, so expect to get results around that magnitude for tasks of similar complexity.

Building

To generate WebAssembly binaries, you need LLVM, clang, and lld. Stable – 5.0, at the time of writing – won't cut it, so build it from source. Sadly LTO doesn't work, so we need to fiddle around with the IR.

First, let's turn each of our C source files into LLVM IR bitcode. This can be done like so:

clang -cc1 -Ofast -emit-llvm-bc -triple=wasm32-unknown-unknown-wasm -std=c11 -fvisibility hidden src/*\.c

Combine all the bitcode into one and optimize it again:

llvm-link -o wasm.bc src/*\.bc
opt -O3 wasm.bc -o wasm.bc

Next step is actually compiling it:

llc -O3 -filetype=obj wasm.bc -o wasm.o

Linking can now be done with lld.

wasm-ld --no-entry wasm.o -o binary.wasm --strip-all -allow-undefined-file wasm.syms --import-memory

Note that if you are running a version later than this commit, you'll need to add "--export-dynamic" as well to prevent wasm-lld stripping absolutely everything.

Running

Our binary is now ready for use. Let's whip up some JS to do so:

let imports = {};
let memory = null;
let exports = null;

let request = await fetch( 'binary.wasm' );
let binary = await request.arrayBuffer();

imports['memory'] = new WebAssembly['Memory']( {'initial':32} );
memory = new Uint8Array( imports['memory']['buffer'] );
let program = await WebAssembly['instantiate']( binary, { "env":imports } );

let instance = program['instance'];
exports = instance['exports'];

Calling Javascript from C

All functions you want to be able to call from the WebAssembly module should be placed into imports, like so:

imports['print_num'] = function( n ){
    console.log( "Your number is " + n  );
    return 123;
};

Secondly, create declaration for it in a C header somewhere. The only types you can use are i32, f32, and f64; I'll go over how to pass buffers and strings shortly.

typedef signed int i32;

[...]

i32 print_num( i32 n );

Finally, add the name of the function to wasm.syms so that the linker won't complain. You can then call the function to your hearts content.

Calling C from Javascript

Add __attribute__((visibility("default"))) to the function you want to call, so it won't get stripped out:

#define export __attribute__( ( visibility( "default" ) ) 

[...]

export i32 some_func( i32 n )
{
    return n+1;
}

Then you can immediately go ahead and call it from exports. Easy!

exports['some_func']( 1 );

Passing memory and strings

This is a little trickier than just returning an integer. From C, you return a pointer to the memory, as well as the length:

void console_log( i32 str, i32 len );

[...]

const char * string = "Hello World!";
console_log( string, strlen(string) );

With the help of the memory object, you can get the bytes in question from Javascript. The TextDecoder interface can then convert it to a string if so desired.

let utf8decoder = new TextDecoder( "utf-8" );

[...]

function console_log( str, len ){
    let arr = memory.subarray( str, str+len );
    console.log( utf8decoder.decode( arr ) );
}

Passing data the other way requires you to allocate space in C, and then blit the data into the memory object from Javascript.

Passing objects

You can't. To operate on e.g. WebGL objects, you'll need to store them JS-side and return an integer reference to it.

Here's an example how you might do it:

let gl_id_freelist = [];
let gl_id_map = [ null ];

[...]

function webgl_id_new( obj ){
    if( gl_id_freelist.length == 0 )
    {
        gl_id_map.push( obj );
        return gl_id_map.length - 1;
    }
    else
    {
        let id = gl_id_freelist.shift();
        gl_id_map[id] = obj;
        return id;
    }
}

function webgl_id_remove( id ){
    delete gl_id_map[id];
    gl_id_freelist.push( id );
}

[...]

imports["glCreateShader"] = function( type ){
    let shader = gl.createShader( type );
    let shader_id = webgl_id_new( shader );
    return shader_id;
}

imports["glDeleteShader"] = function( shader_id ){
    let shader = gl_id_map[shader_id];
    gl.deleteShader( shader );
    webgl_id_remove( shader_id );
}

The standard C library

Naturally none of it is available, so we'll have to implement what we want ourselves. Since we can't use malloc and friends, we'll either have to:

In the future there will probably be implementations of malloc written as separate wasm modules that you could link into your own, but no such luck yet.

Builtins

Fortunately clang has a few builtin functions, mostly for string handling, bit twiddling, and trigonometry. To use them you can simply call __builtin_cosf() or the like; you won't need a header or anything. There's no guarantee that LLVM won't just try to insert a call to the non-existant C library function anyway however, so it's probably not something you should rely on outside of simple demos and such.

Futher optimizations with Binaryen

The Binaryen toolchain includes wasm-opt, a tool that reads WebAssembly, optimizes it, and then spits it out again. It shrinks my program by 10% or thereabouts, but your mileage may vary.

wasm-opt -Oz binary.wasm -o binary_opt.wasm

If you have large buffers allocated statically, LLVM insits on including them verbatim in the binary. wasm-opt will strip it out, so significant gains might be had in that case.

Demo

You can find the demo I did here. The source is available on Github as well.