Nathaniel McCallum presented "WASI Networking: Towards a World Wide WebAssembly" at Wasm Day / KubeCon + CloudNativeCon 2022.
My name is Nathaniel McCallum, CTO of Profian. We're going to be talking about WASI Networking and, in particular, what drove us to propose the addition of socket_accept to WASI networking, what the most recent developments are in this space, and the unique ways in which the Enarx project is using them.
(What’s WASI?) So I think a lot of people know the answer to this question. But just in case, if there's somebody in the room who doesn't and has seen this acronym all over the place in all the talks today, and is still abundantly confused. WASI is actually a pretty simple concept. We have our WebAssembly code that's running and we have some native code underneath it. And we just need an interface between the two so they can talk to each other. And we can always do this with custom APIs. But custom APIs aren't great for building communities. And they aren't great for scaling code. And so we want something that's standardized that can give an excellent experience on every language platform out of the box. And let’s look at the next slide and get a little bit of history here. So WASI actually started under a different name, called Cloud API, and along with a few other inspirations, it started in 2016. And after we released the WebAssembly MVP in 2017, people started to think more systematically about this. And so Cloud API sort of developed into what we today call WASI, which is a subgroup in the W3C, and is actually working on this standard. So at that point, Cloud API was deprecated. And so everybody is basically trying to use WASI today with some degree of success. And some of the more recent developments are the ones I mentioned. Basically, we're trying to drive this effort on modularization, which is really important, because there's a lot of environments that could use WebAssembly, but they may not be able to expose all of the interfaces that could be available under WASI. So we want to divide up the WASI specification into multiple different modules, so that platforms can support only the APIs that they are able to support. And then specifically, we added the accept call this last year to snapshot one. And we're going to talk a little bit more about that in a moment. So the WASI snapshot is insufficient, right? This is what we're all basically trying to run on today. And it has a lot of niceties to it. So it does contain a bunch of interfaces, things like clocks, file systems, networking, arguments, etc. However, it's really not modular as I was talking about before, you sort of have to buy into the whole thing. And one counter example, for example, of where this doesn't work is actually in the Enarx project. So although we are working on file system support today, as of today, in our latest release, there is no file system support at all. So if you attempt to call any of the file system APIs, you'll simply get an error. That will magically disappear in a future release. And we will have transparently encrypted file systems. So we look for things to get better in that regard. But we still need to divide up WASI into multiple different modules, so that we can advertise support for different feature sets. And for a long time, modularization has been blocked on interface types. And more lately on streams, which is currently under active development, thanks to many of the people in this room. And we're looking for this glorious future, which will arrive any day now.
So under WASI snapshot zero, if we're just looking at the networking calls, snapshot zero had a variety of interfaces. But if we just look at the networking calls, we really only had three calls directly having to do with sockets. And that was being able to receive packets, being able to send packets and being able to close a stream. And it's a remarkably simple API, right? And the question is sort of, what can we do with this? Now, there's actually two other sorts of interfaces that sneak in under the guise of networking, which is poll_oneoff. poll_oneoff allows you to basically receive a notification when there is I/O ready to be performed on a given file descriptor. And, of course, there's the nonblock flag as well, which allows you to set a file descriptor in a non blocking mode, which is what allows you to get to this poll event and then it won't block if there's not enough data to read and so forth. So this core basically summarizes what snapshot zero was in terms of networking. And you should immediately notice that there is no way to create a new socket here. So basically, the runtime could create a set of sockets, could hand it over to the WebAssembly application, and you could operate on that socket. You can read it right in those two streams, you could close them. You can wait for I/O. But that's really all you can do. And so we really need the ability to create new sockets. But the problem is on what capability do we do this. And for those of you who are not familiar with capabilities, we need to give a brief introduction to capability based security. So each WASI call has a capability context. And you can think, look at two calls there, we have an open call, and an openat now. WASI has no open call, because open is global, it operates on a global context, where openat operates on a directory context. So you can say within this directory, open the path. And this actually goes all the way back to the 1960s, when we first started getting memory controllers in our hardware, and so we invented this concept of processes, where a process can have a separate address space. So every time you call fork, you basically get a different address space for the process. And the end result of having a different process base means that one application can't muck with another application in memory. And that's a really great feature. The problem is nobody ever thought to extend this notion of privacy, or multiple views. Today, we call them namespaces, in the Linux kernel, to all of the other resources that are available on the system. So for example, while you've got a private view of memory, you've got a global view of the file system. So everybody saw all the same files on the disk. And the only access control you had was whether you have permissions to access that file. It was the same thing with networking. It typically would have a set of networking interfaces on a Linux system. And if you had access to one of them, you had access to all of them, or plus or minus. And so basically, capabilities provide us a low cost alternative to OS namespacing. And so there's been a significant amount of effort. Of course, basically, all containers are built on top of operating system namespacing, where you can actually create different namespaces for network interfaces and file systems, and give people separate views of those. But capability based security basically says we're not going to have any APIs that don't have a context to them. And so what this means is that we can always create private views of the resources on the system, because every API receives a context. And so what we want basically is we want a system where there are no global resources. And the runtime can always indicate which resources a particular WebAssembly executable has. So this poses particular challenges for the global APIs that we all know and love, particularly in the networking world, where you're typically used to having a global view of network IP addresses, and so forth. And you operate in this global context. Well, when we're trying to do capability based security in WASI, that's not exactly a great fit. And there's a tension there. But it's not just the Berkeley sockets, it's also file systems. So we've solved this pretty efficiently by using an openat, for example. But even today, for file systems, for example, if you look at the Rust standard API, and compare it with openat, openat takes a directory file descriptor. Well, Rust doesn't expose at all in their standard library a primitive for operating on open directories. And so even though the underlying operating system does provide openat, the Rust standard library provides no way to actually access that. So the fundamental situation we found ourselves in with snapshot zero was that there was no way to create new sockets, the runtime could create sockets ahead of time, could hand them to the runtime, you could read and write on them, you could close them. But that was it. If you wanted to do more, you're out of luck.
So fortunately, we were able to get this to snapshot one. And then snapshot one, we got most of this of the same stuff. But we Profian felt pretty constrained by not being able to create any incoming connections. And so we have essentially proposed the addition of sock accept as an API. And I think it's great, everyone was pretty enthusiastic about this, we were able to move really quickly. Profian sponsored the addition of the entire networking stack into the Rust standard library. And we also provided patches to WASI libc. There are a number of people here who, like Microsoft, for example, has done this in .NET. So it's great to see a lot of people taking this up. And basically what it means is we do have now the ability to accept incoming sockets. And the reason we can do this is because when you pre create a listening socket in the Berkeley sockets API, that listening socket already provides a context, so we are not violating capability based security here. We are just simply using the incoming listening socket as that context and so it was pretty easy to add this. And so as I mentioned, this has been implemented in a variety of places. This has been a lot of work. Profian has done some of this work, but others have done it as well. So thanks to everyone who's contributed. And basically what we see is in main right now, in WASI libc, there's sock accept support. So this means anybody who is consuming WASI libc as their interface to WASI automatically get socket accept as part of this, this would include a bunch of the dynamic languages like Python, and Ruby, and so forth. In the Rust world, we added support for networking to the standard library, this is available in nightly. We also added support to mio to be able to support “poll_oneoff,” which couldn't currently support and so now, mio actually supports the ability to do non blocking I/O based in WASI. We currently have somebody working on getting tokio up and running. So we would really like to see the entirety of the tokio framework. And we're also evaluating async std. If anyone in this room is interested in collaborating with these, we would love to have your collaboration. This is work that really benefits everybody. So we'd love to make a good showing of it.
So as we look beyond WASI snapshot one, however, we still have a variety of things that have to happen in order for us to make forward progress. Fortunately, we have pretty mature interface types at this point, the tooling is rapidly maturing in this area, we are also starting to get the streams definition to be somewhat mature. I'm hoping that this will accelerate in the coming days as people show more and more interest in it. I think it's pretty clear, and there were at least four talks, I think, that mentioned that their biggest pain point in WebAssembly today was networking. So it seems to me that there's a pretty broad consensus that this is something we need to pay attention to. We really need to target three different scenarios. And right now, there's a lot of work being done in the last one. But I want to talk about what these three are. And I want to talk about the subtle differences between them. And why I think we need to actually adopt all of them.
So the subtle difference between them is, the first one is blocking. And blocking is the old Berkeley sockets that we know and love, you know, since time immemorial, if you create a socket, and do a connect, you're going to wait until you know that connection completes before the function returns. The same thing with reading or writing and so forth. Non blocking was a mode that was added to this where you could set the nonblocking flag on the socket. And then if there was no I/O available to be performed, and you did a read, for example, the function would return immediately with an error E, again, saying that you need to call this function again, when there's actually I/O available. And so this is combined with a polling function of some kind, in WASI, this is poll_oneoff. And with poll_oneoff, what it allows you to do is you can call poll_oneoff and poll_oneoff will block regardless what the state of the non blocking flag is. And when poll_oneoff returns, it gives you an indication that there is I/O ready to be performed. And then you can call the non blocking read. And instead of receiving that E again error, you will instead receive some of the data that was available on that connection. And so we might call this, I got this term from Dan Gohman, and this is notification, a notification mode. async is different. But it's very subtly different. async is where we indicate to the kernel or the runtime that we want to perform some I/O. And that function immediately returns. And then we can call another function later to block and it returns only when the I/O is complete. So the distinction between non blocking and async is that non blocking provides you a notification that I/O is available. And then you perform a non blocking read, where in async, you give an indication that you want to do a read, and then you call a function that blocks until all of the data is available. So notification versus completeness. Thank you Dan Gohman for that great phrase. And so we still also need to port existing tooling at Bindgen. And I know Dan is working on that furiously. We also have a new networking proposal that's been proposed. And the proposal that's been proposed is fairly reminiscent of what we know of from traditional Berkeley sockets. But that may actually pose some problems. And you'll see why when we get to the Enarx demo in a moment. Because one of the things it does is it exposes all of the lower level protocols. And then one of the questions is, do we actually want to expose all of those lower level protocols? Where do we really just want to say, I have this named thing, maybe it's an outgoing connection, maybe it's an incoming connection, and I want to perform operations on it. But all of the details of what that thing actually is, is maybe hidden by the runtime. You'll see why this is important for a moment. As a spoiler alert, we do transparent TLS in Enarx. So when you create sockets, you're automatically getting a TLS socket. It's not TCP, we don't allow the use of TCP at all. So this does provide some challenges. For example, for TLS. It also provides challenges if we're just going to wrap the bare Berkeley sockets API, and expose all the underlying protocols, it also means that we are going to have difficulties with multi layer policy. For example, if let's say in a world where you're not doing transparent TLS, like Enarx is, and you want to do TCP operations, but you also want to do TLS operations. Well, how do you control the policy over which is allowed to which hosts? It becomes a fairly complex problem to figure out what the actual interactions are between those things. And the reason for that is because TLS is a species of TCP. So now you have to, on every packet, you have to analyze, okay, if TLS isn't allowed, is this packet that I'm receiving on TCP actually, it's a TLS packet. And if it is, then I have to evaluate my policy. So now we're sort of forcing everyone into deep packet inspection, which is probably not a place where we want to be. So we really need some good thinking about this. And really, this is just an invitation to participate. I know there's a lot of people in this room that really care about WASI, that care about networking. So this is a really good opportunity to contribute to this discussion, and help us to create a design that looks really good. By the way, all my credit goes to the author of the proposal, it's a very thorough proposal. So I'm not knocking him at all. And it's really just a matter of what we can come up with that fits the needs of the community the best.
So we actually have a demo today. And I want to be able to demo, essentially, what you can create in a sock accept enabled world. So everything you're going to see today is running today on the most recent release of Enarx, which was last week, 0.5. And we're going to show an application called Cryptle. And Cryptle is a clone of everyone's favorite game, Wordle. Except it is done in an encrypted environment. And first, we're going to show it running on Enarx just so you can get a feel of what the application does, then we're going to attack Cryptle on wasmtime, and I'm not singling out wasmtime, here's the bad guy. Okay, wasmtime is fantastic. We use wasmtime internally. Okay, what I am trying to show by using wasmtime here is that we're going to take the same exact WebAssembly binary that we ran into wasmtime, and we're gonna deploy that binary using Enarx. And we're gonna get a bunch of other protections for free. And so we're gonna show an attack on Cryptle using wasmtime, we're going to do an attack retrospective, we're going to analyze why the attack worked and what we could do to stop it. And then we're going to try the same attack on Enarx. I need to pause here for a moment because a huge thanks needs to go out to Harald Hoyer, Richard Zack, Roman, and Nick. You guys put in a tremendous amount of work on this demo. And I'm just really pleased to work with you all. So thank you very much. By the way, Harald was supposed to be giving this talk today, but his wife is expecting so if you know Harald, send him congratulations.
All right, hopefully this video is gonna come here. Go Go gadget internet. This is when you record the video, so you don't have problems. And then, of course, you have problems with the video. Oh, there we go. Okay. So we have this game Cryptle. And Cryptle is basically a multiplayer Wordle demo. And you can get some words on the left. And one of the things that's different about Cryptle compared to the normal Wordle game is that in the normal Wordle game, the word that is actually guessed, the word list is all actually in the client. It's done on the server. So anyone who is good at inspecting in the browser console, they can figure out what the word is. But we wanted to do something that's more secure, we want the word to actually be chosen on the server side. And then more than that, we wanted to allow multiple players to guess. And we wanted them to see when they actually guessed other players' words. So this is not a super competitive game. It's just a game for a little bit of fun. So we have three players here. And they're all basically playing the Cryptle game. And you can see oh, we got words, we got three letters there. And now we're gonna guess "world" and see we actually guessed one of the other players words, and so it showed up in a special color. And finally, we're going to play on the third player here. And we're going to be doing the same thing. Just guessing letters. While this is playing, I'm going to make a brief PR announcement: we did release last week 0.5. We now have support for running Enarx in the unencrypted mode on both MacOS and your favorite Raspberry Pi. This is in preparation by the way for Arm Realms which has been publicly announced. So stay tuned for news in that regard. So basically, we've seen our application here. And we've guessed another word here. And that's it more or less, we can see who the winners were based on this.
So now what we want to do is, we actually want to, this was actually shown, by the way, this is running in Enarx, on the latest release, and we're gonna skip ahead. And we're gonna do that, we're going to show the application running on a wasmtime. So the text is probably a little small, hopefully, you can see it, we're going to do a cargo build of this Rust crate. And the Rust crate it's just the Cryptle crate, you can actually see this, there'll be a URL for the demo later if you'd like to see it. And so we have run it in wasmtime, it's now listening on a socket. But what we want to do is we are an attacker who has managed to gain root access on the server. And we're trying desperately to get this most prized Wordle word. And so what we want to do is we want to scan the memory of the application for any of the words that are in the dictionary, because we want to find out what the word is, basically bypassing the guessing rule. Here, by the way, you should understand that the guessing rules in Wordle are really just the access controls of your application. And we want to by accessing this host, you're gonna see here in red, we found words that are in the word list. And so as we scan this memory for the application, we pick up I think there's three words, in this particular instance, yeah, there's "youth." And there's one more.
And so although wasmtime has performed spectacularly, we are performing an attack that is out of scope for the security model of wasmtime. So again, wasmtime is not to blame. And if we were running this in Enarx, in debug mode, you'd see exactly the same thing, you'd be able to access the memory and bypass it. So the question is, why did this attack succeed? And the fundamental problem is that we have three different forms of workload isolation. Type one is protecting one workload from another. Type two is protecting the host from a malicious workload. And both of those we actually can do pretty well today, right? There's lots of companies doing this at scale. So this is not a problem. The problem is, we don't have really any protection until confidential computing for the third type of isolation, which is protecting a particular workload from the host. Because currently, the host has access to read all the memory of the application, and can tamper with that application while it's running and so forth. And this is fine, right? Basically, as long as you trust your CSP, and all of their sysadmins, and all of the hardware software and firmware stack. So fortunately, it's not millions of lines of code, or oh, wait, yeah, it is. And then from either compromise, because they may not be doing it directly, they may just have not been able to secure something, or for a supply chain attack on the actual operating system, both now and in the future. Right. So and that's, of course, if you agree with also your CFO and your board and your auditor and your regulator. So all of these are, this is a pretty high list of criteria in order to be able to trust it. And this is something we just sort of accept in the industry today. And we accept it because we aren't aware that there's another way to operate. And that's because the hardware simply hasn't been available. But not all clouds are good. So the question is what makes Enarx different. And how Enarx is different is that we use confidential computing. Confidential computing is a new set of hardware technologies that have come out from all of our secret and favorite CPU manufacturers. For example, Intel, AMD, and Arm has also announced Arm 9 Realms. And basically, this allows you to create an application or a virtual machine within which the memory pages are actually encrypted. And so while the actual application is running, even if the host can scan memory of a normal application, if you've set up this special confidential application correctly, then you won't be able to tamper with it. So we use trusted computing environments, which are based on CPU hardware, we encrypt the workloads, and we provide two things that are really important. And the Enarx project will not be implemented on a TEE platform if it does not provide these two properties. We want integrity, and we want confidentiality. In other words, no peeking, no tweaking. So basically, what you want to do is you want to have started off with your workload here, and you want to put the workload in the host somehow. But the problem is, how do you actually know that the workload that you are attempting to deploy to that host is in fact the workload that gets deployed? And we shouldn't be thinking of this as a certain kind of supply chain attack. We tend to think of supply chain attacks as everything north of me, I grew up in upstate New York. And if you ask anyone where is upstate New York, any New Yorker will reply to you. Well, it's what's north of me. Right? So if you live in New York City, upstate New York, if there's anything north of New York City. If you live in Albany, well, then upstate New York is anything north of Albany. Well, the same thing applies here. Downstream from you is also a supply chain attack. And so what we want to do is we want to create this TEE, and we want to create a measurement of the application, or in this case, the Enarx runtime. And then we want to offload that measurement signed by the hardware to an attestation service, and the attestation service must not be in your cloud, because your cloud provider can't prove to you that they set up the environment correctly, you need an independent source of trust. So we offload the measurements to another attestation service, and the attestation service proves to you cryptographically that the environment that was set up has those two properties: confidentiality and integrity. But what we actually want to do is something more than that, because we actually want to create an empty Keep. There are several systems out there today that try to do something like this, but they deploy the application immediately into an untrusted system. And what if the algorithms of that application, right, what if it's a risk model, and you're an insurer, or what if it's an AI model, or what if it's any of these types of code that need to be protected, of which there's quite a few today. And so what we want to do is we want to bring up an empty Keep that we call this an undifferentiated Keep, it contains only the Enarx runtime, and that's what we measure. And then the attestation service validates this for us, and provides a certificate identifying the workload that gets deployed in that Keep, once the Keep has the certificate, it can then fetch an application from Drawbridge. You can think of Drawbridge as something like an attestation-aware Docker Hub, where it contains the software or the software that you're going to be deploying. And it will only release that software if you perform a successful attestation to the Steward. And so we can show this same exact demo on Enarx. Now one of the things that's not immediately obvious here is that when we ran on wasmtime, our sockets were unencrypted. Well, what's going to happen when we run this time on Enarx, we're going to do the exactly the same thing to play exactly the same binary, but we're going to do the attestation, we're going to get a certificate that identifies the workload. And then we can do transparent TLS on everything that's involved, coming soon will be transparently encrypted filesystem as well. So anything that you persist to the disk is always encrypted. The point is, once data or code enters the system, it never leaves unencrypted. Unless you do something seriously, seriously wrong. We can't make it impossible to do the wrong thing, but we can make it hard.
So this is an example of the wasmtime. We're going to kill that which was previously running. And we're just going to do the same thing. We're going to start it using Enarx instead of wasmtime. And we have this configuration file that identifies the environment and which Steward to contact for the certificate. This file is going to come from the Drawbridge in the future. But what we're going to do right now is we're going to do an upload to Drawbridge of the files we're going to upload the wasm file and we're going to upload the enarx.toml. And once those are both in the Drawbridge, they are now ready to be deployed. So the upload is completed. And now we're just going to do an Enarx deploy, giving the URL of the particular application. Normally we have a shorter slug here that's more similar to the Docker style. But we're specifying a full URL here because the drawbridge is running locally. And unencrypted. The support for unencrypted drawbridge will go away, it's just currently there to support this latest release. So it's taking a moment here to actually start, it's a little bit longer to bring up because we have to bring up the hardware environment, we have to do a bunch of cryptography, we have to contact the Steward and do our attestation, receive our certificate, set up all of our sockets. But once that's done, there we go. Now we've switched to the scanning page, the application is running, we are going to do the same scanning attack that we saw before. This time, we're going to scan for Enarx instead of wasmtime. And we're going to do the same memdump we had just with a different PID. This time, it's the PID for Enarx. And you notice that we've not found any words. And the reason for this is because all of the memory is encrypted. On different hardware platforms, this was on Intel SGX. We can also do this on AMD SEV-SNP, which is the latest Milan generation. And if you do this on SNP, you'll actually get a denial because the hardware actually denies access to the memory that's inside of the encrypted VM. So if you'd like to find more about the Enarx project, as I said, we just released 0.5, last week. We have releases coming every four weeks now. And it's a train. So if you want to contribute, please come hop on the train with us. You can go to the website enarx.dev, we have a blog, we have GitHub, we have chat as well. We're a friendly bunch of people. So come along and help us build a better future.
BTW, Profian is hiring! So if you are like a ninja wizard, and doing all sorts of devops stuff, and you like performance, and low level hardware stuff, or cryptography, if any of that intersection interests you, we are doing really really cool work, and we have a great team of people, so come check us out!
There’s one more thing. Today we are announcing the Cryptle Hack Challenge. Basically we want to see what your 1337 skillz are. We want to know if you can actually hack the Enarx runtime. Now I do have to give a little bit of caveat here that, although Enarx is very close to production, this is still a pre-production release. But we want you to help us find issues before attackers do. So we want to see if you can prove it. There is going to be a cake, and by cake we mean prizes. This will include some hardware and will include some cash. If you have an attack, you submit this attack to us, we’ll run this attack on the server, and we are going to livestream the whole thing, so we’ll see if your attack succeeds or fails. And if it succeeds, you win a prize. And all of the winners will be announced at Black Hat. We’ll be doing this in two phases. The first phase will be announced at Black Hat. So come along and show us your stuff!
Cross-posted at: https://blog.profian.com/wasi-networking/