Wasm Builders 🧱

Cover image for Serverless AI Inferencing with Python and Wasm
Matt Butcher
Matt Butcher

Posted on

Serverless AI Inferencing with Python and Wasm

In a recent article entitled Serverless AI Inferencing with Python and Wasm, Tim McCallum walks through the process of writing a new AI application with Spin, Python, and LLaMa2-Chat.

The basic process is as follows:

  • Create a new app with spin new http-py
  • Add ai_models = ["llama2-chat"] to the spin.toml file
  • Write some code (which I'll show below)
  • Build it with spin build
  • Deploy it with spin deploy

Tim's write-up is super simple and easy to follow. You can be writing your first AI app in no time.

Here's the code Tim added to the scaffolded out app.py file:

from spin_http import Response
from spin_llm import llm_infer
import json
import re

PROMPT = """<<SYS>>
You are a bot that generates sentiment analysis responses. Respond with a single positive, negative, or neutral.
Follow the pattern of the following examples:

User: Hi, my name is Bob
Bot: neutral

User: I am so happy today
Bot: positive

User: I am so sad today
Bot: negative

User: """

def handle_request(request):
    request_body = json.loads(request.body)
    sentence = request_body["sentence"].strip()
    result = llm_infer("llama2-chat", PROMPT + sentence)
    response_body = json.dumps({"sentence": re.sub("\\nBot\: ", "", result.text)})
    return Response(
        200, {"content-type": "application/json"}, bytes(response_body, "utf-8")
Enter fullscreen mode Exit fullscreen mode

That's it! The spin deploy command will give you a URL to test. You can use curl to send JSON requests to it:

$ curl -X POST --data '{"sentence":"Everything is awesome!"}' https://sentiment-analysis-abc-xyz.fermyon.app/

    "sentence": "positive"
Enter fullscreen mode Exit fullscreen mode

If Spin is new for you, you might prefer to start with the Spin quickstart guide. Spin is open source, and you can check out the code for it on GitHub.

Top comments (0)