In a recent article entitled Serverless AI Inferencing with Python and Wasm, Tim McCallum walks through the process of writing a new AI application with Spin, Python, and LLaMa2-Chat.
The basic process is as follows:
- Create a new app with
spin new http-py
- Add
ai_models = ["llama2-chat"]
to thespin.toml
file - Write some code (which I'll show below)
- Build it with
spin build
- Deploy it with
spin deploy
Tim's write-up is super simple and easy to follow. You can be writing your first AI app in no time.
Here's the code Tim added to the scaffolded out app.py
file:
from spin_http import Response
from spin_llm import llm_infer
import json
import re
PROMPT = """<<SYS>>
You are a bot that generates sentiment analysis responses. Respond with a single positive, negative, or neutral.
<</SYS>>
[INST]
Follow the pattern of the following examples:
User: Hi, my name is Bob
Bot: neutral
User: I am so happy today
Bot: positive
User: I am so sad today
Bot: negative
[/INST]
User: """
def handle_request(request):
request_body = json.loads(request.body)
sentence = request_body["sentence"].strip()
result = llm_infer("llama2-chat", PROMPT + sentence)
response_body = json.dumps({"sentence": re.sub("\\nBot\: ", "", result.text)})
return Response(
200, {"content-type": "application/json"}, bytes(response_body, "utf-8")
)
That's it! The spin deploy
command will give you a URL to test. You can use curl
to send JSON requests to it:
$ curl -X POST --data '{"sentence":"Everything is awesome!"}' https://sentiment-analysis-abc-xyz.fermyon.app/
{
"sentence": "positive"
}
If Spin is new for you, you might prefer to start with the Spin quickstart guide. Spin is open source, and you can check out the code for it on GitHub.
Top comments (0)