Aman Azad

My First Time Building with SST was a Disaster

2/2024 | Aman Azad

Classic programmer mistake, SST is actually really, really good.

Mistakes between the chair and the keyboard
Moving on from RDBMS in a serverless world
We have achieved Muslim AI RAG

Mistakes between the chair and the keyboard

The SST team had announced a new console to view the infrastructure scaffolded for your SST app. When you run pnpm sst dev, the console spits out a URL that should link you directly to the console. In dev mode this is great, and honestly kind of magic, to see infra being built and rolled back as you edit the SST stack.

Framework defined infrastructure, but without the Vercel black box.

This is all great in theory, especially with a goal of mine of learning how to deploy a NextJS app directly via AWS primitives (via CDK). The issue I ran into was I was seeing the infrastructure be deployed, but the SST console was stuck in a loading state. Did some searching around their support Discord, much love to FlexpathDXP and Jay for behing helpful in trying to add context.

I had originally thought the issue was with the new console as changing the URL structure to the old format showed the infra just fine.

The short of it - I'm not sure how I got into this state with the old console, but I'm assuming this is what happened. I had originally created my first SST console account while my AWS CLI was auth'd with a previous AWS access key (lol). While troubleshooting, I ended up created a new SST console account with the correct AWS access key.

At this point, my local dev environment in my CLI was still on the old AWS access key, so although the infra was being deployed, the SST console was stuck in a loading state. I re-auth'd my AWS CLI with the new access key, and got the SST console to work. I now have some orphaned OpenNext project on my old AWS account I guess.

SST works! And wow it is a very well built piece of software...

Moving on from RDBMS in a serverless world

I love tables, SQL was my homie, Postgres, foreign keys, love all that. But in a serverless context, it's really just the wrong tool.

Furthering my journey into SST land with AskDeen I made a few key realizations. Although Postgres+pgvector gets me RAG out of the box with stored SQL procedures, RDBMS is not an ideal choice for a serverless first app.

Although you can run RDBMS using AWS Aurora - for my purposes I'd need to maintain at minimum 2 auto scaling clusters at minimum. Also, the queries would be slow, and much more expensive that a NoSQL solution.

I could also have gone with an off the shelf solution like Supabase, but for the amount of data I'd need to host for my app I'd be looking at paying $25/mo for the rest of my life to host this app. Not paying on a usage based model is not what I signed up for when I decided to go for a serverless first approach.

So the solution for a data store that's most natural for serverless apps is NoSQL, and AWS's DynamoDB comes built in as a construct with SST. Coming from a RDBMS world there's a bit of a learning curve understanding how to model data in a single table. Luckily this isn't too hard, lots of resources to figure this out. There was some recommendations in the SST docs to go with something like ElectroDB, but as a first time NoSQL'er my personal approach to learning new things is go as close to building with primitives as you can.

Last note on NoSQL and single table design, this isn't the default way to model data as is done in RDBMS, but lots of folks have taken this approach. There's some talk about how NoSQL requires lots of work up front to understand data relationships. I disagree, I found the process of modeling primary and sort keys very natural to build in an evolutionary way. Also - many teams have built traditional RDBMS solutions in a very similar single table way. Pieter Lievels and Reddit both have mentioned this approach.

@levelsio tweet about single table design in RDBMS

We have achieved Muslim AI RAG

Got my infrastructure infrastructuring, and got my data store data storing. With the undifferentiated plumbing out of the way, I was able to get some great progress on the actual novel bits of AskDeen. One of my main objectives was achieving some way to get RAG (retrieval augmented generation) for the Quran and hadith.

Great news - was able to get this working on a UI using Vercel's AI SDK + AI Chatbot NextJS starter!

A few things went into this:

Pinecone for a serverless vector DB
OpenAI embedding API for prompt and Quran embeddings
Vercel's AI SDK to achieve a streaming chatbot + UI react hooks

The above choices came about in a few ways. The easiest to explain was OpenAI, I had listed the reason why I chose their model instead of hosting my own in the original AskDeen post. Nothings changed here, good APIs, good docs, lots of community support. At this point, I see no reason to necessarily need to host a fine-tuned model for this particular project.

Why Pinecone?

I had originally wanted this project to be 100% SST, 100% AWS primitives. To that end, I gave AWS's OpenSearch a try. What a nightmare of an experience.

You've got to do a few things in order to get vector search done with AWS's implementation of OpenSearch. They had a talk in re:Invent discussing a "Zero-ETL Integration" with DynamoDB - lies.

You have to configure a serverless OpenSearch collection
You have to configure a pipeline to ingest data from DynamoDB
You have to correctly set up IAM roles for all the services above
Once data starts flowing, you have to properly index the correct embedding columns
You have to learn how to query the OpenSearch API

Maybe I would've gotten in at the last step of learning OpenSearch queries, but I had spent nearly 10 hours hacking away at this effort that in the meantime I reached out to my friends for some advice. While I was plugging away, a friend of mine mentioned Pinecone.

After hitting a roadblock with the OpenSearch queries, I gave Pinecone a try. Within 2 hours I was able to 10 ayat uploaded into their serverless vector DB, and was able to input a prompt, generate an embedding with OpenAI, and get back a cosine similarity kNN search result. Alahamdulillah.

Last note on OpenSearch - my AWS bill was basically $0.00 until I started trying to setup this DynamoDB pipeline. The unbound access to AWS and my general lack of understanding here made my very hesitant to continue going down this path with OpenSearch. Just from attempting to set up the pipeline, I've got a $3 charge on my account for this month. Lmao.

Getting the Quran into Pinecone

const embeddingsResponse = await openai.embeddings.create({
  model: "text-embedding-3-small",
  dimensions: 1536,
  input: record.Text,
  encoding_format: "float",
});
const embeddingValue = embeddingsResponse?.data?.[0]?.embedding ?? [];

if (!embeddingValue.length) {
  console.log("No embedding value", absoluteAyat, record.Text);
  continue;
}

const quranMetadata: TableQuranMetadata = {
  ayat,
  absoluteAyat,
  surah: Number(record.Surah),
  englishText: record.Text,
  arabicText: quranArabic,
};
console.log(
  "quranMetadata",
  JSON.stringify(
    { ...quranMetadata, embedding: embeddingValue.length },
    null,
    2,
  ),
);

upsertQuranAyat(quranMetadata, embeddingValue, pinecone);
progressBar.update(absoluteAyat);

Data ingestion into Pinecone once I had the DB set up was very simple.

Since I'm running this in a monorepo, it was pretty straightforward to setup some core getters/setters for the Pinecone DB. I have the entirety of the Quran loaded in my repo as a single .parquet file, along with a TypeScript object containing all the Surah names and ayat counts. I was able to use the parquetjs library to step through the entirety of the quran, call OpenAI to generate the embedding for the particualr ayat, associate all the required metadata for that embedding and upload the final data payload into Pinecone.

A single script, took probably about 20 minutes to get through all 6,236 ayat!

Why Vercel's AI SDK? I thought we were talking OpenNext?

I could have built a chatbot from scratch, but why bother when 80% of the work is already done! I think it pays a lot of dividends to focus on clearing away any undifferentiated aspects of my project to get to the real value in building out the novel bits (RAG, in this case). Vercel's AI SDK works pretty well out the box, is actively being developed, and is built using all the latest NextJS features and best practices.

There was a bit of work to get rid of some of the edge runtime functions they had as for my use case I'd actually want serverless calls to be done closer to our Pinecone and DynamoDBs. Bit more work was needed to port over any use of Vercel .env variables over to SST's usage of a Config construct via AWS Secrets Manager. Lastly, their template makes heavy usage of the Vercel KV data store which I need to replace with DyanmoDB.

With a bit of setup, I was able to get to a very functional, responsive, streaming chatbot UI.

Next steps - fine tune the system prompts, layer in DynamoDB, add in more auth providers, hadith, and some paywall integrations with Stripe!

Socials

Email me at aman.s.azad@gmail.com to get in touch.