Module 10: Deploying AI Apps
Module Goal
Get any AI app you build live on the internet, securely and cheaply. Cover the four free and cheap platforms beginners actually need.
Estimated Duration
3 to 4 hours.
Skills Learned
- Deploying Streamlit, Gradio, FastAPI, Next.js AI apps
- Managing API keys and secrets in production
- Setting up monitoring and basic logs
- Cost guards and rate limiting
- Choosing a platform for the right reasons
Real-world Importance
A non-deployed project is invisible to recruiters, users, and your future self. Owning deployment is the difference between "I learned AI" and "I shipped AI".
Lessons in this module
- The 4 platforms beginners should know
- Streamlit Cloud and Hugging Face Spaces in depth
- Vercel for Next.js AI apps
- Railway for FastAPI backends
- Secrets, logging, monitoring, cost guards
Lesson 10.1: The 4 platforms beginners should know
Hook / Why This Matters
Pick once, ship five projects. This lesson is the platform decision tree.
Beginner Analogy
Different airports for different routes. You do not pick the same one for a domestic flight and a cargo shipment.
Concept Explanation
The 4-platform map:
| App type | Best free platform | Why |
|---|---|---|
| Streamlit demo | Streamlit Cloud | one-click |
| Gradio demo | Hugging Face Spaces | native |
| Next.js (frontend + serverless) | Vercel | best DX |
| FastAPI backend | Railway or Fly.io | persistent server, env vars, simple |
Technical Breakdown
Beyond free tiers, when usage grows:
- Streamlit Cloud paid (more memory, custom domain)
- Hugging Face Spaces hardware tiers (GPU)
- Vercel Pro (more functions, edge)
- Railway Hobby (longer-running services)
Visual Learning Suggestion
A decision tree: "What kind of app?" -> Streamlit/Gradio/Next.js/FastAPI -> recommended platform.
Interactive Element
Sign up for all four. Add their CLIs (vercel, railway) to your machine.
Hands-on Lab
Make a "hello" deploy on each platform. 30 minutes total. You will reuse these skills forever.
Mini Exercise
When would you choose Hugging Face Spaces over Streamlit Cloud?
Common Mistakes
- Trying to deploy a Next.js app on Streamlit Cloud
- Putting a heavy backend on Vercel (use Railway/Fly)
- Skipping the CLI install (faster iteration once you have it)
Debugging Tips
If you cannot decide, default to Streamlit Cloud for prototypes and Vercel + Railway for "real" apps.
Knowledge Check Questions
- Which platform for a FastAPI backend?
- Which for a Gradio demo?
- Which for Next.js?
Quiz Questions
- A Streamlit chatbot first deploy should go to: a) Streamlit Cloud b) AWS EC2 c) Vercel d) GCP VM Answer: a
Challenge Task
Make a "hello world" on all four platforms. Bookmark each dashboard.
Real-world Use Cases
- Portfolio demos
- Internal team tools
- Public AI utilities
Industry Insight
Junior AI engineers should master deploying 4 stacks (Streamlit, Gradio, Next.js, FastAPI) on these 4 platforms. That single matrix is a moat.
Interview Questions
- Where would you deploy a quick chatbot demo? Why?
- What is the tradeoff between Streamlit Cloud and HF Spaces?
- When would you graduate off these and onto AWS?
Summary
Four platforms. Four app types. Memorize the matrix.
Lesson 10.2: Streamlit Cloud and Hugging Face Spaces in depth
Hook / Why This Matters
The two free platforms beginners use most. Master the secrets, the dependency files, the constraints.
Beginner Analogy
These are the bicycles of AI deployment. Free, fast, slightly limited, perfect to learn on.
Concept Explanation
Streamlit Cloud:
- Connect GitHub
- Pick repo, branch, file
- Add secrets (
OPENAI_API_KEYetc.) in dashboard - Auto-deploys on every push
- Free tier: 1 GB RAM, 1 GB storage, sleeps after inactivity
Hugging Face Spaces:
- Connect or create a Space
- Pick SDK: Streamlit, Gradio, Static, Docker
- Add secrets in the Settings tab
- Free tier: CPU 2 vCPU 16 GB; sleeps; GPU available at $0.60+/hr
Technical Breakdown
For both, your repo needs:
requirements.txt(orpyproject.tomlfor Streamlit)- An entry file (
app.pyorstreamlit_app.py) - A
.gitignoreexcluding.env,chroma_db/,data/
For Spaces, also a README.md with frontmatter:
---
title: My PDF Chatbot
sdk: streamlit
sdk_version: 1.40.0
app_file: app.py
---
Visual Learning Suggestion
Side-by-side screenshots of both deployment dashboards with annotated secret-management areas.
Interactive Element
Deploy your Module 6 chatbot to Streamlit Cloud. Deploy your Module 9 PDF chatbot to Hugging Face Spaces.
Hands-on Lab
Both deploys, both URLs in your README.
Mini Exercise
When is Hugging Face Spaces preferable for a heavy model?
Common Mistakes
- Hardcoding secrets (free tier scanners catch them quickly)
- Missing
requirements.txt - Forgetting that ChromaDB persistence is ephemeral on free tiers (use external storage or accept it)
Debugging Tips
If your free-tier deploy keeps "sleeping" mid-demo, switch to a different platform or pay for an always-on tier when demoing to recruiters.
Knowledge Check Questions
- Where do you set secrets on Streamlit Cloud?
- Why use HF Spaces over Streamlit for some models?
- What is the role of the README frontmatter on Spaces?
Quiz Questions
- To deploy a Gradio app for free with GPU access on demand, choose: a) Streamlit Cloud b) Hugging Face Spaces c) Vercel d) Railway Answer: b
Challenge Task
Add an "Embeddings on device" mode using sentence-transformers and deploy it to HF Spaces.
Real-world Use Cases
- AI demos
- Open-source community tools
- Hobby projects
Industry Insight
Many open-source AI projects in 2026 maintain both a Streamlit Cloud and an HF Spaces deploy for redundancy and reach.
Interview Questions
- Walk me through deploying a Streamlit app.
- How do you handle secrets on these platforms?
- What are their limitations?
Summary
Both are free, GitHub-integrated, beginner-friendly. Master both.
Lesson 10.3: Vercel for Next.js AI apps
Hook / Why This Matters
When you graduate from Streamlit to a real frontend, Vercel is where Next.js apps go.
Beginner Analogy
If Streamlit is a bicycle, Next.js + Vercel is a car. More setup, far more power, the bar for production work.
Concept Explanation
Vercel hosts Next.js apps as serverless functions + static assets, deploys on git push, manages secrets, and offers analytics.
For AI calls, you can use:
- Next.js Route Handlers / API routes for backend calls
- Vercel AI SDK for streaming chat UIs
- Edge runtime for low-latency streaming
Technical Breakdown
A minimal Next.js chat endpoint:
// app/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({ model: openai("gpt-4o-mini"), messages });
return result.toDataStreamResponse();
}
Deploy with vercel deploy. Set OPENAI_API_KEY in the Vercel dashboard.
Visual Learning Suggestion
A typical Next.js + Vercel deploy flow diagram: git push -> Vercel build -> serverless deploy -> public URL.
Interactive Element
Clone a Vercel AI Chatbot starter (vercel.com/templates). Deploy. Customize the system prompt.
Hands-on Lab
Deploy a Next.js AI chat app. Add a system prompt. Push to GitHub. Share the URL.
Mini Exercise
When does serverless become a poor fit for your AI app?
Common Mistakes
- Heavy long-running jobs in serverless (timeout issues)
- Forgetting
edgeruntime for streaming (Node default works too) - Storing large vector DBs in /tmp (use external services)
Debugging Tips
If functions time out, move long jobs to Railway or a queue. Vercel functions are designed for short tasks.
Knowledge Check Questions
- What is the Vercel AI SDK?
- Why use edge runtime for chat?
- Why move heavy jobs off Vercel?
Quiz Questions
- The most natural deploy target for a Next.js AI chatbot is: a) Streamlit Cloud b) Hugging Face Spaces c) Vercel d) Railway Answer: c
Challenge Task
Add a "models" dropdown and route to different providers via a single API route.
Real-world Use Cases
- Production-facing AI features
- Public marketing demos
- SaaS apps
Industry Insight
Vercel + Next.js is now the de facto stack for shipped, polished AI products built by small teams.
Interview Questions
- Why Next.js for AI front-ends?
- What is the Vercel AI SDK?
- How do you handle long-running AI jobs in Vercel?
Summary
Vercel + Next.js is the bar for production. Learn it once and your AI portfolio looks senior.
Lesson 10.4: Railway for FastAPI backends
Hook / Why This Matters
When your AI app needs a long-running backend (RAG ingest, custom embedding pipelines), Railway is the cheapest way to host it.
Beginner Analogy
Vercel is a courier for parcels. Railway is a warehouse where the worker actually lives.
Concept Explanation
Railway runs containerized services with a generous free trial and pay-as-you-grow pricing. Connect a GitHub repo, set env vars, deploy. It runs persistently (no cold starts unless you want them).
Use it for:
- FastAPI backends
- Persistent vector DB hosts
- Background workers (Celery, RQ)
- Scheduled ingest jobs
Technical Breakdown
A minimal FastAPI:
# main.py
from fastapi import FastAPI
from pydantic import BaseModel
from openai import OpenAI
import os
app = FastAPI()
client = OpenAI()
class Q(BaseModel):
question: str
@app.post("/chat")
def chat(q: Q):
r = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": q.question}],
)
return {"answer": r.choices[0].message.content}
requirements.txt:
fastapi
uvicorn
openai
Procfile (Railway):
web: uvicorn main:app --host 0.0.0.0 --port $PORT
Deploy: connect GitHub repo to Railway, set OPENAI_API_KEY, click deploy.
Visual Learning Suggestion
Architecture: Next.js on Vercel to FastAPI on Railway to OpenAI/Anthropic/Gemini.
Interactive Element
Deploy the FastAPI above. Hit /chat from your terminal with curl.
Hands-on Lab
Build a public /summarize endpoint that takes text and returns a 3-sentence summary. Deploy. Test.
Mini Exercise
When does FastAPI on Railway beat Next.js API routes on Vercel?
Common Mistakes
- Forgetting the
$PORTenv var - Not pinning Python and library versions
- Leaving spend limits open
Debugging Tips
If deploys fail, check the build log for missing system libs. Sometimes you need apt packages declared in a nixpacks.toml.
Knowledge Check Questions
- When use Railway over Vercel?
- What is
uvicorn? - Why pin versions?
Quiz Questions
- A long-running Python job is best deployed to: a) Vercel b) Streamlit Cloud c) Railway or Fly.io d) Hugging Face Spaces free CPU Answer: c
Challenge Task
Add a background ingest endpoint (/ingest) that downloads a PDF URL, chunks it, embeds, and stores in a free Postgres + pgvector hosted on Railway.
Real-world Use Cases
- Production FastAPI backends
- Scheduled crawlers
- Worker services for AI pipelines
Industry Insight
In 2026, the dominant indie stack is Next.js on Vercel + FastAPI on Railway + Supabase for DB + ChromaDB or pgvector for retrieval. Master this and you can ship anything.
Interview Questions
- Compare Vercel and Railway.
- How do you deploy FastAPI?
- How do you split a Next.js frontend from a Python backend?
Summary
Railway is your "real backend" host. Cheap, persistent, container-friendly. Pair it with Vercel for full-stack AI apps.
Lesson 10.5: Secrets, logging, monitoring, cost guards
Hook / Why This Matters
Production AI without observability is gambling. This lesson is the safety net.
Beginner Analogy
A pilot who never reads the instruments will eventually crash. Logs and dashboards are the instruments.
Concept Explanation
Five non-negotiable production habits:
- Secrets in env vars or secret manager. Never in code.
- Structured logs for every LLM call: timestamp, user_id, prompt_hash, model, tokens, latency, status.
- Cost dashboard: provider dashboard + your own.
- Hard spending caps in provider dashboards.
- Rate limit per user/IP to prevent abuse.
For Python, log with logging and ship to Logtail, Datadog, or Grafana Cloud (free tiers exist).
Technical Breakdown
Wrap LLM calls:
import logging, time, hashlib, json
log = logging.getLogger("ai")
def call_llm(messages, user_id):
started = time.time()
h = hashlib.sha256(json.dumps(messages, sort_keys=True).encode()).hexdigest()[:10]
try:
r = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
log.info({
"user_id": user_id, "prompt_hash": h, "model": "gpt-4o-mini",
"tokens": r.usage.total_tokens, "latency_ms": int((time.time()-started)*1000),
"status": "ok",
})
return r
except Exception as e:
log.error({"user_id": user_id, "prompt_hash": h, "status": "error", "err": str(e)})
raise
Cost guard via simple per-user counter in Redis, capping daily token spend.
Visual Learning Suggestion
A "production layer cake" diagram: app -> logger -> dashboard -> alerting. Each layer labeled.
Interactive Element
Set a $5 monthly cap on OpenAI right now. Take a screenshot for accountability.
Hands-on Lab
Add structured logging to your PDF chatbot. Print logs locally as JSON.
Mini Exercise
Why hash the prompt instead of logging the raw prompt?
Common Mistakes
- Logging full prompts (privacy and storage cost)
- No spend caps (free trial gets drained in hours)
- No per-user rate limits (one user can rack up your bill)
Debugging Tips
If a feature seems "stuck", check logs for repeated errors. Add a retry around the offending line.
Knowledge Check Questions
- Why hash prompts rather than store them?
- Why cap spend per user?
- What tool would you use for log aggregation on a free tier?
Quiz Questions
- The most overlooked production safety control is: a) Hard spending caps b) Bigger model c) More retries d) JSON mode Answer: a
Challenge Task
Build a tiny "/health" endpoint that reports last hour's: requests, error count, average latency, total tokens.
Real-world Use Cases
- Production AI apps
- SaaS demos at scale
- Internal AI tools
Industry Insight
The single biggest "junior to mid-level" promotion driver in 2026 AI roles is "you set up our cost dashboard and prevented the next outage". That is observability.
Interview Questions
- What logs do you ship for LLM calls?
- How do you prevent runaway cost?
- How do you debug an LLM app in production?
Summary
Secrets, logs, dashboards, caps, rate limits. Five habits. Always.
Module 10 Recap
You can deploy any of the four common AI app types to the right platform, with secrets, logs, and cost guards in place. You ship production-ready code now.
SEO Notes
- Primary keyword: "deploy AI app for beginners"
- Long-tail targets: "Streamlit Cloud deploy", "Hugging Face Spaces tutorial", "Vercel AI chatbot", "FastAPI Railway"
- Internal links: Module 6 and 9 (apps to deploy), Module 11 (safety overlay)