Tech Stack and Tools
The 2026 tech stack used throughout the AI and LLMs for Beginners course: APIs, libraries, vector DBs, deployment platforms, free tiers, and cost-aware setup.
Tech Stack and Tools
The full, opinionated 2026 stack for this course. Use it as your "what to install and why" reference.
TL;DR
Python 3.11+ | Streamlit | OpenAI / Gemini / Anthropic APIs
ChromaDB or pgvector | tiktoken | python-dotenv
LangChain (light) | Hugging Face (optional) | FastAPI (light intro)
GitHub | VS Code or Cursor | Colab or Replit for fast experiments
Streamlit Cloud | Hugging Face Spaces | Vercel | Railway
Languages and runtimes
- Python 3.11+ is the default. Easy install via python.org or
pyenv. - Node 20+ if you take the Next.js stretch. Install via
nvm.
Environments
- Local: VS Code (free) or Cursor (AI-first IDE).
- Cloud notebooks: Google Colab (free with limits), Kaggle Notebooks, Replit.
- Containers: Docker Desktop or Podman, optional.
Core Python libraries
| Library | Purpose | Install |
|---|---|---|
openai | OpenAI API client | pip install openai |
anthropic | Claude API client | pip install anthropic |
google-genai | Gemini API client | pip install google-genai |
python-dotenv | load .env keys | pip install python-dotenv |
tiktoken | count tokens | pip install tiktoken |
streamlit | UI framework | pip install streamlit |
chromadb | vector DB | pip install chromadb |
pypdf | PDF parsing | pip install pypdf |
pymupdf | better PDF parsing | pip install pymupdf |
numpy | vector math | pip install numpy |
tenacity | retries with backoff | pip install tenacity |
pydantic | data validation | pip install pydantic |
fastapi | backend API (light intro) | pip install fastapi uvicorn |
langchain / langchain_openai | optional orchestration | pip install langchain langchain-openai |
sentence-transformers | self-hosted embeddings | pip install sentence-transformers |
rank-bm25 | keyword search | pip install rank-bm25 |
LLM providers and free tiers
Free-tier numbers change frequently. Always check the provider dashboard for the current quota; the snapshot below reflects publicly documented limits at time of writing.
| Provider | Free tier signal | Card required | When to pick |
|---|---|---|---|
| Google AI Studio (Gemini) | Free tier on Flash and Flash-Lite models, roughly 5 to 15 requests/min with a 250K-token/min ceiling shared across models | No | Default free choice for this course |
| Groq | ~30 requests/min, ~6K tokens/min, ~14.4K requests/day on most hosted models (Llama 3.x, Gemma) | No | Fastest free inference, open weights |
| Hugging Face Inference Providers | Free credit + serverless inference on many open models | No | Open-model experiments |
| OpenAI | Trial credit on new accounts | Yes | Tutorials and SDK familiarity |
| Anthropic Claude | Trial credit on new accounts | Yes | Long-context document analysis |
| Together AI | $5 minimum top-up to access platform; startup credits available | Yes | Open-source model hosting at scale |
| Cohere | Generous free tier on multilingual embeddings | Yes | Multilingual RAG |
Tip: at the start of every project, set a hard spending limit in any provider dashboard that takes a card. For this course, the Google AI Studio + Groq combo gets most learners to the capstone without spending anything.
Embeddings
| Model | Provider | Free path | Strengths |
|---|---|---|---|
text-embedding-3-small | OpenAI | Paid only | Cheap, balanced, the default |
text-embedding-3-large | OpenAI | Paid only | Higher quality, larger dim |
voyage-3 | Voyage AI | Free tier, generous | Often top of MTEB |
embed-multilingual-v3 | Cohere | Free tier | Multilingual strength |
bge-base-en-v1.5 | BAAI (open) | Free, self-host via sentence-transformers | Strong English baseline |
nomic-embed-text-v1.5 | Nomic (open) | Free, self-host | Long-context embeddings |
mxbai-embed-large | Mixedbread (open) | Free, self-host | Solid quality, small model |
Going fully free: install
sentence-transformers, loadBAAI/bge-base-en-v1.5, and callmodel.encode(...). Runs on CPU on any laptop. No API key, no cost.
Vector databases
| DB | Free path | Best for |
|---|---|---|
| ChromaDB | Free, embedded or self-host (Apache 2.0) | Prototypes, this course's PDF chatbot |
| FAISS | Free, in-process library | Research, fast in-memory similarity |
| pgvector | Free, Postgres extension; works on Supabase free tier (500 MB) | Teams already on Postgres |
| Qdrant | Free, self-host; managed cloud has a free tier | Production, hybrid search |
| Weaviate | Free, self-host; managed cloud has a free tier | Production, native hybrid |
| Milvus | Free, self-host | Massive scale |
| Pinecone | Paid (limited free starter exists, often gated) | Easiest managed scaling |
Run-it-yourself stack (no API key)
For learners who want to skip API providers entirely:
- Ollama (ollama.com) lets you download and run Llama 3.x, Mistral, Gemma 3, Phi, Qwen, and DeepSeek-R1 locally. Works on macOS, Linux, and Windows. 7B to 8B models run comfortably on 16 GB of RAM. Zero per-token cost, zero data leaving your machine, fully offline after the initial download.
- LM Studio is a GUI alternative to Ollama, helpful if you prefer a click-through interface to evaluate models.
- llama.cpp is the lower-level engine both projects build on; advanced learners use it directly for quantization control.
A typical free-stack project: Ollama (Llama 3.1 8B) plus sentence-transformers for embeddings plus ChromaDB for retrieval plus Streamlit for the UI. Every component is open source, runs on a laptop, and costs nothing per query.
Deployment platforms
| Platform | App types | Free tier reality |
|---|---|---|
| Streamlit Cloud | Streamlit | Free, sleeps when idle, ~1 GB RAM |
| Hugging Face Spaces | Streamlit, Gradio, Docker, Static | Free CPU; paid GPU on demand |
| Vercel | Next.js, Edge Functions | Generous free tier on the Hobby plan |
| Railway | FastAPI, workers, DBs | $5 free trial credit on signup, then usage-based |
| Fly.io | Containers, long-running | Free quota on small machines, pay beyond |
| Render | Web services, workers | Free tier exists but web services sleep aggressively |
| Cloudflare Workers / Pages | Edge functions, static sites | 100K requests/day free |
Observability and monitoring
| Tool | Purpose |
|---|---|
logging (stdlib) | Local logs |
| Logtail / Better Stack | Hosted log aggregation (free tier) |
| Helicone | LLM-specific observability proxy |
| LangSmith | LangChain-native tracing |
| Phoenix (Arize) | Open-source LLM tracing |
Eval frameworks
| Tool | Strength |
|---|---|
| Promptfoo | Side-by-side prompt eval |
| Ragas | RAG-specific metrics |
| TruLens | Production LLM evals |
| OpenAI Evals | Open-source eval harness |
| LangSmith Evals | LangChain-native |
Setup commands
Create a project from zero:
mkdir my-ai-app && cd my-ai-app
python -m venv .venv
source .venv/bin/activate # Mac/Linux
.venv\Scripts\activate # Windows
pip install streamlit openai python-dotenv tiktoken chromadb pypdf tenacity pydantic
echo ".env" > .gitignore
echo ".venv/" >> .gitignore
echo "chroma_db/" >> .gitignore
echo "data/" >> .gitignore
git init
.env:
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=sk-ant-...
Verify:
python -c "from openai import OpenAI; print(OpenAI().models.list().data[0].id)"
Cost-aware defaults
- Default model for prototypes:
gpt-4o-mini(orgemini-2.5-flashon the free tier) - Default embedding model:
text-embedding-3-small(orBAAI/bge-base-en-v1.5self-hosted) - Default chunk size: 500 tokens with 80 overlap
- Default top-K: 5
- Default
max_tokensfor chat: 600 to 800 - Default temperature: 0.4 for factual, 0.8 for creative
The fully free path
You can finish this entire course without spending a rupee. Three valid free routes, all interchangeable per module:
Route A: Free hosted APIs (no credit card)
- LLM: Google AI Studio (Gemini Flash family). Free tier on
ai.google.dev. Recently tightened to about 5 to 15 requests per minute with a shared 250K tokens-per-minute cap, which is fine for learning. No credit card required. - LLM (open weights, fastest): Groq at
console.groq.com. Around 30 requests/min, 6K tokens/min, 14K requests/day on hosted Llama and Gemma models. No credit card required. - Embeddings: Cohere's free tier on
embed-multilingual-v3(1,000 requests/min on the trial key), or self-hostBAAI/bge-base-en-v1.5viasentence-transformers. - Vector DB: ChromaDB locally. Zero cost, persistent on disk.
- Deploy: Streamlit Cloud (auto-deploys from GitHub) or Hugging Face Spaces.
Route B: 100% local (no internet after install)
- LLM: Ollama at
ollama.com. Run Llama 3.1 8B, Mistral 7B, Gemma 3, Phi-4, Qwen, or DeepSeek-R1 on your laptop. 16 GB RAM handles 7 to 8B models comfortably. - Embeddings:
sentence-transformerswithBAAI/bge-base-en-v1.5, runs on CPU. - Vector DB: ChromaDB local mode.
- Deploy: localhost during development, then containerize via Docker if needed.
Route C: Hybrid (free hosted LLM + local everything else)
- LLM: Groq for speed, Gemini for long context, Hugging Face Inference Providers for variety. All free.
- Embeddings: Self-host
bge-base-en-v1.5. - Vector DB: ChromaDB local, or pgvector on Supabase's free tier (500 MB) if you want it cloud-hosted.
- Deploy: Streamlit Cloud or Vercel free tier.
The PDF chatbot capstone (Module 9) runs end-to-end on any of these three routes. Free-tier rate limits change frequently, so always confirm the live numbers on the provider dashboard before relying on them in production.
What we deliberately avoid in this course
- Heavy ML training (no PyTorch deep dives at beginner level)
- Custom transformer implementations
- Manual fine-tuning runs (better in the intermediate course)
- AWS / GCP / Azure deep provisioning (overkill for beginners)
- Kubernetes (not for first apps)
Optional power-ups
- Cursor and Windsurf: AI-pair-programming IDEs with free tiers.
- Continue.dev: open-source coding assistant that connects to any model (Ollama, OpenAI, Groq).
- Aider: free CLI pair programmer that works with Ollama, Gemini, Groq, OpenAI.
- Supabase: Postgres + auth + storage + pgvector on a free tier (500 MB DB, 1 GB storage).
- n8n (self-hosted): free workflow automation for AI agents.
Done reading this section?
Mark complete updates the course progress bar.