GeekHub Learn
Course
Learn/AI and LLMs for Beginners/Beyond the Modules
Module 14 of 167 min read Docs view

Tech Stack and Tools

The 2026 tech stack used throughout the AI and LLMs for Beginners course: APIs, libraries, vector DBs, deployment platforms, free tiers, and cost-aware setup.

Tech Stack and Tools

The full, opinionated 2026 stack for this course. Use it as your "what to install and why" reference.

TL;DR

Python 3.11+ | Streamlit | OpenAI / Gemini / Anthropic APIs
ChromaDB or pgvector | tiktoken | python-dotenv
LangChain (light) | Hugging Face (optional) | FastAPI (light intro)
GitHub | VS Code or Cursor | Colab or Replit for fast experiments
Streamlit Cloud | Hugging Face Spaces | Vercel | Railway

Languages and runtimes

  • Python 3.11+ is the default. Easy install via python.org or pyenv.
  • Node 20+ if you take the Next.js stretch. Install via nvm.

Environments

  • Local: VS Code (free) or Cursor (AI-first IDE).
  • Cloud notebooks: Google Colab (free with limits), Kaggle Notebooks, Replit.
  • Containers: Docker Desktop or Podman, optional.

Core Python libraries

LibraryPurposeInstall
openaiOpenAI API clientpip install openai
anthropicClaude API clientpip install anthropic
google-genaiGemini API clientpip install google-genai
python-dotenvload .env keyspip install python-dotenv
tiktokencount tokenspip install tiktoken
streamlitUI frameworkpip install streamlit
chromadbvector DBpip install chromadb
pypdfPDF parsingpip install pypdf
pymupdfbetter PDF parsingpip install pymupdf
numpyvector mathpip install numpy
tenacityretries with backoffpip install tenacity
pydanticdata validationpip install pydantic
fastapibackend API (light intro)pip install fastapi uvicorn
langchain / langchain_openaioptional orchestrationpip install langchain langchain-openai
sentence-transformersself-hosted embeddingspip install sentence-transformers
rank-bm25keyword searchpip install rank-bm25

LLM providers and free tiers

Free-tier numbers change frequently. Always check the provider dashboard for the current quota; the snapshot below reflects publicly documented limits at time of writing.

ProviderFree tier signalCard requiredWhen to pick
Google AI Studio (Gemini)Free tier on Flash and Flash-Lite models, roughly 5 to 15 requests/min with a 250K-token/min ceiling shared across modelsNoDefault free choice for this course
Groq~30 requests/min, ~6K tokens/min, ~14.4K requests/day on most hosted models (Llama 3.x, Gemma)NoFastest free inference, open weights
Hugging Face Inference ProvidersFree credit + serverless inference on many open modelsNoOpen-model experiments
OpenAITrial credit on new accountsYesTutorials and SDK familiarity
Anthropic ClaudeTrial credit on new accountsYesLong-context document analysis
Together AI$5 minimum top-up to access platform; startup credits availableYesOpen-source model hosting at scale
CohereGenerous free tier on multilingual embeddingsYesMultilingual RAG

Tip: at the start of every project, set a hard spending limit in any provider dashboard that takes a card. For this course, the Google AI Studio + Groq combo gets most learners to the capstone without spending anything.

Embeddings

ModelProviderFree pathStrengths
text-embedding-3-smallOpenAIPaid onlyCheap, balanced, the default
text-embedding-3-largeOpenAIPaid onlyHigher quality, larger dim
voyage-3Voyage AIFree tier, generousOften top of MTEB
embed-multilingual-v3CohereFree tierMultilingual strength
bge-base-en-v1.5BAAI (open)Free, self-host via sentence-transformersStrong English baseline
nomic-embed-text-v1.5Nomic (open)Free, self-hostLong-context embeddings
mxbai-embed-largeMixedbread (open)Free, self-hostSolid quality, small model

Going fully free: install sentence-transformers, load BAAI/bge-base-en-v1.5, and call model.encode(...). Runs on CPU on any laptop. No API key, no cost.

Vector databases

DBFree pathBest for
ChromaDBFree, embedded or self-host (Apache 2.0)Prototypes, this course's PDF chatbot
FAISSFree, in-process libraryResearch, fast in-memory similarity
pgvectorFree, Postgres extension; works on Supabase free tier (500 MB)Teams already on Postgres
QdrantFree, self-host; managed cloud has a free tierProduction, hybrid search
WeaviateFree, self-host; managed cloud has a free tierProduction, native hybrid
MilvusFree, self-hostMassive scale
PineconePaid (limited free starter exists, often gated)Easiest managed scaling

Run-it-yourself stack (no API key)

For learners who want to skip API providers entirely:

  • Ollama (ollama.com) lets you download and run Llama 3.x, Mistral, Gemma 3, Phi, Qwen, and DeepSeek-R1 locally. Works on macOS, Linux, and Windows. 7B to 8B models run comfortably on 16 GB of RAM. Zero per-token cost, zero data leaving your machine, fully offline after the initial download.
  • LM Studio is a GUI alternative to Ollama, helpful if you prefer a click-through interface to evaluate models.
  • llama.cpp is the lower-level engine both projects build on; advanced learners use it directly for quantization control.

A typical free-stack project: Ollama (Llama 3.1 8B) plus sentence-transformers for embeddings plus ChromaDB for retrieval plus Streamlit for the UI. Every component is open source, runs on a laptop, and costs nothing per query.

Deployment platforms

PlatformApp typesFree tier reality
Streamlit CloudStreamlitFree, sleeps when idle, ~1 GB RAM
Hugging Face SpacesStreamlit, Gradio, Docker, StaticFree CPU; paid GPU on demand
VercelNext.js, Edge FunctionsGenerous free tier on the Hobby plan
RailwayFastAPI, workers, DBs$5 free trial credit on signup, then usage-based
Fly.ioContainers, long-runningFree quota on small machines, pay beyond
RenderWeb services, workersFree tier exists but web services sleep aggressively
Cloudflare Workers / PagesEdge functions, static sites100K requests/day free

Observability and monitoring

ToolPurpose
logging (stdlib)Local logs
Logtail / Better StackHosted log aggregation (free tier)
HeliconeLLM-specific observability proxy
LangSmithLangChain-native tracing
Phoenix (Arize)Open-source LLM tracing

Eval frameworks

ToolStrength
PromptfooSide-by-side prompt eval
RagasRAG-specific metrics
TruLensProduction LLM evals
OpenAI EvalsOpen-source eval harness
LangSmith EvalsLangChain-native

Setup commands

Create a project from zero:

mkdir my-ai-app && cd my-ai-app
python -m venv .venv
source .venv/bin/activate          # Mac/Linux
.venv\Scripts\activate            # Windows

pip install streamlit openai python-dotenv tiktoken chromadb pypdf tenacity pydantic

echo ".env" > .gitignore
echo ".venv/" >> .gitignore
echo "chroma_db/" >> .gitignore
echo "data/" >> .gitignore

git init

.env:

OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=sk-ant-...

Verify:

python -c "from openai import OpenAI; print(OpenAI().models.list().data[0].id)"

Cost-aware defaults

  • Default model for prototypes: gpt-4o-mini (or gemini-2.5-flash on the free tier)
  • Default embedding model: text-embedding-3-small (or BAAI/bge-base-en-v1.5 self-hosted)
  • Default chunk size: 500 tokens with 80 overlap
  • Default top-K: 5
  • Default max_tokens for chat: 600 to 800
  • Default temperature: 0.4 for factual, 0.8 for creative

The fully free path

You can finish this entire course without spending a rupee. Three valid free routes, all interchangeable per module:

Route A: Free hosted APIs (no credit card)

  • LLM: Google AI Studio (Gemini Flash family). Free tier on ai.google.dev. Recently tightened to about 5 to 15 requests per minute with a shared 250K tokens-per-minute cap, which is fine for learning. No credit card required.
  • LLM (open weights, fastest): Groq at console.groq.com. Around 30 requests/min, 6K tokens/min, 14K requests/day on hosted Llama and Gemma models. No credit card required.
  • Embeddings: Cohere's free tier on embed-multilingual-v3 (1,000 requests/min on the trial key), or self-host BAAI/bge-base-en-v1.5 via sentence-transformers.
  • Vector DB: ChromaDB locally. Zero cost, persistent on disk.
  • Deploy: Streamlit Cloud (auto-deploys from GitHub) or Hugging Face Spaces.

Route B: 100% local (no internet after install)

  • LLM: Ollama at ollama.com. Run Llama 3.1 8B, Mistral 7B, Gemma 3, Phi-4, Qwen, or DeepSeek-R1 on your laptop. 16 GB RAM handles 7 to 8B models comfortably.
  • Embeddings: sentence-transformers with BAAI/bge-base-en-v1.5, runs on CPU.
  • Vector DB: ChromaDB local mode.
  • Deploy: localhost during development, then containerize via Docker if needed.

Route C: Hybrid (free hosted LLM + local everything else)

  • LLM: Groq for speed, Gemini for long context, Hugging Face Inference Providers for variety. All free.
  • Embeddings: Self-host bge-base-en-v1.5.
  • Vector DB: ChromaDB local, or pgvector on Supabase's free tier (500 MB) if you want it cloud-hosted.
  • Deploy: Streamlit Cloud or Vercel free tier.

The PDF chatbot capstone (Module 9) runs end-to-end on any of these three routes. Free-tier rate limits change frequently, so always confirm the live numbers on the provider dashboard before relying on them in production.

What we deliberately avoid in this course

  • Heavy ML training (no PyTorch deep dives at beginner level)
  • Custom transformer implementations
  • Manual fine-tuning runs (better in the intermediate course)
  • AWS / GCP / Azure deep provisioning (overkill for beginners)
  • Kubernetes (not for first apps)

Optional power-ups

  • Cursor and Windsurf: AI-pair-programming IDEs with free tiers.
  • Continue.dev: open-source coding assistant that connects to any model (Ollama, OpenAI, Groq).
  • Aider: free CLI pair programmer that works with Ollama, Gemini, Groq, OpenAI.
  • Supabase: Postgres + auth + storage + pgvector on a free tier (500 MB DB, 1 GB storage).
  • n8n (self-hosted): free workflow automation for AI agents.

Done reading this section?

Mark complete updates the course progress bar.

Loading