GeekHub Learn
Course
Module 9 of 1615 min read8 sub-lessons Docs view

Module 9: Building a PDF Chatbot (RAG Project)

Build a real PDF chatbot end-to-end with Python, ChromaDB, OpenAI, and Streamlit. Upload PDFs, chunk, embed, retrieve, generate, and ship to the public internet.

Module 9: Building a PDF Chatbot (RAG Project)

What this module gives you

Ship the flagship project of this course: a working "chat with any PDF" app that uses real RAG, retrieves citations, and lives at a public URL. By the end you can rebuild this in any future job interview from memory.

Skills you will pick up

  • Parsing and chunking PDFs
  • Building a persistent vector index
  • End-to-end RAG with citations
  • Streamlit UI for upload + chat
  • Evaluating and improving RAG quality
  • Free deployment with secrets

Why it matters in production

PDF chatbots are 2026's most common AI feature request: legal teams, students, founders, support orgs all want them. Knowing how to build one professionally is hireable on its own.

Lessons in this module

  1. 1

    Lesson 9.1

    Project setup and architecture review

    2 min
  2. 2

    Lesson 9.2

    Parsing PDFs into clean text

    2 min
  3. 3

    Lesson 9.3

    Chunking strategy that actually works

    2 min
  4. 4

    Lesson 9.4

    Building the vector index

    2 min
  5. 5

    Lesson 9.5

    The retrieve-and-answer flow with citations

    2 min
  6. 6

    Lesson 9.6

    Streamlit UI: upload, chat, citations

    2 min
  7. 7

    Lesson 9.7

    Evaluation: knowing when RAG is "good enough"

    2 min
  8. 8

    Lesson 9.8

    Deployment and stretch goals

    2 min

Recap

You shipped a real RAG-powered PDF chatbot. Citations work. Deploy is live. Your GitHub is now stronger than 90% of bootcamp grads.

Ready to start?

Open Lesson 9.1: Project setup and architecture review

Start first lesson