GeekHub Learn
Module
Lesson 9.55 of 8 in this module2 min read Module 9: Building a PDF Chatbot (RAG Project)

The retrieve-and-answer flow with citations

This is the lesson where the chatbot actually answers. The trick is forcing citations so users trust the output.

A research assistant who not only answers your question but tells you which book and page they got it from.

Flow per query:

  1. Embed the user question.
  2. Query Chroma for top-K (e.g., 5) chunks.
  3. Build a prompt with system rules + retrieved chunks + the user question.
  4. Tell the model to cite (filename + page) for each claim.
  5. Stream the answer.
SYSTEM = """You are a PDF chatbot. Use ONLY the provided sources to answer.
If the answer is not in the sources, say "I could not find that in the documents."
After each fact, cite the source like [filename p.X].
"""

def retrieve(question, k=5):
    q_emb = embed(question)
    res = col.query(query_embeddings=[q_emb], n_results=k)
    docs = res["documents"][0]
    metas = res["metadatas"][0]
    return list(zip(docs, metas))

def answer(question):
    pairs = retrieve(question, k=5)
    context = "\n\n".join(
        f"[Source: {m['source']} p.{m['page']}]\n{d}" for d, m in pairs
    )
    messages = [
        {"role": "system", "content": SYSTEM},
        {"role": "user", "content": f"Sources:\n{context}\n\nQuestion: {question}"},
    ]
    stream = openai_client.chat.completions.create(
        model="gpt-4o-mini", messages=messages, stream=True
    )
    text = ""
    for chunk in stream:
        delta = chunk.choices[0].delta.content
        if delta:
            text += delta
            yield delta
    return text

Visualize it

A 5-step flow with annotations: embed question, top-K chunks, prompt assembly, LLM call, stream out.

Try it now

Run the function on a small index. Ask 3 questions. Verify the citations are real.

Hands-on lab

Implement retrieve and answer in rag.py. Try 5 questions. Note false citations if any.

Try it now

Why does the system prompt explicitly say "do not invent answers"?

Common mistakes

  • No "I do not know" instruction (the model will invent)
  • Forgetting to include the citation format in the system prompt
  • Top-K too high (noisy context, cost spikes) or too low (misses answer)

Debugging tip

If citations look wrong, they probably are. Add a post-step that verifies the cited page actually contains the claim's keywords. This catches the worst hallucinations.

Challenge

Add a "show sources" expander in the UI that displays the actual retrieved chunks for each answer.

Where this shows up

  • Document QA with audit trails
  • Compliance bots
  • Internal knowledge assistants

From the field

In 2026 enterprise, citations are non-negotiable. Buyers reject any RAG product that cannot show its sources.

Recap

Retrieve, augment, generate, cite. The four moves of a trustworthy RAG answer.


Quick recall

3 prompts · think before you flip

Prompt 1 of 3

Why include sources in the prompt with explicit tags?

Quiz time

1 question · tap an answer to check it

  1. 1. To reduce hallucinations in RAG, the most reliable move is to

Finished lesson 9.5?

Mark complete to update your module progress and unlock the streak.

Loading