Lesson 11.4: Privacy and data handling | GeekHub Learn

Sensitive user data plus careless API calls = your company on the news. This lesson keeps you off the news.

A loud diary read aloud in a cafe. Privacy in AI is making sure the diary is only read where it should be, by whom it should be.

Best practices:

Know your provider's data retention policy (OpenAI does not train on API data by default but check current terms).
Minimize: only send the fields needed.
Redact PII before LLM calls when possible.
Avoid logging raw prompts that contain user PII.
Region: check data residency requirements.
Use private deployments for highly regulated data (self-hosted Llama, Azure OpenAI with no training).
User consent and transparency: tell users you use AI, how, and on what.

PII redaction snippet:

import re
def redact(text):
    text = re.sub(r"\b[\w.+-]+@[\w-]+\.[\w.-]+\b", "[EMAIL]", text)
    text = re.sub(r"\b\d{10}\b", "[PHONE]", text)
    return text

Then call the LLM with redact(user_text). Map back on the way out only if needed.

Visualize it

A "data flow" map: user -> redactor -> LLM provider -> response -> log (without raw). Each step labeled with what is and is not stored.

Try it now

Take your PDF chatbot. Trace the lifecycle of an uploaded PDF: where it lives, who sees it, how long. Document.

Hands-on lab

Add a redactor to your chatbot's input. Log only redacted prompts.

Try it now

When would you choose a self-hosted Llama over OpenAI?

Common mistakes

Sending entire emails or DBs to providers without minimization
Logging raw prompts that include PII
Failing to disclose AI use to users (regulatory and trust risk)

Debugging tip

Read your provider's "data usage" policy quarterly. Terms change.

Challenge

Add a "privacy mode" toggle that strips identifiers before any LLM call.

Where this shows up

HR assistants
Healthcare-adjacent AI
Customer support
B2B SaaS with enterprise customers

From the field

In 2026 procurement, a 2-page "AI data handling" doc is the difference between winning and losing enterprise deals.

Recap

Minimize, redact, disclose, audit. Privacy is a feature, not a chore.

Quick recall

3 prompts · think before you flip

Prompt 1 of 3

Why minimize what you send?

Quiz time

1 question · tap an answer to check it

1. The cheapest privacy improvement most apps can make today is