Module 9: Context Management

Lesson 1: What Context Means in LLM Systems

In the world of AI architecture, Context is Currency. Unlike a human who can remember thousands of pages of experience across a lifetime, an LLM's "Working Memory" is limited to its Context Window. Everything Claude knows about your current task must fit inside this window.

In this lesson, we define the "Context Window" and learn the relationship between Context Volume and Model Intelligence.

1. The "Working Memory" Analogy

Think of Claude as a brilliant professor who has a very small desk.

The professor can read anything, but they can only keep a few books open on their desk at one time.
If they need to read a new book, they must close an old one.

The "Desk" is the Context Window. The "Books" are your prompt, your conversation history, and your files.

2. Tokens: The Unit of Context

We don't measure context in "Words"; we measure it in Tokens.

1,000 Tokens ≈ 750 Words.
Claude 3.5 Sonnet has a context window of 200,000 Tokens.

Why Tokens Matter:

Cost: You are billed for every token you send.
Speed: More tokens = higher latency.
Accuracy: This is counter-intuitive: More tokens often result in LOWER accuracy.

3. The "Lost in the Middle" Phenomenon

Researchers have found that LLMs are very good at remembering the Beginning and the End of a prompt, but they can "Forget" or ignore details placed in the middle of a massive context block.

Architect's Move: SNR (Signal-to-Noise Ratio). Your job is not to give Claude the most data; it's to give Claude the most relevant data.

4. Visualizing the Context Window

graph TD
    subgraph "The Context Window (200k Tokens)"
    A[System Prompt]
    B[Conversation History]
    C[Injected Files]
    D[Tool Results]
    end
    E[Total Context] --> F{Model Intelligence}
    Note right of F: Intelligence peaks when context is relevant

5. Summary

Context is the temporary "Working Memory" of the model. It is limited, expensive, and fragile. Effective management involves maximizing the Signal (information the model needs) and minimizing the Noise (filler text, irrelevant logs, or repetitive history).

In the next lesson, we look at the physical limits of this memory: Context Window Limitations.

Interactive Quiz

What is a "Token" and how does it relate to words?
Explain the "Lost in the Middle" phenomenon.
Why does increasing the context volume often decrease the model's accuracy?
Scenario: You want to provide a 500-page PDF to Claude. Should you paste the whole thing in or use a different strategy? Why?

Reference Video: