
The Agent's Diary: Persistent Memory
Build agents that grow over time. Learn how to implement persistent memory that survives between sessions and how to distinguish between facts and preferences.
Ephemeral vs Persistent Memory
In Module 2.2, we discussed the basics of State. Now, we go deeper into Chronic Memory. Most agents today have "Amnesia"—once a session ends, they forget everything. To build a truly personal agent, it must have a Long-Term Memory (LTM) that persists across days, months, and thousands of distinct threads.
In this lesson, we will explore how to architect a "Personal Cache" for your agents.
1. Defining the Three Tapers of Memory
To be effective, an agent must manage information at three different "speeds":
- Short-Term (Conversation): The current chat window. (Deleted when the thread is closed).
- Medium-Term (Working State): Variables like
is_logged_inorcurrent_project_id. (Held in the checkpointer). - Long-Term (The Identity): "The user prefers Python," "The user's daughter is named Sarah," "The user works at Google." (Stored in a Vector or Graph DB).
2. The Implementation: Memory-as-a-Tool
We don't send the "Entire History" of a user to the LLM. That would be million of tokens. Instead, we give the agent two tools:
save_to_memory(fact: str): "The user just told me they are vegan. I should remember this."retrieve_memory(query: str): "Does the user have any food allergies?"
The Flow
- The agent calls
save_to_memory. - The backend embeds the text and saves it to a User-Specific Vector Index.
- Future sessions query this index before every response.
3. Fact vs. Context vs. Preference
Not everything should be "Remembered" forever.
- Facts (Static): "The user lives in NYC." -> Save to LTM.
- Context (Transient): "The user is currently at the airport." -> Save to Short-Term Memory.
- Preferences (Operational): "The user likes short, direct answers." -> Save to System Prompt / Profile.
4. The "Summary Memory" Strategy
As we saw in Module 3.3, we can summarize history. But what do we do with the summary? Production Pattern: Every time a thread finishes, a "Reflector Agent" reads the whole chat and extracts Entity Updates.
- Update: "Increment 'Coffee_Count' for User_123."
- Update: "Change 'Project_Status' from 'Active' to 'Done'."
This transforms raw text into Actionable Data for the next session.
5. Privacy: The "State" of the Mirror
Long-term memory is the highest privacy risk in AI. If your database is hacked, the hacker doesn't just get a password; they get a Digital Mirror of the user's life.
Security Rules for LTM
- Encryption at Rest: The vector database must be encrypted with a user-specific key.
- Explicit Consent: The agent should say: "Should I remember that for next time?" before calling the
savetool.
6. Implementation Strategy: Memory in LangGraph
We use a Post-Processing Node that runs after the conversation is over.
async def memory_reflector_node(state):
# Only run this at the END of a session
summary = await llm.ainvoke(f"Extract key personal facts from this chat: {state['messages']}")
await vector_db.add(summary, metadata={"user_id": state["user_id"]})
return state
Summary and Mental Model
Think of Long-Term Memory like The Notes a Doctor Takes.
- They don't remember every single word you said in the check-up (Short-term).
- But they look at your chart (Long-term) to see your history before you walk in the door.
An agent without a chart is just a stranger.
Exercise: Memory Design
- Categorization: Which memory (Short, Medium, or Long) would you use for these 3 items:
- User's Birthday.
- The fact that the user is currently "In a hurry."
- The partial code written in the last message.
- Refinement: Write a prompt for the "Memory Reflector" that prevents it from saving PII like Credit Card numbers into the long-term vector store.
- UX: Describe how you would show a user "What the agent knows about them."
- (Hint: A "Personal Profile" page where they can edit or delete memories). Ready for complex relationships? Next lesson: Graph Databases for Complex Relationships.