
Designing the Research Graph
Map out the brains of your assistant. Learn how to architect a multi-agent graph with planning nodes, retrieval nodes, and quality-control loops using LangGraph.
Designing the Research Graph
A professional AI application is not a single script; it's a Conversation between Nodes. In this lesson, we will design the architecture for your Research Assistant using the LangGraph philosophy.
By the end of this lesson, you will have a visual map of how your agent "thinks" and "works."
1. The Multi-Agent Blueprint
For our Capstone, we will use a Three-Node System:
- The Planner: Receives the user question and breaks it into 3 "Sub-Questions."
- The Researcher: Takes each sub-question and decides whether to use the Local Vector DB or Web Search.
- The Editor: Collects all findings, removes duplicates, and generates the final report.
graph TD
A[User Input] --> B[Planner Node]
B -- Sub-Questions --> C[Researcher Node]
C -- "Use Tool" --> D[Tool: Vector DB]
C -- "Use Tool" --> E[Tool: Web Search]
D & E --> F[Researcher: Analyze Observations]
F -- "Is research complete?" --> G{Decision}
G -- No --> C
G -- Yes --> H[Editor Node]
H --> I[Final Markdownized Report]
2. Defining the State Object
Because our agent is "Stateful," we need a shared dictionary that all nodes can access.
class ResearchState(TypedDict):
topic: str
sub_questions: List[str]
raw_findings: List[str]
final_report: str
iteration_count: int
- Every time the Researcher finds a fact, it appends it to
raw_findings. - The Editor reads the entire
raw_findingslist to write the report.
3. The "Conditional Edge" (The Self-Healer)
One of the most powerful features of your architecture is the Conditional Edge.
- After the Researcher finishes, we don't just move to the Editor.
- We move to a Quality Gate node. This node uses a cheap LLM to verify: "Is there enough information to answer the user's question?"
- If the answer is No, it sends the Researcher back out for more data!
4. Engineering the RAG Pipeline
For the Vector DB part of your architecture:
- Model: Use
text-embedding-3-smallfor the vectors. - DB: Use ChromaDB.
- Chunking: Use RecursiveCharacterTextSplitter with a 1,000-character size and 150-character overlap.
Summary of the Design
- Modularity: Each agent has one specific job.
- Verification: The system can loop back if it fails its own quality check.
- Structured Data: The system builds up a "Knowledge Base" in the
ResearchStatebefore writing a single word of the report.
In the next lesson, we will turn this architectural diagram into Functional Python Code.
Exercise: Identify the Bottleneck
Look at the diagram again.
- Which node is the most "Expensive" in terms of tokens?
- Which node is the most likely to "Hallucinate"?
- How would you add a "Human-in-the-Loop" step to this diagram?
Answer Logic:
- The Editor. It has to process the entire history of research to write the report.
- The Researcher. It receives raw, messy data from the web and might misinterpret it.
- HITL: You would place an "Interrupt" between the Planner and the Researcher. The human reviews the "Sub-Questions" and approves them before the bot spends money on search tokens!