ThreadState schema that extends LangChain’s AgentState with domain-specific fields for sandbox management, file tracking, and user interactions. The state is managed by LangGraph’s checkpointing system and enhanced with custom reducers for intelligent state merging.
ThreadState Schema
Defined inbackend/src/agents/thread_state.py:
State Fields
sandbox
Type:SandboxState | None
Purpose: Tracks the active sandbox environment for isolated tool execution.
Structure:
- Set by
SandboxMiddlewareinbefore_agenthook - Persisted across turns within same thread (sandbox not released)
- Used by sandbox tools (
bash,read_file,write_file,str_replace,ls)
thread_data
Type:ThreadDataState | None
Purpose: Provides path mappings between virtual (agent-visible) and physical (host) paths.
Structure:
- Agent sees:
/mnt/user-data/workspace,/mnt/user-data/uploads,/mnt/user-data/outputs - Physical:
backend/.deer-flow/threads/{thread_id}/user-data/{workspace,uploads,outputs}
- Set by
ThreadDataMiddlewareinbefore_agenthook - With
lazy_init=True(default): Paths computed but directories created on-demand - With
lazy_init=False: Directories eagerly created in middleware
title
Type:str | None
Purpose: Human-readable thread title for UI display.
Lifecycle:
- Set by
TitleMiddlewareafter first complete user-assistant exchange - Generated via lightweight LLM based on first user message and assistant response
- Persisted by LangGraph checkpointer
backend/src/agents/middlewares/title_middleware.py:46-81):
config.yaml):
artifacts
Type:Annotated[list[str], merge_artifacts]
Purpose: Tracks files presented to user via present_files tool.
Custom Reducer (backend/src/agents/thread_state.py:21-28):
- Maintains insertion order (first occurrence preserved)
- Automatically deduplicates paths
- Survives across turns (cumulative)
todos
Type:list | None
Purpose: Stores task list when TodoListMiddleware is enabled (is_plan_mode=True).
Structure:
pending- Not yet startedin_progress- Currently working (one at a time, or multiple if parallel)completed- Finished successfully
TodoListMiddleware (LangChain built-in) with custom prompts
Tool: write_todos (injected by middleware)
uploaded_files
Type:list[dict] | None
Purpose: Tracks newly uploaded files for current turn.
Structure:
- Set by
UploadsMiddlewareinbefore_agenthook - Only includes files NOT already shown in previous messages (deduplication)
- Cleared on next turn (not cumulative)
backend/src/agents/middlewares/uploads_middleware.py:110-136):
viewed_images
Type:Annotated[dict[str, ViewedImageData], merge_viewed_images]
Purpose: Tracks images loaded via view_image tool for vision model analysis.
Structure:
backend/src/agents/thread_state.py:31-45):
- Normal updates: Merge dictionaries (new keys added, existing keys updated)
- Empty dict
{}: Clears all viewed images (reset) - Used by
ViewImageMiddlewareto inject images before LLM call
- Agent calls
view_imagetool → Tool returns base64 data in ToolMessage - Tool also updates state:
{"viewed_images": {path: {base64, mime_type}}} ViewImageMiddlewaredetects completedview_imagetool calls inbefore_model- Middleware injects HumanMessage with multimodal content (text + images)
- LLM analyzes images automatically
- Middleware clears state:
{"viewed_images": {}}after processing
State Management Patterns
1. Middleware State Updates
Middlewares return state updates as dictionaries:- Fields without custom reducers: Replace existing value
- Fields with custom reducers: Call reducer function
messages: Uses LangChain’sadd_messagesreducer (append to list)
2. Tool State Updates
Tools can update state by returning dictionaries:3. State Persistence
State persisted via LangGraph checkpointer:- All
ThreadStatefields - Full message history
- Checkpoints created after each agent step
4. Thread Isolation
Each thread maintains independent state:- Separate directories:
backend/.deer-flow/threads/{thread_id}/ - Separate sandboxes (if using Docker provider)
- Separate checkpoint history
Custom Reducer Implementation
When to Use Custom Reducers
- Deduplication - Remove duplicates while merging (like
merge_artifacts) - Merging Dicts - Intelligently merge nested structures (like
merge_viewed_images) - Reset Semantics - Support clearing values (empty dict resets
viewed_images) - Aggregation - Accumulate values with custom logic
Creating Custom Reducers
- Takes two arguments:
existing(current state) andnew(update) - Both arguments can be
None - Returns merged value of same type
- Pure function (no side effects)
Testing Reducers
State Debugging
Inspecting Current State
State Size Monitoring
Performance Considerations
Memory Usage
messages: Grows unbounded without summarization (useSummarizationMiddleware)viewed_images: Store base64 data (can be large, clear after processing)artifacts: Small (just file paths)
Database Size
- Each checkpoint persists full state to database
- With PostgresCheckpointer: One row per checkpoint
- Recommend periodic cleanup of old threads
State Transfer
- State serialized/deserialized on every agent step
- Keep state schema simple (avoid deeply nested structures)
- Use
NotRequiredfor optional fields (reduces serialization overhead)
See Also
- Middleware Chain - How middlewares manipulate state
- Context Engineering - Managing message history size
- Model Factory - Runtime configuration via state