Architecture Overview
┌────────────────────────────────────────────────────────────┐
│ Load Balancer / CDN │
│ (Cloudflare, AWS ALB) │
└──────────────────────────┬─────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ Nginx / API Gateway │
│ (SSL termination, rate limiting) │
└───────┬──────────────────┬─────────────────┬───────────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Frontend │ │ Gateway │ │ LangGraph │
│ (Replicas) │ │ (Replicas) │ │ (Replicas) │
└──────────────┘ └──────┬───────┘ └──────┬───────┘
│ │
▼ ▼
┌──────────────────────────────────┐
│ Shared Storage (NFS/EFS) │
│ - Thread data │
│ - Skills │
│ - Artifacts │
└──────────────────────────────────┘
│
▼
┌──────────────────────────────────┐
│ Sandbox Provisioner │
│ (Kubernetes-based sandboxes) │
└──────────────────────────────────┘
Pre-deployment Checklist
Infrastructure
- Kubernetes cluster (EKS, GKE, AKS, or self-hosted)
- Container registry (ECR, GCR, Docker Hub)
- Load balancer (ALB, GCE LB, Nginx Ingress)
- TLS certificates (Let’s Encrypt, AWS ACM)
- Shared storage (NFS, EFS, Cloud Filestore)
- Database for checkpoints (PostgreSQL, Redis)
- Monitoring stack (Prometheus, Grafana)
- Logging aggregation (ELK, CloudWatch, Loki)
- Backup solution (Velero, cloud snapshots)
Security
- API keys stored in secrets manager (Vault, AWS Secrets Manager)
- Network policies configured
- RBAC policies defined
- Security scanning enabled (Snyk, Trivy)
- SSL/TLS certificates deployed
- Rate limiting configured
- DDoS protection enabled
- Audit logging enabled
Configuration
-
config.yamlprepared for production - Environment-specific settings configured
- Resource limits defined
- Auto-scaling policies configured
- Health check endpoints tested
- Backup retention policies defined
Production Configuration
config.yaml
Production-ready configuration:models:
- name: gpt-4
display_name: GPT-4
use: langchain_openai:ChatOpenAI
model: gpt-4
api_key: $OPENAI_API_KEY # From secrets manager
max_tokens: 4096
temperature: 0.7
timeout: 120 # Increased timeout
max_retries: 3 # Retry failed requests
tool_groups:
- name: web
- name: file:read
- name: file:write
- name: bash
tools:
- name: web_search
group: web
use: src.community.tavily.tools:web_search_tool
max_results: 5
timeout: 30
- name: web_fetch
group: web
use: src.community.jina_ai.tools:web_fetch_tool
timeout: 30
# Use Kubernetes-based sandbox with provisioner
sandbox:
use: src.community.aio_sandbox:AioSandboxProvider
provisioner_url: http://provisioner:8002
# Resource limits enforced by provisioner pod config
subagents:
timeout_seconds: 1800 # 30 minutes for complex tasks
agents:
general-purpose:
timeout_seconds: 3600 # 1 hour for research
bash:
timeout_seconds: 600 # 10 minutes for commands
skills:
container_path: /mnt/skills
title:
enabled: true
max_words: 6
max_chars: 60
model_name: null
# Aggressive summarization for long conversations
summarization:
enabled: true
model_name: null
trigger:
- type: tokens
value: 50000 # Trigger earlier in production
- type: fraction
value: 0.7 # When 70% of context used
keep:
type: messages
value: 20 # Keep more recent messages
trim_tokens_to_summarize: 50000
# Memory persistence
memory:
enabled: true
storage_path: /mnt/shared/memory.json
debounce_seconds: 60
model_name: null
max_facts: 200
fact_confidence_threshold: 0.8 # Higher threshold
injection_enabled: true
max_injection_tokens: 3000
Environment Variables
Store sensitive data in secrets:# API Keys (from secrets manager)
OPENAI_API_KEY=secret-ref:openai-key
ANTHROPIC_API_KEY=secret-ref:anthropic-key
TAVILY_API_KEY=secret-ref:tavily-key
# Application settings
NODE_ENV=production
CI=true
# Logging
LOG_LEVEL=INFO
LOG_FORMAT=json
# Monitoring
ENABLE_METRICS=true
METRICS_PORT=9090
# Tracing (optional)
OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317
Container Images
Build Production Images
Frontend Production Dockerfile
Createfrontend/Dockerfile.prod:
# Build stage
FROM node:22-alpine AS builder
WORKDIR /app
# Install pnpm
RUN corepack enable && corepack install -g pnpm@10.26.2
# Copy package files
COPY frontend/package.json frontend/pnpm-lock.yaml ./
# Install dependencies
RUN pnpm install --frozen-lockfile --prod=false
# Copy source
COPY frontend/ .
# Build
RUN pnpm run build
# Production stage
FROM node:22-alpine
WORKDIR /app
# Install pnpm
RUN corepack enable && corepack install -g pnpm@10.26.2
# Copy built assets and dependencies
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/public ./public
COPY --from=builder /app/package.json ./package.json
COPY --from=builder /app/node_modules ./node_modules
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001 && \
chown -R nextjs:nodejs /app
USER nextjs
EXPOSE 3000
ENV NODE_ENV=production
ENV PORT=3000
CMD ["pnpm", "start"]
Backend Production Dockerfile
Createbackend/Dockerfile.prod:
FROM python:3.12-slim
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Install uv
RUN curl -LsSf https://astral.sh/uv/install.sh | sh
ENV PATH="/root/.local/bin:$PATH"
WORKDIR /app
# Copy backend
COPY backend ./backend
# Install production dependencies only
RUN --mount=type=cache,target=/root/.cache/uv \
sh -c "cd backend && uv sync --no-dev"
# Create non-root user
RUN useradd -m -u 1001 deerflow && \
chown -R deerflow:deerflow /app
USER deerflow
EXPOSE 8001 2024
ENV PYTHONUNBUFFERED=1
# Multi-worker uvicorn for production
CMD ["sh", "-c", "cd backend && uv run uvicorn src.gateway.app:app --host 0.0.0.0 --port 8001 --workers 4 --log-level info"]
Build and Push
# Build images
docker build -t your-registry/deer-flow-frontend:v2.0 -f frontend/Dockerfile.prod .
docker build -t your-registry/deer-flow-backend:v2.0 -f backend/Dockerfile.prod .
# Push to registry
docker push your-registry/deer-flow-frontend:v2.0
docker push your-registry/deer-flow-backend:v2.0
# Tag as latest
docker tag your-registry/deer-flow-frontend:v2.0 your-registry/deer-flow-frontend:latest
docker tag your-registry/deer-flow-backend:v2.0 your-registry/deer-flow-backend:latest
docker push your-registry/deer-flow-frontend:latest
docker push your-registry/deer-flow-backend:latest
Kubernetes Deployment
Namespace
apiVersion: v1
kind: Namespace
metadata:
name: deer-flow-prod
labels:
name: deer-flow-prod
environment: production
Frontend Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: deer-flow-prod
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: deer-flow-frontend
template:
metadata:
labels:
app: deer-flow-frontend
spec:
containers:
- name: frontend
image: your-registry/deer-flow-frontend:v2.0
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: NEXT_PUBLIC_API_URL
value: "https://api.deerflow.example.com"
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
livenessProbe:
httpGet:
path: /
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- deer-flow-frontend
topologyKey: kubernetes.io/hostname
Gateway Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: gateway
namespace: deer-flow-prod
spec:
replicas: 3
selector:
matchLabels:
app: deer-flow-gateway
template:
metadata:
labels:
app: deer-flow-gateway
spec:
containers:
- name: gateway
image: your-registry/deer-flow-backend:v2.0
command: ["sh", "-c"]
args:
- |
cd backend &&
uv run uvicorn src.gateway.app:app \
--host 0.0.0.0 --port 8001 \
--workers 4 \
--log-level info
ports:
- containerPort: 8001
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: deer-flow-secrets
key: openai-api-key
volumeMounts:
- name: config
mountPath: /app/config.yaml
subPath: config.yaml
- name: shared-storage
mountPath: /mnt/shared
resources:
requests:
cpu: 1000m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
livenessProbe:
httpGet:
path: /health
port: 8001
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8001
initialDelaySeconds: 10
periodSeconds: 5
volumes:
- name: config
configMap:
name: deer-flow-config
- name: shared-storage
persistentVolumeClaim:
claimName: deer-flow-shared-pvc
LangGraph Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: langgraph
namespace: deer-flow-prod
spec:
replicas: 3
selector:
matchLabels:
app: deer-flow-langgraph
template:
metadata:
labels:
app: deer-flow-langgraph
spec:
containers:
- name: langgraph
image: your-registry/deer-flow-backend:v2.0
command: ["sh", "-c"]
args:
- |
cd backend &&
uv run langgraph start \
--host 0.0.0.0 --port 2024
ports:
- containerPort: 2024
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: deer-flow-secrets
key: openai-api-key
volumeMounts:
- name: config
mountPath: /app/config.yaml
subPath: config.yaml
- name: shared-storage
mountPath: /mnt/shared
- name: threads
mountPath: /app/backend/.deer-flow
resources:
requests:
cpu: 2000m
memory: 2Gi
limits:
cpu: 4000m
memory: 4Gi
livenessProbe:
httpGet:
path: /
port: 2024
initialDelaySeconds: 30
periodSeconds: 10
volumes:
- name: config
configMap:
name: deer-flow-config
- name: shared-storage
persistentVolumeClaim:
claimName: deer-flow-shared-pvc
- name: threads
persistentVolumeClaim:
claimName: deer-flow-threads-pvc
Services
---
apiVersion: v1
kind: Service
metadata:
name: frontend
namespace: deer-flow-prod
spec:
type: ClusterIP
ports:
- port: 3000
targetPort: 3000
selector:
app: deer-flow-frontend
---
apiVersion: v1
kind: Service
metadata:
name: gateway
namespace: deer-flow-prod
spec:
type: ClusterIP
ports:
- port: 8001
targetPort: 8001
selector:
app: deer-flow-gateway
---
apiVersion: v1
kind: Service
metadata:
name: langgraph
namespace: deer-flow-prod
spec:
type: ClusterIP
ports:
- port: 2024
targetPort: 2024
selector:
app: deer-flow-langgraph
Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: deer-flow-ingress
namespace: deer-flow-prod
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: 100m
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
spec:
tls:
- hosts:
- deerflow.example.com
- api.deerflow.example.com
secretName: deer-flow-tls
rules:
- host: deerflow.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend
port:
number: 3000
- host: api.deerflow.example.com
http:
paths:
- path: /api/langgraph
pathType: Prefix
backend:
service:
name: langgraph
port:
number: 2024
- path: /api
pathType: Prefix
backend:
service:
name: gateway
port:
number: 8001
Scaling
Horizontal Pod Autoscaler
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: frontend-hpa
namespace: deer-flow-prod
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: frontend
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gateway-hpa
namespace: deer-flow-prod
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: gateway
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: langgraph-hpa
namespace: deer-flow-prod
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: langgraph
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
Cluster Autoscaler
Enable cluster autoscaler for node scaling:apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler
namespace: kube-system
data:
min-nodes: "3"
max-nodes: "20"
scale-down-delay-after-add: "10m"
scale-down-unneeded-time: "10m"
Monitoring
Prometheus ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: deer-flow-metrics
namespace: deer-flow-prod
spec:
selector:
matchLabels:
app: deer-flow-gateway
endpoints:
- port: metrics
interval: 30s
path: /metrics
Grafana Dashboard
Key metrics to monitor:- Request rate: Requests per second
- Response time: P50, P95, P99 latencies
- Error rate: 4xx/5xx responses
- Active threads: Number of agent threads
- Sandbox usage: Active sandbox pods
- Resource utilization: CPU, memory, disk
- Queue depth: Pending agent tasks
Alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: deer-flow-alerts
namespace: deer-flow-prod
spec:
groups:
- name: deer-flow
interval: 30s
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 5m
annotations:
summary: High error rate detected
- alert: HighLatency
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 5
for: 5m
annotations:
summary: High latency detected (P95 > 5s)
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
annotations:
summary: Pod is crash looping
Security
Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deer-flow-network-policy
namespace: deer-flow-prod
spec:
podSelector:
matchLabels:
app: deer-flow-gateway
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: deer-flow-frontend
- podSelector:
matchLabels:
app: deer-flow-langgraph
ports:
- protocol: TCP
port: 8001
egress:
- to:
- podSelector:
matchLabels:
app: deer-flow-provisioner
ports:
- protocol: TCP
port: 8002
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443 # HTTPS for external APIs
Pod Security Policy
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: deer-flow-psp
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
runAsUser:
rule: MustRunAsNonRoot
seLinux:
rule: RunAsAny
fsGroup:
rule: RunAsAny
volumes:
- configMap
- secret
- persistentVolumeClaim
- emptyDir
Secrets Management
Use external secrets operator:apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secrets-manager
namespace: deer-flow-prod
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: deer-flow-secrets
namespace: deer-flow-prod
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: SecretStore
target:
name: deer-flow-secrets
creationPolicy: Owner
data:
- secretKey: openai-api-key
remoteRef:
key: prod/deerflow/openai-api-key
- secretKey: anthropic-api-key
remoteRef:
key: prod/deerflow/anthropic-api-key
Backup and Disaster Recovery
Velero Backup
# Install Velero
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.9.0 \
--bucket deer-flow-backups \
--backup-location-config region=us-east-1 \
--snapshot-location-config region=us-east-1
# Create backup schedule
velero schedule create deer-flow-daily \
--schedule="0 2 * * *" \
--include-namespaces deer-flow-prod \
--ttl 720h0m0s
# Manual backup
velero backup create deer-flow-backup-$(date +%Y%m%d) \
--include-namespaces deer-flow-prod
Thread Data Backup
#!/bin/bash
# backup-threads.sh
BACKUP_DIR="/backups/threads"
THREADS_DIR="/mnt/shared/threads"
RETENTION_DAYS=30
# Create timestamped backup
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
tar -czf "${BACKUP_DIR}/threads-${TIMESTAMP}.tar.gz" "${THREADS_DIR}"
# Upload to S3
aws s3 cp "${BACKUP_DIR}/threads-${TIMESTAMP}.tar.gz" \
s3://deer-flow-backups/threads/
# Clean old backups
find "${BACKUP_DIR}" -name "threads-*.tar.gz" -mtime +${RETENTION_DAYS} -delete
Restore Procedure
# Restore from Velero
velero restore create --from-backup deer-flow-backup-20260304
# Restore thread data
aws s3 cp s3://deer-flow-backups/threads/threads-20260304-020000.tar.gz .
tar -xzf threads-20260304-020000.tar.gz -C /mnt/shared/
Performance Optimization
Caching
-
Redis for session state:
env: - name: REDIS_URL value: redis://redis:6379/0 -
CDN for static assets:
- Cloudflare
- AWS CloudFront
- Fastly
-
Response caching:
# In gateway app from fastapi_cache import FastAPICache from fastapi_cache.backends.redis import RedisBackend @app.on_event("startup") async def startup(): redis = await aioredis.create_redis_pool("redis://redis:6379") FastAPICache.init(RedisBackend(redis), prefix="deer-flow-cache")
Database Optimization
For checkpoint storage:# PostgreSQL for LangGraph checkpoints
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: deer-flow-prod
spec:
ports:
- port: 5432
selector:
app: postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: deer-flow-prod
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
spec:
containers:
- name: postgres
image: postgres:16-alpine
env:
- name: POSTGRES_DB
value: deerflow
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-secret
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
Cost Optimization
Resource Right-sizing
Monitor and adjust resource requests/limits:# Analyze resource usage
kubectl top pods -n deer-flow-prod
# View recommendations
kubectl describe vpa -n deer-flow-prod
Spot Instances
Use spot instances for sandbox pods:spec:
nodeSelector:
node.kubernetes.io/instance-type: spot
tolerations:
- key: spot
operator: Equal
value: "true"
effect: NoSchedule
Storage Tiering
# Hot data on SSD
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: deer-flow-hot-storage
spec:
storageClassName: fast-ssd
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
# Cold data on HDD
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: deer-flow-cold-storage
spec:
storageClassName: standard-hdd
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Ti
Troubleshooting
Check Pod Status
kubectl get pods -n deer-flow-prod
kubectl describe pod <pod-name> -n deer-flow-prod
kubectl logs <pod-name> -n deer-flow-prod --tail=100 -f
Check Resource Usage
kubectl top nodes
kubectl top pods -n deer-flow-prod
Debug Network Issues
# Test service connectivity
kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- bash
curl http://gateway:8001/health
# Check network policies
kubectl get networkpolicies -n deer-flow-prod
Performance Profiling
# Enable profiling in gateway
kubectl exec -it gateway-xxx -n deer-flow-prod -- bash
cd backend
uv run python -m cProfile -o profile.stats src/gateway/app.py
Next Steps
Docker Deployment
Learn about Docker-based deployment
Kubernetes Deployment
Deploy on Kubernetes with provisioner
See Also
- Configuration Guide - Production configuration
- Security Best Practices - Secure your deployment
- Monitoring Guide - Set up monitoring
- Backup Strategy - Backup and recovery