File Uploads - DeerFlow

Overview

DeerFlow provides complete file upload functionality with automatic document conversion and thread isolation. Users can upload files during conversations, and the agent automatically accesses and processes them.

Features

Multi-file Upload

Upload multiple files simultaneously

Auto Conversion

Automatic PDF and Office document to Markdown conversion

Thread Isolation

Files stored in thread-specific directories

Agent Awareness

Agent automatically sees uploaded files

Supported File Formats

These formats are automatically converted to Markdown:

PDF: .pdf
PowerPoint: .ppt, .pptx
Excel: .xls, .xlsx
Word: .doc, .docx

Other file types are stored as-is and can be accessed by the agent.

API Endpoints

Upload Files

POST /api/threads/{thread_id}/uploads
Content-Type: multipart/form-data

Request:

files: One or more files

Response:

{
  "success": true,
  "files": [
    {
      "filename": "document.pdf",
      "size": 1234567,
      "path": ".deer-flow/threads/{thread_id}/user-data/uploads/document.pdf",
      "virtual_path": "/mnt/user-data/uploads/document.pdf",
      "artifact_url": "/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf",
      "markdown_file": "document.md",
      "markdown_path": ".deer-flow/threads/{thread_id}/user-data/uploads/document.md",
      "markdown_virtual_path": "/mnt/user-data/uploads/document.md",
      "markdown_artifact_url": "/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.md"
    }
  ],
  "message": "Successfully uploaded 1 file(s)"
}

List Files

GET /api/threads/{thread_id}/uploads/list

Response:

{
  "files": [
    {
      "filename": "document.pdf",
      "size": 1234567,
      "path": ".deer-flow/threads/{thread_id}/user-data/uploads/document.pdf",
      "virtual_path": "/mnt/user-data/uploads/document.pdf",
      "artifact_url": "/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf",
      "extension": ".pdf",
      "modified": 1705997600.0
    }
  ],
  "count": 1
}

Delete File

DELETE /api/threads/{thread_id}/uploads/{filename}

Response:

{
  "success": true,
  "message": "Deleted document.pdf"
}

Using the Upload API

curl
Python
JavaScript/TypeScript

# Upload single file
curl -X POST http://localhost:2026/api/threads/my-thread/uploads \
  -F "files=@/path/to/document.pdf"

# Upload multiple files
curl -X POST http://localhost:2026/api/threads/my-thread/uploads \
  -F "files=@document.pdf" \
  -F "files=@presentation.pptx" \
  -F "files=@spreadsheet.xlsx"

# List uploaded files
curl http://localhost:2026/api/threads/my-thread/uploads/list

# Delete file
curl -X DELETE http://localhost:2026/api/threads/my-thread/uploads/document.pdf

import requests

thread_id = "my-thread"
base_url = "http://localhost:2026"

# Upload files
with open("document.pdf", "rb") as f1, \
     open("presentation.pptx", "rb") as f2:
    files = [
        ("files", f1),
        ("files", f2),
    ]
    response = requests.post(
        f"{base_url}/api/threads/{thread_id}/uploads",
        files=files
    )
    print(response.json())

# List files
response = requests.get(
    f"{base_url}/api/threads/{thread_id}/uploads/list"
)
print(response.json())

# Delete file
response = requests.delete(
    f"{base_url}/api/threads/{thread_id}/uploads/document.pdf"
)
print(response.json())

const threadId = "my-thread";
const baseUrl = "http://localhost:2026";

// Upload files
async function uploadFiles(files: File[]) {
  const formData = new FormData();
  files.forEach(file => {
    formData.append('files', file);
  });

  const response = await fetch(
    `${baseUrl}/api/threads/${threadId}/uploads`,
    {
      method: 'POST',
      body: formData,
    }
  );

  return response.json();
}

// List files
async function listFiles() {
  const response = await fetch(
    `${baseUrl}/api/threads/${threadId}/uploads/list`
  );
  return response.json();
}

// Delete file
async function deleteFile(filename: string) {
  const response = await fetch(
    `${baseUrl}/api/threads/${threadId}/uploads/${filename}`,
    { method: 'DELETE' }
  );
  return response.json();
}

Path Mapping

Files are stored with three different path representations:

Physical Path

Actual location on the filesystem:

backend/.deer-flow/threads/{thread_id}/user-data/uploads/document.pdf

Virtual Path (Agent)

Path used by the agent in sandbox:

/mnt/user-data/uploads/document.pdf

The agent reads files using this path:

read_file("/mnt/user-data/uploads/document.pdf")

Artifact URL (Frontend)

HTTP URL for frontend access:

/api/threads/{thread_id}/artifacts/mnt/user-data/uploads/document.pdf

File Storage Structure

backend/.deer-flow/threads/
└── {thread_id}/
    └── user-data/
        └── uploads/
            ├── document.pdf          # Original file
            ├── document.md           # Converted Markdown
            ├── presentation.pptx
            ├── presentation.md
            └── spreadsheet.xlsx
            └── spreadsheet.md

Each thread has its own isolated upload directory. Files cannot be accessed across threads.

Agent Integration

Automatic File Listing

The UploadsMiddleware automatically injects uploaded files into every agent request:

<uploaded_files>
The following files have been uploaded and are available for use:

- document.pdf (1.2 MB)
  Path: /mnt/user-data/uploads/document.pdf

- document.md (45.3 KB)
  Path: /mnt/user-data/uploads/document.md

You can read these files using the `read_file` tool with the paths shown above.
</uploaded_files>

Reading Uploaded Files

The agent can read files using the read_file tool:

# Read original PDF (if binary reading is supported)
read_file("/mnt/user-data/uploads/document.pdf")

# Read converted Markdown (recommended)
read_file("/mnt/user-data/uploads/document.md")

Reading the Markdown version (.md) is recommended as it provides text content the agent can process.

Document Conversion

DeerFlow uses markitdown to convert documents:

Conversion Process

Upload

File uploaded via POST request

Storage

Original file saved to uploads directory

Detection

File extension checked against supported formats

Conversion

Document converted to Markdown using markitdown

Save Markdown

Converted file saved as {filename}.md

Handling Conversion Failures

If conversion fails:

Original file is still saved
Error logged but not returned to user
Agent can still access original file
No Markdown file created

try:
    markdown_content = markitdown.convert(file_path)
    save_markdown(markdown_content)
except Exception as e:
    logger.error(f"Conversion failed: {e}")
    # Original file still accessible

Frontend Integration

Implement file upload in your UI:

components/FileUpload.tsx

import { useState } from 'react';

export function FileUpload({ threadId }: { threadId: string }) {
  const [uploading, setUploading] = useState(false);

  const handleUpload = async (event: React.ChangeEvent<HTMLInputElement>) => {
    const files = event.target.files;
    if (!files) return;

    setUploading(true);
    const formData = new FormData();
    
    Array.from(files).forEach(file => {
      formData.append('files', file);
    });

    try {
      const response = await fetch(
        `/api/threads/${threadId}/uploads`,
        {
          method: 'POST',
          body: formData,
        }
      );

      const result = await response.json();
      console.log('Uploaded:', result);
      
      // Refresh file list or notify user
    } catch (error) {
      console.error('Upload failed:', error);
    } finally {
      setUploading(false);
    }
  };

  return (
    <div>
      <input
        type="file"
        multiple
        onChange={handleUpload}
        disabled={uploading}
      />
      {uploading && <span>Uploading...</span>}
    </div>
  );
}

Limits and Restrictions

File Size Limit

Default: 100 MB per fileConfigure in nginx:

client_max_body_size 100M;

File Name Security

Path traversal prevented (no ../ in filenames)
Special characters sanitized
Filenames normalized

Thread Isolation

Each thread has separate upload directory
Cross-thread access blocked
Files deleted when thread is deleted

Supported Conversions

PDF: ✅ Text extraction, images as placeholders
Office (docx, xlsx, pptx): ✅ Full text extraction
Images: ❌ No automatic OCR (consider adding)
Archives (zip): ❌ No automatic extraction

Implementation Details

Components

Upload Router

src/gateway/routers/uploads.pyHandles HTTP endpoints

Uploads Middleware

src/agents/middlewares/uploads_middleware.pyInjects file list into agent

Artifacts Router

src/gateway/routers/artifacts.pyServes files to frontend

Dependencies

pyproject.toml

[tool.uv.dependencies]
markitdown = ">=0.0.1a2"    # Document conversion
python-multipart = ">=0.0.20"  # File upload handling

Troubleshooting

Upload fails with 413 error

File exceeds size limit. Increase in nginx config:

docker/nginx/nginx.conf

client_max_body_size 200M;  # Increase to 200MB

Restart nginx:

make stop
make dev

Conversion fails silently

Check Gateway logs:

tail -f logs/gateway.log | grep markitdown

Verify markitdown is installed:

cd backend
uv run python -c "import markitdown"

Agent can't see uploaded files

Verify:

Files uploaded successfully (check response)
UploadsMiddleware is registered in agent
Thread ID matches between upload and agent

Files exist in filesystem:

ls backend/.deer-flow/threads/{thread_id}/user-data/uploads/

Files not accessible in sandbox

For non-local sandbox:

Ensure sandbox is running
Check mount configuration
Verify thread_id matches
Check sandbox logs

Best Practices

Validate Files Client-Side

const MAX_SIZE = 100 * 1024 * 1024; // 100MB

if (file.size > MAX_SIZE) {
  alert('File too large');
  return;
}

Show Upload Progress

const xhr = new XMLHttpRequest();
xhr.upload.addEventListener('progress', (e) => {
  const percent = (e.loaded / e.total) * 100;
  updateProgress(percent);
});

Display File List

const files = await fetch(`/api/threads/${threadId}/uploads/list`)
  .then(r => r.json());

displayFiles(files.files);

Handle Errors Gracefully

try {
  await uploadFiles(files);
} catch (error) {
  if (error.status === 413) {
    alert('File too large');
  } else {
    alert('Upload failed');
  }
}

Next Steps

Custom Tools

Create tools that process uploaded files

Creating Skills

Build skills that work with documents

API Reference

Complete upload API documentation

Configuration

Configure file handling settings

Overview

Core Concepts

Configuration

Guides

Deployment

Documentation Index

​Overview

​Features

Multi-file Upload

Auto Conversion

Thread Isolation

Agent Awareness

​Supported File Formats

​API Endpoints

​Upload Files

​List Files

​Delete File

​Using the Upload API

​Path Mapping

​File Storage Structure

​Agent Integration

​Automatic File Listing

​Reading Uploaded Files

​Document Conversion

​Conversion Process

​Handling Conversion Failures

​Frontend Integration

​Limits and Restrictions

​Implementation Details

​Components

Upload Router

Uploads Middleware

Artifacts Router

​Dependencies

​Troubleshooting

​Best Practices

​Next Steps

Custom Tools

Creating Skills

API Reference

Configuration

Overview

Features

Supported File Formats

API Endpoints

Upload Files

List Files

Delete File

Using the Upload API

Path Mapping

File Storage Structure

Agent Integration

Automatic File Listing

Reading Uploaded Files

Document Conversion

Conversion Process

Handling Conversion Failures

Frontend Integration

Limits and Restrictions

Implementation Details

Components

Dependencies

Troubleshooting

Best Practices

Next Steps