August 7, 2025: Added GPT-5 and updated the Deep Research Bench evaluation w/ GPT-5 results.
git clone https://github.com/langchain-ai/open_deep_research.git
cd open_deep_research
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
bash
uv sync
# or
uv pip install -r pyproject.toml
.env
file to customize the environment variables (for model selection, search tools, and other configuration settings):
bash
cp .env.example .env
bash
# Install dependencies and start the LangGraph server
uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev --allow-blocking
This will open the LangGraph Studio UI in your browser.
- 🚀 API: http://127.0.0.1:2024
- 🎨 Studio UI: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
- 📚 API Docs: http://127.0.0.1:2024/docs
messages
input field and click Submit
. Select different configuration in the "Manage Assistants" tab.openai:gpt-4.1-mini
): Summarizes search API resultsopenai:gpt-4.1
): Power the search agentopenai:gpt-4.1
): Compresses research findingsopenai:gpt-4.1
): Write the final reportNote: the selected model will need to support structured outputs and tool calling.
Note: For OpenRouter: Follow this guide and for local models via Ollama see setup instructions.
search_api
and mcp_config
fields in the configuration.py file for more details. This can be accessed via the LangGraph Studio UI.Warning: Running across the 100 examples can cost ~$20-$100 depending on the model selection.
# Run comprehensive evaluation on LangSmith datasets
python tests/run_evaluate.py
YOUR_EXPERIMENT_NAME
. Once this is done, extract the results to a JSONL file that can be submitted to the Deep Research Bench.python tests/extract_langsmith_data.py --project-name "YOUR_EXPERIMENT_NAME" --model-name "you-model-name" --dataset-name "deep_research_bench"
tests/expt_results/deep_research_bench_model-name.jsonl
with the required format. Move the generated JSONL file to a local clone of the Deep Research Bench repository and follow their Quick Start guide for evaluation submission.Name | Commit | Summarization | Research | Compression | Total Cost | Total Tokens | RACE Score | Experiment |
---|---|---|---|---|---|---|---|---|
GPT-5 | ca3951d | openai:gpt-4.1-mini | openai:gpt-5 | openai:gpt-4.1 | 204,640,896 | 0.4943 | Link | |
Defaults | 6532a41 | openai:gpt-4.1-mini | openai:gpt-4.1 | openai:gpt-4.1 | $45.98 | 58,015,332 | 0.4309 | Link |
Claude Sonnet 4 | f877ea9 | openai:gpt-4.1-mini | anthropic:claude-sonnet-4-20250514 | openai:gpt-4.1 | $187.09 | 138,917,050 | 0.4401 | Link |
Deep Research Bench Submission | c0a160b | openai:gpt-4.1-nano | openai:gpt-4.1 | openai:gpt-4.1 | $87.83 | 207,005,549 | 0.4344 | Link |
Open Agent Platform (OAP) is a UI from which non-technical users can build and configure their own agents. OAP is great for allowing users to configure the Deep Researcher with different MCP tools and search APIs that are best suited to their needs and the problems that they want to solve.
You can also deploy your own instance of OAP, and make your own custom agents (like Deep Researcher) available on it to your users.
src/legacy/
folder contains two earlier implementations that provide alternative approaches to automated research. They are less performant than the current implementation, but provide alternative ideas understanding the different approaches to deep research.legacy/graph.py
)Plan-and-Execute: Structured workflow with human-in-the-loop planning
Sequential Processing: Creates sections one by one with reflection
Interactive Control: Allows feedback and approval of report plans
Quality Focused: Emphasizes accuracy through iterative refinement
legacy/multi_agent.py
)Supervisor-Researcher Architecture: Coordinated multi-agent system
Parallel Processing: Multiple researchers work simultaneously
Speed Optimized: Faster report generation through concurrency
MCP Support: Extensive Model Context Protocol integration