Deep Research Tool

The Deep Research Tool is an automated research assistant that uses iterative deep-search techniques to gather insights from the web. It leverages large language models, SERP (Search Engine Results Page) queries, and a recursive research methodology to compile detailed reports. The solution supports both a command-line (console) interface and a web-based (Streamlit) interface.

Features

Iterative Deep Research:
- Generates follow-up questions from the initial user query.
- Creates tailored SERP queries based on ongoing research learnings.
- Recursively drills down into topics by controlling research breadth and depth.
Report Generation:
- Processes SERP results to extract key learnings and URLs.
- Compiles all gathered information into a detailed final report.
Multiple Interfaces:
- Console Application: Interactively guides the user via terminal inputs.
- Streamlit Web App: Provides an easy-to-use UI with live progress updates and a downloadable final report.
Configurable Parameters:
- Control search breadth (number of query branches) and depth (levels of recursive search).
- Configure environment variables for API keys and endpoints.

Project Structure

.
├── src
│   ├── deep_research.py         # Core research logic, utility functions, and asynchronous deep research routine.
│   ├── console_app.py           # Console (CLI) application for running deep research.
│   └── streamlit_app.py         # Streamlit web application to run deep research with a GUI.
├── .env                       # Environment configuration (API keys, endpoints, etc.).
├── requirements.txt          # Python dependencies.
└── README.md                 # This file.

Prerequisites

Python 3.11+ is required.
Install the required packages using the provided requirements.txt.

Installation

Clone the Repository:

git clone https://github.com/mohocp/deep-research-python.git
cd deep-research-python

Create and Activate a Virtual Environment (Optional but Recommended):

python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

Install Dependencies:
```
pip install -r requirements.txt
```

Configuration

Before running the application, set up your environment variables. You can do this by editing the provided .env file. The following keys must be configured:

LLM (Large Language Model) Settings:
- LLM_KEY — API key for the language model (e.g., OpenAI).
- LLM_MODEL — The model name (default example: gpt-4o).
- LLM_ENDPOINT — The endpoint URL for your LLM API (default: https://api.openai.com/v1).
Firecrawl (SERP Search) Settings:
- FIRECRAWL_KEY — Your API key for Firecrawl.
- FIRECRAWL_BASE_URL — Base URL for the Firecrawl API.
Other Parameters:
- CONTEXT_SIZE — Maximum allowed context size (default: 128000).
- MAX_OUTPUT_TOKENS — Maximum tokens for LLM responses (default: 8000).
- BREADTH — Default breadth (number of query branches) for research (default: 4).
- DEPTH — Default depth (recursion levels) for research (default: 2).

Note: If you are using your self-hosted Firecrawl or different LLM settings, update the .env file accordingly.

Usage

You can run the console app or the streamlit app.

1. Running the Console Application

The console app provides an interactive command-line interface:

python src/console_app.py

Step 1: Enter your research query.
Step 2: Answer follow-up questions generated by the tool.
Step 3: The tool performs deep research, displays progress, and saves the final report to output.md.

2. Running the Streamlit Web Application

The Streamlit app provides a graphical interface with live progress updates:

streamlit run src/streamlit_app.py

Step 1: Enter your research query and adjust the breadth/depth parameters if needed.
Step 2: Answer the generated follow-up questions.
Step 3: Watch the research progress in real time and view/download the final report directly from the browser.

How It Works

Feedback & Query Generation:
- The tool first asks follow-up questions based on the initial user query to clarify research direction.
- It then generates multiple SERP queries using an assistant agent powered by a large language model.
SERP Search & Processing:
- Each generated query is run concurrently (subject to a concurrency limit).
- The SERP results are processed to extract key learnings and URLs.
Recursive Deep Research:
- If additional depth is allowed, the tool recursively generates new queries based on follow-up questions and learnings from the current search.
Final Report Compilation:
- All learnings and sources are combined into a detailed report.
- The report is output as Markdown and saved locally (output.md for the console app).

Troubleshooting & Notes

API Rate Limits:
The tool implements exponential backoff when encountering rate limits. If you see repeated rate limit messages, consider adjusting your API usage or reviewing the Firecrawl documentation.
Asynchronous Execution:
Both the console and Streamlit apps use asynchronous programming (asyncio) to perform multiple searches concurrently. Ensure your Python environment supports asyncio (Python 3.8+).
Customizing Prompts:
You can adjust the research instructions and system prompt in deep_research.py to better fit your use case.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Research Tool

Features

Project Structure

Prerequisites

Installation

Configuration

Usage

You can run the console app or the streamlit app.

1. Running the Console Application

2. Running the Streamlit Web Application

How It Works

Troubleshooting & Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mohocp/deep-research-python

Folders and files

Latest commit

History

Repository files navigation

Deep Research Tool

Features

Project Structure

Prerequisites

Installation

Configuration

Usage

You can run the console app or the streamlit app.

1. Running the Console Application

2. Running the Streamlit Web Application

How It Works

Troubleshooting & Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages