A conversational financial question-answering system that handles numerical reasoning over unstructured financial documents containing data tables.
This project implements a system to handle conversational queries about financial data, supporting:
- Multi-turn conversations with context memory
- Numerical reasoning across financial documents
- Table data extraction and processing
- Two types of conversations:
- Type I: Simple conversations (decomposed multi-hop questions)
- Type II: Hybrid conversations (combined reasoning chains)
This is a prototype and prints logs.
- Python 3.12+
- UV environment manager
- Clone the repository
git clone <git@github.com:spyrosze/financial-agent.git>
- Install dependencies using UV
# Install UV if not already installed
brew install uv
# Set up environment and install dependencies
uv syncRun the chat interface using:
uv run main chat <record_id>Example:
uv run main chat "Single_RSG/2017/page_98.pdf-1"Note: The record_id serves as thread_id for conversation memory (in-memory only, not persistent across runs).
The project uses the ConvFinQA dataset, which includes:
- 3,892 conversations with 14,115 questions
- Train/Dev/Test split: 3,037/421/434 conversations
- Cleaned and preprocessed financial data
- Structured tables with numerical data
Dataset location: data/convfinqa_dataset.json
For detailed dataset information, see dataset.md.
The project uses pytest for testing. Tests are organized into:
- Unit tests
- Integration tests
- Example tests
Run tests using:
# Run all tests
uv run pytest
# Run specific test category
uv run pytest tests/unit/
uv run pytest tests/integration/
# Run with coverage
uv run pytest --cov=src tests/Key test files:
- Integration tests:
tests/integration/test_multiturn_chat_flow.pytests/integration/test_single_turn_chat_flow.py
- Unit tests:
tests/unit/test_doc_retriever.pytests/unit/test_prompt_designer.pytests/unit/test_table_to_dataframe.py
├── analysis/ # Dataset Analysis notebooks and scripts
├── data/ # Dataset files
├── src/
│ ├── agent/ # Core agent implementation
│ │ ├── models/ # Financial text analysis models
│ │ └── utils/ # Helper utilities
│ └── main.py # CLI application
├── tests/ # Test suite
│ ├── fixtures/ # Test data including doc specific ids and types
│ ├── integration/ # Integration tests
│ └── unit/ # Unit tests
- Code formatting: The project uses
rufffor linting and formatting - Type checking:
mypyis configured for static type checking - Dependencies are managed through
pyproject.toml
Copyright (c) 2024 S Zevelakis - All Rights Reserved
This code is private and confidential, created as part of a coding assessment/test. Unauthorized copying, modification, distribution, or use of this code, via any medium, is strictly prohibited.
This code is provided for evaluation purposes only and may not be used for any other purpose without explicit written permission from the author.
NO WARRANTIES ARE PROVIDED, EXPRESS OR IMPLIED.
=
This README provides:
- Clear installation and setup instructions
- Usage examples for the chat interface
- Information about the dataset
- Testing instructions and structure
- Project organization
- Development tools and standards