Large Language Model (LLM)-based autonomous agents have shown strong capabilities in decision-making and handling complex tasks. However, public research on applying multi-agent systems to Product Question Answering (PQA)—a crucial area in modern e-commerce—remains limited.
PRADA-QA is a framework designed to enhance the user experience through multi-agent collaboration, enabling dynamic information retrieval from diverse sources to respond to user queries accurately.
- Multi-Agent Collaboration: Agents work together to dynamically retrieve and integrate information for more accurate product-related responses.
- Adaptive Planning Module: Guides agents’ objectives adaptively, improving task fulfillment efficiency while minimizing redundant steps and operational costs.
- Reward Model-Based Evaluation: Uses a reward model (commonly applied in RLHF for LLMs) as a proxy for human preferences, ensuring user-centric quality in evaluation.
- Generalizable Framework: While designed for PQA, the evaluation and planning strategies may extend to other open-ended QA scenarios.
- We employ a reward model-based evaluation strategy to capture user-centric quality.
- Experiments were conducted across three distinct domains to validate the framework’s effectiveness.
- Results show that PRADA-QA outperforms traditional approaches, delivering more accurate and contextually appropriate responses for PQA.
- Improves task fulfillment efficiency
- Tailored for Product Question Answering in e-commerce
- Built on LLM-powered multi-agent collaboration
- Evaluated using reward models as human preference proxies
- Demonstrates superior performance across multiple domains
PRADA-QA sets a new direction for leveraging LLM-based multi-agent systems in e-commerce and beyond.
MASEE/
├── src/ # Core application code
│ ├── core/ # Core functionality
│ │ ├── experiment.py # Experiment management
│ │ ├── evaluation.py # Evaluation engine
│ │ └── agent_factory.py # Agent creation
│ ├── agents/ # Agent implementations
│ ├── utils/ # Utility functions
│ └── main.py # Main entry point
├── config/ # Configuration files
│ ├── experiment/ # Experiment configurations
│ └── meta_agent.yaml # Agent model configurations
├── data/ # Dataset files
│ └── demo_pqa_validation_part*.csv
├── scripts/ # Automation scripts
│ ├── run_experiment.sh # Single experiment runner
│ ├── run_batch_experiments.sh # Batch processing
│ ├── monitor_experiments.sh # Monitoring tools
│ └── analyze_results.py # Results analysis
├── tests/ # Test suite
├── masee/ # Custom smolagents environment
├── archive/ # Historical data and logs
└── legacy/ # Old code structure (ignored)
- Python 3.10+
- Virtual environment (recommended)
- Required dependencies (see requirements.txt)
- Clone and setup:
git clone <repository-url>
cd MASEE
pip install -r requirements.txt- Configure environment:
# Copy and edit environment file
cp .env.example .env
# Edit .env with your API keys (OpenAI, Tavily, etc.)- Quick Demo (recommended first step):
python run_demo.py- Single Experiment:
python src/main.py gpt-4o-mini base 100 demo 3- Using Scripts (recommended for production):
# Single experiment with enhanced features
./scripts/run_experiment.sh --model gpt-4o-mini --type base --part 100
# Batch experiments
./scripts/run_batch_experiments.sh
# Monitor running experiments
./scripts/monitor_experiments.sh statusEdit config/experiment/multi_experiment.yaml:
agent: ['base', 'description'] # Agent types to use
model: 'gpt-4o-mini' # Model identifier
name: 'experiment-name' # Experiment name
part: 100 # Dataset portion (questions to process)Edit config/meta_agent.yaml:
model-type: 'LiteLLMModel' # Model type
model-id: 'gpt-4o-mini' # Model identifier
model-api: 'OPENAI_API_KEY' # API key environment variable
api-base: 'https://api.openai.com/v1' # API base URLquestion_id,question_text,question_type,asin,item_name,description
Q001,What color is this product?,wh,ASIN001,Wireless Headphones,"High-quality wireless headphones with premium black finish..."
Q002,Is this waterproof?,yes-no,ASIN002,Sports Watch,"Durable sports watch with water-resistant casing..."{"Q001": "The product is black based on the description provided."}
{"Q002": "Yes, this product is water-resistant up to 50 meters."}