Add Implicit Q-Learning (IQL) implementation #348

hwilner · 2025-11-02T07:10:46Z

This PR implements Implicit Q-Learning (IQL), an offline reinforcement learning algorithm, addressing issue #329.

Overview

IQL is designed for learning from fixed datasets without online interaction. Unlike other offline RL methods, IQL avoids querying values of out-of-sample actions, which helps prevent overestimation and distributional shift issues.

Implementation

This implementation includes:

IQL Learner: Core training logic with expectile regression for value function, TD learning for Q-function, and advantage-weighted regression for policy extraction
IQL Networks: Policy, Q-function, and value function networks
IQL Builder: Constructs the IQL agent following Acme's builder pattern
IQL Config: Hyperparameter configuration
Example Script: run_iql_jax.py for training on D4RL datasets
Unit Tests: agent_test.py for component verification

Algorithm Details

IQL uses three key components:

Value Function (V): Trained with expectile regression to estimate state values as an upper expectile of Q-values
Q-Function: Trained with standard TD learning using the value function for next state values
Policy: Trained with advantage-weighted regression to maximize Q-values while staying close to the data distribution

Code Quality

Follows Acme's established patterns (modeled after CQL agent)
Google-style docstrings throughout
Professional, academic writing style
1,014 lines of well-documented code
Comprehensive README with usage examples

Testing

Unit tests verify:

Network creation
Config initialization
Builder construction
Learner creation and training steps

References

Kostrikov, I., Nair, A., & Levine, S. (2021). Offline Reinforcement Learning with Implicit Q-Learning. arXiv preprint arXiv:2110.06169. https://arxiv.org/abs/2110.06169

Fixes #329

Implements IQL offline RL algorithm with: - Expectile regression for value function - TD learning for Q-function - Advantage-weighted regression for policy - Complete learner, builder, and networks - Comprehensive documentation Addresses issue google-deepmind#329

- Example script for running IQL on D4RL datasets - Unit tests for IQL components - Follows CQL example pattern

hwilner added 2 commits November 2, 2025 02:07

Add IQL example and tests

a915ab1

- Example script for running IQL on D4RL datasets - Unit tests for IQL components - Follows CQL example pattern

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Implicit Q-Learning (IQL) implementation #348

Add Implicit Q-Learning (IQL) implementation #348

hwilner commented Nov 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add Implicit Q-Learning (IQL) implementation #348

Are you sure you want to change the base?

Add Implicit Q-Learning (IQL) implementation #348

Conversation

hwilner commented Nov 2, 2025

Overview

Implementation

Algorithm Details

Code Quality

Testing

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant