A PyTorch implementation of the gradient fusion strategy for Bicriteria Policy Optimization. This approach is specifically designed for optimization problems where two objectives share a common global optimum, such as combining Reinforcement Learning (RL) and Imitation Learning (IL).
If you find our work useful, please cite our paper:
@article{zhan2025bicriteria,
title={Bicriteria policy optimization for high-accuracy reinforcement learning},
author={Zhan, Guojian and Zhang, Xiangteng and Zhang, Feihong and Tao, Letian and Li, Shengbo Eben},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2025},
publisher={IEEE}
}