MO-MIX: Multi-Objective Multi-Agent Cooperative Decision-Making With Deep Reinforcement Learning
arXiv:2603.00730v1 Announce Type: new Abstract: Deep reinforcement learning (RL) has been applied extensively to solve complex decision-making problems. In many real-world scenarios, tasks often have several conflicting objectives and may require multiple agents to cooperate, which are the multi-objective multi-agent decision-making problems. However, only few works have been conducted on this intersection. Existing approaches are limited to separate fields and can only handle multi-agent decision-making with a single objective, or multi-objective decision-making with a single agent. In this paper, we propose MO-MIX to solve the multi-objective multi-agent reinforcement learning (MOMARL) problem. Our approach is based on the centralized training with decentralized execution (CTDE) framework. A weight vector representing preference over the objectives is fed into the decentralized agent network as a condition for local action-value function estimation, while a mixing network with paral
arXiv:2603.00730v1 Announce Type: new Abstract: Deep reinforcement learning (RL) has been applied extensively to solve complex decision-making problems. In many real-world scenarios, tasks often have several conflicting objectives and may require multiple agents to cooperate, which are the multi-objective multi-agent decision-making problems. However, only few works have been conducted on this intersection. Existing approaches are limited to separate fields and can only handle multi-agent decision-making with a single objective, or multi-objective decision-making with a single agent. In this paper, we propose MO-MIX to solve the multi-objective multi-agent reinforcement learning (MOMARL) problem. Our approach is based on the centralized training with decentralized execution (CTDE) framework. A weight vector representing preference over the objectives is fed into the decentralized agent network as a condition for local action-value function estimation, while a mixing network with parallel architecture is used to estimate the joint action-value function. In addition, an exploration guide approach is applied to improve the uniformity of the final non-dominated solutions. Experiments demonstrate that the proposed method can effectively solve the multi-objective multi-agent cooperative decision-making problem and generate an approximation of the Pareto set. Our approach not only significantly outperforms the baseline method in all four kinds of evaluation metrics, but also requires less computational cost.
Executive Summary
This article proposes MO-MIX, a novel approach to multi-objective multi-agent cooperative decision-making using deep reinforcement learning. MO-MIX leverages the centralized training with decentralized execution framework, incorporating a weight vector to represent preference over objectives and a mixing network to estimate joint action-value functions. The approach demonstrates significant improvements over baseline methods in solving complex decision-making problems, generating an approximation of the Pareto set while reducing computational costs.
Key Points
- ▸ Introduction of MO-MIX for multi-objective multi-agent reinforcement learning
- ▸ Centralized training with decentralized execution framework
- ▸ Use of weight vector and mixing network for action-value function estimation
Merits
Effective Solution Generation
MO-MIX effectively solves multi-objective multi-agent cooperative decision-making problems and generates a high-quality approximation of the Pareto set.
Improved Efficiency
The approach requires less computational cost compared to baseline methods, making it more efficient for complex decision-making scenarios.
Demerits
Limited Exploration
The reliance on an exploration guide approach may limit the exploration of the solution space, potentially leading to suboptimal solutions in certain scenarios.
Expert Commentary
The introduction of MO-MIX marks a significant advancement in the field of multi-objective multi-agent reinforcement learning. By effectively addressing the challenges of cooperative decision-making in complex scenarios, MO-MIX has the potential to transform various applications, from autonomous systems to smart infrastructure. However, further research is needed to fully explore the capabilities and limitations of this approach, particularly in regards to its scalability and adaptability to diverse problem domains.
Recommendations
- ✓ Further investigation into the scalability of MO-MIX for large-scale multi-agent systems
- ✓ Exploration of alternative exploration strategies to improve the robustness of the approach