A Novel Framework for Multi-Agent Multi-Criteria Decision-Making: Mathematical Modeling and Rastrigin-Based Benchmarking
A Novel Framework for Multi-Agent Multi-Criteria Decision-Making: Mathematical Modeling and Rastrigin-Based Benchmarking
Abstract
Multi-agent systems (MAS) are increasingly adopted in various domains, necessitating efficient decision-making strategies that balance individual agent preferences with overarching system objectives. This paper presents a novel framework for Multi-Agent Multi-Criteria Decision-Making (MAS-MCDM), which extends any MCDM algorithm to address the challenges of scalability, resource constraints, and inter-agent interactions. We propose four distinct decision-making approaches:
1. Full Enumeration, which guarantees the global optimum but is computationally prohibitive for large-scale problems.
2. Independent Agent Decomposition, which enhances scalability but neglects inter-agent synergies.
3. Iterative Refinement, where agents adjust decisions dynamically based on system-level feedback to satisfy constraints.
4. Iterative Refinement with Inter-Agent Interactions, which integrates cooperation and competition dynamics through an interaction matrix.
To validate the framework, we adapt the Rastrigin function — traditionally used for single-agent global optimization —into a discrete MAS-MCDM benchmark. By transforming the continuous optimization problem into a decision-making task, we examine how constraints and interactions influence solutions, shifting them away from the global minimum at (x=0). The experiments demonstrate that iterative refinement approaches effectively navigate resource limitations, while interaction-aware models enable the emergence of cooperative and competitive behaviors. The results highlight trade-offs between optimality, scalability, and inter-agent coordination, providing insights into designing robust multi-agent systems for multi-criteria decision-making (MAS-MCDM) strategies in real-world applications. Future directions include refining interaction models, integrating reinforcement learning, and extending applications to autonomous and resource-constrained systems.
1. Introduction
Multi-Criteria Decision-Making (MCDM) serves as an essential methodology for evaluating, selecting, or ranking alternatives in complex scenarios where conflicting objectives and diverse criteria must be systematically balanced . From infrastructure planning to autonomous systems, MCDM offers structured frameworks for synthesizing quantitative and qualitative factors, enabling stakeholders to navigate trade-offs and align decisions with strategic objectives .
As decision-making complexity escalates in group settings, Group Decision-Making (GDM) extends the MCDM paradigm to scenarios where a group of decision-makers (DMs) collaborates to reach a consensus . In such collaborative environments, MCDM algorithms play a critical role by aggregating heterogeneous preferences and information — expressed through diverse preference relations (e.g., preference orderings, utility functions, multiplicative preference relations, and fuzzy preference relations), or varying data formats (e.g., including crisp values, interval numbers, fuzzy numbers, and linguistic data) — into a unified decision . However, traditional GDM approaches often suffer from significant limitations: they may ignore the complex interrelationships and interactions among experts, struggle with scalability as group size increases and face challenges such as dominant influences, conflict resolution, and time-intensive consensus processes, leading to inconsistent opinions, biases from heterogeneous information representations, and difficulties in achieving true consensus.
In contrast, Multi-Agent Systems (MAS) provide a decentralized framework in which autonomous agents make independent decisions while interacting within a shared environment . Unlike GDM’s collaborative consensus-seeking approach, MAS empowers each agent to achieve both individual and system-level objectives, making decisions based on local information while also accounting for interdependencies, such as shared resource constraints and competitive or cooperative interactions . Current MAS decision-making paradigms employ diverse approaches to address the challenges associated with coordination, learning, adaptability, and optimization among autonomous agents. These approaches include rule-based systems (primarily fuzzy logic), game-theoretic methods, evolutionary algorithms, multi-agent reinforcement learning (MARL), and large language models (LLMs) . Although many MCDM methods have been successfully applied in both single-agent or group decision-making contexts, their extension to multi-agent systems presents unique challenges. In particular, the limitations of conventional GDM motivate the exploration of more robust, multi-layered decision frameworks that integrate MCDM into MAS.
Notably, there is no universally superior MCDM technique or algorithm, as the appropriateness of a method depends on the specific decision situation. While almost all methodologies in MCDM share common steps like problem organization and decision matrix construction, they differ in how they synthesize information . While our prior work introduced the TriMetric Fusion (TMF) algorithm for single-agent MCDM , the present framework is designed to accommodate any MCDM technique, including TOPSIS, AHP , or any other suitable method, depending on problem-specific requirements.
The proposed Multi-Agent Multi-Criteria Decision-Making (MAS-MCDM) framework is a novel approach that addresses the challenges of scalability and global optimality in decentralized decision-making systems. It enables agents to rank alternatives based on local criteria while simultaneously negotiating system-wide constraints. Four models are introduced:
First, a full enumeration strategy exhaustively evaluates all combinations to guarantee global optimality, although it is computationally infeasible for large-scale problems.
Second, a decomposition approach enables independent agent evaluations, improving scalability at the potential cost of global optimality.
Third, an iterative refinement mechanism enables agents to adjust their decisions based on system-level feedback, thereby enhancing feasibility under resource constraints.
Finally, an extended iterative framework incorporates explicit inter-agent interaction terms to model cooperative synergies and competitive conflicts among agents.
To validate the framework, we adapt the Rastrigin function — a benchmark in continuous optimization—to multi-agent decision-making by discretizing its input space. This adaptation transforms the problem into one where agents select options from predefined sets, linking the Rastrigin function’s local minima landscape to multi-agent decision dynamics. This approach tests the framework’s ability to manage resource limitations and inter-agent interactions.
The remainder of the paper is organized as follows. Section 2 details the methodology, including mathematical models and workflows for each approach. Section 3 validates the framework experimentally using the discretized Rastrigin function, with a focus on resource constraints and inter-agent dynamics. Section 4 concludes with a discussion of strengths, limitations, and future research directions.
2. Methodology
2.1. Problem Setup
We define the following parameters:
For all following approaches, the decision variable
If there are
The preference scores represent the normalized ranks of options that are calculated and ranked by the TMF algorithm. Based on the criteria for each agent, the TMF gives the value of the preference for each option.
2.2. Approach 1: Full Enumeration of
This approach systematically valuates all possible combinations of options for N agents, each with M options. The total number of combinations is
2.2.1. Mathematical Model
Agent-level Objective (
System-Level Objective (
Objective Function:
Combine agent objectives and system-level objectives:
where:
Ensure
Constraints:
Single Option Per Agent: Each agent selects exactly one option:
System Constraints: The system imposes constraints such as:
Resource limits:
Set of Solutions: Each solution
Optimal Solution: The optimal solution is:
2.2.2. Strengths and Limitations of the Approach
This approach guarantees the globally optimal solution by systematically evaluating all
2.3. Approach 2: Decomposition into Agent Contributions
This approach reduces the computational complexity of multi-agent decision-making by evaluating each agent’s decision independently and approximating the system-level objectives as the sum of individual agent contributions. While it may not guarantee the global optimum and does not consider the existence of the resources constraints like the full enumeration approach, it significantly improves scalability and remains practical for larger systems without constraints.
2.3.1. Mathematical Model
2.3.1.1. Agent-Level Objective
Each agent independently evaluates its options:
where:
2.3.1.2. System-Level Objective
Approximate the system-level objective as the sum of agent-level contributions:
2.3.1.3. Constraints
Each agent must select exactly one option:
2.3.1.4. Optimal Solution
The optimal solution selects the best option for each agent based on their individual objectives and system-level contribution:
2.3.2. Step-by-Step workflow
Step 1: Define Agent Contributions
For each agent
Step 2: Apply Multi-Criteria Decision-Making (MCDM) Locally
For each agent
Use TMF to calculate the individual preferences (
Step 3: Generate the ranking
Generate a priority ranking for each agent by combining the individual preferences (
Step 4: Aggregate Agent-Level Decisions
After evaluating each agent's options locally, aggregate the top-ranked choices to approximate the system-level solution.
2.3.3. Strengths and Limitations
This approach significantly improves scalability by reducing the computational complexity from
However, since agents make decisions independently, it does not guarantee a globally optimal solution, as inter-agent dependencies and constraints are not directly accounted for. This can lead to suboptimal system-wide performance, especially in cases where resource constraints or dependencies between agent choices significantly impact the overall objective. Additionally, the approximation of the system-level objective as a sum of individual agent contributions may not always be valid — some global objectives are inherently non-decomposable, meaning that the interactions between agents play a critical role in decision-making.
2.4. Approach 3: Iterative Refinement
The Iterative Refinement approach is a hybrid optimization method that balances computational efficiency and solution quality by iteratively improving agent decisions. Unlike the Full Enumeration approach, which explores all possible combinations, and the Decomposition approach, which assumes independence among agents, Iterative Refinement incorporates feedback between the system-level evaluation and agent-level decisions. This feedback loop progressively adjusts decisions to achieve a better balance between individual preferences and system-wide objectives.
2.4.1. Mathematical Model
2.4.1.1. Objectives
(a) Initial Agent-Level Objective
Each agent computes its initial score for all options j based on a weighted combination of individual preference and system-level contribution like in the Decomposition into Agent Contributions approach:
(b) System-Level Objective
Approximate the system-level objective as the sum of agent-level contributions:
c) Adjusted Agent-Level Objective
Based on feedback from the system-level evaluation, agents adjust their scores by incorporating penalties for violating system constraints:
where:
Key Benefits of saving the previous rank from the previous iteration:
- Smooth Transition Between Iterations: By retaining the
- Better Constraint Handling: The penalties applied in the previous iteration persist in influencing the next iteration, effectively discouraging agents from reverting to previously penalized options.
- Faster Convergence: This memory mechanism reduces oscillations in decision-making, potentially speeding up the convergence to a feasible and optimal solution.
2. Constraints
(a) Individual Agent Constraint
Each agent selects one and only one option:
(b) System-Wide Constraints
System resources must remain within their limits:
3. Solution
The process iteratively refines decisions until a stable, constraint-satisfying solution
2.4.2. Step-by-Step Workflow
Step 1: Initial Solution
Compute Initial Ranks: Each agent evaluates its options using Equation (2.13).
Make Initial Selection: Each agent selects the option
Form Initial Solution: The initial solution
Step 2: System-Level Evaluation
Check Constraints: Evaluate total resource usage for each constraint
Identify Violations: Compute penalties for constraint violations:
Step 3: Feedback and Adjustment
Compute Agent Impact: For each agent, compute its contribution to resource usage:
Adjust Rankings: Penalize options that contribute heavily to constraint violations. The applied penalty will be saved during the iteration:
Re-select Options: Each agent recomputes its rankings using the adjusted scores and selects the best option:
Step 4: Repeat Until Convergence
Convergence Check: Stop the iteration when:
- The system objective
- No agent changes its decision between many iterations:
- Reach the iteration limit.
Output Final Solution:
The final solution
2.4.3. Strengths and Limitations
The Iterative Refinement approach strikes a balance between computational efficiency and solution quality by iteratively improving agent decisions through a feedback mechanism. By continuously refining choices based on violations of system constraints, it achieves a more adaptive and responsive optimization process. A key advantage of this method is that it reduces computational complexity significantly compared to full enumeration while still capturing partial interdependencies between agents, making it more scalable for larger systems. Additionally, the iterative structure helps mitigate constraint violations over time, leading to a feasible and balanced solution without requiring an exhaustive search of the solution space. This process also allows for parallel updates in decentralized settings, making it applicable to multi-agent systems with distributed decision-making.
However, without proper tuning of penalty weights and feedback mechanisms, the iterative process can oscillate between solutions, causing instability. Another limitation is that this approach assumes that all agents respond rationally and iteratively adjust their decisions based on system feedback, which may not hold in dynamic or adversarial environments where agents have strategic behaviors or incomplete information.
2.5. Approach 4: Iterative Refinement with Inter-agent Interactions
While the previous approaches addressed many fundamental challenges—such as exponential complexity, balancing local vs. global objectives, and adaptive penalty mechanisms to enforce resource constraints—they did not fully capture how agents’ choices interact with each other beyond shared resources — whether they be cooperative synergies, competitive conflicts, or more nuanced relational dynamics. Building on the iterative penalty-based framework of Approach 3, approach 4 closes this gap by incorporating a joint contribution function that quantifies synergy or conflict based on the specific combination of options chosen across agents. In doing so, each agent’s objective is no longer limited to local preference, system-level contribution, and resource penalties; it also accounts for how its choice aligns (or clashes) with the decisions of other agents. As a result, the proposed method supports a wide range of cooperative, competitive, and hybrid scenarios, providing a more realistic and dynamic decision-making framework for multi-agent systems.
2.5.1. Mathematical Model
2.5.1.1. Joint Contribution Function for Inter-agent Interactions
2.5.1.1.1. Motivation
When multiple agents coordinate (or compete), the benefit or cost of an agent
For example:
Cooperative scenario:
Competitive scenario:
Partial synergy:
2.5.1.1.2. Interaction Matrix
In addition to
Hence the total interaction term for agent
2.5.1.1.3. Examples of defining the joint contribution function
We define the Interaction matrix as:
Simple joint function
A common binary function:
Positive
Negative
Complex joint function
below is a partially rewards (or penalizes) agents
Example: Location-Based Partial Synergy
Suppose each agent’s options correspond to possible locations (or positions in a metric space). Let:
We can design a piecewise function that:
- Rewards pairs of locations that are very close (e.g., good for collaboration).
- Gives a smaller bonus (or neutral effect) if they are moderately distant.
- Penalizes them if they are too far apart (e.g., high communication or transportation cost).
A sample function might look like:
Where:
For intermediate distances
Other Possible Joint contribution function formulations
- Attribute Similarity: Instead of location distance, we could define a function based on the similarity of agents’ skill sets, technology types, or preferences. For instance, if
- Nonlinear or Multi-Attribute: We might combine multiple factors—distance + type alignment + resource overlap—into a single function, weighting each sub-factor differently.
- Time-Dependent: For dynamic environments, let
- Probabilistic: Define expected synergy or conflict, e.g.
2.5.1.2. Refined Objective Function
Agent-Level Objective
At iteration
Local preference (
System-level contribution (
Penalty for resource overuse (adaptive from iteration
Interaction with other agents.
The general objective for agent
Each agent
2.5.2. Step-by-step workflow: Iterative Refinement Process
The algorithm proceeds in iterations
Step 1: Initialization
- Set an initial decision matrix
- Initialize
-
Step 2: Compute Resource Usage & Penalties
- For each resource
- Determine
- Allocate impacts based on
Step 3: Compute Interaction Terms
For each agent
Step 4: Agent-Level Optimization
Each agent
where
Step 5: Convergence Check
Check if:
- The total system-level objective (or sum of all
- All resource constraints are satisfied (or the penalty remains unchanged).
- No agent changes its decision, i.e.,
If these conditions are met, stop. Otherwise, increment t and repeat from Step 2.
2.5.3. Strengths and Limitations
Approach 4’s main advantage is its ability to capture a wide range of cooperative or competitive inter-agent dynamics by introducing explicit interaction terms beyond simple resource-sharing constraints. It adaptively refines decisions in an iterative manner, allowing for parallel agent updates and offering greater scalability than exhaustive methods, while still preserving a multi-criteria perspective by combining individual preferences, system-level contributions, penalty terms, and interaction factors into a unified objective. However, tuning the additional interaction weights and penalty parameters introduces further complexity and can lead to oscillations if not carefully calibrated. The accuracy of the results relies heavily on defining the joint function. Finally, in highly adversarial or rapidly changing environments, agents may not faithfully follow the iterative penalty updates, and more robust game-theoretic or reinforcement learning approaches may be required for stable and fair outcomes.
3. Verification using the Rastrigin Function
The Rastrigin function was selected as a representative benchmark for evaluating the performance of the proposed MAS-MCDM approaches. This function is traditionally used in single-agent global optimization to test an algorithm’s ability to escape local minima. It was used to verify the proposed models for a multi-agent scenario by discretizing it. These discretizing points represent the agent’s options.
3.1. Implementation Overview
3.1.1. Discretized Options
For each agent
The generated points (Set of agent’s options) in this implementation are:
3.1.2. Preference, Contribution, and Resources
For each agent
Then assign:

Figure 1 - The Rastrigin function with parameter A=10. Point B at x=0 is the continuous global minimum in single-agent scenarios; in multi-agent settings, additional constraints and interactions can shift the solution away from x=0 to A and C points
The weights
3.1.4. Constraints
- Resource Constraints: Resource-usage terms derived from the negative Rastrigin values and impose a global limit
- Let the global resource threshold
- Interaction Constraints: Reward-based interactions were incorporated in the implementation of Approach 4, which can steer agents to local minima near
3.1.5. Inter-Agent Interactions:
3.1.5.1. Interaction Setup
We created a scenario where first and fifth agents set as the first group and the other agents set as the second group, each group benefit from matching their values sign (cooperative between agents in the same group). Additionally, cross-group interactions are set as conflict, discouraging all five from picking the same sign (competitive between groups).
Interaction matrix:
3.1.5.1. Normalization
The normalized preferences and contributions onto
3.2. Results and Observations
- Approach 1 (Full Enumeration)
Outcome:
Correctly identifies the global minimum
With Constraints:
When resource usage rules push agents away from
Runtime: Exponential in
Despite guaranteeing the global optimum, the exponential complexity restricts this method to small values of
- Approach 2 (Decomposition)
Outcome:
Each agent individually picks the best discrete point, ignoring synergy or resource constraints. All agents choose
- Approach 3 (Iterative Refinement)
Outcome:
Agents iteratively adjust choices based on feedback from resource penalties until finding the solution where satisfy the constraints.
Resource
This demonstrates that iterative penalty-based methods handle resource constraints effectively without enumerating all solutions.
- Approach 4 (Iterative Refinement with Inter-Agent Interactions)
Outcome:
There are two combinations that solve the problem of constraints and interactions. This approach found a combination that satisfy the constraints with total usage value equals
As shown, incorporating inter-agent synergy and conflict can yield solutions where agent groups naturally self-organize, surpassing purely penalty-based approaches in scenarios with strong interactions.
The Rastrigin experiments validate that each proposed approach can be adapted to MAS-MCDM, handling both simpler unconstrained cases and more intricate interactions/constraints. The results underscore the trade-offs between optimality, scalability, and the capacity to model inter-agent effects — key considerations for multi-agent systems in many practical domains.
4. Conclusion
In this paper, we have presented a comprehensive framework for MAS MCDM to effectively balance local agent preferences with system-wide objectives and constraints. We introduced and rigorously formulated four distinct approaches — full enumeration, decomposition into agent contributions, iterative refinement, and iterative refinement with inter-agent interactions — each offering a unique set of trade-offs between global optimality and computational scalability. This work systematically demonstrates that while exhaustive search guarantees a global optimum, its exponential complexity renders it impractical for large-scale systems, thereby motivating the need for more scalable methods.
A key contribution of the research is the novel iterative penalty-based refinement strategy. This mechanism dynamically adjusts agent-level decisions based on feedback from resource constraint violations and, in the most advanced approach, inter-agent interactions. By integrating these dynamic adjustments, the framework is capable of converging to robust and feasible solutions even in the presence of complex interdependencies and resource limitations. The incorporation of explicit interaction terms further allows the model to capture cooperative and competitive dynamics, thereby enhancing its applicability to a wide range of real-world scenarios — from tightly coupled, mission-critical systems to decentralized and distributed decision-making. The validation experiments using a discretized version of the Rastrigin function underscore the efficacy of the proposed methods. The experimental results highlight the strengths of each approach and illuminate the inherent trade-offs between achieving optimality and ensuring scalability. In particular, the iterative refinement strategies have shown promise in navigating challenging optimization landscapes. Future work may investigate further enhancements to the interaction models, integrate with reinforcement learning and game-theoretic strategies, and apply these methods to real-world challenges in autonomous systems, resource management, and other areas.
