Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
Commitments play a crucial role in game theory, shaping strategic interactions by either altering a player's own payoffs or influencing the incentives of others through outcome-contingent payments. While most research has focused on using commitments to achieve efficient equilibria, their potential applications beyond this goal remain largely unexplored. In this study, we introduce a non-coercive extortion mechanism that leverages commitments to outcome-contingent payments, demonstrating how a player or external agent can extract profit by offering rewards rather than threatening punishment. At the core of the mechanism is the introduction of sequentiality into a simultaneous-move game, fundamentally reshaping the strategic interaction. We derive the conditions under which extortion is successful, identify the class of games susceptible to this scheme, and determine both the maximum extractable profit and the minimum required payment. To illustrate the extortion mechanism, we apply it to 2x2 games, highlighting how even simple strategic settings can be vulnerable to this form of manipulation. Our results reveal strategic vulnerabilities in competitive settings, with significant implications for economic markets, diplomatic relations, and multi-agent systems operating in blockchain environments. This work broadens our understanding of commitments in game theory and raises critical questions about how to safeguard strategic interactions from exploitation through non-coercive extortion.
Human verification under adversarial information flow operates as a cost-bounded decision procedure constrained by working memory limits and cognitive biases. We introduce the Verification Cost Asymmetry (VCA) coefficient, formalizing it as the ratio of expected verification work between populations under identical claim distributions. Drawing on probabilistically checkable proofs (PCP) and parameterized complexity theory, we construct dissemination protocols that reduce verification for trusted audiences to constant human effort while imposing superlinear costs on adversarial populations lacking cryptographic infrastructure. We prove theoretical guarantees for this asymmetry, validate the framework through controlled user studies measuring verification effort with and without spot-checkable provenance, and demonstrate practical encoding of real-world information campaigns. The results establish complexity-theoretic foundations for engineering democratic advantage in cognitive warfare, with immediate applications to content authentication, platform governance, and information operations doctrine.
Visualization dashboards are increasingly used in strategic settings like auctions to enhance decision-making and reduce strategic confusion. This paper presents behavioral experiments evaluating how different dashboard designs affect bid optimization in reverse first-price auctions. Additionally, we assess how dashboard designs impact the auction designer's ability to accurately infer bidders' preferences within the dashboard mechanism framework. We compare visualizations of the bid allocation rule, commonly deployed in practice, to alternatives that display expected utility. We find that utility-based visualizations significantly improve bidding by reducing cognitive demands on bidders. However, even with improved dashboards, bidders systematically under-shade their bids, driven by an implicit preference for certain wins in uncertain settings. As a result, dashboard-based mechanisms that assume fully rational or risk-neutral bidder responses to dashboards can produce significant estimation errors when inferring private preferences, which may lead to suboptimal allocations in practice. Explicitly modeling agents' behavioral responses to dashboards substantially improves inference accuracy, highlighting the need to align visualization design and econometric inference assumptions in practice.
We introduce the first implementable framework for corrigibility, with provable guarantees in multi-step, partially observed environments. Our framework replaces a single opaque reward with five *structurally separate* utility heads -- deference, switch-access preservation, truthfulness, low-impact behavior via a belief-based extension of Attainable Utility Preservation, and bounded task reward -- combined lexicographically by strict weight gaps. Theorem 1 proves exact single-round corrigibility in the partially observable off-switch game; Theorem 3 extends the guarantee to multi-step, self-spawning agents, showing that even if each head is \emph{learned} to mean-squared error $\varepsilon$ and the planner is $\varepsilon$-sub-optimal, the probability of violating \emph{any} safety property is bounded while still ensuring net human benefit. In contrast to Constitutional AI or RLHF/RLAIF, which merge all norms into one learned scalar, our separation makes obedience and impact-limits dominate even when incentives conflict. For open-ended settings where adversaries can modify the agent, we prove that deciding whether an arbitrary post-hack agent will ever violate corrigibility is undecidable by reduction to the halting problem, then carve out a finite-horizon ``decidable island'' where safety can be certified in randomized polynomial time and verified with privacy-preserving, constant-round zero-knowledge proofs. Consequently, the remaining challenge is the ordinary ML task of data coverage and generalization: reward-hacking risk is pushed into evaluation quality rather than hidden incentive leak-through, giving clearer implementation guidance for today's LLM assistants and future autonomous systems.
We study the fair allocation of indivisible goods under cardinality constraints, where each agent must receive a bundle of fixed size. This models practical scenarios, such as assigning shifts or forming equally sized teams. Recently, variants of envy-freeness up to one/any item (EF1, EFX) were introduced for this setting, based on flips or exchanges of items. Namely, one can define envy-freeness up to one/any flip (EFF1, EFFX), meaning that an agent $i$ does not envy another agent $j$ after performing one or any one-item flip between their bundles that improves the value of $i$. We explore algorithmic aspects of this notion, and our contribution is twofold: we present both algorithmic and impossibility results, highlighting a stark contrast between the classic EFX concept and its flip-based analogue. First, we explore standard techniques used in the literature and show that they fail to guarantee EFFX approximations. On the positive side, we show that we can achieve a constant factor approximation guarantee when agents share a common ranking over item values, based on the well-known envy cycle elimination technique. This idea also leads to a generalized algorithm with approximation guarantees when agents agree on the top $n$ items and their valuation functions are bounded. Finally, we show that an algorithm that maximizes the Nash welfare guarantees a 1/2-EFF1 allocation, and that this bound is tight.
When an old apartment building is demolished and rebuilt, how can we fairly redistribute the new apartments to minimize envy among residents? We reduce this question to a combinatorial optimization problem called the *Min Max Average Cycle Weight* problem. In that problem we seek to assign objects to agents in a way that minimizes the maximum average weight of directed cycles in an associated envy graph. While this problem reduces to maximum-weight matching when starting from a clean slate (achieving polynomial-time solvability), we show that this is not the case when we account for preexisting conditions, such as residents' satisfaction with their original apartments. Whether the problem is polynomial-time solvable in the general case remains an intriguing open problem.
User-generated content (UGC) on social media platforms is vulnerable to incitements and manipulations, necessitating effective regulations. To address these challenges, those platforms often deploy automated content moderators tasked with evaluating the harmfulness of UGC and filtering out content that violates established guidelines. However, such moderation inevitably gives rise to strategic responses from users, who strive to express themselves within the confines of guidelines. Such phenomena call for a careful balance between: 1. ensuring freedom of speech -- by minimizing the restriction of expression; and 2. reducing social distortion -- measured by the total amount of content manipulation. We tackle the problem of optimizing this balance through the lens of mechanism design, aiming at optimizing the trade-off between minimizing social distortion and maximizing free speech. Although determining the optimal trade-off is NP-hard, we propose practical methods to approximate the optimal solution. Additionally, we provide generalization guarantees determining the amount of finite offline data required to approximate the optimal moderator effectively.
Consider costly tasks that add up to the success of a project, and must be fitted by an agent into a given time-frame. This is an instance of the classic budgeted maximization problem, which admits an approximation scheme (FPTAS). Now assume the agent is performing these tasks on behalf of a principal, who is the one to reap the rewards if the project succeeds. The principal must design a contract to incentivize the agent. Is there still an approximation scheme? In this work, our ultimate goal is an algorithm-to-contract transformation, which transforms algorithms for combinatorial problems (like budgeted maximization) to tackle incentive constraints that arise in contract design. Our approach diverges from previous works on combinatorial contract design by avoiding an assumption of black-box access to a demand oracle. We first show how to "lift" the FPTAS for budgeted maximization to obtain the best-possible multiplicative and additive FPTAS for the contract design problem. We establish this through our "local-global" framework, in which the "local" step is to (approximately) solve a two-sided strengthened variant of the demand problem. The "global" step then utilizes the local one to find the approximately optimal contract. We apply our framework to a host of combinatorial constraints including multi-dimensional budgets, budgeted matroid, and budgeted matching constraints. In all cases we achieve an approximation essentially matching the best approximation for the purely algorithmic problem. We also develop a method to tackle multi-agent contract settings, where the team of working agents must abide to combinatorial feasibility constraints.
The Stable Roommates problems are characterized by the preferences of agents over other agents as roommates. A solution is a partition of the agents into pairs that are acceptable to each other (i.e., they are in the preference lists of each other), and the matching is stable (i.e., there do not exist any two agents who prefer each other to their roommates, and thus block the matching). Motivated by real-world applications, and considering that stable roommates problems do not always have solutions, we continue our studies to compute "good-enough" matchings. In addition to the agents' habits and habitual preferences, we consider their networks of preferred friends, and introduce a method to generate personalized solutions to stable roommates problems. We illustrate the usefulness of our method with examples and empirical evaluations.
Traditional combinatorial spectrum auctions mainly rely on fixed bidding and matching processes, which limit participants' ability to adapt their strategies and often result in suboptimal social welfare in dynamic spectrum sharing environments. To address these limitations, we propose a novel approximately truthful combinatorial forward auction scheme with a flexible bidding mechanism aimed at enhancing resource efficiency and maximizing social welfare. In the proposed scheme, each buyer submits a combinatorial bid consisting of the base spectrum demand and adjustable demand ranges, enabling the auctioneer to dynamically optimize spectrum allocation in response to market conditions. To standardize the valuation across heterogeneous frequency bands, we introduce a Spectrum Equivalent Mapping (SEM) coefficient. A greedy matching algorithm is employed to determine winning bids by sorting buyers based on their equivalent unit bid prices and allocating resources within supply constraints. Simulation results demonstrate that the proposed flexible bidding mechanism significantly outperforms existing benchmark methods, achieving notably higher social welfare in dynamic spectrum sharing scenarios.
In this paper, we explore mission assignment and task offloading in an Open Radio Access Network (Open RAN)-based intelligent transportation system (ITS), where autonomous vehicles leverage mobile edge computing for efficient processing. Existing studies often overlook the intricate interdependencies between missions and the costs associated with offloading tasks to edge servers, leading to suboptimal decision-making. To bridge this gap, we introduce Oranits, a novel system model that explicitly accounts for mission dependencies and offloading costs while optimizing performance through vehicle cooperation. To achieve this, we propose a twofold optimization approach. First, we develop a metaheuristic-based evolutionary computing algorithm, namely the Chaotic Gaussian-based Global ARO (CGG-ARO), serving as a baseline for one-slot optimization. Second, we design an enhanced reward-based deep reinforcement learning (DRL) framework, referred to as the Multi-agent Double Deep Q-Network (MA-DDQN), that integrates both multi-agent coordination and multi-action selection mechanisms, significantly reducing mission assignment time and improving adaptability over baseline methods. Extensive simulations reveal that CGG-ARO improves the number of completed missions and overall benefit by approximately 7.1% and 7.7%, respectively. Meanwhile, MA-DDQN achieves even greater improvements of 11.0% in terms of mission completions and 12.5% in terms of the overall benefit. These results highlight the effectiveness of Oranits in enabling faster, more adaptive, and more efficient task processing in dynamic ITS environments.
We develop an operator algebraic framework for infinite games with a continuum of agents and prove that regret based learning dynamics governed by a noncommutative continuity equation converge to a unique quantal response equilibrium under mild regularity assumptions. The framework unifies functional analysis, coarse geometry and game theory by assigning to every game a von Neumann algebra that represents collective strategy evolution. A reflective regret operator within this algebra drives the flow of strategy distributions and its fixed point characterises equilibrium. We introduce the ordinal folding index, a computable ordinal valued metric that measures the self referential depth of the dynamics, and show that it bounds the transfinite time needed for convergence, collapsing to zero on coarsely amenable networks. The theory yields new invariant subalgebra rigidity results, establishes existence and uniqueness of envy free and maximin share allocations in continuum economies, and links analytic properties of regret flows with empirical stability phenomena in large language models. These contributions supply a rigorous mathematical foundation for large scale multi agent systems and demonstrate the utility of ordinal metrics for equilibrium selection.
We study the fair division of indivisible chores among agents with additive disutility functions. We investigate the existence of allocations satisfying the popular fairness notion of envy-freeness up to any chore (EFX), and its multiplicative approximations. The existence of $4$-EFX allocations was recently established by Garg, Murhekar, and Qin (2025). We improve this guarantee by proving the existence of $2$-EFX allocations for all instances with additive disutilities. This approximation was previously known only for restricted instances such as bivalued disutilities (Lin, Wu, and Zhou (2025)) or three agents (Afshinmehr, Ansaripour, Danaei, and Mehlhorn (2024)). We obtain our result by providing a general framework for achieving approximate-EFX allocations. The approach begins with a suitable initial allocation and performs a sequence of local swaps between the bundles of envious and envied agents. For our main result, we begin with an initial allocation that satisfies envy-freeness up to one chore (EF1) and Pareto-optimality (PO); the existence of such an allocation was recently established in a major breakthrough by Mahara (2025). We further demonstrate the strength and generality of our framework by giving simple and unified proofs of existing results, namely (i) $2$-EFX for bivalued instances, (ii) 2-EFX for three agents, (iii) EFX when the number of chores is at most twice the number of agents, and (iv) $4$-EFX for all instances. We expect this framework to have broader applications in approximate-EFX due to its simplicity and generality.
Fair and dynamic energy allocation in community microgrids remains a critical challenge, particularly when serving socio-economically diverse participants. Static optimization and cost-sharing methods often fail to adapt to evolving inequities, leading to participant dissatisfaction and unsustainable cooperation. This paper proposes a novel framework that integrates multi-objective mixed-integer linear programming (MILP), cooperative game theory, and a dynamic equity-adjustment mechanism driven by reinforcement learning (RL). At its core, the framework utilizes a bi-level optimization model grounded in Equity-regarding Welfare Maximization (EqWM) principles, which incorporate Rawlsian fairness to prioritize the welfare of the least advantaged participants. We introduce a Proximal Policy Optimization (PPO) agent that dynamically adjusts socio-economic weights in the optimization objective based on observed inequities in cost and renewable energy access. This RL-powered feedback loop enables the system to learn and adapt, continuously striving for a more equitable state. To ensure transparency, Explainable AI (XAI) is used to interpret the benefit allocations derived from a weighted Shapley value. Validated across six realistic scenarios, the framework demonstrates peak demand reductions of up to 72.6%, and significant cooperative gains. The adaptive RL mechanism further reduces the Gini coefficient over time, showcasing a pathway to truly sustainable and fair energy communities.
We study the fair division problem of allocating $m$ indivisible goods to $n$ agents with additive personalized bi-valued utilities. Specifically, each agent $i$ assigns one of two positive values $a_i > b_i > 0$ to each good, indicating that agent $i$'s valuation of any good is either $a_i$ or $b_i$. For convenience, we denote the value ratio of agent $i$ as $r_i = a_i / b_i$. We give a characterization to all the Pareto-optimal allocations. Our characterization implies a polynomial-time algorithm to decide if a given allocation is Pareto-optimal in the case each $r_i$ is an integer. For the general case (where $r_i$ may be fractional), we show that this decision problem is coNP-complete. Our result complements the existing results: this decision problem is coNP-complete for tri-valued utilities (where each agent's value for each good belongs to $\{a,b,c\}$ for some prescribed $a>b>c\geq0$), and this decision problem belongs to P for bi-valued utilities (where $r_i$ in our model is the same for each agent). We further show that an EFX allocation always exists and can be computed in polynomial time under the personalized bi-valued utilities setting, which extends the previous result on bi-valued utilities. We propose the open problem of whether an EFX and Pareto-optimal allocation always exists (and can be computed in polynomial time).
In the context of single-winner ranked-choice elections between $m$ candidates, we explore the tradeoff between two competing goals in every democratic system: the majority principle (maximizing the social welfare) and the minority principle (safeguarding minority groups from overly bad outcomes).To measure the social welfare, we use the well-established framework of metric distortion subject to various objectives: utilitarian (i.e., total cost), $\alpha$-percentile (e.g., median cost for $\alpha = 1/2$), and egalitarian (i.e., max cost). To measure the protection of minorities, we introduce the $\ell$-mutual minority criterion, which requires that if a sufficiently large (parametrized by $\ell$) coalition $T$ of voters ranks all candidates in $S$ lower than all other candidates, then none of the candidates in $S$ should win. The highest $\ell$ for which the criterion is satisfied provides a well-defined measure of mutual minority protection (ranging from 1 to $m$). Our main contribution is the analysis of a recently proposed class of voting rules called $k$-Approval Veto, offering a comprehensive range of trade-offs between the two principles. This class spans between Plurality Veto (for $k=1$) - a simple voting rule achieving optimal metric distortion - and Vote By Veto (for $k=m$) which picks a candidate from the proportional veto core. We show that $k$-Approval Veto has minority protection at least $k$, and thus, it accommodates any desired level of minority protection. However, this comes at the price of lower social welfare. For the utilitarian objective, the metric distortion increases linearly in $k$. For the $\alpha$-percentile objective, the metric distortion is the optimal value of 5 for $\alpha \ge k/(k+1)$ and unbounded for $\alpha < k/(k+1)$. For the egalitarian objective, the metric distortion is the optimal value of 3 for all values of $k$.
Understanding and predicting the behavior of large-scale multi-agents in games remains a fundamental challenge in multi-agent systems. This paper examines the role of heterogeneity in equilibrium formation by analyzing how smooth regret-matching drives a large number of heterogeneous agents with diverse initial policies toward unified behavior. By modeling the system state as a probability distribution of regrets and analyzing its evolution through the continuity equation, we uncover a key phenomenon in diverse multi-agent settings: the variance of the regret distribution diminishes over time, leading to the disappearance of heterogeneity and the emergence of consensus among agents. This universal result enables us to prove convergence to quantal response equilibria in both competitive and cooperative multi-agent settings. Our work advances the theoretical understanding of multi-agent learning and offers a novel perspective on equilibrium selection in diverse game-theoretic scenarios.
Dynamic game theory offers a toolbox for formalizing and solving for both cooperative and non-cooperative strategies in multi-agent scenarios. However, the optimal configuration of such games remains largely unexplored. While there is existing literature on the parametrization of dynamic games, little research examines this parametrization from a strategic perspective where each agent's configuration choice is influenced by the decisions of others. In this work, we introduce the concept of a game of configuration, providing a framework for the strategic fine-tuning of differential games. We define a game of configuration as a two-stage game within the setting of finite-horizon, affine-quadratic, AQ, differential games. In the first stage, each player chooses their corresponding configuration parameter, which will impact their dynamics and costs in the second stage. We provide the subgame perfect solution concept and a method for computing first stage cost gradients over the configuration space. This then allows us to formulate a gradient-based method for searching for local solutions to the configuration game, as well as provide necessary conditions for equilibrium configurations over their downstream (second stage) trajectories. We conclude by demonstrating the effectiveness of our approach in example AQ systems, both zero-sum and general-sum.