Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
We consider the computational complexity of computing Bayes-Nash equilibria in first-price auctions, where the bidders' values for the item are drawn from a general (possibly correlated) joint distribution. We show that when the values and the bidding space are discrete, determining the existence of a pure Bayes-Nash equilibrium is NP-hard. This is the first hardness result in the literature of the problem that does not rely on assumptions of subjectivity of the priors, or convoluted tie-breaking rules. We then present two main approaches for achieving positive results, via bid sparsification and via bid densification. The former is more combinatorial and is based on enumeration techniques, whereas the latter makes use of the continuous theory of the problem developed in the economics literature. Using these approaches, we develop polynomial-time approximation algorithms for computing equilibria in symmetric settings or settings with a fixed number of bidders, for different (discrete or continuous) variants of the auction.
Machine learning is now ubiquitous in societal decision-making, for example in evaluating job candidates or loan applications, and it is increasingly important to take into account how classified agents will react to the learning algorithms. The majority of recent literature on strategic classification has focused on reducing and countering deceptive behaviors by the classified agents, but recent work of Attias et al. identifies surprising properties of learnability when the agents genuinely improve in order to attain the desirable classification, such as smaller generalization error than standard PAC-learning. In this paper we characterize so-called learnability with improvements across multiple new axes. We introduce an asymmetric variant of minimally consistent concept classes and use it to provide an exact characterization of proper learning with improvements in the realizable setting. While prior work studies learnability only under general, arbitrary agent improvement regions, we give positive results for more natural Euclidean ball improvement sets. In particular, we characterize improper learning under a mild generative assumption on the data distribution. We further show how to learn in more challenging settings, achieving lower generalization error under well-studied bounded noise models and obtaining mistake bounds in realizable and agnostic online learning. We resolve open questions posed by Attias et al. for both proper and improper learning.
We consider the cooperative elements that arise in the design of public goods, such as transportation policies and infrastructure. These involve a variety of stakeholders: governments, businesses, advocates, and users. Their eventual deployment depends on the decision maker's ability to garner sufficient support from each of these groups; we formalize these strategic requirements from the perspective of cooperative game theory. Specifically, we introduce non-transferable utility, linear production (NTU LP) games, which combine the game-theoretic tensions inherent in public decision-making with the modeling flexibility of linear programming. We derive structural properties regarding the non-emptiness, representability and complexity of the core, a solution concept that models the viability of cooperation. In particular, we provide fairly general sufficient conditions under which the core of an NTU LP game is guaranteed to be non-empty, prove that determining membership in the core is co-NP-complete, and develop a cutting plane algorithm to optimize various social welfare objectives subject to core membership. Lastly, we apply these results in a data-driven case study on service plan optimization for the Chicago bus system. As our study illustrates, cooperation is necessary for the successful deployment of transportation service plans and similar public goods, but it may also have adverse or counterintuitive distributive implications.
Recent work [Soleymani et al., 2025] introduced a variant of Optimistic Multiplicative Weights Updates (OMWU) that adaptively controls the learning pace in a dynamic, non-monotone manner, achieving new state-of-the-art regret minimization guarantees in general games. In this work, we demonstrate that no-regret learning acceleration through adaptive pacing of the learners is not an isolated phenomenon. We introduce \emph{Cautious Optimism}, a framework for substantially faster regularized learning in general games. Cautious Optimism takes as input any instance of Follow-the-Regularized-Leader (FTRL) and outputs an accelerated no-regret learning algorithm by pacing the underlying FTRL with minimal computational overhead. Importantly, we retain uncoupledness (learners do not need to know other players' utilities). Cautious Optimistic FTRL achieves near-optimal $O_T(\log T)$ regret in diverse self-play (mixing-and-matching regularizers) while preserving the optimal $O(\sqrt{T})$ regret in adversarial scenarios. In contrast to prior works (e.g. Syrgkanis et al. [2015], Daskalakis et al. [2021]), our analysis does not rely on monotonic step-sizes, showcasing a novel route for fast learning in general games.
In budget-feasible mechanism design, there is a set of items $U$, each owned by a distinct seller. The seller of item $e$ incurs a private cost $\overline{c}_e$ for supplying her item. A buyer wishes to procure a set of items from the sellers of maximum value, where the value of a set $S\subseteq U$ of items is given by a valuation function $v:2^U\to \mathbb{R}_+$. The buyer has a budget of $B \in \mathbb{R}_+$ for the total payments made to the sellers. We wish to design a mechanism that is truthful, that is, sellers are incentivized to report their true costs, budget-feasible, that is, the sum of the payments made to the sellers is at most the budget $B$, and that outputs a set whose value is large compared to $\text{OPT}:=\max\{v(S):\overline{c}(S)\le B,S\subseteq U\}$. Budget-feasible mechanism design has been extensively studied, with the literature focussing on (classes of) subadditive valuation functions, and various polytime, budget-feasible mechanisms, achieving constant-factor approximation, have been devised for the special cases of additive, submodular, and XOS valuations. However, for general subadditive valuations, the best-known approximation factor achievable by a polytime budget-feasible mechanism (given access to demand oracles) was only $O(\log n / \log \log n)$, where $n$ is the number of items. We improve this state-of-the-art significantly by designing a budget-feasible mechanism for subadditive valuations that \emph{achieves a substantially-improved approximation factor of $O(\log\log n)$ and runs in polynomial time, given access to demand oracles.}
In this paper, we analyze the mis\`ere versions of two impartial combinatorial games: k-Bounded Greedy Nim and Greedy Nim. We present a complete solution to both games by showing necessary and sufficient conditions for a position to be P-positions.
The burgeoning growth of the esports and multiplayer online gaming community has highlighted the critical importance of evaluating the Most Valuable Player (MVP). The establishment of an explainable and practical MVP evaluation method is very challenging. In our study, we specifically focus on play-by-play data, which records related events during the game, such as assists and points. We aim to address the challenges by introducing a new MVP evaluation framework, denoted as \oursys, which leverages Shapley values. This approach encompasses feature processing, win-loss model training, Shapley value allocation, and MVP ranking determination based on players' contributions. Additionally, we optimize our algorithm to align with expert voting results from the perspective of causality. Finally, we substantiated the efficacy of our method through validation using the NBA dataset and the Dunk City Dynasty dataset and implemented online deployment in the industry.
Users of social media platforms based on recommendation systems (RecSys) (e.g. TikTok, X, YouTube) strategically interact with platform content to influence future recommendations. On some such platforms, users have been documented to form large-scale grassroots movements encouraging others to purposefully interact with algorithmically suppressed content in order to "boost" its recommendation; we term this behavior user altruism. To capture this behavior, we study a game between users and a RecSys, where users provide the RecSys (potentially manipulated) preferences over the contents available to them, and the RecSys -- limited by data and computation constraints -- creates a low-rank approximation preference matrix, and ultimately provides each user her (approximately) most-preferred item. We compare the users' social welfare under truthful preference reporting and under a class of strategies capturing user altruism. In our theoretical analysis, we provide sufficient conditions to ensure strict increases in user social welfare under user altruism, and provide an algorithm to find an effective altruistic strategy. Interestingly, we show that for commonly assumed recommender utility functions, effectively altruistic strategies also improve the utility of the RecSys! We show that our results are robust to several model misspecifications, thus strengthening our conclusions. Our theoretical analysis is complemented by empirical results of effective altruistic strategies on the GoodReads dataset, and an online survey on how real-world users behave altruistically in RecSys. Overall, our findings serve as a proof-of-concept of the reasons why traditional RecSys may incentivize users to form collectives and/or follow altruistic strategies when interacting with them.
The rise of Large Language Models (LLMs) has driven progress in reasoning tasks -- from program synthesis to scientific hypothesis generation -- yet their ability to handle ranked preferences and structured algorithms in combinatorial domains remains underexplored. We study matching markets, a core framework behind applications like resource allocation and ride-sharing, which require reconciling individual ranked preferences to ensure stable outcomes. We evaluate several state-of-the-art models on a hierarchy of preference-based reasoning tasks -- ranging from stable-matching generation to instability detection, instability resolution, and fine-grained preference queries -- to systematically expose their logical and algorithmic limitations in handling ranked inputs. Surprisingly, even top-performing models with advanced reasoning struggle to resolve instability in large markets, often failing to identify blocking pairs or execute algorithms iteratively. We further show that parameter-efficient fine-tuning (LoRA) significantly improves performance in small markets, but fails to bring about a similar improvement on large instances, suggesting the need for more sophisticated strategies to improve LLMs' reasoning with larger-context inputs.
Kidney Exchange Programmes (KEPs) facilitate the exchange of kidneys, and larger pools of recipient-donor pairs tend to yield proportionally more transplants, leading to the proposal of international KEPs (IKEPs). However, as studied by \citet{mincu2021ip}, practical limitations must be considered in IKEPs to ensure that countries remain willing to participate. Thus, we study IKEPs with country-specific parameters, represented by a tuple $\Gamma$, restricting the selected transplants to be feasible for the countries to conduct, e.g., imposing an upper limit on the number of consecutive exchanges within a country's borders. We provide a complete complexity dichotomy for the problem of finding a feasible (according to the constraints given by $\Gamma$) cycle packing with the maximum number of transplants, for every possible $\Gamma$. We also study the potential for countries to misreport their parameters to increase their allocation. As manipulation can harm the total number of transplants, we propose a novel individually rational and incentive compatible mechanism $\mathcal{M}_{\text{order}}$. We first give a theoretical approximation ratio for $\mathcal{M}_{\text{order}}$ in terms of the number of transplants, and show that the approximation ratio of $\mathcal{M}_{\text{order}}$ is asymptotically optimal. We then use simulations which suggest that, in practice, the performance of $\mathcal{M}_{\text{order}}$ is significantly better than this worst-case ratio.
The convergence of online learning algorithms in games under self-play is a fundamental question in game theory and machine learning. Among various notions of convergence, last-iterate convergence is particularly desirable, as it reflects the actual decisions made by the learners and captures the day-to-day behavior of the learning dynamics. While many algorithms are known to converge in the average-iterate, achieving last-iterate convergence typically requires considerably more effort in both the design and the analysis of the algorithm. Somewhat surprisingly, we show in this paper that for a large family of games, there exists a simple black-box reduction that transforms the average iterates of an uncoupled learning dynamics into the last iterates of a new uncoupled learning dynamics, thus also providing a reduction from last-iterate convergence to average-iterate convergence. Our reduction applies to games where each player's utility is linear in both their own strategy and the joint strategy of all opponents. This family includes two-player bimatrix games and generalizations such as multi-player polymatrix games. By applying our reduction to the Optimistic Multiplicative Weights Update algorithm, we obtain new state-of-the-art last-iterate convergence rates for uncoupled learning dynamics in two-player zero-sum normal-form games: (1) an $O(\frac{\log d}{T})$ last-iterate convergence rate under gradient feedback, representing an exponential improvement in the dependence on the dimension $d$ (i.e., the maximum number of actions available to either player); and (2) an $\widetilde{O}(d^{\frac{1}{5}} T^{-\frac{1}{5}})$ last-iterate convergence rate under bandit feedback, improving upon the previous best rates of $\widetilde{O}(\sqrt{d} T^{-\frac{1}{8}})$ and $\widetilde{O}(\sqrt{d} T^{-\frac{1}{6}})$.
Strategic litigation involves bringing a legal case to court with the goal of having a broader impact beyond resolving the case itself: for example, creating precedent which will influence future rulings. In this paper, we explore strategic litigation from the perspective of machine learning theory. We consider an abstract model of a common-law legal system where a lower court decides new cases by applying a decision rule learned from a higher court's past rulings. In this model, we explore the power of a strategic litigator, who strategically brings cases to the higher court to influence the learned decision rule, thereby affecting future cases. We explore questions including: What impact can a strategic litigator have? Which cases should a strategic litigator bring to court? Does it ever make sense for a strategic litigator to bring a case when they are sure the court will rule against them?
We consider an extension to the classic position auctions in which sponsored creatives can be added within AI generated content rather than shown in predefined slots. New challenges arise from the natural requirement that sponsored creatives should smoothly fit into the context. With the help of advanced LLM technologies, it becomes viable to accurately estimate the benefits of adding each individual sponsored creatives into each potential positions within the AI generated content by properly taking the context into account. Therefore, we assume one click-through rate estimation for each position-creative pair, rather than one uniform estimation for each sponsored creative across all positions in classic settings. As a result, the underlying optimization becomes a general matching problem, thus the substitution effects should be treated more carefully compared to standard position auction settings, where the slots are independent with each other. In this work, we formalize a concrete mathematical model of the extended position auction problem and study the welfare-maximization and revenue-maximization mechanism design problem. Formally, we consider two different user behavior models and solve the mechanism design problems therein respectively. For the Multinomial Logit (MNL) model, which is order-insensitive, we can efficiently implement the optimal mechanisms. For the cascade model, which is order-sensitive, we provide approximately optimal solutions.
As AI technologies improve, people are increasingly willing to delegate tasks to AI agents. In many cases, the human decision-maker chooses whether to delegate to an AI agent based on properties of the specific instance of the decision-making problem they are facing. Since humans typically lack full awareness of all the factors relevant to this choice for a given decision-making instance, they perform a kind of categorization by treating indistinguishable instances -- those that have the same observable features -- as the same. In this paper, we define the problem of designing the optimal algorithmic delegate in the presence of categories. This is an important dimension in the design of algorithms to work with humans, since we show that the optimal delegate can be an arbitrarily better teammate than the optimal standalone algorithmic agent. The solution to this optimal delegation problem is not obvious: we discover that this problem is fundamentally combinatorial, and illustrate the complex relationship between the optimal design and the properties of the decision-making task even in simple settings. Indeed, we show that finding the optimal delegate is computationally hard in general. However, we are able to find efficient algorithms for producing the optimal delegate in several broad cases of the problem, including when the optimal action may be decomposed into functions of features observed by the human and the algorithm. Finally, we run computational experiments to simulate a designer updating an algorithmic delegate over time to be optimized for when it is actually adopted by users, and show that while this process does not recover the optimal delegate in general, the resulting delegate often performs quite well.
Decentralized exchanges (DEXs) are crucial to decentralized finance (DeFi) as they enable trading without intermediaries. However, they face challenges like impermanent loss (IL), where liquidity providers (LPs) see their assets' value change unfavorably within a liquidity pool compared to outside it. To tackle these issues, we propose dynamic fee mechanisms over traditional fixed-fee structures used in automated market makers (AMM). Our solution includes asymmetric fees via block-adaptive, deal-adaptive, and the "ideal but unattainable" oracle-based fee algorithm, utilizing all data available to arbitrageurs to mitigate IL. We developed a simulation-based framework to compare these fee algorithms systematically. This framework replicates trading on a DEX, considering both informed and uninformed users and a psychological relative loss factor. Results show that adaptive algorithms outperform fixed-fee baselines in reducing IL while maintaining trading activity among uninformed users. Additionally, insights from oracle-based performance underscore the potential of dynamic fee strategies to lower IL, boost LP profitability, and enhance overall market efficiency.
Proportional response is a well-established distributed algorithm which has been shown to converge to competitive equilibria in both Fisher and Arrow-Debreu markets, for various sub-families of homogeneous utilities, including linear and constant elasticity of substitution utilities. We propose a natural generalization of proportional response for gross substitutes utilities, and prove that it converges to competitive equilibria in Fisher markets. This is the first convergence result of a proportional response style dynamics in Fisher markets for utilities beyond the homogeneous utilities covered by the Eisenberg-Gale convex program. We show an empirical convergence rate of $O(1/T)$ for the prices. Furthermore, we show that the allocations of a lazy version of the generalized proportional response dynamics converge to competitive equilibria in Arrow-Debreu markets.
This paper investigates the role of mediators in Bayesian games by examining their impact on social welfare through the price of anarchy (PoA) and price of stability (PoS). Mediators can communicate with players to guide them toward equilibria of varying quality, and different communication protocols lead to a variety of equilibrium concepts collectively known as Bayes (coarse) correlated equilibria. To analyze these equilibrium concepts, we consider a general class of Bayesian games with submodular social welfare, which naturally extends valid utility games and their variant, basic utility games. These frameworks, introduced by Vetta (2002), have been developed to analyze the social welfare guarantees of equilibria in games such as competitive facility location, influence maximization, and other resource allocation problems. We provide upper and lower bounds on the PoA and PoS for a broad class of Bayes (coarse) correlated equilibria. Central to our analysis is the strategy representability gap, which measures the multiplicative gap between the optimal social welfare achievable with and without knowledge of other players' types. For monotone submodular social welfare functions, we show that this gap is $1-1/\mathrm{e}$ for independent priors and $\Theta(1/\sqrt{n})$ for correlated priors, where $n$ is the number of players. These bounds directly lead to upper and lower bounds on the PoA and PoS for various equilibrium concepts, while we also derive improved bounds for specific concepts by developing smoothness arguments. Notably, we identify a fundamental gap in the PoA and PoS across different classes of Bayes correlated equilibria, highlighting essential distinctions among these concepts.
This paper provides an efficient computational scheme to handle general security games from an adversarial risk analysis perspective. Two cases in relation to single-stage and multi-stage simultaneous defend-attack games motivate our approach to general setups which uses bi-agent influence diagrams as underlying problem structure and augmented probability simulation as core computational methodology. Theoretical convergence and numerical, modeling, and implementation issues are thoroughly discussed. A disinformation war case study illustrates the relevance of the proposed approach.