CyberSec Research

Twitter/X GitHub

Loading...

Made by 0x1622

CyberSec Research

Browse, search and filter the latest cybersecurity research papers from arXiv

Filters

Cryptography and Security1245

Computers and Society654

Networking and Internet Architecture876

Distributed Computing432

Software Engineering789

Artificial Intelligence1532

Machine Learning921

Hardware Security342

Software Security578

Network Security456

AI Security324

ML Security428

Cloud Security219

IoT Security187

Malware Analysis296

Cryptography413

Privacy329

Authentication247

Vulnerability Analysis385

Publication Year

Results (26266)

Faster Approx. Top-K: Harnessing the Full Power of Two Stages

Jun 4, 2025

Yashas Samaga, Varun Yerram, Spandana Ra...

We consider the Top-$K$ selection problem, which aims to identify the largest-$K$ elements from an array. Top-$K$ selection arises in many machine learning algorithms and often becomes a bottleneck on accelerators, which are optimized for dense matrix multiplications. To address this problem, \citet{chern2022tpuknnknearestneighbor} proposed a fast two-stage \textit{approximate} Top-$K$ algorithm: (i) partition the input array and select the top-$1$ element from each partition, (ii) sort this \textit{smaller subset} and return the top $K$ elements. In this paper, we consider a generalized version of this algorithm, where the first stage selects top-$K'$ elements, for some $1 \leq K' \leq K$, from each partition. Our contributions are as follows: (i) we derive an expression for the expected recall of this generalized algorithm and show that choosing $K' > 1$ with fewer partitions in the first stage reduces the input size to the second stage more effectively while maintaining the same expected recall as the original algorithm, (ii) we derive a bound on the expected recall for the original algorithm in \citet{chern2022tpuknnknearestneighbor} that is provably tighter by a factor of $2$ than the one in that paper, and (iii) we implement our algorithm on Cloud TPUv5e and achieve around an order of magnitude speedups over the original algorithm without sacrificing recall on real-world tasks.

Network Security

+2

+1

Read Article PDF

Testing (Conditional) Mutual Information

Jun 4, 2025

Jan Seyfried, Sayantan Sen, Marco Tomami...

We investigate the sample complexity of mutual information and conditional mutual information testing. For conditional mutual information testing, given access to independent samples of a triple of random variables $(A, B, C)$ with unknown distribution, we want to distinguish between two cases: (i) $A$ and $C$ are conditionally independent, i.e., $I(A\!:\!C|B) = 0$, and (ii) $A$ and $C$ are conditionally dependent, i.e., $I(A\!:\!C|B) \geq \varepsilon$ for some threshold $\varepsilon$. We establish an upper bound on the number of samples required to distinguish between the two cases with high confidence, as a function of $\varepsilon$ and the three alphabet sizes. We conjecture that our bound is tight and show that this is indeed the case in several parameter regimes. For the special case of mutual information testing (when $B$ is trivial), we establish the necessary and sufficient number of samples required up to polylogarithmic terms. Our technical contributions include a novel method to efficiently simulate weakly correlated samples from the conditionally independent distribution $P_{A|B} P_{C|B} P_B$ given access to samples from an unknown distribution $P_{ABC}$, and a new estimator for equivalence testing that can handle such correlated samples, which might be of independent interest.

Network Security

+2

Read Article PDF

GenTT: Generate Vectorized Codes for General Tensor Permutation

Jun 4, 2025

Yaojian Chen, Tianyu Ma, An Yang, Lin Ga...

Tensor permutation is a fundamental operation widely applied in AI, tensor networks, and related fields. However, it is extremely complex, and different shapes and permutation maps can make a huge difference. SIMD permutation began to be studied in 2006, but the best method at that time was to split complex permutations into multiple simple permutations to do SIMD, which might increase the complexity for very complex permutations. Subsequently, as tensor contraction gained significant attention, researchers explored structured permutations associated with tensor contraction. Progress on general permutations has been limited, and with increasing SIMD bit widths, achieving efficient performance for these permutations has become increasingly challenging. We propose a SIMD permutation toolkit, \system, that generates optimized permutation code for arbitrary instruction sets, bit widths, tensor shapes, and permutation patterns, while maintaining low complexity. In our experiments, \system is able to achieve up to $38\times$ speedup for special cases and $5\times$ for general gases compared to Numpy.

Software Security

Network Security

+1

+3

Read Article PDF

Stability Notions for Hospital Residents with Sizes

Jun 4, 2025

Haricharan Balasundaram, J B Krishnashre...

The Hospital Residents problem with sizes (HRS) is a generalization of the well-studied hospital residents (HR) problem. In the HRS problem, an agent $a$ has a size $s(a)$ and the agent occupies $s(a)$ many positions of the hospital $h$ when assigned to $h$. The notion of stability in this setting is suitably modified, and it is known that deciding whether an HRS instance admits a stable matching is NP-hard under severe restrictions. In this work, we explore a variation of stability, which we term occupancy-based stability. This notion was defined by McDermid and Manlove in their work, however, to the best of our knowledge, this notion remains unexplored. We show that every HRS instance admits an occupancy-stable matching. We further show that computing a maximum-size occupancy-stable matching is NP-hard. We complement our hardness result by providing a linear-time 3-approximation algorithm for the max-size occupancy-stable matching problem. Given that the classical notion of stability adapted for HRS is not guaranteed to exist in general, we show a practical restriction under which a stable matching is guaranteed to exist. We present an efficient algorithm to output a stable matching in the restricted HRS instances. We also provide an alternate NP-hardness proof for the decision version of the stable matching problem for HRS which imposes a severe restriction on the number of neighbours of non-unit sized agents.

Read Article PDF

Connectivity-Preserving Minimum Separator in AT-free Graphs

Jun 4, 2025

Batya Kenig

Let $A$ and $B$ be disjoint, non-adjacent vertex-sets in an undirected, connected graph $G$, whose vertices are associated with positive weights. We address the problem of identifying a minimum-weight subset of vertices $S\subseteq V(G)$ that, when removed, disconnects $A$ from $B$ while preserving the internal connectivity of both $A$ and $B$. We call such a subset of vertices a connectivity-preserving, or safe minimum $A,B$-separator. Deciding whether a safe $A,B$-separator exists is NP-hard by reduction from the 2-disjoint connected subgraphs problem, and remains NP-hard even for restricted graph classes that include planar graphs, and $P_\ell$-free graphs if $\ell\geq 5$. In this work, we show that if $G$ is AT-free then in polynomial time we can find a safe $A,B$-separator of minimum weight, or establish that no safe $A,B$-separator exists.

Read Article PDF

Conjectured Bounds for 2-Local Hamiltonians via Token Graphs

Jun 3, 2025

Anuj Apte, Ojas Parekh, James Sud

We explain how the maximum energy of the Quantum MaxCut, XY, and EPR Hamiltonians on a graph $G$ are related to the spectral radii of the token graphs of $G$. From numerical study, we conjecture new bounds for these spectral radii based on properties of $G$. We show how these conjectures tighten the analysis of existing algorithms, implying state-of-the-art approximation ratios for all three Hamiltonians. Our conjectures also provide simple combinatorial bounds on the ground state energy of the antiferromagnetic Heisenberg model, which we prove for bipartite graphs.

Network Security

+1

Read Article PDF

Cover time of random subgraphs of the hypercube

Jun 3, 2025

Colin Cooper, Alan Frieze, Wesley Pegden

$Q_{n,p}$, the random subgraph of the $n$-vertex hypercube $Q_n$, is obtained by independently retaining each edge of $Q_n$ with probability $p$. We give precise values for the cover time of $Q_{n,p}$ above the connectivity threshold.

+1

Read Article PDF

Prefix-free parsing for merging big BWTs

Jun 3, 2025

Diego Diaz-Dominguez, Travis Gagie, Vero...

When building Burrows-Wheeler Transforms (BWTs) of truly huge datasets, prefix-free parsing (PFP) can use an unreasonable amount of memory. In this paper we show how if a dataset can be broken down into small datasets that are not very similar to each other -- such as collections of many copies of genomes of each of several species, or collections of many copies of each of the human chromosomes -- then we can drastically reduce PFP's memory footprint by building the BWTs of the small datasets and then merging them into the BWT of the whole dataset.

Read Article PDF

Labelling Data with Unknown References

Jun 3, 2025

Adrian de Wynter

An evaluator is trustworthy when there exists some agreed-upon way to measure its performance as a labeller. The two ways to establish trustworthiness are either by testing it, or by assuming the evaluator `knows' somehow the way to label the corpus. However, if labelled references (e.g., a development set) are unavailable, neither of these approaches work: the former requires the data, and the latter is an assumption, not evidence. To address this, we introduce an algorithm (the `No-Data Algorithm') by which to establish trust in an evaluator without any existing references. Our algorithm works by successively posing challenges to said evaluator. We show that this is sufficient to establish trustworthiness w.h.p., in such a way that when the evaluator actually knows the way to label the corpus, the No-Data Algorithm accepts its output; and, conversely, flags untrustworthy evaluators when these are unable to prove it. We present formal proofs of correctness and limited experiments.

+1

Read Article PDF

GPU-Parallelizable Randomized Sketch-and-Precondition for Linear Regression using Sparse Sign Sketches

Jun 3, 2025

Tyler Chen, Pradeep Niroula, Archan Ray,...

A litany of theoretical and numerical results have established the sketch-and-precondition paradigm as a powerful approach to solving large linear regression problems in standard computing environments. Perhaps surprisingly, much less work has been done on understanding how sketch-and-precondition performs on graphics processing unit (GPU) systems. We address this gap by benchmarking an implementation of sketch-and-precondition based on sparse sign-sketches on single and multi-GPU systems. In doing so, we describe a novel, easily parallelized, rejection-sampling based method for generating sparse sign sketches. Our approach, which is particularly well-suited for GPUs, is easily adapted to a variety of computing environments. Taken as a whole, our numerical experiments indicate that sketch-and-precondition with sparse sign sketches is particularly well-suited for GPUs, and may be suitable for use in black-box least-squares solvers.

+3

Read Article PDF

Upper bounds on the theta function of random graphs

Jun 3, 2025

Uriel Feige, Vadim Grinberg

The theta function of Lovasz is a graph parameter that can be computed up to arbitrary precision in polynomial time. It plays a key role in algorithms that approximate graph parameters such as maximum independent set, maximum clique and chromatic number, or even compute them exactly in some models of random and semi-random graphs. For Erdos-Renyi random $G_{n,1/2}$ graphs, the expected value of the theta function is known to be at most $2\sqrt{n}$ and at least $\sqrt{n}$. These bounds have not been improved in over 40 years. In this work, we introduce a new class of polynomial time computable graph parameters, where every parameter in this class is an upper bound on the theta function. We also present heuristic arguments for determining the expected values of parameters from this class in random graphs. The values suggested by these heuristic arguments are in agreement with results that we obtain experimentally, by sampling graphs at random and computing the value of the respective parameter. Based on parameters from this new class, we feel safe in conjecturing that for $G_{n,1/2}$, the expected value of the theta function is below $1.55 \sqrt{n}$. Our paper falls short of rigorously proving such an upper bound, because our analysis makes use of unproven assumptions.

Read Article PDF

Cartesian Forest Matching

Jun 3, 2025

Bastien Auvray, Julien David, Richard Gr...

In this paper, we introduce the notion of Cartesian Forest, which generalizes Cartesian Trees, in order to deal with partially ordered sequences. We show that algorithms that solve both exact and approximate Cartesian Tree Matching can be adapted to solve Cartesian Forest Matching in average linear time. We adapt the notion of Cartesian Tree Signature to Cartesian Forests and show how filters can be used to experimentally improve the algorithm for the exact matching. We also show a one to one correspondence between Cartesian Forests and Schr\"oder Trees.

Read Article PDF

The power of mediators: Price of anarchy and stability in Bayesian games with submodular social welfare

Jun 3, 2025

Kaito Fujii

This paper investigates the role of mediators in Bayesian games by examining their impact on social welfare through the price of anarchy (PoA) and price of stability (PoS). Mediators can communicate with players to guide them toward equilibria of varying quality, and different communication protocols lead to a variety of equilibrium concepts collectively known as Bayes (coarse) correlated equilibria. To analyze these equilibrium concepts, we consider a general class of Bayesian games with submodular social welfare, which naturally extends valid utility games and their variant, basic utility games. These frameworks, introduced by Vetta (2002), have been developed to analyze the social welfare guarantees of equilibria in games such as competitive facility location, influence maximization, and other resource allocation problems. We provide upper and lower bounds on the PoA and PoS for a broad class of Bayes (coarse) correlated equilibria. Central to our analysis is the strategy representability gap, which measures the multiplicative gap between the optimal social welfare achievable with and without knowledge of other players' types. For monotone submodular social welfare functions, we show that this gap is $1-1/\mathrm{e}$ for independent priors and $\Theta(1/\sqrt{n})$ for correlated priors, where $n$ is the number of players. These bounds directly lead to upper and lower bounds on the PoA and PoS for various equilibrium concepts, while we also derive improved bounds for specific concepts by developing smoothness arguments. Notably, we identify a fundamental gap in the PoA and PoS across different classes of Bayes correlated equilibria, highlighting essential distinctions among these concepts.

Network Security

+1

Read Article PDF

On the Inversion Modulo a Power of an Integer

Jun 3, 2025

Guangwu Xu, Yunxiao Tian, Bingxin Yang

Recently, Koc proposed a neat and efficient algorithm for computing $x = a^{-1} \pmod {p^k}$ for a prime $p$ based on the exact solution of linear equations using $p$-adic expansions. The algorithm requires only addition and right shift per step. In this paper, we design an algorithm that computes $x = a^{-1} \pmod {n^k}$ for any integer $n>1$. The algorithm has a motivation from the schoolbook multiplication and achieves both efficiency and generality. The greater flexibility of our algorithm is explored by utilizing the build-in arithmetic of computer architecture, e.g., $n=2^{64}$, and experimental results show significant improvements. This paper also contains some results on modular inverse based on an alternative proof of correctness of Koc algorithm.

Network Security

Read Article PDF

A Practical Linear Time Algorithm for Optimal Tree Decomposition of Halin Graphs

Jun 3, 2025

J. A. Alejandro-Soto, Joel Antonio Trejo...

This work proposes \textsc{H-Td}, a practical linear-time algorithm for computing an optimal-width tree decomposition of Halin graphs. Unlike state-of-the-art methods based on reduction rules or separators, \textsc{H-Td} exploits the structural properties of Halin graphs. Although two theoretical linear-time algorithms exist that can be applied to graphs of treewidth three, no practical implementation has been made publicly available. Furthermore, extending reduction-based approaches to partial $k$-trees with $k > 3$ results in increasingly complex rules that are challenging to implement. This motivates the exploration of alternative strategies that leverage structural insights specific to certain graph classes. Experimental validation against the winners of the Parameterized Algorithms and Computational Experiments Challenge (PACE) 2017 and the treewidth library \texttt{libtw} demonstrates the advantage of \textsc{H-Td} when the input is known to be a Halin graph.

Software Security

+1

Read Article PDF

Sensitivity-Aware Density Estimation in Multiple Dimensions

Jun 2, 2025

Aleix Boquet-Pujadas, Pol del Aguila Pla...

We formulate an optimization problem to estimate probability densities in the context of multidimensional problems that are sampled with uneven probability. It considers detector sensitivity as an heterogeneous density and takes advantage of the computational speed and flexible boundary conditions offered by splines on a grid. We choose to regularize the Hessian of the spline via the nuclear norm to promote sparsity. As a result, the method is spatially adaptive and stable against the choice of the regularization parameter, which plays the role of the bandwidth. We test our computational pipeline on standard densities and provide software. We also present a new approach to PET rebinning as an application of our framework.

Software Security

Network Security

+1

+4

Read Article PDF

Learning Optimal Posted Prices for a Unit-Demand Buyer

Jun 2, 2025

Yifeng Teng, Yifan Wang

We study the problem of learning the optimal item pricing for a unit-demand buyer with independent item values, and the learner has query access to the buyer's value distributions. We consider two common query models in the literature: the sample access model where the learner can obtain a sample of each item value, and the pricing query model where the learner can set a price for an item and obtain a binary signal on whether the sampled value of the item is greater than our proposed price. In this work, we give nearly tight sample complexity and pricing query complexity of the unit-demand pricing problem.

+2

Read Article PDF

Fairly Wired: Towards Leximin-Optimal Division of Electricity

Jun 2, 2025

Eden Hartman, Dinesh Kumar Baghel, Erel ...

In many parts of the world - particularly in developing countries - the demand for electricity exceeds the available supply. In such cases, it is impossible to provide electricity to all households simultaneously. This raises a fundamental question: how should electricity be allocated fairly? In this paper, we explore this question through the lens of egalitarianism - a principle that emphasizes equality by prioritizing the welfare of the worst-off households. One natural rule that aligns with this principle is to maximize the egalitarian welfare - the smallest utility across all households. We show that computing such an allocation is NP-hard, even under strong simplifying assumptions. Leximin is a stronger fairness notion that generalizes the egalitarian welfare: it also requires to maximize the smallest utility, but then, subject to that, the second-smallest, then the third, and so on. The hardness results extends directly to leximin as well. Despite this, we present a Fully Polynomial-Time Approximation Scheme (FPTAS) for leximin in the special case where the network connectivity graph is a tree. This means that we can efficiently approximate leximin - and, in particular, the egalitarian welfare - to any desired level of accuracy.

Network Security

+2

Read Article PDF