CyberSec Research

Twitter/X GitHub

Loading...

Made by 0x1622

CyberSec Research

Browse, search and filter the latest cybersecurity research papers from arXiv

Filters

Cryptography and Security1245

Computers and Society654

Networking and Internet Architecture876

Distributed Computing432

Software Engineering789

Artificial Intelligence1532

Machine Learning921

Hardware Security342

Software Security578

Network Security456

AI Security324

ML Security428

Cloud Security219

IoT Security187

Malware Analysis296

Cryptography413

Privacy329

Authentication247

Vulnerability Analysis385

Publication Year

Results (25523)

Information geometry of Lévy processes and financial models

Jul 31, 2025

Jaehyung Choi

We explore the information geometry of L\'evy processes. As a starting point, we derive the $\alpha$-divergence between two L\'evy processes. Subsequently, the Fisher information matrix and the $\alpha$-connection associated with the geometry of L\'evy processes are computed from the $\alpha$-divergence. In addition, we discuss statistical applications of this information geometry. As illustrative examples, we investigate the differential-geometric structures of various L\'evy processes relevant to financial modeling, including tempered stable processes, the CGMY model, and variance gamma processes.

Software Security

+5

Read Article PDF

Incorporating structural uncertainty in causal decision making

Jul 31, 2025

Maurits Kaptein

Practitioners making decisions based on causal effects typically ignore structural uncertainty. We analyze when this uncertainty is consequential enough to warrant methodological solutions (Bayesian model averaging over competing causal structures). Focusing on bivariate relationships ($X \rightarrow Y$ vs. $X \leftarrow Y$), we establish that model averaging is beneficial when: (1) structural uncertainty is moderate to high, (2) causal effects differ substantially between structures, and (3) loss functions are sufficiently sensitive to the size of the causal effect. We prove optimality results of our suggested methodological solution under regularity conditions and demonstrate through simulations that modern causal discovery methods can provide, within limits, the necessary quantification. Our framework complements existing robust causal inference approaches by addressing a distinct source of uncertainty typically overlooked in practice.

Network Security

+2

Read Article PDF

Optimal-Transport Based Multivariate Goodness-of-Fit Tests

Jul 31, 2025

Zdeněk Hlávka, Šárka Hudecová, Simos G. ...

Characteristic-function based goodness-of-fit tests are suggested for multivariate observations. The test statistics, which are straightforward to compute, are defined as two-sample criteria measuring discrepancy between multivariate ranks of the original observations and the corresponding ranks obtained from an artificial sample generated from the reference distribution under test. Multivariate ranks are constructed using the theory of the optimal measure transport, thus rendering the tests of a simple null hypothesis distribution-free, while bootstrap approximations are still necessary for testing composite null hypotheses. Asymptotic theory is developed and a simulation study, concentrating on comparisons with previously proposed tests of multivariate normality, demonstrates that the method performs well in finite samples.

+1

Read Article PDF

CLT in high-dimensional Bayesian linear regression with low SNR

Jul 31, 2025

Seunghyun Lee, Nabarun Deb, Sumit Mukher...

We study central limit theorems for linear statistics in high-dimensional Bayesian linear regression with product priors. Unlike the existing literature where the focus is on posterior contraction, we work under a non-contracting regime where neither the likelihood nor the prior dominates the other. This is motivated by modern high-dimensional datasets characterized by a bounded signal-to-noise ratio. This work takes a first step towards understanding limit distributions for one-dimensional projections of the posterior, as well as the posterior mean, in such regimes. Analogous to contractive settings, the resulting limiting distributions are Gaussian, but they heavily depend on the chosen prior and center around the Mean-Field approximation of the posterior. We study two concrete models of interest to illustrate this phenomenon -- the white noise design, and the (misspecified) Bayesian model. As an application, we construct credible intervals and compute their coverage probability under any misspecified prior. Our proofs rely on a combination of recent developments in Berry-Esseen type bounds for Random Field Ising models and both first and second order Poincar\'{e} inequalities. Notably, our results do not require any sparsity assumptions on the prior.

Software Security

+2

Read Article PDF

Learning Smooth Populations of Parameters with Trial Heterogeneity

Jul 30, 2025

JungHo Lee, Valerio Baćak, Edward H. Ken...

We consider the classical problem of estimating the mixing distribution of binomial mixtures, but under trial heterogeneity and smoothness. This problem has been studied extensively when the trial parameter is homogeneous, but not under the more general scenario of heterogeneous trials, and only within a low smoothness regime, where the resulting rates are slow. Under the assumption that the density is s-smooth, we derive fast error rates for the kernel density estimator under trial heterogeneity that depend on the harmonic mean of the trials. Importantly, even when reduced to the homogeneous case, our result improves on the state-of-the-art rate of Ye and Bickel (2021). We also study nonparametric estimation of the difference between two densities, which can be smoother than the individual densities, in both i.i.d. and binomial-mixture settings. Our work is motivated by an application in criminal justice: comparing conviction rates of indigent representation in Pennsylvania. We find that the estimated conviction rates for appointed counsel (court-appointed private attorneys) are generally higher than those for public defenders, potentially due to a confounding factor: appointed counsel are more likely to take on severe cases.

Software Security

+2

Read Article PDF

Phylogenetic network models as graphical models

Jul 30, 2025

Seth Sullivant

The displayed tree phylogenetic network model is shown to sit as a natural submodel of the graphical model associated to a directed acyclic graph (DAG). This representation allows to derive a number of results about the displayed tree model. In particular, the concept of a local modification to a DAG model is developed and applied to the displayed tree model. As an application, some nonidentifiability issues related to the displayed tree models are highlighted as they relate to reticulation edges and stacked reticulations in the networks. We also derive rank conditions on flattenings of probability tensors for the displayed tree model, generalizing classic results for phylogenetic tree models.

Software Security

Network Security

+3

Read Article PDF

Inference on Common Trends in a Cointegrated Nonlinear SVAR

Jul 30, 2025

James A. Duffy, Xiyu Jiao

We consider the problem of performing inference on the number of common stochastic trends when data is generated by a cointegrated CKSVAR (a two-regime, piecewise-linear SVAR; Mavroeidis, 2021), using a modified version of the Breitung (2002) multivariate variance ratio test that is robust to the presence of nonlinear cointegration (of a known form). To derive the asymptotics of our test statistic, we prove a fundamental LLN-type result for a class of stable but nonstationary autoregressive processes, using a novel dual linear process approximation. We show that our modified test yields correct inferences regarding the number of common trends in such a system, whereas the unmodified test tends to infer a higher number of common trends than are actually present, when cointegrating relations are nonlinear.

+3

Read Article PDF

A quantum experiment with joint exogeneity violation

Jul 30, 2025

Yuhao Wang, Xingjian Zhang

In randomized experiments, the assumption of potential outcomes is usually accompanied by the \emph{joint exogeneity} assumption. Although joint exogeneity has faced criticism as a counterfactual assumption since its proposal, no evidence has yet demonstrated its violation in randomized experiments. In this paper, we reveal such a violation in a quantum experiment, thereby falsifying this assumption, at least in regimes where classical physics cannot provide a complete description. We further discuss its implications for potential outcome modelling, from both practial and philosophical perspectives.

Network Security

+3

Read Article PDF

Generalized Optimal Transport

Jul 30, 2025

Andrei Voronin

Many causal and structural parameters in economics can be identified and estimated by computing the value of an optimization program over all distributions consistent with the model and the data. Existing tools apply when the data is discrete, or when only disjoint marginals of the distribution are identified, which is restrictive in many applications. We develop a general framework that yields sharp bounds on a linear functional of the unknown true distribution under i) an arbitrary collection of identified joint subdistributions and ii) structural conditions, such as (conditional) independence. We encode the identification restrictions as a continuous collection of moments of characteristic kernels, and use duality and approximation theory to rewrite the infinite-dimensional program over Borel measures as a finite-dimensional program that is simple to compute. Our approach yields a consistent estimator that is $\sqrt{n}$-uniformly valid for the sharp bounds. In the special case of empirical optimal transport with Lipschitz cost, where the minimax rate is $n^{2/d}$, our method yields a uniformly consistent estimator with an asymmetric rate, converging at $\sqrt{n}$ uniformly from one side.

Software Security

Network Security

+1

+2

Read Article PDF

A note on blinded continuous monitoring for continuous outcomes

Jul 29, 2025

Long-Hao Xu, Tim Friede

Continuous monitoring is becoming more popular due to its significant benefits, including reducing sample sizes and reaching earlier conclusions. In general, it involves monitoring nuisance parameters (e.g., the variance of outcomes) until a specific condition is satisfied. The blinded method, which does not require revealing group assignments, was recommended because it maintains the integrity of the experiment and mitigates potential bias. Although Friede and Miller (2012) investigated the characteristics of blinded continuous monitoring through simulation studies, its theoretical properties are not fully explored. In this paper, we aim to fill this gap by presenting the asymptotic and finite-sample properties of the blinded continuous monitoring for continuous outcomes. Furthermore, we examine the impact of using blinded versus unblinded variance estimators in the context of continuous monitoring. Simulation results are also provided to evaluate finite-sample performance and to support the theoretical findings.

+2

Read Article PDF

Stacked SVD or SVD stacked? A Random Matrix Theory perspective on data integration

Jul 29, 2025

Tavor Z. Baharav, Phillip B. Nicol, Rafa...

Modern data analysis increasingly requires identifying shared latent structure across multiple high-dimensional datasets. A commonly used model assumes that the data matrices are noisy observations of low-rank matrices with a shared singular subspace. In this case, two primary methods have emerged for estimating this shared structure, which vary in how they integrate information across datasets. The first approach, termed Stack-SVD, concatenates all the datasets, and then performs a singular value decomposition (SVD). The second approach, termed SVD-Stack, first performs an SVD separately for each dataset, then aggregates the top singular vectors across these datasets, and finally computes a consensus amongst them. While these methods are widely used, they have not been rigorously studied in the proportional asymptotic regime, which is of great practical relevance in today's world of increasing data size and dimensionality. This lack of theoretical understanding has led to uncertainty about which method to choose and limited the ability to fully exploit their potential. To address these challenges, we derive exact expressions for the asymptotic performance and phase transitions of these two methods and develop optimal weighting schemes to further improve both methods. Our analysis reveals that while neither method uniformly dominates the other in the unweighted case, optimally weighted Stack-SVD dominates optimally weighted SVD-Stack. We extend our analysis to accommodate multiple shared components, and provide practical algorithms for estimating optimal weights from data, offering theoretical guidance for method selection in practical data integration problems. Extensive numerical simulations and semi-synthetic experiments on genomic data corroborate our theoretical findings.

Software Security

Network Security

+3

+5

Read Article PDF

Divergence and Model Adequacy, A Semiparametric Case Study

Jul 29, 2025

Michel Broniatowski, Justin Moutsouka

Adequacy for estimation between an inferential method and a model can be de{\ldots}ned through two main requirements: {\ldots}rstly the inferential tool should de{\ldots}ne a well posed problem when applied to the model; secondly the resulting statistical procedure should produce consistent estimators. Conditions which entail these analytical and statistical issues are considered in the context when divergence based inference is applied for smooth semiparametric models under moment restrictions. A discussion is also held on the choice of the divergence, extending the classical parametric inference to the estimation of both parameters of interest and of nuisance. Arguments in favor of the omnibus choice of the L 2 and Kullback Leibler choices as presented in [16] are discussed and motivation for the class of power divergences de{\ldots}ned in [5] is presented in the context of the present semi parametric smooth models. A short simulation study illustrates the method.

Network Security

+1

Read Article PDF

New (and old) predictive schemes with a.c.i.d. sequences

Jul 29, 2025

Marco Battiston, Lorenzo Cappello

There is a growing interest in procedures for Bayesian inference that bypass the need to specify a model and prior but simply rely on a predictive rule that describes how we learn on future observations given the available ones. At the heart of the idea is a bootstrap-type scheme that allows us to move from the realm of prediction to that of inference. Which conditions the predictive rule needs to satisfy to produce valid inference is a key question. In this work, we substantially relax previous assumptions building on a generalization of martingales, opening up the possibility of employing a much wider range of predictive rules that were previously ruled out. These include ``old" ideas in Statistics and Learning Theory, such as kernel estimators, and more novel ones, such as the parametric Bayesian bootstrap or copula-based algorithms. Our aim is not to advocate in favor of one predictive rule versus the other ones, but rather to showcase the benefits of working with this larger class of predictive rules.

+2

Read Article PDF

Alternating Bregman projections and convergence of the EM algorithm

Jul 29, 2025

Dominikus Noll

We investigate convergence of alternating Bregman projections between non-convex sets and prove convergence to a point in the intersection, or to points realizing a gap between the two sets. The speed of convergence is generally sub-linear, but may be linear under transversality. We apply our analysis to prove convergence of versions of the expectation maximization algorithm for non-convex parameter sets.

+2

Read Article PDF

Factorization by extremal privacy mechanisms: new insights into efficiency

Jul 29, 2025

Chiara Amorino, Arnaud Gloter

We study the problem of efficiency under $\alpha$ local differential privacy ($\alpha$ LDP) in both discrete and continuous settings. Building on a factorization lemma, which shows that any privacy mechanism can be decomposed into an extremal mechanism followed by additional randomization, we reduce the Fisher information maximization problem to a search over extremal mechanisms. The representation of extremal mechanisms requires working in infinite dimensional spaces and invokes advanced tools from convex and functional analysis, such as Choquet's theorem. Our analysis establishes matching upper and lower bounds on the Fisher information in the high privacy regime ($\alpha \to 0$), and proves that the maximization problem always admits a solution for any $\alpha$. As a concrete application, we consider the problem of estimating the parameter of a uniform distribution on $[0, \theta]$ under $\alpha$ LDP. Guided by our theoretical findings, we design an extremal mechanism that yields a consistent and asymptotically efficient estimator in high privacy regime. Numerical experiments confirm our theoretical results.

Software Security

+3

Read Article PDF

Exact Distribution of the Noncentral Complex Roy's Largest Root Statistic via Pieri's Formula

Jul 29, 2025

Koki Shimizu, Hiroki Hashiguchi

In this study, we derive the exact distribution and moment of the noncentral complex Roy's largest root statistic, expressed as a product of complex zonal polynomials. We show that the linearization coefficients arising from the product of complex zonal polynomials in the distribution of Roy's test under a specific alternative hypothesis can be explicitly computed using Pieri's formula, a well-known result in combinatorics. These results were then applied to compute the power of tests in the complex multivariate analysis of variance (MANOVA).

+1

Read Article PDF

False discovery rate control with compound p-values

Jul 29, 2025

Rina Foygel Barber, Richard J Samworth

In the setting of multiple testing, compound p-values generalize p-values by asking for superuniformity to hold only \emph{on average} across all true nulls. We study the properties of the Benjamini--Hochberg procedure applied to compound p-values. Under independence, we show that the false discovery rate (FDR) is at most $1.93\alpha$, where $\alpha$ is the nominal level, and exhibit a distribution for which the FDR is $\frac{7}{6}\alpha$. If additionally all nulls are true, then the upper bound can be improved to $\alpha + 2\alpha^2$, with a corresponding worst-case lower bound of $\alpha + \alpha^2/4$. Under positive dependence, on the other hand, we demonstrate that FDR can be inflated by a factor of $O(\log m)$, where~$m$ is the number of hypotheses. We provide numerous examples of settings where compound p-values arise in practice, either because we lack sufficient information to compute non-trivial p-values, or to facilitate a more powerful analysis.

Network Security

+2

Read Article PDF

A Generalized Cramér-Rao Bound Using Information Geometry

Jul 28, 2025

Satyajit Dhadumia, M. Ashok Kumar

In information geometry, statistical models are considered as differentiable manifolds, where each probability distribution represents a unique point on the manifold. A Riemannian metric can be systematically obtained from a divergence function using Eguchi's theory (1992); the well-known Fisher-Rao metric is obtained from the Kullback-Leibler (KL) divergence. The geometric derivation of the classical Cram\'er-Rao Lower Bound (CRLB) by Amari and Nagaoka (2000) is based on this metric. In this paper, we study a Riemannian metric obtained by applying Eguchi's theory to the Basu-Harris-Hjort-Jones (BHHJ) divergence (1998) and derive a generalized Cram\'er-Rao bound using Amari-Nagaoka's approach. There are potential applications for this bound in robust estimation.

Software Security

+5

Read Article PDF