Loading...
Loading...
Browse, search, and filter preprints from arXiv—fast, readable, and built for curious security folks.
Showing 18 loaded of 46,674—scroll for more
Federated Learning (FL) enables heterogeneous clients to collaboratively train a shared model without centralizing their raw data, offering an inherent level of privacy. However, gradients and model updates can still leak sensitive information, while malicious servers may mount adversarial attacks such as Byzantine manipulation. These vulnerabilities highlight the need to address differential privacy (DP) and Byzantine robustness within a unified framework. Existing approaches, however, often rely on unrealistic assumptions such as bounded gradients, require auxiliary server-side datasets, or fail to provide convergence guarantees. We address these limitations by proposing Byz-Clip21-SGD2M, a new algorithm that integrates robust aggregation with double momentum and carefully designed clipping. We prove high-probability convergence guarantees under standard $L$-smoothness and $σ$-sub-Gaussian gradient noise assumptions, thereby relaxing conditions that dominate prior work. Our analysis recovers state-of-the-art convergence rates in the absence of adversaries and improves utility guarantees under Byzantine and DP settings. Empirical evaluations on CNN and MLP models trained on MNIST further validate the effectiveness of our approach.
AI-driven cybersecurity systems often fail under cross-environment deployment due to fragmented, event-centric telemetry representations. We introduce the Canonical Security Telemetry Substrate (CSTS), an entity-relational abstraction that enforces identity persistence, typed relationships, and temporal state invariants. Across heterogeneous environments, CSTS improves cross-topology transfer for identity-centric detection and prevents collapse under schema perturbation. For zero-day detection, CSTS isolates semantic orientation instability as a modeling, not schema, phenomenon, clarifying layered portability requirements.
The integration of machine learning (ML) algorithms into Internet of Things (IoT) applications has introduced significant advantages alongside vulnerabilities to adversarial attacks, especially within IoT-based intrusion detection systems (IDS). While theoretical adversarial attacks have been extensively studied, practical implementation constraints have often been overlooked. This research addresses this gap by evaluating the feasibility of evasion attacks on IoT network-based IDSs, employing a novel black-box adversarial attack. Our study aims to bridge theoretical vulnerabilities with real-world applicability, enhancing understanding and defense against sophisticated threats in modern IoT ecosystems. Additionally, we propose a defense scheme tailored to mitigate the impact of evasion attacks, thereby reinforcing the resilience of ML-based IDSs. Our findings demonstrate successful evasion attacks against IDSs, underscoring their susceptibility to advanced techniques. In contrast, we proposed a defense mechanism that exhibits robust performance by effectively detecting the majority of adversarial traffic, showcasing promising outcomes compared to current state-of-the-art defenses. By addressing these critical cybersecurity challenges, our research contributes to advancing IoT security and provides insights for developing more resilient IDS.
Industrial deployments increasingly rely on Open Platform Communications Unified Architecture (OPC UA) as a secure and platform-independent communication protocol, while private Fifth Generation (5G) networks provide low-latency and high-reliability connectivity for modern automation systems. However, their combination introduces new attack surfaces and traffic characteristics that remain insufficiently understood, particularly with respect to machine learning-based intrusion detection systems (ML-based IDS). This paper presents an experimental study on detecting cyberattacks against OPC UA applications operating over an operational private 5G network. Multiple attack scenarios are executed, and OPC UA traffic is captured and enriched with statistical flow-, packet-, and protocol-aware features. Several supervised ML models are trained and evaluated to distinguish benign and malicious traffic. The results demonstrate that the proposed ML-based IDS achieves high detection performance for a representative set of OPC UA-specific attack scenarios over an operational private 5G network.
Ring-mapping protocols need a canonical byte-to-residue layer before any algebraic encryption step can begin. This paper isolates that layer and presents the base-m length codec, a canonical map from byte strings of length less than 2^64 to lists of residues modulo m. The encoder builds on and adapts an rANS-based system proposed by Duda. Decoding is exact for all moduli satisfying the paper's parameter bounds. Because the encoding carries the byte length in its fixed-width header, decoding is also tolerant to appended valid suffix digits. The paper is accompanied by a Rust implementation of the described protocol, a Lean 4 formalization of the abstract codec with machine-checked proofs, and performance benchmarks. The Lean 4 formalization establishes fixed-width prefix inversion and payload-state bounds below 2^64, stream-level roundtrip correctness, and that every emitted symbol is a valid residue modulo m. We conclude with a complexity analysis and a discussion of practical considerations arising in real-world use of the codec.
The latest Wi-Fi security standard, IEEE 802.11, includes a secure authentication protocol called SAE, whose use is mandatory for WPA3-Personal networks. The protocol is specified at two separate but linked levels: a traditional cryptographic description of the communication logic between network devices, and a state machine description that realises the former in each single device. Current formal verification efforts focus mainly on communication logic. We present detailed formal models of the protocol at both levels, provide precise specifications of its security properties, and analyse machine-checked proofs in ProVerif and ASMETA. The integrated analysis of the above two models is particularly novel, enabling us to identify and address several issues in the current IEEE 802.11 specification more thoroughly than would have been possible otherwise, leading to several official revisions of the standard.
Financial institutions face increasing cyber risk while operating under strict regulatory oversight. To manage this risk, they rely heavily on Cyber Threat Intelligence (CTI) to inform detection, response, and strategic security decisions. Artificial intelligence (AI) is widely suggested as a means to strengthen CTI. However, evidence of trustworthy production use in finance remains limited. Adoption depends not only on predictive performance, but also on governance, integration into security workflows and analyst trust. Thus, we examine how AI is used for CTI in practice within financial institutions and what barriers prevent trustworthy deployment. We report a mixed-methods, user-centric study combining a CTI-finance-focused systematic literature review, semi-structured interviews, and an exploratory survey. Our review screened 330 publications (2019-2025) and retained 12 finance-relevant studies for analysis; we further conducted six interviews and collected 14 survey responses from banks and consultancies. Across research and practice, we identify four recurrent socio-technical failure modes that hinder trustworthy AI-driven CTI: (i) shadow use of public AI tools outside institutional controls, (ii) license-first enablement without operational integration, (iii) attacker-perception gaps that limit adversarial threat modeling, and (iv) missing security for the AI models themselves, including limited monitoring, robustness evaluation and audit-ready evidence. Survey results provide additional insights: 71.4% of respondents expect AI to become central within five years, 57.1% report infrequent current use due to interpretability and assurance concerns and 28.6% report direct encounters with adversarial risks. Based on these findings, we derive three security-oriented operational safeguards for AI-enabled CTI deployments.
Large Language Models(LLMs) are widely deployed, yet are vulnerable to jailbreak prompts that elicit policy-violating outputs. Although prior studies have uncovered these risks, they typically treat all tokens as equally important during prompt mutation, overlooking the varying contributions of individual tokens to triggering model refusals. Consequently, these attacks introduce substantial redundant searching under query-constrained scenarios, reducing attack efficiency and hindering comprehensive vulnerability assessment. In this work, we conduct a token-level analysis of refusal behavior and observe that token contributions are highly skewed rather than uniform. Moreover, we find strong cross-model consistency in refusal tendencies, enabling the use of a surrogate model to estimate token-level contributions to the target model's refusals. Motivated by these findings, we propose TriageFuzz, a token-aware jailbreak fuzzing framework that adapts the fuzz testing approach with a series of customized designs. TriageFuzz leverages a surrogate model to estimate the contribution of individual tokens to refusal behaviors, enabling the identification of sensitive regions within the prompt. Furthermore, it incorporates a refusal-guided evolutionary strategy that adaptively weights candidate prompts with a lightweight scorer to steer the evolution toward bypassing safety constraints. Extensive experiments on six open-source LLMs and three commercial APIs demonstrate that TriageFuzz achieves comparable attack success rates (ASR) with significantly reduced query costs. Notably, it attains a 90% ASR with over 70% fewer queries compared to baselines. Even under an extremely restrictive budget of 25 queries, TriageFuzz outperforms existing methods, improving ASR by 20-40%.
Fully Homomorphic Encryption (FHE) is rapidly emerging as a promising foundation for privacy-preserving cloud services, enabling computation directly on encrypted data. As FHE implementations mature and begin moving toward practical deployment in domains such as secure finance, biomedical analytics, and privacy-preserving AI, a critical question remains insufficiently explored: how reliable is FHE computation on real hardware? This question is especially important because, compared with plaintext computation, FHE incurs much higher computational overhead, making it more susceptible to transient hardware faults. Moreover, data corruptions are likely to remain silent: the FHE service has no access to the underlying plaintext, causing unawareness even though the corresponding decrypted result has already been corrupted. To this end, we conduct a comprehensive evaluation of SDCs in FHE ciphertext computation. Through large-scale fault-injection experiments, we characterize the vulnerability of FHE to transient faults, and through a theoretical analysis of error-propagation behaviors, we gain deeper algorithmic insight into the mechanisms underlying this vulnerability. We further assess the effectiveness of different fault-tolerance mechanisms for mitigating these faults.
Given two linear codes, the Linear Equivalence Problem (LEP) asks to find (if it exists) a linear isometry between them; as a special case, we have the Permutation Equivalence Problem (PEP), in which isometries must be permutations. LEP and PEP have recently gained renewed interest as the security foundations for several post-quantum schemes, including LESS. A recent paper has introduced the use of the Schur product to solve PEP, identifying many new easy-to-solve instances. In this paper, we extend this result to LEP. In particular, we generalize the approach and rely on the more general notion of power codes. Combining it with Frobenius automorphisms and Hermitian hulls, we identify many classes of easy LEP instances. To the best of our knowledge, this is the first work exploiting algebraic weaknesses for LEP. Finally we show an improved reduction to PEP whenever the coefficients of the monomial matrix are in a subgroup of the multiplicative group of the finite field.
Private Membership Testing (PMT) protocols enable clients to verify whether a certain data item is included in a database without revealing the item to the database operator or other external parties. This paper examines Source-assisted PMT (SPMT), in which clients leverage compact data source-provided information issued when the data item is first submitted to the database. SPMT is relevant in applications such as certificate transparency and supply-chain auditing; yet, designing an approach that is efficient, scalable, and privacy-preserving remains a challenge. This work presents Gyokuro, which takes a different approach to conventional membership testing schemes. Instead of requesting the server to produce a proof attesting that a certain data item exists in the database, we leverage Trusted Execution Environments (TEEs) to produce proofs demonstrating that the server has made enough progress to add the data item to the database. With the help of existing monitoring services, clients can infer that no items have been removed from the database. This allows Gyokuro to provide strong privacy guaranties and achieve high efficiency, as a client's membership testing query does not include any information regarding their interests, and eliminates the need for complex and inefficient protection mechanisms. Additionally, this approach enables membership testing on large-scale databases, since the communication and computation required are independent of the database size. Our evaluations show practical feasibility, achieving 7 ms membership testing latency and throughput of around 1400 requests/sec/core.
European Digital Identity (EUDI) Wallet aims to provide end users with a way to get attested credentials from issuers, and present them to different relying parties. An important property mentioned in the regulatory frameworks is the possibility to revoke a previously issued credential. While it is possible to issue a short-lived credential, in some cases it may be inconvenient, and a separate revocation service which allows to revoke a credential at any time may be necessary. In this work, we propose a full end-to-end description of a generic credential revocation system, which technically relies on a single server and secure transmission channels between parties. We prove security of the proposed revocation functionality in the universal composability model, and estimate its efficiency based on a proof-of-concept implementation.
Coordination of view coverage via privacy-aware smart cameras is key to a more socially responsible urban intelligence. Rather than maximizing view coverage at any cost or over relying on expensive cryptographic techniques, we address how cameras can coordinate to legitimately monitor public spaces while excluding privacy-sensitive regions by design. This article proposes a decentralized framework in which interactive smart cameras coordinate to autonomously select their orientation via collective learning, while eliminating privacy violations via soft and hard constraint satisfaction. The approach scales to hundreds up to thousands of cameras without any centralized control. Experimental evidence shows 18.42% higher coverage efficiency and 85.53% lower privacy violation than baselines and other state-of-the-art approaches. This significant advance further unravels practical guidelines for operators and policymakers: how the field of view, spatial placement, and budget of cameras operating by ethically-aligned artificial intelligence jointly influence coverage efficiency and privacy protection in large-scale and sensitive urban environments.
Large language models (LLMs) can be misused to reveal sensitive information, such as weapon-making instructions or writing malware. LLM providers rely on $\emph{monitoring}$ to detect and flag unsafe behavior during inference. An open security challenge is $\emph{adaptive}$ adversaries who craft attacks that simultaneously (i) evade detection while (ii) eliciting unsafe behavior. Adaptive attackers are a major concern as LLM providers cannot patch their security mechanisms, since they are unaware of how their models are being misused. We cast $\emph{robust}$ LLM monitoring as a security game, where adversaries who know about the monitor try to extract sensitive information, while a provider must accurately detect these adversarial queries at low false positive rates. Our work (i) shows that existing LLM monitors are vulnerable to adaptive attackers and (ii) designs improved defenses through $\emph{activation watermarking}$ by carefully introducing uncertainty for the attacker during inference. We find that $\emph{activation watermarking}$ outperforms guard baselines by up to $52\%$ under adaptive attackers who know the monitoring algorithm but not the secret key.
By integrating Chain-of-Thought(CoT) reasoning, Vision-Language-Action (VLA) models have demonstrated strong capabilities in robotic manipulation, particularly by improving generalization and interpretability. However, the security of CoT-based reasoning mechanisms remains largely unexplored. In this paper, we show that CoT reasoning introduces a novel attack vector for targeted control hijacking--for example, causing a robot to mistakenly deliver a knife to a person instead of an apple--without modifying the user's instruction. We first provide empirical evidence that CoT strongly governs action generation, even when it is semantically misaligned with the input instructions. Building on this observation, we propose TRAP, the first targeted adversarial attack framework for CoT-reasoning VLA models. TRAP uses an adversarial patch (e.g., a coaster placed on the table) to corrupt intermediate CoT reasoning and hijack the VLA's output. By optimizing the CoT adversarial loss, TRAP induces specific and adversary-defined behaviors. Extensive evaluations across 3 mainstream VLA architectures and 3 CoT reasoning paradigms validate the effectiveness of TRAP. Notably, we implemented the patch by printing it on paper in a real-world setting. Our findings highlight the urgent need to secure CoT reasoning in VLA systems.
We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.
Critical energy infrastructures increasingly rely on information and communication technology for monitoring and control, which leads to new challenges with regard to cybersecurity. Recent advancements in this domain, including attribute-based access control (ABAC), have not been sufficiently addressed by established standards such as IEC 61850 and IEC 62351. To address this issue, we propose a novel real-time server-aided attribute-based authorization and access control for time-critical applications called RTS-ABAC. We tailor RTS-ABAC to the strict timing constraints inherent to the protocols employed in substation automation systems (SAS). We extend the concept of conventional ABAC by introducing real-time attributes and time-dependent policy evaluation and enforcement. To safeguard the authenticity, integrity, and non-repudiation of SAS communication and protect an SAS against domain-typical adversarial attacks, RTS-ABAC employs mandatory authentication, authorization, and access control for any type of SAS communication using a bump-in-the-wire (BITW) approach. To evaluate RTS-ABAC, we conduct a testbed-based performance analysis and a laboratory-based demonstration of applicability. We demonstrate the applicability using intelligent electronic devices, merging units, and I/O boxes communicating via the GOOSE and SV protocol. The results show that RTS-ABAC is able to secure low-latency communication between SAS devices, as up to 99.82 % of exchanged packets achieve a round-trip time below 6 ms. Moreover, the results of the evaluation indicate that RTS-ABAC is a viable solution to enhance the cybersecurity not only in a newly constructed SAS but also via retrofitting of existing substations.
The rapid adoption of mobile graphical user interface (GUI) agents, which autonomously control applications and operating systems (OS), exposes new system-level attack surfaces. Existing backdoors against web GUI agents and general GenAI models rely on environmental injection or deceptive pop-ups to mislead the agent operation. However, these techniques do not work on screenshots-based mobile GUI agents due to the challenges of restricted trigger design spaces, OS background interference, and conflicts in multiple trigger-action mappings. We propose AgentRAE, a novel backdoor attack capable of inducing Remote Action Execution in mobile GUI agents using visually natural triggers (e.g., benign app icons in notifications). To address the underfitting caused by natural triggers and achieve accurate multi-target action redirection, we design a novel two-stage pipeline that first enhances the agent's sensitivity to subtle iconographic differences via contrastive learning, and then associates each trigger with a specific mobile GUI agent action through a backdoor post-training. Our extensive evaluation reveals that the proposed backdoor preserves clean performance with an attack success rate of over 90% across ten mobile operations. Furthermore, it is hard to visibly detect the benign-looking triggers and circumvents eight representative state-of-the-art defenses. These results expose an overlooked backdoor vector in mobile GUI agents, underscoring the need for defenses that scrutinize notification-conditioned behaviors and internal agent representations.