Loading...
Loading...
Browse, search, and filter preprints from arXiv—fast, readable, and built for curious security folks.
Showing 18 loaded of 47,596—scroll for more
Side Channel Analysis (SCA) relaxes the black-box assumption of conventional cryptanalysis by incorporating physical measurements acquired during cryptographic operations. Electro-magnetic (EM) emissions of a chip during computations often provide a very valuable source of side channel leakage. During the evaluation of a chip for electro-magnetic side channel emissions one needs to position an electro-magnetic probe in an advantageous position relative to the chip. Previous literature focused on hot-spot finding and to a lower extend repositioning. Trace augmentations have been considered to aid portability of profiling using one physical device and attacking another device. This paper focuses on training a single neural network using traces from multiple EM probe positions to detect leakage from a larger area over the attacked device. We provide dual evaluation of EM traces - from two completely independent labs - profiling on data from one lab and attacking traces from the other lab.
Barrett reduction is the nonlinear core of every practical NTT-based post-quantum cryptography implementation. Existing composition frameworks (ISW, t-SNI, PINI, DOM) address Boolean masking over GF(2); none provides a machine-checked characterization of Barrett's leakage under first-order arithmetic masking and the first-order probing model over prime fields. Building on our prior series, QANARY [15], partial-NTT-masking margins [14], algebraic foundations [16], and butterfly composition [18], we close this gap. We prove a trichotomy: for any $q > 0$ and shift $s$, the Barrett internal wire map $f_x(m) = ((x + 2^s - m) \bmod 2^s) \bmod q$ has preimage cardinality in $\{0, 1, 2\}$, never more. We call this the 1-Bit Barrier: max-multiplicity 2 implies at most 1 bit of min-entropy loss per internal wire, universal over all moduli. The count-zero cases, unreachable output values, reveal that actual leakage is often strictly less than 1 bit, making the bound conservative. We introduce PF-PINI (Prime-Field PINI): Barrett satisfies PF-PINI(2); the Cooley-Tukey butterfly satisfies PF-PINI(1). We observe (not yet proved) that with fresh inter-stage masking, the composed pipeline has max-multiplicity $\max(k_1, k_2)$, so the 1-Bit Barrier propagates. The trichotomy, the PF-PINI instantiations, and cardinality results are machine-checked in Lean 4 with Mathlib: 12 proved results, zero sorry, universal over all $q > 0$ (the min-entropy bound follows by standard definitions). Adams Bridge lacks fresh inter-stage masking, violating PF-PINI composition and explaining why Papers 1 [15] and 2 [14] found vulnerabilities. NIST IR 8547 recommends formal methods for PQC implementation validation. The 1-Bit Barrier provides the first universal machine-checked cardinality bound for masked Barrett reduction in ML-KEM (FIPS 203) and ML-DSA (FIPS 204), with a corresponding 1-bit leakage interpretation.
Autonomous AI agents extend large language models into full runtime systems that load skills, ingest external content, maintain memory, plan multi-step actions, and invoke privileged tools. In such systems, security failures rarely remain confined to a single interface; instead, they can propagate across initialization, input processing, memory, decision-making, and execution, often becoming apparent only when harmful effects materialize in the environment. This paper presents AgentWard, a lifecycle-oriented, defense-in-depth architecture that systematically organizes protection across these five stages. AgentWard integrates stage-specific, heterogeneous controls with cross-layer coordination, enabling threats to be intercepted along their propagation paths while safeguarding critical assets. We detail the design rationale and architecture of five coordinated protection layers, and implement a plugin-native prototype on OpenClaw to demonstrate practical feasibility. This perspective provides a concrete blueprint for structuring runtime security controls, managing trust propagation, and enforcing execution containment in autonomous AI agents. Our code is available at https://github.com/FIND-Lab/AgentWard .
Current cyber attribution approaches typically operate on a per-incident basis, leaving open whether aggregating evidence across campaigns improves adversary identification. We investigate whether cross-campaign attribution reduces ambiguity or whether structural limits persist under longitudinal data. We model adversary fingerprints as multi-dimensional feature vectors encoding behavioral, infrastructural, and temporal characteristics derived from covert beacon interactions. We introduce ARCANE (Attacker Re-identification via Cross-campaign Attribution Network), a probabilistic framework that aggregates passive telemetry across campaigns and organizations to construct persistent adversary fingerprints. These fingerprints are updated using a Bayesian belief network that integrates new evidence over time. A time-decayed confidence metric captures accumulated similarity across campaigns. Evaluation on a synthetic dataset of multiple threat profiles shows that intra-actor similarity consistently exceeds inter-actor similarity. However, separation between distinct actors remains limited due to shared operational practices among sophisticated adversaries. Results indicate that cross-campaign aggregation alone does not resolve attribution ambiguity. Performance is constrained by a structural ceiling in feature space, where inter-actor similarity remains high even without evasion. Attribution accuracy remains stable under increasing evasion, suggesting the main limitation is feature indistinguishability rather than adversarial adaptation. These findings highlight the need for additional signal classes, such as targeting patterns, temporal coordination, and infrastructure relationships, to improve attribution reliability.
Object detection (OD) is critical to real-world vision systems, yet existing backdoor attacks on detection transformers (DETRs) for OD tasks rely on patch-wise triggers optimized at fixed locations with minimal perturbations. Such attacks overlook that backdoor triggers in the real world may appear at different sizes, fields of view (FoVs), and locations in images, while minimal perturbations are difficult for cameras to capture, limiting attack practicality. We first observe that a patch-wise trigger in DETR delivers high attack effectiveness when activating the backdoor across neighboring locations, a phenomenon we term the trigger radiating effect (TRE). Meanwhile, inserting patch-wise triggers across multiple locations synergistically enhances TRE, resulting in high attack effectiveness across images. We propose DETOUR, a practical backdoor attack by using semantic triggers that are effective in real-world object detection systems. To ensure attack practicality, we rescale trigger patterns to different sizes and insert them at various predefined locations during backdoor training, enabling the model to recognize the trigger regardless of its spatial configurations. To address FoV variations in physical deployments, we extract the trigger pattern from a real-world object (e.g., a mug) captured under multiple FoVs and inject the trigger accordingly, promoting viewpoint-invariant backdoor activation and enhancing TRE across the entire image. As a result, the backdoor can be reliably activated under diverse FoVs and spatial configurations.
Large language models deployed at runtime can misbehave in ways that clean-data validation cannot anticipate: training-time backdoors lie dormant until triggered, jailbreaks subvert safety alignment, and prompt injections override the deployer's instructions. Existing runtime defenses address these threats one at a time and often assume a clean reference model, trigger knowledge, or editable weights, assumptions that rarely hold for opaque third-party artifacts. We introduce Layerwise Convergence Fingerprinting (LCF), a tuning-free runtime monitor that treats the inter-layer hidden-state trajectory as a health signal: LCF computes a diagonal Mahalanobis distance on every inter-layer difference, aggregates via Ledoit-Wolf shrinkage, and thresholds via leave-one-out calibration on 200 clean examples, with no reference model, trigger knowledge, or retraining. Evaluated on four architectures (Llama-3-8B, Qwen2.5-7B, Gemma-2-9B, Qwen2.5-14B) across backdoors, jailbreaks, and prompt injection (56 backdoor combinations, 3 jailbreak techniques, and BIPIA email + code-QA), LCF reduces mean backdoor attack success rate (ASR) below 1% on Qwen2.5-7B and Gemma-2 and to 1.3% on Qwen2.5-14B, detects 92-100% of DAN jailbreaks (62-100% for GCG and softer role-play), and flags 100% of text-payload injections across all eight (model, domain) cells, at 12-16% backdoor FPR and <0.1% inference overhead. A single aggregation score covers all three threat families without threat-specific tuning, positioning LCF as a general-purpose runtime safety layer for cloud-served and on-device LLMs.
The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving capabilities, but it has also expanded their attack surfaces, exposing them to vulnerabilities such as prompt infection and compromised inter-agent communication. While emerging graph-based anomaly detection methods show promise in protecting these networks, the field currently lacks a standardized, reproducible environment to train these models and evaluate their efficacy. To address this gap, we introduce Gammaf (Graph-based Anomaly Monitoring for LLM Multi-Agent systems Framework), an open-source benchmarking platform. Gammaf is not a novel defense mechanism itself, but rather a comprehensive evaluation architecture designed to generate synthetic multi-agent interaction datasets and benchmark the performance of existing and future defense models. The proposed framework operates through two interdependent pipelines: a Training Data Generation stage, which simulates debates across varied network topologies to capture interactions as robust attributed graphs, and a Defense System Benchmarking stage, which actively evaluates defense models by dynamically isolating flagged adversarial nodes during live inference rounds. Through rigorous evaluation using established defense baselines (XG-Guard and BlindGuard) across multiple knowledge tasks (such as MMLU-Pro and GSM8K), we demonstrate Gammaf's high utility, topological scalability, and execution efficiency. Furthermore, our experimental results reveal that equipping an LLM-MAS with effective attack remediation not only recovers system integrity but also substantially reduces overall operational costs by facilitating early consensus and cutting off the extensive token generation typical of adversarial agents.
Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for resource-constrained organizations. While cloud platforms could provide the needed resources, data privacy concerns make sharing sensitive information with third parties risky. A promising solution is split learning for LLM fine-tuning, which divides the model between clients and a server, allowing collaborative and secure training through exchanged intermediate data, thus enabling resource-constrained participants to adapt LLMs safely. % In light of this, a growing body of literature has emerged to advance this paradigm, introducing varied model methods, system optimizations, and privacy defense-attack techniques for split learning. To bring clarity and direction to the field, a comprehensive survey is needed to classify, compare, and critique these diverse approaches. This paper fills the gap by presenting the first extensive survey dedicated to split learning for LLM fine-tuning. We propose a unified, fine-grained training pipeline to pinpoint key operational components and conduct a systematic review of state-of-the-art work across three core dimensions: model-level optimization, system-level efficiency, and privacy preservation. Through this structured taxonomy, we establish a foundation for advancing scalable, robust, and secure collaborative LLM adaptation.
Public warning systems (PWS) in cellular networks enable authorities to broadcast emergency alerts to all mobile phones in a geographic region in the event of threats such as earthquakes or severe weather. If an attacker can imitate these alerts and transmit a forged warning containing fake news or phishing links, the impact could range from public panic to user compromise. In this work, we present the first open-source 5G emergency alert spoofing attack, implemented by modifying the openairinterface (OAI) radio access network (RAN) code and executed using a software-defined radio, complemented by a custom network management system to automate network and warning configuration. We conduct a detailed analysis of how different smartphones behave under various conditions. Our findings show that while devices readily display spoofed alerts, the alerting mechanism enables multiple practical attack scenarios beyond simple warning display. Finally, to address this threat, we propose and implement a lightweight cross-cell verification mechanism in OAI, in which the device compares the received warning with neighboring cell broadcasts to flag single-source alerts as suspicious.
Accurate vulnerability-inducing commit identification serves as a foundation for a series of software security tasks, such as vulnerability detection and affected version analysis. A straightforward solution is the SZZ algorithm, which traces back through the code history to identify the earliest commit that modify the vulnerable code. Unfortunately, neither the customized V-SZZ nor state-of-the-art LLM4SZZ perform satisfactorily due to the incorrect anchor selection and inadequate backtracking capability, making them far beyond a reliable usage in practice. To overcome these challenges, we propose a multi-agentic SZZ algorithm, named MAS-SZZ, that facilitates the identification of vulnerability-inducing commits through collaboration among agents. Specifically, given a CVE description and its corresponding fixing commit, MAS-SZZ summarizes the root cause of the vulnerability and employs a structured step-forward prompting strategy to localize vulnerability-related statements based on the change intent of each patch hunk. These vulnerable statements serve as anchors from which MAS-SZZ autonomously traces backward through the repository's history to find the commit that first introduced the vulnerability. Extensive experiments show that MAS-SZZ outperforms the state-of-the-art baselines across datasets and programming languages, achieving F1-score gains of up to 65.22% over the best-performing SZZ algorithm.
A t-private n-server Information-Theoretic Distributed Point Function ((t,n)-ITDPF) allows one to convert any point function f_{alpha,beta}(x): [N] -> G into n shares (secret keys), such that each server can compute an additive share of f_{alpha,beta}(x) with a key while any <= t servers learn absolutely no information about the function. This paper constructs a novel share conversion based on the private information retrieval (PIR) of Ghasemi, Kopparty, and Sudan (STOC 2025) and proposes a perfectly secure 1-private ITDPF with output group G = Z_p, where p can be any prime. Compared with the existing perfectly secure ITDPFs for the same output group, the proposed ITDPF is more efficient with asymptotically shorter secret keys.
Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial attacks. However, FAT is prone to catastrophic overfitting (CO), wherein models overfit to the specific attack used during training and fail to generalize to others. While existing methods introduce diverse hypotheses and propose various strategies to mitigate CO, a systematic and intuitive explanation of CO remains absent. In this work, we innovatively interpret CO through the lens of backdoor. Through validations on pathway division, diverse feature predictions, and universal class distinguishable triggers in CO, we conceptualize CO as a weak trigger variant of unlearnable tasks, unifying CO, backdoor attacks, and unlearnable tasks under a common theoretical framework. Guided by this, we leverage several backdoor inspired strategies to mitigate CO: (i) Recalibrate CO affected model parameters using vanilla fine tuning, linear probing, or reinitialization-based techniques; (ii) Introduce a weight outlier suppression constraint to regulate abnormal deviations in model weights. Extensive experiments support our interpretation of CO and show the efficacy of the proposed mitigation strategies.
Cross-chain bridges, the critical infrastructure of the multi-chain ecosystem, have become a primary target for attackers, resulting in over $2.8 billion in losses due to subtle implementation flaws. Existing defenses, such as bytecode-level static analysis, are ill-equipped to handle the semantic complexity of cross-chain interactions, while LLM-based approaches, which can understand source code, struggle with hallucinatory reasoning over complex, multi-contract dependencies. In this paper, we propose GoAT-X, a framework that shifts automated cross-chain smart contract codebases auditing from heuristic pattern matching toward systematic first-principles verification. GoAT-X structures the audit process as a Graph of Auditing Thoughts, explicitly mirroring how human experts decompose, reason about, and validate security logic. By anchoring LLM reasoning in statically extracted data flows and explicitly linking abstract security properties to concrete code implementations, the framework constrains semantic reasoning within well-defined structural and state boundaries. Within this constrained space, GoAT-X treats missing constraints and adversarial bypass paths in cross-chain logic as first-class vulnerability targets and dynamically explores reasoning paths to identify exploitable semantic gaps. We evaluate GoAT-X on a comprehensive benchmark covering all known cross-chain token transaction attacks. GoAT-X achieves 92% recall on fine-grained audit points and 95% coverage of vulnerable projects, while identifying 117 confirmed risks in the wild with low operational cost, establishing a new standard for scalable, logic-driven cross-chain security.
Fast Adversarial Training (FAT) has proven effective in enhancing model robustness by encouraging networks to learn perturbation-invariant representations. However, FAT often suffers from catastrophic overfitting (CO), where the model overfits to the training attack and fails to generalize to unseen ones. Moreover, robustness oriented optimization typically leads to notable performance degradation on clean inputs, and such degradation becomes increasingly severe as the perturbation budget grows. In this work, we conduct a comprehensive analysis of how guidance strength affects model performance by modulating perturbation and supervision levels across distinct confidence groups. The findings reveal that low confidence samples are the primary contributors to CO and the robustness accuracy trade off. Building on this insight, we propose a Distribution-aware Dynamic Guidance (DDG) strategy that dynamically adjusts both the perturbation budget and supervision signal. Specifically, DDG scales the perturbation magnitude according to the sample confidence at the ground truth class, thereby guiding samples toward consistent decision boundaries while mitigating the influence of learning spurious correlations. Simultaneously, it dynamically adjusts the supervision signal based on the prediction state of each sample, preventing overemphasis on incorrect signals. To alleviate potential gradient instability arising from dynamic guidance, we further design a weighted regularization constraint. Extensive experiments on standard benchmarks demonstrate that DDG effectively alleviates both CO and the robustness accuracy trade off.
The decentralization of modern energy systems is transforming consumers into prosumers who continuously exchange data with aggregators, peers, and market operators. While such data is essential for peer-to-peer trading, demand response, and distributed forecasting, it can reveal sensitive household patterns and introduce privacy risks. Existing data sharing mechanisms rely on fixed policies or predefined differential privacy budgets, limiting their ability to adapt to variations in reliability, data sensitivity, and request purpose. As a result, prosumers rarely receive explanations for why a request is accepted, rejected, or modified, reducing trust and participation. To address these limitations, we propose X-NegoBox, an explainable negotiation framework for adaptive privacy budgeting and transparent decision making. Each prosumer data is managed locally within a private DataBox, where raw data remain confined. Incoming requests are processed by an Autonomous Privacy Budget Negotiation Protocol (APBNP), which determines an appropriate privacy budget based on trust, feature sensitivity, declared purpose, historical behavior, and risk-aware pricing. When needed, APBNP generates privacy-preserving counter-offers, such as reduced resolution or duration. An Explainable Agreement Layer (X-Contract) produces human- and machine-readable justifications for each decision. After agreement, requester code executes locally in a sandbox, and only sanitized outputs are shared. Experiments on realistic energy market settings show reduced privacy leakage, higher acceptance rates, and improved interpretability.
The Rowhammer vulnerability poses an increasing challenge with newer generations of DRAM and aggressive technology scaling. Existing mitigation techniques, such as Graphene, Twice, and Hydra, primarily rely on tracking activation counts for each row and issuing refreshes when a row reaches a predefined tracking threshold. However, these methods have inherent limitations, including inefficiencies in identifying rows genuinely at risk of bit flips. In this paper, we propose a novel framework called Rowhammer Vulnerability Count (RVC), which shifts the focus from activation count tracking to evaluating a row's actual vulnerability to bit flips. By selectively issuing refreshes only to rows on the verge of experiencing bit flips, RVC drastically reduces unnecessary refresh operations. We also demonstrate that prior works have incorrectly set tracking thresholds, leading to security flaws. Our evaluation shows that RVC achieves 95 - 99.99% improvement in mitigation induced refreshes when compared to Graphene, with no additional space overhead. Furthermore, RVC improves energy efficiency and reduces average LLC latency by up to 76.91%, making it a highly efficient and scalable solution for addressing Rowhammer in modern DRAM systems. These findings establish RVC as a superior approach for preventing Rowhammer, outperforming existing methods in both accuracy and efficiency.
Trusted Execution Environments (TEEs) on low-power microcontrollers (e.g., ARM TrustZone-M) enable isolation of Secure and Non-Secure software but still require both worlds to share resources, including interrupt controllers. In this model, real-time applications and real-time operating systems (RTOS-s) are executed in the Non-Secure sub-system, whereas the Secure sub-system is typically reserved for a small set of pre-defined security (e.g., cryptographic) operations referred to as trusted computing services. However, many RTOS-s rely on periodic interrupts (SysTicks) to advance their own notion of time (time-keeping), and the delivery of this interrupt is essential for preserving real-time behavior. On the other hand, the security of many trusted computing services requires atomicity vis-a-vis the Non-Secure sub-system (where the RTOS resides), precluding SysTick handling. This paper first characterizes this conflict and then introduces a Secure-driven time synchronization mechanism in which the Secure World measures elapsed time and compensates the Non-Secure RTOS by unobtrusively updating the RTOS time-keeping data structures with the appropriate number of missed ticks before re-enabling interrupts and resuming the execution of the Non-Secure system. This approach restores a consistent, monotonic notion of time across worlds and enables secure coexistence of trusted computing services and RTOS-s on microcontrollers. Importantly, the proposed approach requires no modifications to the underlying RTOS and yields no significant run-time overhead.
The convergence of Internet of Things (IoT) security and Zero Trust (ZT) principles is a trending topic, demanding a comprehensive, multi-perspective analysis. We present the first multivocal literature review (MLR) on this topic, combining 68 academic and 36 industrial studies. This comprehensive review identifies two complementary yet divergent perspectives: academia focuses on IoT compliance with ZT principles through IoT modifications, while industry prioritizes practical integration within existing ZT frameworks guided by NIST standards. The analysis reveals critical research gaps in socio-technical understanding, cost-benefit evaluation, and interdisciplinary collaboration, highlighting these as key directions for future research.