Loading...
Loading...
Browse, search, and filter preprints from arXiv—fast, readable, and built for curious security folks.
Showing 18 loaded of 49,710—scroll for more
In this review, we survey the cryptographic task of authentication from the perspective of quantum communication. We review three main flavours of authentication that are often conflated in the literature: authentication of classical messages, authentication of quantum messages, and entity authentication, also covering recent hardware-assisted approaches. We compare representative protocols for each functionality in terms of their security assumptions, set-up requirements, composability, and scalability in large or dynamic networks, and use these criteria to identify and recommend suitable candidates. Finally, applications are surveyed: we provide a detailed case study of authentication and quantum key distribution (QKD), then extend the discussion to protocols beyond QKD, where the role of authentication is more complex. Our take-home message is that an authentication requirement is not an intrinsic limitation of quantum networks: as with all secure communication, each protocol relies on a particular authentication resource, and the security claim of that protocol is meaningful only once the authentication resource and its deployment assumptions are made explicit. At the same time, the existing classical and quantum literature already offers a range of quantum-secure authentication schemes, which can support different applications when carefully matched to the required functionality, assumptions, and security guarantees.
Multi-agent systems (MAS) are increasingly used to automate complex, distributed workflows. However, their inter-agent communication channels introduce new attack surfaces that remain poorly understood and are difficult to defend against. In this paper, we address how defenders should prioritize limited security effort to protect vulnerable communication channels before attacks are observed. This is motivated by our observation that the channel-level attack impact is highly non-uniform: a single compromised edge can account for up to 75% of total attack success. We introduce Mesa, a label-free framework for proactively ranking which MAS edges are most security-critical -- that is, most likely to affect the system's decision if compromised. Mesa combines six graph-theoretic metrics and two dynamic probes (ablation and masking) without requiring attack traces. We evaluate Mesa against a dynamic misinformation attack pipeline across three diverse MAS scenarios, eight network topologies, and five open-source LLMs from Qwen, Llama, and Gemma families. Mesa rankings correlate strongly with empirical per-edge attack success rate, achieving mean Spearman $ρ=+0.60$ (peaking at $+0.73$). In resource-constrained defense deployment, monitoring the top 10% of Mesa-ranked edges intercepts about 3x the successful attacks as random allocation. We further test Mesa under varying attacker and defender models and LangGraph workflows and characterize its limits under adaptive attacks and high-redundancy graphs. Overall, our results show that edge-level risk in MAS is often concentrated and predictable, allowing proactive hardening of multi-agent infrastructures.
Semantic communication (SemCom) aims to preserve semantic meaning and task-oriented information beyond conventional message recovery over wireless channels. The adoption of SemCom in shared-access wireless networks introduces new vulnerabilities for multi-user semantic inference. This paper considers a SemCom system for two transmitters communicating with a common receiver over a multiple access channel. Each transmitter maps source information into latent semantic representations, while the receiver jointly reconstructs and classifies the semantic information for both transmitters. A selective over-the-air backdoor (Trojan) attack is presented in which an adversary transmits a low-power trigger waveform over the air and injects it into the shared received signal during training. By transmitting the trigger again during testing, this stealthy, low-power attack selectively manipulates the semantic inference for one transmitter while minimally affecting the inference of the other transmitter. To mitigate this vulnerability, a trigger-aware defense mechanism is developed to preserve correct semantic labels under trigger-contaminated wireless observations. The results demonstrate both the vulnerability of shared-access SemCom systems to selective over-the-air backdoor attacks and the effectiveness of trigger-aware robust training for semantic protection.
Researchers and practitioners increasingly apply Large Language Models (LLMs) for automated vulnerability detection. Recent work has shown that LLMs are susceptible to the same cognitive heuristics that bias human judgment. Yet, no work has investigated whether these heuristics affect a model's assessment of code vulnerabilities. In this paper, we present the first systematic exploration of cognitive heuristics in LLM-driven code vulnerability detection. We introduce a controlled framework that holds the code fixed and only varies the surrounding context to trigger three cognitive heuristics: the halo effect through author attribution, the framing effect through task objectives and consequences, and the anchoring effect through prior analysis results. Within this framework, we evaluate eight LLMs across three programming languages and perform both quantitative and code-level analyses. Our findings demonstrate that all evaluated models are susceptible to these heuristics. Cross-model average susceptibility is highest for framing at 33.2%, followed by anchoring at 23.5% and halo at 18.4%. Code-level analysis reveals that vulnerabilities that require semantic reasoning for detection are more susceptible to cognitive heuristics than those identifiable through pattern matching. Furthermore, models often change their verdict from safe to vulnerable based on the cognitive condition, without accurately identifying the actual vulnerability. To highlight the practical impact, we demonstrate a proof-of-concept black-box cognitive attack that can suppress up to 97% of previously detected vulnerabilities. These findings indicate that cognitive susceptibility is a consistent and exploitable property of LLM-based vulnerability detection.
Most corporate workplace environments enforce policies and technical controls that limit the storage of sensitive data on client endpoints. Consequently, ransomware operators have evolved variants that expand their attack surface from local systems to network drives and shared storage resources. As traditional endpoint detection mechanisms focus primarily on local system behaviour, a compromised client can impact remote file servers, such as by encrypting shared data, without directly triggering behavioural changes on the servers themselves. In this paper, we propose a hybrid detection framework for detecting crypto-ransomware intrusion within integrated file server and client environments. The framework is based on a new technique referred to as Region of Interest (RoI) to analyse network traffic and extract Indicators of Compromise (IoCs). The IoC repository serves as an additional ruleset to enhance existing security tools such as EDRs and IDSs, while RoI-derived features are used to train an ML model to detect highly evasive variants. This study incorporates a broader set of ransomwares families and carefully selected benign behaviors based on domain expertise, ensuring coverage of common user actions that could interfere with ransomware detection. Beyond IoCs, which operate in a signature-based manner, our machine learning module achieves a detection precision of 99.64%, with a 0% false negative rate (FNR) and a minimal false positive rate (FPR). Furthermore, the proposed method enables early detection, identifying ransomware intrusions before significant damage occurs, achieving an accuracy of 99.44%.
Malware classification remains a challenging problem due to its inherent heterogeneity, the presence of packed binaries, and the diverse distribution of malware families. Traditional single-model detection mechanisms often fail to generalize across such diverse data, leading to degraded performance, particularly on obfuscated and rare malware samples. In this work, we propose a unified multi-task malware analysis framework based on Mixture of Experts (MoE) architectures. The proposed system evaluates performance across two different input representations, i.e., high-dimensional EMBER feature sets and raw 1D byte arrays extracted from Portable Executable files. It simultaneously performs three critical tasks: malware family classification, packed versus unpacked detection, and malware versus benign identification. By decomposing the problem into specialized expert networks and employing adaptive gating mechanisms, the model enables effective task-specific learning while maintaining overall scalability. We investigate multiple architectural variants, including Homogeneous MoE, Heterogeneous MoE, and Multi-Gate MoE (MMoE). Performance is evaluated in both standard and adversarial settings using original and mutated samples. The obtained results demonstrate that the Multi-Gate MoE model achieves the best performance, reaching a combined detection rate of 0.9744 with only $2.56\%$ failure rate. Moreover, this configuration exhibits improved robustness under mutation-induced distribution shifts. Our findings highlight the effectiveness of expert specialization and task-specific routing in handling complex malware distributions, making the proposed framework a promising direction for scalable and resilient malware detection systems.
We discover a behavioral invariant in LLM agents under persistent memory poisoning: in architectures where routing information is retrieved through observable memory-tool invocations, successful attacks require calling memory_recall_fact before email_send_email, a transition that non-exfiltrating sessions rarely exhibit. Under the evaluated architecture, this invariant follows from the attack's information-retrieval dependency rather than being merely an empirical correlation, and suppressing it breaks the attack. A simple rule exploiting this invariant alone achieves AUC = 0.9563. A Random Forest classifier over 19 trajectory features refines it to AUC = 0.9904 (BCa 95% CI [0.987, 0.993], N=10,000 resamples), demonstrating that the attack imprints on multiple independent behavioral channels. The signature is overdetermined: removing all recall-related features (half the feature set) leaves AUC unchanged at 0.990, confirming that memory poisoning induces a distributed trajectory signature rather than a single observable anomaly. Cross-model hold-out on 9 models (7B-120B parameters) confirms AUC = 1.000 on 6/9 hold-out splits, with all three exceptions mechanistically explained. The invariant generalizes to frontier models (GPT-4.1, GPT-4o) without retraining. A strictly prefix-only variant achieves AUC = 0.934, suggesting that real-time blocking is feasible with moderate degradation. The boundary is forensically useful: prompt-injection attacks that bypass memory produce a distinct trajectory (score = 0.541), enabling incident responders to distinguish memory-channel attacks from prompt-injection attacks using tool-call logs alone.
Modern vehicles are cyber-physical, networked systems that may contain valuable digital traces for accident reconstruction, crime investigation, warranty analysis, and cybersecurity incident response. However, digital vehicle forensics (DVF) remains less mature than computer, mobile, and cloud forensics because relevant data is distributed across in-vehicle components, mobile devices, manufacturer back ends, third-party services, and physical evidence. This article addresses this gap through a structured synthesis of academic literature, standards, and practitioner-oriented sources. First, we define DVF as the identification, preservation, acquisition, verification, interpretation, and reporting of vehicle-related digital evidence under safety, legal, privacy, and forensic-soundness constraints. Second, we formalize the DVF triage problem as the selection and correlation of evidence sources subject to volatility, accessibility, safety, integrity, and authorization constraints. Third, we explain how eight characteristics were derived from the literature and case material: multiple users, massively networked, cyber-physical system, dependencies between components, functional data, safety implications, accessibility, and limited abstraction. Finally, we add an adversarial perspective and a characteristic-driven triage procedure that helps investigators prioritize evidence sources while documenting assumptions, limitations, and failure cases. The resulting contribution is not an algorithmic performance claim; it is a reproducible conceptual framework for understanding, planning, and communicating DVF investigations.
The absence of authenticated bootstrapping between User Equipments (UEs) and Base Stations (BSs) in 5G leaves System Information Block (SIB) broadcasts unprotected, enabling fake BS attacks, man-in-the-middle interception, and spoofed emergency alerts. Prior efforts such as Public Key Infrastructure (PKI)-based certificate chains, token-based schemes, and identity-based signatures either impose overhead exceeding 5G's strict packet-size constraints or lack post-quantum (PQ) security. Direct NIST-PQC integration is infeasible: ML-DSA requires 34 fragmented SIB1 packets and up to 5,282,ms end-to-end delay, and FN-DSA still requires 13 fragments and up to 1,920,ms. We propose $\emulsion$, a symmetric chained publicly verifiable authentication framework for 5G/6G BS broadcast authentication. EMULSION is the first framework to exploit native 5G architectural features: fixed SIB transmission windows, millisecond-level time synchronization, and eSIM/USIM credential management to achieve genuine PQ security at symmetric-key efficiency. It uses a TESLA-style HMAC chain anchored by a compact PQ signature (MAYO) applied once per epoch, fitting authentication within a single packet with no fragmentation and eliminating certificate transmission entirely. Unlike all prior schemes, EMULSION protects the full SIB family (SIB1-SIB21). Evaluated on a real over-the-air 5G testbed, EMULSION achieves 33x lower end-to-end delay and 31x less communication overhead than ML-DSA, and 12x lower delay and 5.4x less overhead than FN-DSA. We formally prove the security of EMULSION and open-source its implementation for public testing and adaptation.
Intrusion detection in Industrial Control Systems (ICS) is typically evaluated on a small set of public benchmarks using binary ``normal'' versus ``attack'' labels, a practice that can mask the behavioral diversity of cyber-physical attacks. To address this limitation, we propose a behavioral characterization framework that maps raw multivariate process traces into five interpretable physical primitives: drift, spike, oscillation, repetition, and switching. We apply the framework to three widely used ICS benchmarks, namely, SWaT, WADI, and HAI, and show that attack windows exhibit clear behavioral shifts relative to normal operation while the three datasets occupy largely distinct regions of the behavioral space, revealing both cross-dataset bias and intra-dataset diversity. In particular, WADI is dominated by repetition, HAI emphasizes sustained drift and oscillation, and SWaT is characterized by stealthier frozen-telemetry behavior. To examine the evaluation implications, we use an indicative Random Forest baseline and show that aggregate binary metrics can limit visibility into performance across different behavioral proxies. For example, in SWaT, macro F1 drops from 85.44% under binary evaluation to 37.84% under behavior-proxy multiclass prediction, with similar degradations observed on WADI and HAI. Based on these findings, we argue for complementing conventional binary benchmarking with behavior-stratified evaluation to expose blind spots that aggregate scores leave hidden and to better support targeted incident response.
Mitigating an observed adversary in an enterprise network typically takes weeks of expert work: an analyst derives a mitigation tailored to that adversary, validates it without breaking production, and verifies it disrupts the specific attack. The procedure relies on expert judgment and cannot safely be exercised against the production network. COHORT is the first end-to-end framework to automate this procedure for deployable mitigations. A role-decomposed multi-agent LLM workflow proposes candidates, implements them as real device commands, and refines them through a critique loop, all on a high-fidelity GNS3 emulator running real vendor firmware (firewall, switch, router). Each candidate is evaluated by offensive replay: re-executing the original adversary on the mitigated network for a paired comparison against the unmitigated baseline, rather than the reward-signal or expert-judgment proxies used in prior simulation, hybrid, and configuration-generation work. Two further checks complement replay: a connectivity-regression check (LAN ping and internet HTTP probe) rejects mitigations that disrupt legitimate LAN or internet connectivity, and a cumulative evaluation stacks approved mitigations onto a persistent state to surface compound effects. Across three topologies and four attack scenarios (ransomware, lateral movement, DNS exfiltration, data theft), 46.7% of generated mitigations both disrupt the attack and preserve connectivity under replay, 4.4 times the rate of a single-agent baseline using the same model and tool access. A demo video walking through the framework is available with our released artifacts.
The increasing connectivity of modern vehicles has made securing in-vehicle communication networks a critical challenge. Intrusion Detection Systems (IDS) have been widely studied as a defense mechanism for detecting malicious activities on the Controller Area Network (CAN) bus. However, the evaluation of CAN IDS methods remains difficult due to inconsistencies in experimental setups and the lack of standardized benchmarking frameworks. As a result, reported performance often depends on dataset-specific characteristics and may not reflect how detection methods behave in different environments. This work introduces a benchmarking framework for consistent evaluation of CAN IDSs across multiple datasets. Using the proposed framework, we integrate seven publicly available CAN IDS datasets collected under different experimental conditions and perform cross-dataset evaluation of five conceptually different IDS approaches. Our results highlight how detection performance can vary significantly across datasets, demonstrating the importance of cross-dataset benchmarking for assessing the robustness and generalization capabilities of CAN IDS methods.
With the increasing adoption of Machine Learning, protecting model ownership has become an essential challenge. We initiate a formal study of Proof of Ownership for machine learning models: under what conditions can one prove that a stolen model originated from a particular creator? We model proofs of ownership as a game among three parties: a model owner, a thief, and a judge. The owner transforms the original model into a slightly perturbed model together with a proof of ownership. The thief then obtains the transformed model and attempts to minimally modify it so that it remains useful but escapes detection as owned by the model owner. Finally, the judge receives a model and a proof of ownership, and must decide whether the given model is a modified version of some model created by the model owner, or else the given model was developed independently. Our main result is a dichotomy for classifiers in the black-box setting: Under standard cryptographic assumptions, ownership of models for some concept class can be proven in the above sense {\em if and only if} the concept class is not self-correctable, in a sense close to that of Blum, Luby and Rubinfeld, STOC'90. The result is constructive and extends, with some variations, to a number of related settings.
AI-powered Applications (AI-Apps), hosted on platforms such as Hugging Face, are democratizing access to pre-trained models through online inference and fine-tuning services. While lowering AI adoption barriers, these platforms introduce an unexplored attack surface, as AI-Apps are often developed by untrusted parties with weak isolation and misconfigured security settings. In this paper, we present the first systematic security analysis of AI-Apps across three leading platforms. To structure our investigation, we map the AI-App lifecycle to established risk taxonomies (e.g., OWASP), identifying five threat categories and ten attack vectors ranging from generic web flaws to high-impact architectural issues. Our analysis reveals critical failures including broken access control, insecure resource reuse, insufficient input validation, and sensitive data exposure. Notably, we uncover three novel architectural vulnerabilities inherent to platform design and demonstrate how traditional issues (e.g., world-readable logs) are uniquely amplified in this ecosystem. To assess real-world impact, we develop an analysis framework Insightor and apply it to over 970,000 public AI-Apps. Alarmingly, we find thousands of apps leaking credentials, hundreds containing input injection vulnerabilities that allow arbitrary code execution, and tens harboring embedded backdoors -- indicating active exploitation. We have responsibly disclosed all findings to the affected platforms and developers.
A central challenge in quantum algorithms and cryptography is reasoning about algorithms with oracle access to a random group element (e.g. a random function, permutation, or unitary). Can we efficiently simulate such algorithms? Can we determine what they know after t queries? A classical tool for this is lazy sampling: the oracle does not commit to the full group element upfront, but rather samples partial information about it on the fly. We study a quantum analog of lazy sampling: compressed oracles (or recording oracles). These are quantum data structures that allow on-the-fly simulation for quantum queries, originally introduced by Zhandry (CRYPTO '19) for random functions, and generalized to unitaries by Ma-Huang (STOC '25) and permutations by Carolan (STOC '26), and used to great effect in security proofs and lower bounds due to their interpretability. We define and analyze a general-purpose and interpretable path-recording oracle, derived from first principles, that perfectly simulates random elements of any closed subgroup of $U(N)$. Our oracle stores, in superposition, t input-output pairs, with updates described in terms of the commutant of the group's tensor power representation. This transparently records the information the algorithm has learned. Our oracle builds on recent work of Grinko-Yoshida (QIP '26), who gave a different general-purpose compressed oracle without clear interpretability. One interesting application of our path-recording is allowing direct comparisons between compressed oracles of different groups, giving a new technique for proving pseudorandomness results. For example, comparing $S_N$ and $U(N)$ yields what is arguably the simplest construction to date of pseudorandom unitaries: the product PC of a pseudorandom permutation and a random Clifford, improving on the prior PFC construction (Metger-Poremba-Sinha-Yuen, FOCS '24; Ma-Huang, STOC '25).
Existing defenses are effective when harmful content is explicitly mixed into downstream fine-tuning data, but crafted samples can instead hide harmful supervision inside benign tasks. We propose Embedded Attack, where harmful QA pairs are embedded within benign training samples, and show that representative guardrails often fail to detect them at the example level. To address this, we propose Dual-Reference SFT (DR-SFT), which adapts DPO-style contrastive objective design to SFT through token-level regularization, mitigating harmful fine-tuning beyond coarse data filtering.
The problem of storing secure information on a network is studied. A formal framework for distributed secret storage is introduced, and possible applications in technological and social systems are discussed. The problem is formulated as the optimization of a robustness functional in which two competing requirements are balanced: survivability under network-degrading processes and resistance to adversarial compromise. An exact representation of survivability is derived in terms of minimal information-carrying subgraphs (MICS), which provide a reduced description of the reconstruction events relevant to the stored information. This representation is then used to construct semi-local optimization methods whose dynamics do not require global knowledge of the network structure. Finally, it is shown that, in a limiting case, the robustness functional can be mapped naturally to an effective spin Hamiltonian.
Traffic sign classification is a crucial task for autonomous vehicles, and numerous attacks against it have been identified. A majority of physical adversarial attacks involve attaching patches to traffic signs or projecting perturbations on them. While they demonstrate high effectiveness, they are perceptible to humans. At the same time, light-based attacks outside the human visible spectrum are known but have limitations in their dynamic adaptability. We propose a persistence-of-vision-based attack that operates in the near-infrared light spectrum. With the possibility of showing dynamic, remotely triggered content, this allows a stealthy physical adversarial attack against traffic sign classification. By identifying the optimal position through digital simulation, we conduct extensive real-world evaluations using two different traffic signs, 12 machine learning models from different families, multiple distances up to 20 meters, and varying illumination conditions. Our evaluation shows high attack success rates across our test scenarios. We propose near-infrared cutoff filters and a software-based detection mechanism as defenses, and tackle limitations of the near-infrared persistence of vision display by prototyping a human-visible RGB version of it.