Loading...
Loading...
Browse, search, and filter preprints from arXiv—fast, readable, and built for curious security folks.
Showing 18 loaded of 48,179—scroll for more
Variational quantum algorithms (VQAs) are a central paradigm for noisy intermediate-scale (NISQ) quantum computing, yet their reliance on predesigned and pretrained variational quantum circuits (VQCs) introduces critical security vulnerabilities, particularly backdoor attacks. These attacks embed hidden malicious behaviors that remain dormant under normal conditions but are activated by specific triggers, leading to adversarial outcomes such as incorrect predictions or manipulated objective values. This paper presents a survey of backdoor attacks in VQCs, covering data-poisoning, compiler-level, and quantum-native mechanisms. We formalize key terminology and threat models, and review existing attack strategies along with their empirical characteristics. We also analyze current detection and defense approaches, highlighting their limitations, especially against quantum-specific threats. By synthesizing recent advances, this survey outlines the evolving security landscape of VQCs and identifies key challenges and future directions for developing robust, quantum-aware defenses in hybrid quantum-classical systems.
Modern retrieval-augmented generation (RAG) systems convert sensitive content into high-dimensional embeddings and store them in vector databases that treat the resulting numerical artifacts as opaque. Major vector-store products do not provide native controls for embedding integrity, ingestion-time distributional anomaly detection, or cryptographic provenance attestation. We show this opens a class of steganographic exfiltration attacks: an attacker with write access to the ingestion pipeline can hide payload data inside embeddings using simple post-embedding perturbations (noise injection, rotation, scaling, offset, fragmentation, and combinations thereof) while preserving the surface-level retrieval behavior the RAG system exposes to legitimate users. We evaluate these techniques across a synthetic-PII corpus on text-embedding-3-large, four locally hosted open embedding models, a cross-corpus replication on BEIR NFCorpus and a Quora subset (over 26,000 chunks combined), seven vector-store configurations, an adaptive-attacker variant of the detector evaluation, and a paraphrased-query retrieval benchmark. Distribution-shifting perturbations are often caught by simple anomaly detectors; small-angle orthogonal rotation defeats distribution-based detection across every (model, corpus) pair tested. A disjoint-Givens rotation encoder gives a closed-form per-vector capacity ceiling of floor(d/2) * b bits, but real embedding manifolds impose a capacity-detectability trade-off, and the retrieval-preserving operating point sits well below it. We propose VectorPin, a cryptographic provenance protocol that pins each embedding to its source content and producing model via an Ed25519 signature over a canonical byte representation. Any post-embedding modification breaks signature verification. Embedding-level integrity is a deployable, standardizable control that closes this attack class.
Federated learning enables collaborative model training across distributed clients, yet vanilla FL exposes client updates to the central server. Secure-aggregation schemes protect privacy against an honest-but-curious server, but existing approaches often suffer from many communication rounds, heavy public-key operations, or difficulty handling client dropouts. Recent methods like One-Shot Private Aggregation (OPA) cut rounds to a single server interaction per FL iteration, yet they impose substantial cryptographic and computational overhead on both server and clients. We propose a new protocol called DisAgg that leverages a small committee of clients called Aggregators to perform the aggregation itself: each client secret-shares its update vector to Aggregators, which locally compute partial sums and return only aggregated shares for server-side reconstruction. This design eliminates local masking and expensive homomorphic encryption, reducing endpoint computation while preserving privacy against a curious server and a limited fraction of colluding clients. By leveraging optimal trade-offs between communication and computation costs, DisAgg processes 100k-dimensional update vectors from 100k 5G clients with a 4.6x speedup compared to OPA, the previous best protocol.
From pre-training to query-time augmentation, web-scraped data helps to improve the quality and contextual relevancy of content generated by large language models (LLMs). However, large-scale web scraping to feed LLMs can affect site stability and raise legal, privacy, or ethics concerns. If website owners wish to limit LLM-related web scraping on their site, due to these or other concerns, they may turn to scraper access control mechanisms like the Robots Exclusion Protocol. To be most effective, such mechanisms require site owners to first identify the scrapers that they wish to restrict (e.g., via User-Agent strings). Existing mechanisms to identify LLM-related scrapers rely on voluntary disclosure by companies, one-off experiments by researchers, or crowd-sourced reports -- methods that are neither reliable nor scalable. This paper proposes a novel technique for accurately and automatically inferring LLM-related scrapers. We host dynamic websites that serve unique canary tokens to each visiting scraper, then prompt LLMs for information about our sites. If an LLM consistently generates outputs containing tokens unique to a scraper, it provides evidence of exposure to that scraper. Via experiments across 22 production LLM systems, we demonstrate that our approach can reliably identify which scrapers feed which LLM, including several that are not publicly known or disclosed by the companies. Our approach provides a promising avenue for unprivileged third parties to infer which scrapers serve data to which LLMs, potentially enabling better control over unwanted scraping.
The rapid expansion of the Internet of Things (IoT) has introduced millions of resource-constrained devices into critical infrastructures, consumer environments, and industrial systems. These devices rely on lightweight communication protocols such as MQTT to support low-power, intermittent, and bandwidth-limited operation. However, common TLS algorithms used to secure MQTT communications are vulnerable to quantum attacks made feasible by Shor's algorithm. As a result, IoT infrastructures must evaluate and adopt post-quantum cryptographic (PQC) methods capable of providing long-term resilience. This report investigates the implementation of PQC algorithms within an MQTT-based IoT networks using three Raspberry Pis. Specifically, it integrates the FALCON digital signature scheme, one of NIST's selected post-quantum signature algorithms, to maintain message authenticity and integrity across resource-constrained MQTT clients and brokers. By measuring system performance, the research characterizes the practical trade-offs of deploying lattice-based PQC on lightweight hardware.
Container runtimes provide a stable operational interface for deploying, monitoring, and controlling modern workloads, while trusted execution environments (TEEs) provide hardware-enforced isolation for sensitive computation. Existing confidential-container systems often rely on VM-backed deployment stacks or TEE-specific execution substrates, which can separate confidential execution from the conventional OCI runtime lifecycle. This paper presents EBCC (Enclave-Backed Confidential Containers), an OCI-compatible runtime architecture for managing composite confidential-computing workloads. EBCC treats the REE-side anchor and TEE-side confidential stages as a single containerized confidential-computing composite, preserves standard OCI lifecycle operations, and keeps TEE-specific execution behind a backend adapter. It also maintains persistent per-instance state and per-stage artifacts for request handling, response generation, logging, and evidence binding. We implement EBCC on a Keystone backend and evaluate its correctness, performance, footprint, and concurrent execution behavior. The results show that EBCC introduces additional latency over native Keystone execution, mainly due to lifecycle mediation, request validation, EID allocation, backend dispatch, and artifact persistence, while keeping the added footprint concentrated on host-side management state. Cross-TEE case studies on SGX, TDX, and OP-TEE show that the same lifecycle and stage abstraction can be mapped to enclave-style, VM-style, and embedded-style TEEs. These results indicate that EBCC can make TEE-backed execution manageable through an OCI-style lifecycle without materially enlarging the protected-side TCB.
A key technical difficulty in differential privacy is selecting a privacy budget that satisfies privacy requirements while maximizing utility. A natural and well-studied workaround is to use personalized privacy budgets, which may differ across agents. In this paper, we show that personalized budgets come with major limitations and that for mean estimation, the dominant factor is not full personalization, but rather choosing the right effective privacy budget. This can be achieved through a simple thresholding operator that we describe. Compared with this thresholding baseline, the gains obtained by fully personalized mechanisms are limited. In particular, we precisely quantify the constant-factor improvement in settings with mixed private and public datasets and in private datasets with two levels of privacy requirements. We also establish upper bounds and identify regimes of maximal gain for arbitrary privacy requirements.
Reliable real-time 3D localization is essential for multi-UAV navigation, collision avoidance, and coordinated flight, yet onboard estimates can degrade under GNSS multipath, non-line-of-sight reception, vertical drift, and intentional interference. This paper presents a decentralized, lightweight 3D position-refinement layer that improves robustness by fusing each Unmanned Aerial Vehicle (UAV)'s local estimate with neighbor-shared state summaries and inter-UAV range or proximity constraints. The method performs uncertainty-aware neighborhood fusion by weighting each UAV's prior according to its reported covariance and weighting neighbor constraints according to link quality, ranging uncertainty, and a learned trust score. To support practical deployment, the framework explicitly handles cold start and temporary localization loss by inflating or substituting weak priors, allowing trusted neighborhood constraints to bootstrap and stabilize estimates until absolute sensing recovers. To mitigate the impact of faulty or malicious participants, each UAV applies a local range-consistency check, smoothed over time, to down-weight or exclude neighbors whose reported positions are incompatible with observed inter-UAV distances. Simulation experiments with 10 UAVs in a 3D volume show that the proposed refinement substantially reduces mean localization error during cold start, remains competitive after local estimators stabilize, and maintains lower error as the fraction of malicious nodes increases compared with fusion without trust. These results suggest that the approach can serve as a practical resilience layer for swarm operation in challenging environments.
Embodied intelligent robots rely on tactile sensors to interact with the physical world safely. While the security of visual perception systems has been studied (e.g., adversarial samples), the integrity of the tactile sensory channel remains unexplored. This work explores a vulnerability in Hall-effect fingertip sensors, showing their susceptibility to intentional Electromagnetic Interference (EMI). We demonstrate that a targeted signal injection can induce strong ``phantom forces'', amplifying perceived force magnitude by over \textbf{9$\times$} and deviating the inferred force direction by \textbf{65$^\circ$}. Such perturbations can paralyze learning-based tactile classification models, seriously affecting robot movement. An attacker could exploit this vulnerability to coerce a robot hand into crushing fragile objects or dropping dangerous payloads.
Always-on AI agents (OpenClaw, Hermes Agent) run as a single persistent process under the owner's identity, folding messaging, memory, self-authored skills, scheduling, and shell into one authority boundary. This configuration opens what we call \emph{sleeper channels}: an untrusted input to one surface persists as a memory, skill, scheduled job, or filesystem patch, then fires later through a different surface with no attacker present. Two independent axes define the class: persistence substrate and firing-separation. We walk a confused-deputy cron attack end-to-end through OpenClaw at a pinned commit. The defense is tiered (D1, D2, D3), and D2 carries a soundness theorem against seven named deployment invariants. D2 keys on a canonical action-instance digest with one-shot owner attestations, defeating paraphrase laundering, multi-input grant reuse, and replay. A companion artifact ships the gate, a static audit over the vendored source, and a runtime adapter realising five of the ten mediation hooks (H1, H2, H3, H6, H9) around the cron path (42 tests, Node~$\geq{}20$, at \href{https://github.com/maloyan/sleeper-channels}{github.com/maloyan/sleeper-channels}). Empirical evaluation is preregistered as follow-on.
Large language models remain vulnerable to adversarial prompts that elicit harmful outputs. Existing safety paradigms typically couple red-teaming and post-training in a closed, policy-centric loop, causing attack discovery to suffer from rapid saturation and limiting the exposure of novel failure modes, while leaving defenses inefficient, rigid, and difficult to transfer across victim models. To this end, we propose EvoSafety, an LLM safety framework built around persistent, inspectable, and reusable external structures. For red teaming, EvoSafety equips the attack policy with an adversarial skill library, enabling continued vulnerability probing through simple library expansion after saturation, while supporting the evolution of adversarial vectors. For defense learning, EvoSafety replaces model-specific safety fine-tuning with a lightweight auxiliary defense model augmented with memory retrieval. This enables efficient, transferable, and model-agnostic safety improvements, while allowing robustness to be enhanced solely through memory updates. With a single training procedure, the defense policy can operate in both Steer and Guard modes: the former activates the victim model's intrinsic defense mechanisms, while the latter directly filters harmful inputs. Extensive experiments demonstrate the superiority of EvoSafety: in Guard mode, it achieves a 99.61% defense success rate, outperforming Qwen3Guard-8B by 14.13% with only 37.5% of its parameters, while preserving reasoning performance on benign queries. Warning: This paper contains potentially harmful text.
Large Reasoning Models (LRMs) are increasingly integrated into systems requiring reliable multi-step inference, yet this growing dependence exposes new vulnerabilities related to computational availability. In particular, LRMs exhibit a tendency to "overthink", producing excessively long and redundant reasoning traces, when confronted with incomplete or logically inconsistent inputs. This behavior significantly increases inference latency and energy consumption, forming a potential vector for denial-of-service (DoS) style resource exhaustion. In this work, we investigate this attack surface and propose an automated black-box framework that induces overthinking in LRMs by systematically perturbing the logical structure of input problems. Our method employs a hierarchical genetic algorithm (HGA) operating on structured problem decompositions, and optimizes a composite fitness function designed to maximize both response length and reflective overthinking markers. Across four state-of-the-art reasoning models, the proposed method substantially amplifies output length, achieving up to a 26.1x increase on the MATH benchmark and consistently outperforming benign and manually crafted missing-premise baselines. We further demonstrate strong transferability, showing that adversarial inputs evolved using a small proxy model retain high effectiveness against large commercial LRMs. These findings highlight overthinking as a shared and exploitable vulnerability in modern reasoning systems, underscoring the need for more robust defenses.
Security Information and Event Management (SIEM) systems aggregate log data from heterogeneous sources to detect coordinated attacks. Traditional rule-based correlation engines struggle to classify multi-step web application attacks because they examine each event without reference to the behavioural history of the originating host. We present Smart-SIEM, an AI module for the open-source Wazuh SIEM platform with two contributions: (1) a per-source-IP behavioural context vector encoding HTTP response-status distributions, peak rule activation counts, and MITRE ATT&CK technique frequencies from the N most recent prior events; (2) a two-stage hybrid cascade combining LightGBM for binary attack detection and XGBoost for six-class attack categorisation. Evaluated on 46,454 purpose-built Wazuh security events, context features improve all tested gradient boosting algorithms from ~0.705 macro F1 to 0.947-0.967 (Stage 1) and 0.876-0.914 (Stage 2), an average gain of +0.254 and +0.324 respectively. The hybrid cascade achieves F1 of 0.967 (binary) and 0.914 (six-class). Wazuh's native rule engine detects 0% of Brute Force and Broken Authentication events; the AI module detects 100% and 98.3% respectively. A self-adaptive retraining mechanism recovers from concept drift: F1 drops from 0.905 to 0.465 when unseen attack types emerge, recovering to 0.814 after retraining on the combined corpus.
Reference counting bugs in Linux kernel drivers can lead to severe resource mismanagement and security vulnerabilities. We introduce DrvHorn, a novel automated tool to detect these bugs by reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing, DrvHorn discovered 545 bugs, of which 424 were previously unknown, across all platform drivers in v6.6 Linux kernel, with a lower false positive rate of 29.9% compared to prior studies. To address the root causes of these newly discovered bugs, we submitted patches to the Linux kernel, and 45 of them were merged.
Recent cryptographic results establish that neural networks can be backdoored such that no efficient algorithm can distinguish them from a clean model. These guarantees, however, have been confined to stylised architectures of limited practical relevance, leaving open whether comparable undetectability extends to modern, end-to-end trained networks. We construct such an attack mechanism for state-of-the-art architectures, closely aligned to the cryptographic notion of undetectability, by identifying backdoor channels as learned latent directions, and show that the question of undetectability reduces to a hypothesis test between two unknown distributions over model parameters, which we conjecture to be intractable in practice. The consequence of this reframing is significant: if exploitable channels within a network's latent space are statistically indistinguishable from naturally learned directions, an attacker need not introduce foreign structure but can instead exploit the geometry the network already possesses. Demonstrating the approach on ResNet and Vision Transformer architectures trained on standard image classification datasets, the attack achieves both consistently high success rates with negligible clean accuracy degradation, and resists a comprehensive suite of post-training defences, none of which neutralise the backdoor without rendering the model unusable. Our results establish that cryptographic backdoors need not be artefacts requiring exotic architectures or artificial constructions, but identifiable as latent properties inherent to the geometry of learned representations.
In this paper, we present PoisonCap: scalable temporal safety with strict use-after-free protection and initialisation safety for CHERI systems. Efficient memory safety is an increasing priority for programming languages, operating systems, and hardware designs, and CHERI is a leading hardware/software system that provides native spatial safety and a foundation for temporal memory safety. Cornucopia Reloaded, the current state-of-the-art CHERI temporal safety solution, provides use-after-reallocation safety instead of stronger use-after-free safety, and is not able to enforce initialisation safety. We show that a new 'poison' capability format can be used to enforce strict use-after-free and initialisation safety, and also to communicate memory state to the microarchitecture for efficient cache management of quarantined memory. We enable elegant delegation of memory poisoning privilege using capability bounds to allow nested allocators to enforce safety on their consumers without disturbing upstream allocators. PoisonCap can replace the Cornucopia shadow bitmap, and also automatically zeros memory on reallocation, or optionally traps on read-before-write to enforce initialisation safety. As a result, it incurs no fundamental overhead relative to a Cornucopia baseline that zeros before reallocation, strengthening CHERI temporal safety without performance overhead.
Foundation models and low-rank adapters enable efficient on-device generative AI but raise risks such as intellectual property leakage and model recovery attacks. Existing defenses are often impractical because they require retraining or access to the original dataset. We propose LoREnc, a training-free framework that secures both FMs and adapters via spectral truncation and compensation. LoREnc suppresses dominant low-rank components of FM weights, compensates for the missing information in authorized adapters, and further applies orthogonal reparameterization to obscure structural fingerprints of the protected adapter. Unauthorized users produce structurally collapsed outputs, while authorized users recover exact performance. Experiments demonstrate that LoREnc provides strong protection against model recovery with under 1% computational overhead.
IoT devices particularly microcontrollers are challenged by their inherent limitations in processing capabilities, memory capacity, and energy conservation. Securing communication within IoT networks is further complicated by the heterogeneity of devices and the myriad of potential security threats. Our study introduces a lightweight model that utilises machine learning algorithms to achieve a notable detection accuracy of 99% using a decision tree method and 96% using a neural network in identifying cyber threats, including Denial of Service and Man-in-the-Middle attacks which make up the majority of the attacks these devices face. While the decision tree method offers higher accuracy, it requires more computational resources, whereas the neural network approach, despite a slightly lower accuracy, is more memory-efficient. Both methods enhance the real-time monitoring and defence of IoT networks, safeguarding the transmission of data. Additionally, our approach is tailored to conserve memory and optimise computational demands, rendering it suitable for deployment on microcontrollers with limited resources.