Loading...
Loading...
Browse, search, and filter preprints from arXiv—fast, readable, and built for curious security folks.
Showing 18 loaded of 46,627—scroll for more
A Large Language Model (LLM) as judge evaluates the quality of victim Machine Learning (ML) models, specifically LLMs, by analyzing their outputs. An LLM as judge is the combination of one model and one specifically engineered judge prompt that contains the criteria for the analysis. The resulting automation of the analysis scales up the complex evaluation of the victim models' free-form text outputs by faster and more consistent judgments compared to human reviewers. Thus, quality and security assessments of LLMs can cover a wide range of the victim models' use cases. Being a comparably new technique, LLMs as judges lack a thorough investigation for their reliability and agreement to human judgment. Our work evaluates the applicability of LLMs as automated quality assessors of victim LLMs. We test the efficacy of 37 differently sized conversational LLMs in combination with 5 different judge prompts, the concept of a second-level judge, and 5 models fine-tuned for the task as assessors. As assessment objective, we curate datasets for eight different categories of judgment tasks and the corresponding ground-truth labels based on human assessments. Our empirical results show a high correlation of LLMs as judges with human assessments, when combined with a suitable prompt, in particular for GPT-4o, several open-source models with $\geqslant$ 32B parameters, and a few smaller models like Qwen2.5 14B.
The use of Internet of Things (IoT) devices is growing at a rapid rate. While much of this growth is consumer devices, IoT devices are also commonly found in corporate and industrial environments, as well. These devices can be organization-owned and managed by an information technology unit, deployed organizationally without the knowledge and involvement of technology staff or brought in to the corporate environment by user-owners. In each case, these devices may have access to corporate networks and data and are, thus, important to consider as part of organizational cybersecurity risk assessment. Despite the prevalence of these devices, there is little literature about how to audit their security. This paper presents a risk-based auditing framework which can be used by both internal and external auditors, of any experience level and in any industry, to assess IoT devices.
Deploying ML-DSA (FIPS 204) in threshold settings has remained an open problem: the scheme's inherently non-linear rounding step defeats the additive share techniques that underpin practical threshold schemes for elliptic-curve signatures such as FROST. We present TALUS, the first threshold ML-DSA construction that achieves one-round online signing with >99% online success, while producing standard signatures verifiable by any unmodified ML-DSA verifier. We formalise this as the Lattice Threshold Trilemma, proving that no group homomorphism from the ML-DSA nonce space into any abelian group can simultaneously be hiding and binding, ruling out all possible homomorphic commitment schemes. TALUS overcomes this barrier with two techniques. The Boundary Clearance Condition (BCC) identifies nonces whose rounding residuals lie far enough from modular boundaries that the secret key component s2 has no effect on the signature; such nonces (approximately 31.7% of attempts) are filtered during offline preprocessing. The Carry Elimination Framework (CEF) then enables parties to compute the commitment hash input distributedly, without reconstructing the full nonce product. Together, BCC and CEF reduce online signing to a single broadcast round: each party sends one message and the coordinator assembles a valid FIPS 204 signature. We instantiate TALUS in two deployment profiles: TALUS-TEE (trusted execution environment, T-of-N) and TALUS-MPC (fully distributed, malicious security with identifiable abort for T >= 2). Security of both variants reduces to ML-DSA EUF-CMA. A Rust implementation across all three FIPS 204 security levels (ML-DSA-44, ML-DSA-65, ML-DSA-87) shows that TALUS-TEE completes a signing operation in 0.62--1.94 ms and TALUS-MPC in 2.27--5.02 ms (amortised, T=3), competitive with the fastest concurrent threshold ML-DSA proposals.
In Shamir's secret sharing scheme, all participants possess equal privileges. However, in many practical scenarios, it is often necessary to assign different levels of authority to different participants. To address this requirement, Hierarchical Secret Sharing (HSS) schemes were developed, which partitioned all participants into multiple subsets and assigned a distinct privilege level to each. Existing Chinese Remainder Theorem (CRT)-based HSS schemes benefit from flexible share sizes, but either exhibit security flaws or have an information rate less than $\frac{1}{2}$. In this work, we propose a disjunctive HSS scheme and a conjunctive HSS scheme by using the CRT for integer ring and one-way functions. Both schemes are asymptotically ideal and are proven to be secure.
Conjunctive Hierarchical Secret Sharing (CHSS) is a type of secret sharing that divides participants into multiple distinct hierarchical levels, with each level having a specific threshold. An authorized subset must simultaneously meet the threshold of all levels. Existing Chinese Remainder Theorem (CRT)-based CHSS schemes either have security vulnerabilities or have an information rate lower than $\frac{1}{2}$. In this work, we utilize the CRT for polynomial ring and one-way functions to construct an asymptotically perfect CHSS scheme. It has computational security, and permits flexible share sizes. Notably, when all shares are of equal size, our scheme is an asymptotically ideal CHSS scheme with an information rate one.
Large language models are becoming pervasive core components in many real-world applications. As a consequence, security alignment represents a critical requirement for their safe deployment. Although previous related works focused primarily on model architectures and alignment methodologies, these approaches alone cannot ensure the complete elimination of harmful generations. This concern is reinforced by the growing body of scientific literature showing that attacks, such as jailbreaking and prompt injection, can bypass existing security alignment mechanisms. As a consequence, additional security strategies are needed both to provide qualitative feedback on the robustness of the obtained security alignment at the training stage, and to create an ``ultimate'' defense layer to block unsafe outputs possibly produced by deployed models. To provide a contribution in this scenario, this paper introduces SecureBreak, a safety-oriented dataset designed to support the development of AI-driven solutions for detecting harmful LLM outputs caused by residual weaknesses in security alignment. The dataset is highly reliable due to careful manual annotation, where labels are assigned conservatively to ensure safety. It performs well in detecting unsafe content across multiple risk categories. Tests with pre-trained LLMs show improved results after fine-tuning on SecureBreak. Overall, the dataset is useful both for post-generation safety filtering and for guiding further model alignment and security improvements.
New technologies, such as blockchain, are designed to address various system weaknesses, particularly those related to security. Blockchain can enhance numerous aspects of traditional banking systems by transforming them into digital, immutable, secure, and anonymous ledger. This paper proposes a new banking application ALBank, which is based on blockchain and smart contract technologies. Its functionality relies on invoking functions within smart contracts deployed on the Ethereum blockchain. This approach enables decentralization and enhances both security and trust. In this context, the paper first presents a critical analysis of existing research on blockchain and traditional banking systems, with a focus on their respective challenges. It then examines the Know Your Customer (KYC) process and its various models. Finally, it introduces the design and development of ALBank, a decentralized banking application built on the Ethereum blockchain using smart contracts. The results show that the integration of blockchain and smart contracts effectively addresses key issues in traditional banking systems, including centralization, inefficiency, and security vulnerabilities by storing critical data on a decentralized, immutable ledger, managing processes autonomously, and making transactions transparent to all users.
Modern democracies face an existential crisis of waning public trust in election results. While End-to-End Verifiable (E2E-V) voting systems promise mathematically secure elections, their reliance on complex cryptography creates a ``black box'' that forces blind trust in opaque software or external experts, ultimately failing to build genuine public confidence. To solve this, we introduce the concept of Software-Free Verification (SFV) -- a standard requiring that voters can independently verify election integrity without relying on any software. We propose a practical, non-cryptographic in-booth voting scheme that achieves SFV for national-scale elections. Our approach leverages a public bulletin board of randomized (Pseudonym, Candidate) pairs, where a mechanically generated pseudonym is hidden among real decoy votes on a physical receipt. Our scheme empowers citizens to audit the election using only basic arithmetic via a hierarchical Public Ledger, while anchoring the overall digital tally to physical evidence and Risk-Limiting Audits (RLAs) to guarantee systemic integrity. The result is a system that bridges the gap between mathematical security and public transparency, offering a viable blueprint for restoring trust in democratic infrastructure.
This paper emphasizes the critical role of interoperability in enabling efficient and secure communication for the fragmented distributed ledger ecosystem, particularly within on-chain finance. The purpose of this study is to streamline and accelerate empirical research on the intersection of cross-chain interoperability solutions and their impact within on-chain finance. The analysis examines the relationship between financial use and interoperability while comparing the properties of novel cross-chain interoperability protocols (LayerZero, Wormhole, Connext, Chainlink Cross-Chain Interoperability Protocol, Circle Cross-chain Transfer Protocol, Hop Protocol, Across, Polkadot, and Cosmos), focusing on their design, mechanisms, consensus, and limitations. To encourage further empirical study, the paper proposes a set of network metrics and sample statistical models and provides a framework for evaluating the performance and financial implications of interoperability solutions.
Smart homes are increasingly targeted by cyberattacks, yet residents often lack guidance when incidents occur. Since affected residents are likely to seek help from trustworthy sources, this paper asks: What actionable cybersecurity guidance do governments provide to smart home users whose systems have been compromised? To answer this question, we conduct an exploratory, user-centered review of governmental cybersecurity guidance for smart homes across eleven countries to identify and characterize the types of guidance governments provide and to systematize their content. Using a standardized search and screening process, we derive three emergent clusters: incident reporting, general security recommendations, and incident response. Our findings show that governments provide abundant general security advice and accessible reporting channels, but structured incident response guidance tailored to smart homes is rare. Only two sources offer step-by-step recovery guidance for non-expert users, highlighting a gap between preventive advice and post-incident support.
Multimodal Large Language Models (MLLMs) extend text-only LLMs with visual reasoning, but also introduce new safety failure modes under visually grounded instructions. We study comic-template jailbreaks that embed harmful goals inside simple three-panel visual narratives and prompt the model to role-play and "complete the comic." Building on JailbreakBench and JailbreakV, we introduce ComicJailbreak, a comic-based jailbreak benchmark with 1,167 attack instances spanning 10 harm categories and 5 task setups. Across 15 state-of-the-art MLLMs (six commercial and nine open-source), comic-based attacks achieve success rates comparable to strong rule-based jailbreaks and substantially outperform plain-text and random-image baselines, with ensemble success rates exceeding 90% on several commercial models. Then, with the existing defense methodologies, we show that these methods are effective against the harmful comics, they will induce a high refusal rate when prompted with benign prompts. Finally, using automatic judging and targeted human evaluation, we show that current safety evaluators can be unreliable on sensitive but non-harmful content. Our findings highlight the need for safety alignment robust to narrative-driven multimodal jailbreaks.
The present work investigates a type of morphisms between encryption schemes, called bridges. By associating an encryption scheme to every such bridge, we define and examine their security. Inspired by the bootstrapping procedure used by Gentry to produce fully homomorphic encryption schemes, we exhibit a general recipe for the construction of bridges. Our main theorem asserts that the security of a bridge reduces to the security of the first encryption scheme together with a technical additional assumption.
Retrieval-Augmented Generation (RAG) significantly mitigates the hallucinations and domain knowledge deficiency in large language models by incorporating external knowledge bases. However, the multi-module architecture of RAG introduces complex system-level security vulnerabilities. Guided by the RAG workflow, this paper analyzes the underlying vulnerability mechanisms and systematically categorizes core threat vectors such as data poisoning, adversarial attacks, and membership inference attacks. Based on this threat assessment, we construct a taxonomy of RAG defense technologies from a dual perspective encompassing both input and output stages. The input-side analysis reviews data protection mechanisms including dynamic access control, homomorphic encryption retrieval, and adversarial pre-filtering. The output-side examination summarizes advanced leakage prevention techniques such as federated learning isolation, differential privacy perturbation, and lightweight data sanitization. To establish a unified benchmark for future experimental design, we consolidate authoritative test datasets, security standards, and evaluation frameworks. To the best of our knowledge, this paper presents the first end-to-end survey dedicated to the security of RAG systems. Distinct from existing literature that isolates specific vulnerabilities, we systematically map the entire pipeline-providing a unified analysis of threat models, defense mechanisms, and evaluation benchmarks. By enabling deep insights into potential risks, this work seeks to foster the development of highly robust and trustworthy next-generation RAG systems.
Phishing attacks remain a persistent cybersecurity threat, and the widespread adoption of TLS certificates has unintentionally enabled malicious websites to appear trustworthy to users. This study examines whether certificate metadata and domain characteristics can help distinguish phishing domains from benign domains within the Danish .dk namespace. A dataset was constructed by combining registry information from Punktum dk with phishing reports and popularity rankings from external sources. TLS certificate attributes were collected using Netlas, while additional domain-based features were derived from DNS records and lexical analysis of domain names. The analysis compares phishing, popular, and less frequently visited domains across several feature categories, including Certificate Authorities (CAs), validity periods, missing certificate fields, SAN structure, registrant geography, hosting providers, and lexical properties of domain names. The results indicate that several features show observable differences between phishing and highly popular domains. However, phishing domains often resemble less popular domains, resulting in substantial overlap across many characteristics. Consequently, no individual feature provides a reliable standalone indicator of phishing activity within the Danish namespace. The findings suggest that certificate and domain attributes may still contribute to detection when combined, while also highlighting the limitations of relying on individual indicators in isolation. This work provides an empirical overview of phishing-related infrastructure patterns in the Danish .dk ecosystem and offers insights that may inform future phishing detection approaches.
Prompt injection is listed as the number-one vulnerability class in the OWASP Top 10 for LLM Applications that can subvert LLM guardrails, disclose sensitive data, and trigger unauthorized tool use. Developers are rapidly adopting AI-assisted development tools built on the Model Context Protocol (MCP). However, their convenience comes with security risks, especially prompt-injection attacks delivered via tool-poisoning vectors. While prior research has studied prompt injection in LLMs, the security posture of real-world MCP clients remains underexplored. We present the first empirical analysis of prompt injection with the tool-poisoning vulnerability across seven widely used MCP clients: Claude Desktop, Claude Code, Cursor, Cline, Continue, Gemini CLI, and Langflow. We identify their detection and mitigation mechanisms, as well as the coverage of security features, including static validation, parameter visibility, injection detection, user warnings, execution sandboxing, and audit logging. Our evaluation reveals significant disparities. While some clients, such as Claude Desktop, implement strong guardrails, others, such as Cursor, exhibit high susceptibility to cross-tool poisoning, hidden parameter exploitation, and unauthorized tool invocation. We further provide actionable guidance for MCP implementers and the software engineering community seeking to build secure AI-assisted development workflows.
The Model Context Protocol (MCP) has emerged as a standard for connecting Large Language Models (LLMs) to external tools and data. However, MCP servers often expose privileged capabilities, such as file system access, network requests, and command execution that can be exploited if not properly secured. We present mcp-sec-audit, an extensible security assessment toolkit designed specifically for MCP servers. It implements static pattern matching for Python-based MCP servers and dynamic sandboxed fuzzing and monitoring via Docker and eBPF. The tool detects risky capabilities through configurable rule-based analysis and provides mitigation recommendations.
The rapid expansion of the Internet of Things (IoT) and its integration with backbone networks have heightened the risk of security breaches. Traditional centralized approaches to anomaly detection, which require transferring large volumes of data to central servers, suffer from privacy, scalability, and latency limitations. This paper proposes a lightweight autoencoder-based anomaly detection framework designed for deployment on resource-constrained edge devices, enabling real-time detection while minimizing data transfer and preserving privacy. Federated learning is employed to train models collaboratively across distributed devices, where local training occurs on edge nodes and only model weights are aggregated at a central server. A real-world IoT testbed using Raspberry Pi sensor nodes was developed to collect normal and attack traffic data. The proposed federated anomaly detection system, implemented and evaluated on the testbed, demonstrates its effectiveness in accurately identifying network attacks. The communication overhead was reduced significantly while achieving comparable performance to the centralized method.
Developers rely on online tutorials to learn web application security, but tutorial quality varies. We reviewed 132 free security tutorials to examine topic coverage, authorship, and technical depth. Our analysis shows that most tutorials come from vendors and emphasize high-level explanations over concrete implementation guidance. Few tutorials provide complete runnable code examples or direct links to authoritative security resources such as the Open Web Application Security Project (OWASP), Common Weakness Enumeration (CWE), or Common Vulnerabilities and Exposures (CVE). We found that two visible signals help identify more useful tutorials: the presence of runnable code and direct links to official resources. These signals can help developers distinguish broad awareness material from tutorials that better support secure implementation.