Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
Federated Learning (FL) allows collaborative model training across distributed clients without sharing raw data, thus preserving privacy. However, the system remains vulnerable to privacy leakage from gradient updates and Byzantine attacks from malicious clients. Existing solutions face a critical trade-off among privacy preservation, Byzantine robustness, and computational efficiency. We propose a novel scheme that effectively balances these competing objectives by integrating homomorphic encryption with dimension compression based on the Johnson-Lindenstrauss transformation. Our approach employs a dual-server architecture that enables secure Byzantine defense in the ciphertext domain while dramatically reducing computational overhead through gradient compression. The dimension compression technique preserves the geometric relationships necessary for Byzantine defence while reducing computation complexity from $O(dn)$ to $O(kn)$ cryptographic operations, where $k \ll d$. Extensive experiments across diverse datasets demonstrate that our approach maintains model accuracy comparable to non-private FL while effectively defending against Byzantine clients comprising up to $40\%$ of the network.
Path MTU Discovery (PMTUD) and IP address sharing are integral aspects of modern Internet infrastructure. In this paper, we investigate the security vulnerabilities associated with PMTUD within the context of prevalent IP address sharing practices. We reveal that PMTUD is inadequately designed to handle IP address sharing, creating vulnerabilities that attackers can exploit to perform off-path TCP hijacking attacks. We demonstrate that by observing the path MTU value determined by a server for a public IP address (shared among multiple devices), an off-path attacker on the Internet, in collaboration with a malicious device, can infer the sequence numbers of TCP connections established by other legitimate devices sharing the same IP address. This vulnerability enables the attacker to perform off-path TCP hijacking attacks, significantly compromising the security of the affected TCP connections. Our attack involves first identifying a target TCP connection originating from the shared IP address, followed by inferring the sequence numbers of the identified connection. We thoroughly assess the impacts of our attack under various network configurations. Experimental results reveal that the attack can be executed within an average time of 220 seconds, achieving a success rate of 70%.Case studies, including SSH DoS, FTP traffic poisoning, and HTTP injection, highlight the threat it poses to various applications. Additionally, we evaluate our attack across 50 real-world networks with IP address sharing--including public Wi-Fi, VPNs, and 5G--and find 38 vulnerable. Finally, we responsibly disclose the vulnerabilities, receive recognition from organizations such as IETF, Linux, and Cisco, and propose our countermeasures.
Industrial control systems (ICSs) are widely used in industry, and their security and stability are very important. Once the ICS is attacked, it may cause serious damage. Therefore, it is very important to detect anomalies in ICSs. ICS can monitor and manage physical devices remotely using communication networks. The existing anomaly detection approaches mainly focus on analyzing the security of network traffic or sensor data. However, the behaviors of different domains (e.g., network traffic and sensor physical status) of ICSs are correlated, so it is difficult to comprehensively identify anomalies by analyzing only a single domain. In this paper, an anomaly detection approach based on cross-domain representation learning in ICSs is proposed, which can learn the joint features of multi-domain behaviors and detect anomalies within different domains. After constructing a cross-domain graph that can represent the behaviors of multiple domains in ICSs, our approach can learn the joint features of them by leveraging graph neural networks. Since anomalies behave differently in different domains, we leverage a multi-task learning approach to identify anomalies in different domains separately and perform joint training. The experimental results show that the performance of our approach is better than existing approaches for identifying anomalies in ICSs.
In Vehicle-to-Everything (V2X) networks with multi-hop communication, Road Side Units (RSUs) intend to gather location data from the vehicles to offer various location-based services. Although vehicles use the Global Positioning System (GPS) for navigation, they may refrain from sharing their exact GPS coordinates to the RSUs due to privacy considerations. Thus, to address the localization expectations of the RSUs and the privacy concerns of the vehicles, we introduce a relaxed-privacy model wherein the vehicles share their partial location information in order to avail the location-based services. To implement this notion of relaxed-privacy, we propose a low-latency protocol for spatial-provenance recovery, wherein vehicles use correlated linear Bloom filters to embed their position information. Our proposed spatial-provenance recovery process takes into account the resolution of localization, the underlying ad hoc protocol, and the coverage range of the wireless technology used by the vehicles. Through a rigorous theoretical analysis, we present extensive analysis on the underlying trade-off between relaxed-privacy and the communication-overhead of the protocol. Finally, using a wireless testbed, we show that our proposed method requires a few bits in the packet header to provide security features such as localizing a low-power jammer executing a denial-of-service attack.
Innovative solutions to cyber security issues are shaped by the ever-changing landscape of cyber threats. Automating the mitigation of these threats can be achieved through a new methodology that addresses the domain of mitigation automation, which is often overlooked. This literature overview emphasizes the lack of scholarly work focusing specifically on automated cyber threat mitigation, particularly in addressing challenges beyond detection. The proposed methodology comprise of the development of an automatic cyber threat mitigation framework tailored for Distributed Denial-of-Service (DDoS) attacks. This framework adopts a multi-layer security approach, utilizing smart devices at the device layer, and leveraging fog network and cloud computing layers for deeper understanding and technological adaptability. Initially, firewall rule-based packet inspection is conducted on simulated attack traffic to filter out DoS packets, forwarding legitimate packets to the fog. The methodology emphasizes the integration of fog detection through statistical and behavioral analysis, specification-based detection, and deep packet inspection, resulting in a comprehensive cyber protection system. Furthermore, cloud-level inspection is performed to confirm and mitigate attacks using firewalls, enhancing strategic defense and increasing robustness against cyber threats. These enhancements enhance understanding of the research framework's practical implementation and assessment strategies, substantiating its importance in addressing current cyber security challenges and shaping future automation mitigation approaches.
Anti-money laundering (AML) research is constrained by the lack of publicly shareable, regulation-aligned transaction datasets. We present AMLNet, a knowledge-based multi-agent framework with two coordinated units: a regulation-aware transaction generator and an ensemble detection pipeline. The generator produces 1,090,173 synthetic transactions (approximately 0.16\% laundering-positive) spanning core laundering phases (placement, layering, integration) and advanced typologies (e.g., structuring, adaptive threshold behavior). Regulatory alignment reaches 75\% based on AUSTRAC rule coverage (Section 4.2), while a composite technical fidelity score of 0.75 summarizes temporal, structural, and behavioral realism components (Section 4.4). The detection ensemble achieves F1 0.90 (precision 0.84, recall 0.97) on the internal test partitions of AMLNet and adapts to the external SynthAML dataset, indicating architectural generalizability across different synthetic generation paradigms. We provide multi-dimensional evaluation (regulatory, temporal, network, behavioral) and release the dataset (Version 1.0, https://doi.org/10.5281/zenodo.16736515), to advance reproducible and regulation-conscious AML experimentation.
Recent works in federated learning (FL) have shown the utility of leveraging transfer learning for balancing the benefits of FL and centralized learning. In this setting, federated training happens after a stable point has been reached through conventional training. Global model weights are first centrally pretrained by the server on a public dataset following which only the last few linear layers (the classification head) of the model are finetuned across clients. In this scenario, existing data reconstruction attacks (DRAs) in FL show two key weaknesses. First, strongly input-correlated gradient information from the initial model layers is never shared, significantly degrading reconstruction accuracy. Second, DRAs in which the server makes highly specific, handcrafted manipulations to the model structure or parameters (for e.g., layers with all zero weights, identity mappings and rows with identical weight patterns) are easily detectable by an active client. Improving on these, we propose MAUI, a stealthy DRA that does not require any overt manipulations to the model architecture or weights, and relies solely on the gradients of the classification head. MAUI first extracts "robust" feature representations of the input batch from the gradients of the classification head and subsequently inverts these representations to the original inputs. We report highly accurate reconstructions on the CIFAR10 and ImageNet datasets on a variety of model architectures including convolution networks (CNN, VGG11), ResNets (18, 50), ShuffleNet-V2 and Vision Transformer (ViT B-32), regardless of the batch size. MAUI significantly outperforms prior DRAs in reconstruction quality, achieving 40-120% higher PSNR scores.
In recent years, Rowhammer has attracted significant attention from academia and industry alike. This technique, first published in 2014, flips bits in memory by repeatedly accessing neighbouring memory locations. Since its discovery, researchers have developed a substantial body of work exploiting Rowhammer and proposing countermeasures. These works demonstrate that Rowhammer can be mounted not only through native code, but also via remote code execution, such as JavaScript in browsers, and over networks. In this work, we uncover a previously unexplored Rowhammer vector. We present Thunderhammer, an attack that induces DRAM bitflips from malicious peripherals connected via PCIe or Thunderbolt (which tunnels PCIe). On modern DDR4 systems, we observe that triggering bitflips through PCIe requests requires precisely timed access patterns tailored to the target system. We design a custom device to reverse engineer critical architectural parameters that shape PCIe request scheduling, and to execute effective hammering access patterns. Leveraging this knowledge, we successfully demonstrate Rowhammer-induced bitflips in DDR4 memory modules via both PCIe slot connections and Thunderbolt ports tunnelling PCIe.
In recent years, sequence features such as packet length have received considerable attention due to their central role in encrypted traffic analysis. Existing sequence modeling approaches can be broadly categorized into flow-level and trace-level methods: the former suffer from high feature redundancy, limiting their discriminative power, whereas the latter preserve complete information but incur substantial computational and storage overhead. To address these limitations, we propose the \textbf{U}p-\textbf{D}own \textbf{F}low \textbf{S}equence (\textbf{UDFS}) representation, which compresses an entire trace into a two-dimensional sequence and characterizes each flow by the aggregate of its upstream and downstream traffic, reducing complexity while maintaining high discriminability. Furthermore, to address the challenge of class-specific discriminability differences, we propose an adaptive threshold mechanism that dynamically adjusts training weights and rejection boundaries, enhancing the model's classification performance. Experimental results demonstrate that the proposed method achieves superior classification performance and robustness on both coarse-grained and fine-grained datasets, as well as under concept drift and open-world scenarios. Code and Dataset are available at https://github.com/kid1999/UDFS.
The rapid proliferation of network applications has led to a significant increase in network attacks. According to the OWASP Top 10 Projects report released in 2021, injection attacks rank among the top three vulnerabilities in software projects. This growing threat landscape has increased the complexity and workload of software testing, necessitating advanced tools to support agile development cycles. This paper introduces a novel test prioritization method for SQL injection vulnerabilities to enhance testing efficiency. By leveraging previous test outcomes, our method adjusts defense strength vectors for subsequent tests, optimizing the testing workflow and tailoring defense mechanisms to specific software needs. This approach aims to improve the effectiveness and efficiency of vulnerability detection and mitigation through a flexible framework that incorporates dynamic adjustments and considers the temporal aspects of vulnerability exposure.
The Open Network (TON) is a high-performance blockchain platform designed for scalability and efficiency, leveraging an asynchronous execution model and a multi-layered architecture. While TON's design offers significant advantages, it also introduces unique challenges for smart contract development and security. This paper introduces a comprehensive audit checklist for TON smart contracts, based on an analysis of 34 professional audit reports containing 233 real-world vulnerabilities. The checklist addresses TON-specific challenges, such as asynchronous message handling, and provides actionable insights for developers and auditors. We also present detailed case studies of vulnerabilities in TON smart contracts, highlighting their implications and offering lessons learned. By adopting this checklist, developers and auditors can systematically identify and mitigate vulnerabilities, enhancing the security and reliability of TON-based projects. Our work bridges the gap between Ethereum's mature audit methodologies and the emerging needs of the TON ecosystem, fostering a more secure and robust blockchain environment.
We present ORQ, a system that enables collaborative analysis of large private datasets using cryptographically secure multi-party computation (MPC). ORQ protects data against semi-honest or malicious parties and can efficiently evaluate relational queries with multi-way joins and aggregations that have been considered notoriously expensive under MPC. To do so, ORQ eliminates the quadratic cost of secure joins by leveraging the fact that, in practice, the structure of many real queries allows us to join records and apply the aggregations "on the fly" while keeping the result size bounded. On the system side, ORQ contributes generic oblivious operators, a data-parallel vectorized query engine, a communication layer that amortizes MPC network costs, and a dataflow API for expressing relational analytics -- all built from the ground up. We evaluate ORQ in LAN and WAN deployments on a diverse set of workloads, including complex queries with multiple joins and custom aggregations. When compared to state-of-the-art solutions, ORQ significantly reduces MPC execution times and can process one order of magnitude larger datasets. For our most challenging workload, the full TPC-H benchmark, we report results entirely under MPC with Scale Factor 10 -- a scale that had previously been achieved only with information leakage or the use of trusted third parties.
The Tor network offers network anonymity to its users by routing their traffic through a sequence of relays. A group of nine directory authorities maintains information about all available relay nodes using a distributed directory protocol. We observe that the current protocol makes a steep synchrony assumption, which makes it vulnerable to natural as well as adversarial non-synchronous communication scenarios over the Internet. In this paper, we show that it is possible to cause a failure in the Tor directory protocol by targeting a majority of the authorities for only five minutes using a well-executed distributed denial-of-service (DDoS) attack. We demonstrate this attack in a controlled environment and show that it is cost-effective for as little as \$53.28 per month to disrupt the protocol and to effectively bring down the entire Tor network. To mitigate this problem, we consider the popular partial synchrony assumption for the Tor directory protocol that ensures that the protocol security is hampered even when the network delays are large and unknown. We design a new Tor directory protocol that leverages any standard partial-synchronous consensus protocol to solve this problem, while also proving its security. We have implemented a prototype in Rust, demonstrating comparable performance to the current protocol while resisting similar attacks.
The multi level Bell La Padula model for secure data access and data flow control, formulated in the 1970s, was based on the theory of partial orders. Since then, another model, based on lattice theory, has prevailed. We present reasons why the partial order model is more appropriate. We also show, by example, how non lattice data flow networks can be easily implemented by using Attribute-based access control (ABAC).
Sophisticated malware families exploit the openness of the Android platform to infiltrate IoT networks, enabling large-scale disruption, data exfiltration, and denial-of-service attacks. This systematic literature review (SLR) examines cutting-edge approaches to Android malware analysis with direct implications for securing IoT infrastructures. We analyze feature extraction techniques across static, dynamic, hybrid, and graph-based methods, highlighting their trade-offs: static analysis offers efficiency but is easily evaded through obfuscation; dynamic analysis provides stronger resistance to evasive behaviors but incurs high computational costs, often unsuitable for lightweight IoT devices; hybrid approaches balance accuracy with resource considerations; and graph-based methods deliver superior semantic modeling and adversarial robustness. This survey contributes a structured comparison of existing methods, exposes research gaps, and outlines a roadmap for future directions to enhance scalability, adaptability, and long-term security in IoT-driven Android malware detection.
Malicious URL detection remains a major challenge in cybersecurity, primarily due to two factors: (1) the exponential growth of the Internet has led to an immense diversity of URLs, making generalized detection increasingly difficult; and (2) attackers are increasingly employing sophisticated obfuscation techniques to evade detection. We advocate that addressing these challenges fundamentally requires: (1) obtaining semantic understanding to improve generalization across vast and diverse URL sets, and (2) accurately modeling contextual relationships within the structural composition of URLs. In this paper, we propose a novel malicious URL detection method combining multi-granularity graph learning with semantic embedding to jointly capture semantic, character-level, and structural features for robust URL analysis. To model internal dependencies within URLs, we first construct dual-granularity URL graphs at both subword and character levels, where nodes represent URL tokens/characters and edges encode co-occurrence relationships. To obtain fine-grained embeddings, we initialize node representations using a character-level convolutional network. The two graphs are then processed through jointly trained Graph Convolutional Networks to learn consistent graph-level representations, enabling the model to capture complementary structural features that reflect co-occurrence patterns and character-level dependencies. Furthermore, we employ BERT to derive semantic representations of URLs for semantically aware understanding. Finally, we introduce a gated dynamic fusion network to combine the semantically enriched BERT representations with the jointly optimized graph vectors, further enhancing detection performance. We extensively evaluate our method across multiple challenging dimensions. Results show our method exceeds SOTA performance, including against large language models.
The success of smart contracts has made them a target for attacks, but their closed-source nature often forces vulnerability detection to work on bytecode, which is inherently more challenging than source-code-based analysis. While recent studies try to align source and bytecode embeddings during training to transfer knowledge, current methods rely on graph-level alignment that obscures fine-grained structural and semantic correlations between the two modalities. Moreover, the absence of precise vulnerability patterns and granular annotations in bytecode leads to depriving the model of crucial supervisory signals for learning discriminant features. We propose ExDoS to transfer rich semantic knowledge from source code to bytecode, effectively supplementing the source code prior in practical settings. Specifically, we construct semantic graphs from source code and control-flow graphs from bytecode. To address obscured local signals in graph-level contract embeddings, we propose a Dual-Attention Graph Network introducing a novel node attention aggregation module to enhance local pattern capture in graph embeddings. Furthermore, by summarizing existing source code vulnerability patterns and designing a corresponding set of bytecode-level patterns for each, we construct the first dataset of vulnerability pattern annotations aligned with source code definitions to facilitate fine-grained cross-modal alignment and the capture of function-level vulnerability signals. Finally, we propose a dual-focus objective for our cross-modal distillation framework, comprising: a Global Semantic Distillation Loss for transferring graph-level knowledge and a Local Semantic Distillation Loss enabling expert-guided, fine-grained vulnerability-specific distillation. Experiments on real-world contracts demonstrate that our method achieves consistent F1-score improvements (3\%--6\%) over strong baselines.
Software-based memory-erasure protocols are two-party communication protocols where a verifier instructs a computational device to erase its memory and send a proof of erasure. They aim at guaranteeing that low-cost IoT devices are free of malware by putting them back into a safe state without requiring secure hardware or physical manipulation of the device. Several software-based memory-erasure protocols have been introduced and theoretically analysed. Yet, many of them have not been tested for their feasibility, performance and security on real devices, which hinders their industry adoption. This article reports on the first empirical analysis of software-based memory-erasure protocols with respect to their security, erasure guarantees, and performance. The experimental setup consists of 3 modern IoT devices with different computational capabilities, 7 protocols, 6 hash-function implementations, and various performance and security criteria. Our results indicate that existing software-based memory-erasure protocols are feasible, although slow devices may take several seconds to erase their memory and generate a proof of erasure. We found that no protocol dominates across all empirical settings, defined by the computational power and memory size of the device, the network speed, and the required level of security. Interestingly, network speed and hidden constants within the protocol specification played a more prominent role in the performance of these protocols than anticipated based on the related literature. We provide an evaluation framework that, given a desired level of security, determines which protocols offer the best trade-off between performance and erasure guarantees.