Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
We propose to use orthologic as the basis for designing type systems supporting intersection, union, and negation types in the presence of subtyping assumptions. We show how to extend orthologic to support monotonic and antimonotonic functions, supporting the use of type constructors in such type systems. We present a proof system for orthologic with function symbols, showing that it admits partial cut elimination. Using these insights, we present an $\mathcal O(n^2(1+m))$ algorithm for deciding the subtyping relation under $m$ assumptions. We also show $O(n^2)$ polynomial-time normalization algorithm, allowing simplification of types to their minimal canonical form.
Interaction-Oriented Programming (IOP) is an approach to building a multiagent system by modeling the interactions between its roles via a flexible interaction protocol and implementing agents to realize the interactions of the roles they play in the protocol. In recent years, we have developed an extensive suite of software that enables multiagent system developers to apply IOP. These include tools for efficiently verifying protocols for properties such as liveness and safety and middleware that simplifies the implementation of agents. This paper presents some of that software suite.
Effect handlers allow programmers to model and compose computational effects modularly. Effect systems statically guarantee that all effects are handled. Several recent practical effect systems are based on either row polymorphism or capabilities. However, there remains a gap in understanding the precise relationship between effect systems with such disparate foundations. The main difficulty is that in both row-based and capability-based systems, effect tracking is typically entangled with other features such as functions. We propose a uniform framework for encoding, analysing, and comparing effect systems. Our framework exploits and generalises modal effect types, a recent novel effect system which decouples effect tracking from functions via modalities. Modalities offer fine-grained control over when and how effects are tracked, enabling us to express different strategies for effect tracking. We give encodings as macro translations from existing row-based and capability-based effect systems into our framework and show that these encodings preserve types and semantics. Our encodings reveal the essence of effect tracking mechanisms in different effect systems, enable a direct analysis on their differences, and provide valuable insights on language design.
ReDemon UI synthesizes React applications from user demonstrations, enabling designers and non-expert programmers to create UIs that integrate with standard UI prototyping workflows. Users provide a static mockup sketch with event handler holes and demonstrate desired runtime behaviors by interacting with the rendered mockup and editing the sketch. ReDemon UI identifies reactive data and synthesizes a React program with correct state update logic. We utilize enumerative synthesis for simple UIs and LLMs for more complex UIs.
eBPF is a technology that allows developers to safely extend kernel functionality without modifying kernel source code or developing loadable kernel modules. Since the kernel governs critical system operations and enforces isolation boundaries between user space and privileged data, any mechanism that modifies its behavior must meet the highest standards of safety and correctness. To this end, the eBPF toolchain includes a verifier, which statically checks safety properties such as memory access validity, bounded loops, and type correctness before loading the program into the kernel. However, the existing verifier is both overly conservative in some cases-rejecting valid programs-and unsound in others, permitting unsafe behavior that violates the intended semantics of the kernel interface. To address these challenges, we introduce BeePL, a domain-specific language for eBPF with a formally verified type system. The BeePL type system, along with the language design, statically enforces key safety properties such as type-correct memory access, safe pointer usage, absence of unbounded loops, and structured control flow. These guarantees are backed by formal type soundness proofs, ensuring that well-typed programs satisfy the safety invariants required by the eBPF execution environment. BeePL also proves that well-typed source programs meet critical eBPF-specific properties related to memory safety, termination, and control flow, enabling high-level reasoning prior to compilation. For properties not fully enforceable statically-such as dynamic bounds and undefined behavior-BeePL inserts semantics-preserving runtime checks during compilation. We develop a verified compilation strategy that extends CompCert to generate BPF bytecode from BeePL programs, establishing a principled foundation for an end-to-end verifiable toolchain for safe kernel extensions.
Symbolic execution is a powerful technique for analyzing the behavior of software yet scalability remains a challenge due to state explosion in control and data flow. Existing tools typically aim at managing control flow internally, often at the expense of completeness, while offloading reasoning over data flow to SMT solvers. Moreover, reasoning typically happens on source code or intermediate representation level to leverage structural information, making machine code generation part of the trust base. We are interested in changing the equation in two non-trivial ways: pushing reasoning down to machine code level, and then offloading reasoning entirely into SMT solvers and other, possibly more efficient solver technology. In more abstract terms, we are asking if bit-precise reasoning technology can be made scalable on software, and not just hardware. For this purpose, we developed two tools called rotor and bitme for model generation and bounded model checking, respectively. We chose RISC-V restricted to integer arithmetic as modeling target for rotor since RISC-V integer semantics is essentially equivalent to established SMT semantics over bitvectors and arrays of bitvectors. While state-of-the-art SMT solvers struggle in our experiments, we have evidence that there is potential for improvement. To show the potential, we have slightly generalized and then implemented in bitme two types of binary decision diagrams (BDDs): algebraic decision diagrams (ADDs) and context-free-language ordered binary decision diagrams (CFLOBDDs). Bitme uses BDDs to propagate program input through models, essentially generalizing constant propagation to domain propagation. SMT solvers only get involved when model input cannot be propagated, significanly speeding up SMT solving. We then study the impact on state explosion of CFLOBDDs, which are potentially more scalable than ADDs.
Source code segmentation, dividing code into functionally coherent segments, is crucial for knowledge retrieval and maintenance in software development. While enabling efficient navigation and comprehension of large codebases, manual and syntactic analysis approaches have become impractical as repositories grow, especially for low-resource languages like R and their research domains (e.g., social sciences, psychology).This paper introduces an automated, domain-specific approach for research R code segmentation using Large and Small Language Models (LLMs/SLMs). It presents two novel approaches and a human-annotated dataset, StatCodeSeg. We explore two distinct approaches: line-by-line analysis with context and range-based segment determination. We experiment with LLMs and fine-tuned SLMs. To support the generalizability of our approaches, we also include experiments on Python code from the computer science domain.Our results show that context-based line-by-line analysis is superior over range-based segmentation.Using smaller language models like CodeBERT and an encoder-only version of CodeT5+ are better than their LLM counterparts. Most notably, these two best-performing models did not see R code during pre-training versus the LLMs but were only fine-tuned on 4,130 lines of manually annotated code.
What should a function that extrapolates beyond known input/output examples look like? This is a tricky question to answer in general, as any function matching the outputs on those examples can in principle be a correct extrapolant. We argue that a "good" extrapolant should follow certain kinds of rules, and here we study a particularly appealing criterion for rule-following in list functions: that the function should behave predictably even when certain elements are removed. In functional programming, a standard way to express such removal operations is by using a filter function. Accordingly, our paper introduces a new semantic class of functions -- the filter equivariant functions. We show that this class contains interesting examples, prove some basic theorems about it, and relate it to the well-known class of map equivariant functions. We also present a geometric account of filter equivariants, showing how they correspond naturally to certain simplicial structures. Our highlight result is the amalgamation algorithm, which constructs any filter-equivariant function's output by first studying how it behaves on sublists of the input, in a way that extrapolates perfectly.
We present a novel dependent linear type theory in which the multiplicity of some variable - i.e., the number of times the variable can be used in a program - can depend on other variables. This allows us to give precise resource annotations to many higher-order functions that cannot be adequately typed in any other system. Inspired by the Dialectica translation, our typing discipline is obtained by embedding linear logic into dependent type theory and specifying how the embedded logic interacts with the host theory. We can then use a standard natural numbers type to obtain a quantitative typing system with dependent multiplicities. We characterise the semantics for our theory as a combination of standard models of dependent type theory and linear logic. Our system can be added to any dependently typed language, which we demonstrate with an implementation in Agda.
Formally specifying, let alone verifying, properties of systems involving multiple programming languages is inherently challenging. We introduce Heterogeneous Dynamic Logic (HDL), a framework for combining reasoning principles from distinct (dynamic) program logics in a modular and compositional way. HDL mirrors the architecture of satisfiability modulo theories (SMT): Individual dynamic logics, along with their calculi, are treated as dynamic theories that can be flexibly combined to reason about heterogeneous systems whose components are verified using different program logics. HDL provides two key operations: Lifting extends an individual dynamic theory with new program constructs (e.g., the havoc operation or regular programs) and automatically augments its calculus with sound reasoning principles for the new constructs; and Combination enables cross-language reasoning in a single modality via Heterogeneous Dynamic Theories, facilitating the reuse of existing proof infrastructure. We formalize dynamic theories, their lifting and combination in Isabelle, and prove the soundness of all proof rules. We also prove relative completeness theorems for lifting and combination: Under common assumptions, reasoning about lifted or combined theories is no harder than reasoning about the constituent dynamic theories and their common first-order structure (i.e., the "data theory"). We demonstrate HDL's utility by verifying an automotive case study in which a Java controller (formalized in Java dynamic logic) steers a plant model (formalized in differential dynamic logic).
General program equivalence is undecidable. However, if we abstract away the semantics of statements, then this problem becomes not just decidable, but practically feasible. For instance, a program of the form "if $b$ then $e$ else $f$" should be equivalent to "if not $b$ then $f$ else $e$" - no matter what $b$, $e$ and $f$ are. This kind of equivalence is known as propositional equivalence. In this extended abstract, we discuss recent developments in propositional program equivalence from the perspective of (Guarded) Kleene Algebra with Tests, or (G)KAT.
Probabilistic programming has become a standard practice to model stochastic events and learn about the behavior of nature in different scientific contexts, ranging from Genetics and Ecology to Linguistics and Psychology. However, domain practitioners (such as biologists) also need to be experts in statistics in order to select which probabilistic model is suitable for a given particular problem, relying then on probabilistic inference engines such as Stan, Pyro or Edward to fine-tune the parameters of that particular model. Probabilistic Programming would be more useful if the model selection is made automatic, without requiring statistics expertise from the end user. Automatically selecting the model is challenging because of the large search space of probabilistic programs needed to be explored, because the fact that most of that search space contains invalid programs, and because invalid programs may only be detected in some executions, due to its probabilistic nature. We propose a type system to statically reject invalid probabilistic programs, a type-directed synthesis algorithm that guarantees that generated programs are type-safe by construction, and an heuristic search procedure to handle the vast search space. We collect a number of probabilistic programs from the literature, and use them to compare our method with both a type-agnostic random search, and a data-guided method from the literature (DaPPer). Our results show that our technique both outperforms random search and DaPPer, specially on more complex programs. This drastic performance difference in synthesis allows for fast sampling of programs and enables techniques that previously suffered from the complexity of synthesis, such as Genetic Programming, to be applied.
Compilers play a central role in translating high-level code into executable programs, making their correctness essential for ensuring code safety and reliability. While extensive research has focused on verifying the correctness of compilers for single-language compilation, the correctness of cross-language compilation - which involves the interaction between two languages and their respective compilers - remains largely unexplored. To fill this research gap, we propose CrossLangFuzzer, a novel framework that introduces a universal intermediate representation (IR) for JVM-based languages and automatically generates cross-language test programs with diverse type parameters and complex inheritance structures. After generating the initial IR, CrossLangFuzzer applies three mutation techniques - LangShuffler, FunctionRemoval, and TypeChanger - to enhance program diversity. By evaluating both the original and mutated programs across multiple compiler versions, CrossLangFuzzer successfully uncovered 10 confirmed bugs in the Kotlin compiler, 4 confirmed bugs in the Groovy compiler, 7 confirmed bugs in the Scala 3 compiler, 2 confirmed bugs in the Scala 2 compiler, and 1 confirmed bug in the Java compiler. Among all mutators, TypeChanger is the most effective, detecting 11 of the 24 compiler bugs. Furthermore, we analyze the symptoms and root causes of cross-compilation bugs, examining the respective responsibilities of language compilers when incorrect behavior occurs during cross-language compilation. To the best of our knowledge, this is the firstwork specifically focused on identifying and diagnosing compiler bugs in cross-language compilation scenarios. Our research helps to understand these challenges and contributes to improving compiler correctness in multi-language environments.
We present a system of efficient methods for traversing and combining associative collection data structures. A distinguishing feature of the system is that, like traditional sequential iterator libraries, it does not require specialized compiler infrastructure or staged compilation for efficiency and composability. By using a representation based on indexed streams, the library can express complex joins over input collections while using no intermediate allocations. We implement the library for the Lean, Morphic, and Rust programming languages and provide a mechanized proof of functional correctness in Lean.
Prompt engineering for LLMs remains complex, with existing frameworks either hiding complexity behind restrictive APIs or providing inflexible canned patterns that resist customization -- making sophisticated agentic programming challenging. We present the Prompt Declaration Language (PDL), a novel approach to prompt representation that tackles this fundamental complexity by bringing prompts to the forefront, enabling manual and automatic prompt tuning while capturing the composition of LLM calls together with rule-based code and external tools. By abstracting away the plumbing for such compositions, PDL aims at improving programmer productivity while providing a declarative representation that is amenable to optimization. This paper demonstrates PDL's utility through a real-world case study of a compliance agent. Tuning the prompting pattern of this agent yielded up to 4x performance improvement compared to using a canned agent and prompt pattern.
We present Pyrosome, a generic framework for modular language metatheory that embodies a novel approach to extensible semantics and compilation, implemented in Coq. Common techniques for semantic reasoning are often tied to the specific structures of the languages and compilers that they support. In Pyrosome, verified compilers are fully extensible, meaning that to extend a language (even with a new kind of effect) simply requires defining and verifying the compilation of the new feature, reusing the old correctness theorem for all other cases. The novel enabling idea is an inductive formulation of equivalence preservation that supports the addition of new rules to the source language, target language, and compiler. Pyrosome defines a formal, deeply embedded notion of programming languages with semantics given by dependently sorted equational theories, so all compiler-correctness proofs boil down to type-checking and equational reasoning. We support vertical composition of any compilers expressed in our framework in addition to feature extension. As a case study, we present a multipass compiler from System F with simple references, through CPS translation and closure conversion. Specifically, we demonstrate how we can build such a compiler incrementally by starting with a compiler for simply typed lambda-calculus and adding natural numbers, the unit type, recursive functions, and a global heap, then extending judgments with a type environment and adding type abstraction, all while reusing the original theorems. We also present a linear version of the simply typed CPS pass and compile a small imperative language to the simply typed target to show how Pyrosome handles substructural typing and imperative features.
There is growing excitement about building software verifiers, synthesizers, and other Automated Reasoning (AR) tools by combining traditional symbolic algorithms and Large Language Models (LLMs). Unfortunately, the current practice for constructing such neurosymbolic AR systems is an ad hoc programming model that does not have the strong guarantees of traditional symbolic algorithms, nor a deep enough synchronization of neural networks and symbolic reasoning to unlock the full potential of LLM-powered reasoning. I propose Neurosymbolic Transition Systems as a principled computational model that can underlie infrastructure for building neurosymbolic AR tools. In this model, symbolic state is paired with intuition, and state transitions operate over symbols and intuition in parallel. I argue why this new paradigm can scale logical reasoning beyond current capabilities while retaining the strong guarantees of symbolic algorithms, and I sketch out how the computational model I propose can be reified in a logic programming language.
React has become the most widely used web front-end framework, enabling the creation of user interfaces in a declarative and compositional manner. Hooks are a set of APIs that manage side effects in functional components in React. However, their semantics are often seen as opaque to developers, leading to UI bugs. In this paper, we formalize the semantics of the essence of React Hooks we name React-tRace, providing a framework that clarifies their behavior. We demonstrate that our model captures the behavior of React, by theoretically showing that it embodies essential properties of Hooks and empirically comparing our React-tRace-definitional interpreter against a test suite. Furthermore, we showcase a practical visualization tool based on the formalization to demonstrate how developers can better understand the semantics of Hooks.