Programmable Co-Transcriptional Splicing: Realizing Regular Languages via Hairpin Deletion
Abstract
RNA co-transcriptionality, where RNA is spliced or folded during transcription from DNA templates, offers promising potential for molecular programming. It enables programmable folding of nano-scale RNA structures and has recently been shown to be Turing universal. While post-transcriptional splicing is well studied, co-transcriptional splicing is gaining attention for its efficiency, though its unpredictability still remains a challenge. In this paper, we focus on engineering co-transcriptional splicing, not only as a natural phenomenon but as a programmable mechanism for generating specific RNA target sequences from DNA templates. The problem we address is whether we can encode a set of RNA sequences for a given system onto a DNA template word, ensuring that all the sequences are generated through co-transcriptional splicing. Given that finding the optimal encoding has been shown to be NP-complete under the various energy models considered, we propose a practical alternative approach under the logarithmic energy model. More specifically, we provide a construction that encodes an arbitrary nondeterministic finite automaton (NFA) into a circular DNA template from which co-transcriptional splicing produces all sequences accepted by the NFA. As all finite languages can be efficiently encoded as NFA, this framework solves the problem of finding small DNA templates for arbitrary target sets of RNA sequences. The quest to obtain the smallest possible such templates naturally leads us to consider the problem of minimizing NFA and certain practically motivated variants of it, but as we show, those minimization problems are computationally intractable.