Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
The Gillespie algorithm and its extensions are commonly used for the simulation of chemical reaction networks. A limitation of these algorithms is that they have to process and update the system after every reaction, requiring significant computation. Another class of algorithms, based on the tau-leaping method, is able to simulate multiple reactions at a time at the cost of decreased accuracy. We present a new algorithm for the exact simulation of chemical reaction networks that is capable of sampling multiple reactions at a time via a first-order approximation similarly to the tau-leaping methods. We prove that the algorithm has an improved runtime complexity compared to existing methods for the exact simulation of chemical reaction networks, and present an efficient and easy to use implementation that outperforms existing methods in practice.
Antiterminators are essential components of bacterial transcriptional regulation, allowing the control of gene expression in response to fluctuating environmental conditions. Among them, RNA-binding antiterminator proteins play a major role in preventing transcription termination by binding to specific RNA sequences. These RNA-binding antiterminators have been extensively studied for their role in regulating various metabolic pathways. However, their function in modulating the physiology of pathogens requires further investigation. This review focuses on RNA-binding proteins displaying CAT (Co-AntiTerminator) or ANTAR (AmiR and NasR Transcription Antitermination Regulators) domains reported in model bacteria. In particular, their structures, mechanism of action, and target genes will be described. The involvement of the antitermination mechanisms in bacterial pathogenicity is also discussed. This knowledge is crucial for understanding the regulatory mechanisms that control bacterial virulence, and opens up exciting prospects for future research, and potentially new alternative strategies to combat infectious diseases.
Dynamic Flux Balance Analysis (dFBA) extends the capabilities of FBA to track changes in metabolite concentrations and fluxes over time in response to environmental conditions and cellular processes. The key idea of the method called dynamic parallel FBA (dpFBA) is to assign each species to a separate compartment, and to perform dFBA on the individual compartments while keeping track of the shared pool of external metabolites at each time interval. The software package COBRApy for Constraint-Based Modelling currently offers only an out-of-the-box dFBA implementation for the simulation of the batch growth of a single species over an interval divided in $n$ time steps. Here we describe a simple method to extend COBRApy dFBA to a community of species, by implementing dpFBA without any modification of the current package.
The dynamics of gene regulatory networks is governed by the interaction between deterministic biochemical reactions and molecular noise. To understand how gene regulatory networks process information during cell state transitions, we study stochastic dynamics derived from a Boolean network model via its representation on the parameter space of Gaussian distributions, equipped with the Fisher information metric. This reformulation reveals that the trajectories of optimal information transfer are gradient flows of the Kullback-Leibler divergence. We demonstrate that the most efficient dynamics require isotropic decay rates across all nodes and that the noise intensity quantitatively determines the potential differentiation between the initial and final states. Furthermore, we show that paths minimizing biological cost correspond to metric geodesics that require noise suppression, leading to biologically irrelevant deterministic dynamics. Our approach frames noise and decay rates as fundamental control parameters for cellular differentiation, providing a geometric principle for the analysis and design of synthetic networks.
Deciphering complex gene-gene interactions remains challenging in transcriptomics as traditional methods often miss higher-order and nonlinear dependencies. This study introduces a quantum-inspired framework leveraging tensor networks (TNs) to optimally map expression data into a lower dimensional representation preserving biological locality. Using Quantum Mutual Information (QMI), a nonparametric measure natural for tensor networks, we quantify gene dependencies and establish statistical significance via permutation testing. This constructs robust interaction networks where the edges reflect biologically meaningful relationships that are resilient to random chance. The approach effectively distinguishes true regulatory patterns from experimental noise and biological stochasticity. To test the proposed method, we recover a gene regulatory network consisted of six pathway genes from single-cell RNA sequencing data comprising over $28.000$ lymphoblastoid cells. Furthermore, we unveil several triadic regulatory mechanisms. By merging quantum physics inspired techniques with computational biology, our method provides novel insights into gene regulation, with applications in disease mechanisms and precision medicine.
Competitive inhibitors can, paradoxically, stimulate an enzymatic reaction at low to moderate doses. Competitive inhibition of an enzyme occurs when an inhibitor binds to the enzyme's binding site and blocks the enzyme's target molecule from binding. We recently proposed a detailed but straightforward mass action model for competitive inhibition of phosphoglycerate kinase 1 (PGK1) by Terazosin (TZ). The full PGK1 model has two substrates and two products which can be bound and released in either order, known as a random bi-bi mechanism. This model, with no further modification, predicts an increased reaction rate at low or moderate TZ doses, suggesting that stimulation is an intrinsic feature of competitive inhibition in enzymes with two products. This mechanism can aid in the development of novel therapies, particularly since enzyme activators are more rare and difficult to design than inhibitors. Here we propose a three-timescale reduction of that detailed model and show that the resulting rate equation retains three essential attributes of competitive inhibitor stimulation. These attributes are the biphasic dose response, the dependence on the relative rates of product dissociation from the binary and ternary complexes, and the parameter region where stimulation is possible. The resulting rate equation is a rational function which is a Monod function of each substrate, but quadratic in the denominator as a function of inhibitor dose.
Biomolecular Neural Networks (BNNs), artificial neural networks with biologically synthesizable architectures, achieve universal function approximation capabilities beyond simple biological circuits. However, training BNNs remains challenging due to the lack of target data. To address this, we propose leveraging Signal Temporal Logic (STL) specifications to define training objectives for BNNs. We build on the quantitative semantics of STL, enabling gradient-based optimization of the BNN weights, and introduce a learning algorithm that enables BNNs to perform regression and control tasks in biological systems. Specifically, we investigate two regression problems in which we train BNNs to act as reporters of dysregulated states, and a feedback control problem in which we train the BNN in closed-loop with a chronic disease model, learning to reduce inflammation while avoiding adverse responses to external infections. Our numerical experiments demonstrate that STL-based learning can solve the investigated regression and control tasks efficiently.
In previous works, we introduced the notion of dominant vertices. This is a set of nodes in the underlying network whose evolution determines the whole network's dynamics after a transient time. In this paper, we focus on the case of Boolean networks. We define a reduced graph on the dominant vertices and an induced automata network on this graph, which we prove is asymptotically equivalent to the original Boolean dynamics. Asymptotic conjugacy ensures that the systems, restricted to their respective attractors, are dynamically equivalent. For a significant class of networks, the induced automata network is indeed a reduction of the original system. In these cases, the reduction, which is obtained from the structure of dominant vertices, supplies a more tractable system with the same structure of attractors as the original one. Furthermore, the structure of the induced system allows us to establish bounds on the number and period of the attractors, as well as on the reduction of the basin's sizes and transient lengths. We illustrate this reduction by considering a class of networks, which we call clover networks, whose dominant set is a singleton. To get insight into the structure of the basins of attraction of Boolean networks with a single dominant vertex, we complement this work with a numerical exploration of the behavior of a parametrized ensemble of systems of this kind.
miRNA mRNA relations are closely linked to several biological processes and disease mechanisms In a recent study we tested the performance of large language models LLMs on extracting miRNA mRNA relations from PubMed PubMedBERT achieved the best performance of 0.783 F1 score for miRNA mRNA Interaction Corpus MMIC Here we first applied the finetuned PubMedBERT model to extract miRNA mRNA relations from PubMed for chronic obstructive pulmonary disease COPD Alzheimers disease AD stroke type 2 diabetes mellitus T2DM chronic liver disease and cancer Next we retrieved miRNA drug relations using KinderMiner a literature mining tool for relation extraction Then we constructed three interaction networks 1 disease centric network 2 drug centric network and 3 miRNA centric network comprising 3497 nodes and 16417 edges organized as a directed graph to capture complex biological relationships Finally we validated the drugs using MIMIC IV Our integrative approach revealed both established and novel candidate drugs for diseases under study through 595 miRNA drug relations extracted from PubMed To the best of our knowledge this is the first study to systematically extract and visualize relationships among four distinct biomedical entities miRNA mRNA drug and disease
Despite the thousands of genes implicated in age-related phenotypes, effective interventions for aging remain elusive, a lack of advance rooted in the multifactorial nature of longevity and the functional interconnectedness of the molecular components implicated in aging. Here, we introduce a network medicine framework that integrates 2,358 longevity-associated genes onto the human interactome to identify existing drugs that can modulate aging processes. We find that genes associated with each hallmark of aging form a connected subgraph, or hallmark module, a discovery enabling us to measure the proximity of 6,442 clinically approved or experimental compounds to each hallmark. We then introduce a transcription-based metric, $pAGE$, which evaluates whether the drug-induced expression shifts reinforce or counteract known age-related expression changes. By integrating network proximity and $pAGE$, we identify multiple drug repurposing candidate that not only target specific hallmarks but act to reverse their aging-associated transcriptional changes. Our findings are interpretable, revealing for each drug the molecular mechanisms through which it modulates the hallmark, offering an experimentally falsifiable framework to leverage genomic discoveries to accelerate drug repurposing for longevity.
Boolean networks are a powerful and popular modeling framework in systems biology, enabling the study of complex processes underlying gene regulation, signal transduction, and cellular decision-making. Most biological networks exhibit a high degree of canalization, a property of the Boolean update rules that stabilizes network dynamics. Despite its importance, existing software packages provide hardly any support for generating Boolean networks with defined canalization properties. We present BoolForge, a Python toolbox for the analysis and random generation of Boolean functions and networks, with a particular focus on canalization. BoolForge allows users to (i) generate random Boolean functions with specified canalizing depth, layer structure, or other structural constraints; (ii) construct random Boolean networks with tunable topological and functional properties; and (iii) compute structural and dynamical features including network attractors, robustness, and modularity. BoolForge enables researchers to rapidly prototype biological Boolean network models, explore the relationship between structure and dynamics, and generate ensembles of networks for statistical analysis. It is lightweight, adaptable, and fully compatible with existing Boolean network analysis tools. BoolForge is implemented in Python (version3.8+), with no platform-specific dependencies. The software is distributed under the MIT License and will be maintained for at least two years following publication. Source code, documentation, and tutorial notebooks are freely available at: https://github.com/ckadelka/BoolForge. BoolForge can be installed via pip install git+https://github.com/ckadelka/BoolForge.
The M{\O}D computational framework implements rule-based generative chemistries as explicit transformations of graphs representing chemical structural formulae. Here, we expand M{\O}D by a stochastic simulation module that simulates the time evolution of species concentrations using Gillespie's well-known stochastic simulation algorithm (SSA). This module distinguishes itself among competing implementations of rule-based stochastic simulation engines by its flexible network expansion mechanism and its functionality for defining custom reaction rate functions. It enables direct sampling from actual reactions instead of rules. We present methodology and implementation details followed by examples which demonstrate the capabilities of the stochastic simulation engine.
Understanding the modular structure and central elements of complex biological networks is critical for uncovering system-level mechanisms in disease. Here, we constructed weighted gene co-expression networks from bulk RNA-seq data of rheumatoid arthritis (RA) synovial tissue, using pairwise correlation and a percolation-guided thresholding strategy. Community detection with Louvain and Leiden algorithms revealed robust modules, and node-strength ranking identified the top 50 hub genes globally and within communities. To assess novelty, we integrated genome-wide association studies (GWAS) with literature-based evidence from PubMed, highlighting five high-centrality genes with little to no prior RA-specific association. Functional enrichment confirmed their roles in immune-related processes, including adaptive immune response and lymphocyte regulation. Notably, these hubs showed strong positive correlations with T- and B-cell markers and negative correlations with NK-cell markers, consistent with RA immunopathology. Overall, our framework demonstrates how correlation-based network construction, modularity-driven clustering, and centrality-guided novelty scoring can jointly reveal informative structure in omics-scale data. This generalizable approach offers a scalable path to gene prioritization in RA and other autoimmune conditions.
Biological pathways map gene-gene interactions that govern all human processes. Despite their importance, most ML models treat genes as unstructured tokens, discarding known pathway structure. The latest pathway-informed models capture pathway-pathway interactions, but still treat each pathway as a "bag of genes" via MLPs, discarding its topology and gene-gene interactions. We propose a Graph Attention Network (GAT) framework that models pathways at the gene level. We show that GATs generalize much better than MLPs, achieving an 81% reduction in MSE when predicting pathway dynamics under unseen treatment conditions. We further validate the correctness of our biological prior by encoding drug mechanisms via edge interventions, boosting model robustness. Finally, we show that our GAT model is able to correctly rediscover all five gene-gene interactions in the canonical TP53-MDM2-MDM4 feedback loop from raw time-series mRNA data, demonstrating potential to generate novel biological hypotheses directly from experimental data.
Microbiomes are a vital part of the human body, engaging in tasks like food digestion and immune defense. Their structure and function must be understood in order to promote host health and facilitate swift recovery during disease. Due to the difficulties in experimentally studying these systems in situ, more research is being conducted in the field of mathematical modeling. Visualizing spatiotemporal data is challenging, and current tools that simulate microbial communities' spatial and temporal development often only provide limited functionalities, often requiring expert knowledge to generate useful results. To overcome these limitations, we provide a user-friendly tool to interactively explore spatiotemporal simulation data, called MicroLabVR, which transfers spatial data into virtual reality (VR) while following guidelines to enhance user experience (UX). With MicroLabVR, users can import CSV datasets containing population growth, substance concentration development, and metabolic flux distribution data. The implemented visualization methods allow users to evaluate the dataset in a VR environment interactively. MicroLabVR aims to improve data analysis for the user by allowing the exploration of microbiome data in their spatial context.
Efficient information processing is crucial for both living organisms and engineered systems. The mutual information rate, a core concept of information theory, quantifies the amount of information shared between the trajectories of input and output signals, and enables the quantification of information flow in dynamic systems. A common approach for estimating the mutual information rate is the Gaussian approximation which assumes that the input and output trajectories follow Gaussian statistics. However, this method is limited to linear systems, and its accuracy in nonlinear or discrete systems remains unclear. In this work, we assess the accuracy of the Gaussian approximation for non-Gaussian systems by leveraging Path Weight Sampling (PWS), a recent technique for exactly computing the mutual information rate. In two case studies, we examine the limitations of the Gaussian approximation. First, we focus on discrete linear systems and demonstrate that, even when the system's statistics are nearly Gaussian, the Gaussian approximation fails to accurately estimate the mutual information rate. Second, we explore a continuous diffusive system with a nonlinear transfer function, revealing significant deviations between the Gaussian approximation and the exact mutual information rate as nonlinearity increases. Our results provide a quantitative evaluation of the Gaussian approximation's performance across different stochastic models and highlight when more computationally intensive methods, such as PWS, are necessary.
Computationally inferring mechanistic insights from typical biological data is a challenging pursuit. Even the highest-quality experimental data come with challenges. There are always sources of noise, a limit to how often we can measure the system, and we can rarely measure all the relevant states that participate in the underlying complexity. There are usually sources of uncertainty in model development, which give rise to multiple competing model structures. To underscore the need for further analysis of structural uncertainty in modeling, we use a meta-analysis across six journals covering mathematical biology and show that a huge number of models for biological systems are developed each year, but model selection and comparison across model structures appear to be less common. We walk through a case study involving inference of regulatory network structure involved in a developmental decision in the nematode, \textit{Pristonchus pacificus}. We use real biological data and compare across 13,824 models--each corresponding to a different regulatory network structure, to determine which regulatory features are supported by the data across three experimental conditions. We find that the best-fitting models for each experimental condition share a combination of features and identify a regulatory network that is common across the model sets for each condition. This model can describe the data across the experimental conditions we considered and exhibits a high degree of positive regulation and interconnectivity between the key regulators, \textit{eud-1}, $textit{sult-1}, and \textit{nhr-40}. While the biological results are specific to the molecular biology of development in \textit{Pristonchus pacificus}, the general modeling framework and underlying challenges we faced doing this analysis are widespread across biology, chemistry, physics, and many other scientific disciplines.
We propose a direct optimization framework for learning reduced and sparse chemical reaction networks (CRNs) from time-series trajectory data. In contrast to widely used indirect methods-such as those based on sparse identification of nonlinear dynamics (SINDy)-which infer reaction dynamics by fitting numerically estimated derivatives, our approach fits entire trajectories by solving a dynamically constrained optimization problem. This formulation enables the construction of reduced CRNs that are both low-dimensional and sparse, while preserving key dynamical behaviors of the original system. We develop an accelerated proximal gradient algorithm to efficiently solve the resulting non-convex optimization problem. Through illustrative examples, including a Drosophila circadian oscillator and a glycolytic oscillator, we demonstrate the ability of our method to recover accurate and interpretable reduced-order CRNs. Notably, the direct approach avoids the derivative estimation step and mitigates error accumulation issues inherent in indirect methods, making it a robust alternative for data-driven CRN realizations.