Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
Neural dynamics underlie behaviors from memory to sleep, yet identifying mechanisms for higher-order phenomena (e.g., social interaction) is experimentally challenging. Existing whole-brain models often fail to scale to single-neuron resolution, omit behavioral readouts, or rely on PCA/conv pipelines that miss long-range, non-linear interactions. We introduce a sparse-attention whole-brain foundation model (SBM) for larval zebrafish that forecasts neuron spike probabilities conditioned on sensory stimuli and links brain state to behavior. SBM factorizes attention across neurons and along time, enabling whole-brain scale and interpretability. On a held-out subject, it achieves mean absolute error <0.02 with calibrated predictions and stable autoregressive rollouts. Coupled to a permutation-invariant behavior head, SBM enables gradient-based synthesis of neural patterns that elicit target behaviors. This framework supports rapid, behavior-grounded exploration of complex neural phenomena.
We model sensory streams as observations from high-dimensional stochastic dynamical systems and conceptualize sensory neurons as self-supervised learners of compact representations of such dynamics. From prior experience, neurons learn coherent sets-regions of stimulus state space whose trajectories evolve cohesively over finite times-and assign membership indices to new stimuli. Coherent sets are identified via spectral clustering of the stochastic Koopman operator (SKO), where the sign pattern of a subdominant singular function partitions the state space into minimally coupled regions. For multivariate Ornstein-Uhlenbeck processes, this singular function reduces to a linear projection onto the dominant singular vector of the whitened state-transition matrix. Encoding this singular vector as a receptive field enables neurons to compute membership indices via the projection sign in a biologically plausible manner. Each neuron detects either a predictive coherent set (stimuli with common futures) or a retrospective coherent set (stimuli with common pasts), suggesting a functional dichotomy among neurons. Since neurons lack access to explicit dynamical equations, the requisite singular vectors must be estimated directly from data, for example, via past-future canonical correlation analysis on lag-vector representations-an approach that naturally extends to nonlinear dynamics. This framework provides a novel account of neuronal temporal filtering, the ubiquity of rectification in neural responses, and known functional dichotomies. Coherent-set clustering thus emerges as a fundamental computation underlying sensory processing and transferable to bio-inspired artificial systems.
This overview of integrated information theory (IIT) emphasizes IIT's "consciousness-first" approach to what exists. Consciousness demonstrates to each of us that something exists--experience--and reveals its essential properties--the axioms of phenomenal existence. IIT formulates these properties operationally, yielding the postulates of physical existence. To exist intrinsically or absolutely, an entity must have cause-effect power upon itself, in a specific, unitary, definite and structured manner. IIT's explanatory identity claims that an entity's cause-effect structure accounts for all properties of an experience--essential and accidental--with no additional ingredients. These include the feeling of spatial extendedness, temporal flow, of objects binding general concepts with particular configurations of features, and of qualia such as colors and sounds. IIT's intrinsic ontology has implications for understanding meaning, perception, and free will, for assessing consciousness in patients, infants, other species, and artifacts, and for reassessing our place in nature.
Reconstructing images seen by people from their fMRI brain recordings provides a non-invasive window into the human brain. Despite recent progress enabled by diffusion models, current methods often lack faithfulness to the actual seen images. We present "Brain-IT", a brain-inspired approach that addresses this challenge through a Brain Interaction Transformer (BIT), allowing effective interactions between clusters of functionally-similar brain-voxels. These functional-clusters are shared by all subjects, serving as building blocks for integrating information both within and across brains. All model components are shared by all clusters & subjects, allowing efficient training with a limited amount of data. To guide the image reconstruction, BIT predicts two complementary localized patch-level image features: (i)high-level semantic features which steer the diffusion model toward the correct semantic content of the image; and (ii)low-level structural features which help to initialize the diffusion process with the correct coarse layout of the image. BIT's design enables direct flow of information from brain-voxel clusters to localized image features. Through these principles, our method achieves image reconstructions from fMRI that faithfully reconstruct the seen images, and surpass current SotA approaches both visually and by standard objective metrics. Moreover, with only 1-hour of fMRI data from a new subject, we achieve results comparable to current methods trained on full 40-hour recordings.
In control problems and basic scientific modeling, it is important to compare observations with dynamical simulations. For example, comparing two neural systems can shed light on the nature of emergent computations in the brain and deep neural networks. Recently, Ostrow et al. (2023) introduced Dynamical Similarity Analysis (DSA), a method to measure the similarity of two systems based on their recurrent dynamics rather than geometry or topology. However, DSA does not consider how inputs affect the dynamics, meaning that two similar systems, if driven differently, may be classified as different. Because real-world dynamical systems are rarely autonomous, it is important to account for the effects of input drive. To this end, we introduce a novel metric for comparing both intrinsic (recurrent) and input-driven dynamics, called InputDSA (iDSA). InputDSA extends the DSA framework by estimating and comparing both input and intrinsic dynamic operators using a variant of Dynamic Mode Decomposition with control (DMDc) based on subspace identification. We demonstrate that InputDSA can successfully compare partially observed, input-driven systems from noisy data. We show that when the true inputs are unknown, surrogate inputs can be substituted without a major deterioration in similarity estimates. We apply InputDSA on Recurrent Neural Networks (RNNs) trained with Deep Reinforcement Learning, identifying that high-performing networks are dynamically similar to one another, while low-performing networks are more diverse. Lastly, we apply InputDSA to neural data recorded from rats performing a cognitive task, demonstrating that it identifies a transition from input-driven evidence accumulation to intrinsically-driven decision-making. Our work demonstrates that InputDSA is a robust and efficient method for comparing intrinsic dynamics and the effect of external input on dynamical systems.
It is known that listeners lose the ability to discriminate the direction of motion of a revolving sound (clockwise vs. counterclockwise) beyond a critical velocity ("the upper limit"), primarily due to degraded front-back discrimination. Little is known about how this ability is affected by simultaneously present distractor sounds, despite the real-life importance of tracking moving sounds in the presence of distractors. We hypothesized that the presence of a static distractor sound would impair the perception of moving target sounds and reduce the upper limit, and show that this is indeed the case. A distractor on the right was as effective as a distractor at the front in reducing the upper limit despite the importance of resolving front-back confusions. By manipulating the spectral content of both the target and distractor, we found that the upper limit was reduced if and only if the distractor spectrally overlaps with the target in the frequency range relevant for front/back discrimination; energetic masking thus explains the upper limit reduction by the distractor. We did not find any evidence for informational masking by the distractor. Our findings form the first steps towards a better understanding of the tracking of multiple sounds in the presence of distractors.
The rapid aging of societies is intensifying demand for autonomous care robots; however, most existing systems are task-specific and rely on handcrafted preprocessing, limiting their ability to generalize across diverse scenarios. A prevailing theory in cognitive neuroscience proposes that the human brain operates through hierarchical predictive processing, which underlies flexible cognition and behavior by integrating multimodal sensory signals. Inspired by this principle, we introduce a hierarchical multimodal recurrent neural network grounded in predictive processing under the free-energy principle, capable of directly integrating over 30,000-dimensional visuo-proprioceptive inputs without dimensionality reduction. The model was able to learn two representative caregiving tasks, rigid-body repositioning and flexible-towel wiping, without task-specific feature engineering. We demonstrate three key properties: (i) self-organization of hierarchical latent dynamics that regulate task transitions, capture variability in uncertainty, and infer occluded states; (ii) robustness to degraded vision through visuo-proprioceptive integration; and (iii) asymmetric interference in multitask learning, where the more variable wiping task had little influence on repositioning, whereas learning the repositioning task led to a modest reduction in wiping performance, while the model maintained overall robustness. Although the evaluation was limited to simulation, these results establish predictive processing as a universal and scalable computational principle, pointing toward robust, flexible, and autonomous caregiving robots while offering theoretical insight into the human brain's ability to achieve flexible adaptation in uncertain real-world environments.
In this work, we examine the conditions for the emergence of chimera-like states in Ising systems. We study an Ising chain with periodic boundaries in contact with a thermal bath at temperature T, that induces stochastic changes in spin variables. To capture the non-locality needed for chimera formation, we introduce a model setup with non-local diffusion of spin values through the whole system. More precisely, diffusion is modeled through spin-exchange interactions between units up to a distance R, using Kawasaki dynamics. This setup mimics, e.g., neural media, as the brain, in the presence of electrical (diffusive) interactions. We explored the influence of such non-local dynamics on the emergence of complex spatiotemporal synchronization patterns of activity. Depending on system parameters we report here for the first time chimera-like states in the Ising model, characterized by relatively stable moving domains of spins with different local magnetization. We analyzed the system at T=0, both analytically and via simulations and computed the system's phase diagram, revealing rich behavior: regions with only chimeras, coexistence of chimeras and stable domains, and metastable chimeras that decay into uniform stable domains. This study offers fundamental insights into how coherent and incoherent synchronization patterns can arise in complex networked systems as it is, e.g., the brain.
We present a unified field-theoretic framework for the dynamics of activity and connectivity in interacting neuronal systems. Building upon previous works, where a field approach to activity--connectivity dynamics, formation of collective states and effective fields of collective states were successively introduced, the present paper synthesizes and extends these results toward a general description of multiple hierarchical collective structures. Starting with the dynamical system representing collective states in terms of connections, activity levels, and internal frequencies, we analyze its stability, emphasizing the possibility of transitions between configurations. Then, turning to the field formalism of collective states, we extend this framework to include substructures (subobjects) participating in larger assemblies while retaining intrinsic properties. We define activation classes describing compatible or independent activity patterns between objects and subobjects, and study stability conditions arising from their alignment or mismatch. The global system is described as the collection of landscapes of coexisting and interacting collective states, each characterized both by continuous (activity, frequency) and discrete (class) variables. A corresponding field formalism is developed, with an action functional incorporating both internal dynamics and interaction terms. This nonlinear field model captures cascading transitions between collective states and the formation of composite structures, providing a coherent theoretical basis for emergent neuronal assemblies and their mutual couplings.
Object binding, the brain's ability to bind the many features that collectively represent an object into a coherent whole, is central to human cognition. It groups low-level perceptual features into high-level object representations, stores those objects efficiently and compositionally in memory, and supports human reasoning about individual object instances. While prior work often imposes object-centric attention (e.g., Slot Attention) explicitly to probe these benefits, it remains unclear whether this ability naturally emerges in pre-trained Vision Transformers (ViTs). Intuitively, they could: recognizing which patches belong to the same object should be useful for downstream prediction and thus guide attention. Motivated by the quadratic nature of self-attention, we hypothesize that ViTs represent whether two patches belong to the same object, a property we term IsSameObject. We decode IsSameObject from patch embeddings across ViT layers using a similarity probe, which reaches over 90% accuracy. Crucially, this object-binding capability emerges reliably in self-supervised ViTs (DINO, MAE, CLIP), but markedly weaker in ImageNet-supervised models, suggesting that binding is not a trivial architectural artifact, but an ability acquired through specific pretraining objectives. We further discover that IsSameObject is encoded in a low-dimensional subspace on top of object features, and that this signal actively guides attention. Ablating IsSameObject from model activations degrades downstream performance and works against the learning objective, implying that emergent object binding naturally serves the pretraining objective. Our findings challenge the view that ViTs lack object binding and highlight how symbolic knowledge of "which parts belong together" emerges naturally in a connectionist system.
We ask where, and under what conditions, dyslexic reading costs arise in a large-scale naturalistic reading dataset. Using eye-tracking aligned to word-level features (word length, frequency, and predictability), we model how each feature influences dyslexic time costs. We find that all three features robustly change reading times in both typical and dyslexic readers, and that dyslexic readers show stronger sensitivities to each, especially predictability. Counterfactual manipulations of these features substantially narrow the dyslexic-control gap by about one third, with predictability showing the strongest effect, followed by length and frequency. These patterns align with dyslexia theories that posit heightened demands on linguistic working memory and phonological encoding, and they motivate further work on lexical complexity and parafoveal preview benefits to explain the remaining gap. In short, we quantify when extra dyslexic costs arise, how large they are, and offer actionable guidance for interventions and computational models for dyslexics.
Boundary Vector Cells (BVCs) are a class of neurons in the brains of vertebrates that encode environmental boundaries at specific distances and allocentric directions, playing a central role in forming place fields in the hippocampus. Most computational BVC models are restricted to two-dimensional (2D) environments, making them prone to spatial ambiguities in the presence of horizontal symmetries in the environment. To address this limitation, we incorporate vertical angular sensitivity into the BVC framework, thereby enabling robust boundary detection in three dimensions, and leading to significantly more accurate spatial localization in a biologically-inspired robot model. The proposed model processes LiDAR data to capture vertical contours, thereby disambiguating locations that would be indistinguishable under a purely 2D representation. Experimental results show that in environments with minimal vertical variation, the proposed 3D model matches the performance of a 2D baseline; yet, as 3D complexity increases, it yields substantially more distinct place fields and markedly reduces spatial aliasing. These findings show that adding a vertical dimension to BVC-based localization can significantly enhance navigation and mapping in real-world 3D spaces while retaining performance parity in simpler, near-planar scenarios.
A critical visual computation is to construct global scene properties from activities of early visual cortical neurons which have small receptive fields. Such a computation is enabled by contextual influences, through which a neuron's response to visual inputs is influenced by contextual inputs outside its classical receptive fields. Accordingly, neurons can signal global properties including visual saliencies and figure-ground relationships. Many believe that intracortical axons conduct signals too slowly to bring the contextual information from receptive fields of other neurons. A popular opinion is that much of the contextual influences arise from feedback from higher visual areas whose neurons have larger receptive fields. This paper re-examines pre-existing data to reveal these unexpected findings: the conduction speed of V1 intracortical axons increases approximately linearly with the conduction distance, and is sufficiently high for conveying the contextual influences. Recognizing the importance of intracortical contribution to critical visual computations should enable fresh progress in answering long-standing questions.
Linearly transforming stimulus representations of deep neural networks yields high-performing models of behavioral and neural responses to complex stimuli. But does the test accuracy of such predictions identify genuine representational alignment? We addressed this question through a large-scale model-recovery study. Twenty diverse vision models were linearly aligned to 4.5 million behavioral judgments from the THINGS odd-one-out dataset and calibrated to reproduce human response variability. For each model in turn, we sampled synthetic responses from its probabilistic predictions, fitted all candidate models to the synthetic data, and tested whether the data-generating model would re-emerge as the best predictor of the simulated data. Model recovery accuracy improved with training-set size but plateaued below 80%, even at millions of simulated trials. Regression analyses linked misidentification primarily to shifts in representational geometry induced by the linear transformation, as well as to the effective dimensionality of the transformed features. These findings demonstrate that, even with massive behavioral data, overly flexible alignment metrics may fail to guide us toward artificial representations that are genuinely more human-aligned. Model comparison experiments must be designed to balance the trade-off between predictive accuracy and identifiability-ensuring that the best-fitting model is also the right one.
Neural recording implants are a crucial tool for both neuroscience research and enabling new clinical applications. The power consumption of high channel count implants is dominated by the circuits used to amplify and digitize neural signals. Since circuit designers have pushed the efficiency of these circuits close to the theoretical physical limits, reducing power further requires system level optimization. Recent advances use a strategy called channel selection, in which less important channels are turned off to save power. We demonstrate resolution reconfiguration, in which the resolution of less important channels is scaled down to save power. Our approach leverages variable importance of each channel inside machine-learning-based decoders and we trial this methodology across three applications: seizure detection, gesture recognition, and force regression. With linear decoders, resolution reconfiguration saves 8.7x, 12.8x, and 23.0x power compared to a traditional recording array for each task respectively. It further saves 1.6x, 3.4x, and 5.2x power compared to channel selection. The results demonstrate the power benefits of resolution reconfigurable front-ends and their wide applicability to neural decoding problems.
Background: Upper-limb weakness and tremor (4--12 Hz) limit activities of daily living (ADL) and reduce adherence to home rehabilitation. Objective: To assess technical feasibility and clinician-relevant signals of a sensor-fused wearable targeting the triceps brachii and extensor pollicis brevis. Methods: A lightweight node integrates surface EMG (1 kHz), IMU (100--200 Hz), and flex/force sensors with on-device INT8 inference (Tiny 1D-CNN/Transformer) and a safety-bounded assist policy (angle/torque/jerk limits; stall/time-out). Healthy adults (n = 12) performed three ADL-like tasks. Primary outcomes: Tremor Index (TI), range of motion (ROM), repetitions (Reps min$^{-1}$). Secondary: EMG median-frequency slope (fatigue trend), closed-loop latency, session completion, and device-related adverse events. Analyses used subject-level paired medians with BCa 95\% CIs; exact Wilcoxon $p$-values are reported in the Results. Results: Assistance was associated with lower tremor prominence and improved task throughput: TI decreased by $-0.092$ (95\% CI [$-0.102$, $-0.079$]), ROM increased by $+12.65\%$ (95\% CI [$+8.43$, $+13.89$]), and Reps rose by $+2.99$ min$^{-1}$ (95\% CI [$+2.61$, $+3.35$]). Median on-device latency was 8.7 ms at a 100 Hz loop rate; all sessions were completed with no device-related adverse events. Conclusions: Multimodal sensing with low-latency, safety-bounded assistance produced improved movement quality (TI $\downarrow$) and throughput (ROM, Reps $\uparrow$) in a pilot technical-feasibility setting, supporting progression to IRB-approved patient studies. Trial registration: Not applicable (pilot non-clinical).
Understanding how the human brain progresses from processing simple linguistic inputs to performing high-level reasoning is a fundamental challenge in neuroscience. While modern large language models (LLMs) are increasingly used to model neural responses to language, their internal representations are highly "entangled," mixing information about lexicon, syntax, meaning, and reasoning. This entanglement biases conventional brain encoding analyses toward linguistically shallow features (e.g., lexicon and syntax), making it difficult to isolate the neural substrates of cognitively deeper processes. Here, we introduce a residual disentanglement method that computationally isolates these components. By first probing an LM to identify feature-specific layers, our method iteratively regresses out lower-level representations to produce four nearly orthogonal embeddings for lexicon, syntax, meaning, and, critically, reasoning. We used these disentangled embeddings to model intracranial (ECoG) brain recordings from neurosurgical patients listening to natural speech. We show that: 1) This isolated reasoning embedding exhibits unique predictive power, accounting for variance in neural activity not explained by other linguistic features and even extending to the recruitment of visual regions beyond classical language areas. 2) The neural signature for reasoning is temporally distinct, peaking later (~350-400ms) than signals related to lexicon, syntax, and meaning, consistent with its position atop a processing hierarchy. 3) Standard, non-disentangled LLM embeddings can be misleading, as their predictive success is primarily attributable to linguistically shallow features, masking the more subtle contributions of deeper cognitive processing.
Understanding how creativity is represented in the brain's intrinsic functional architecture remains a central challenge in cognitive neuroscience. While resting-state fMRI studies have revealed large-scale network correlates of creative potential, electroencephalography (EEG) offers a temporally precise and scalable approach to capture the fast oscillatory dynamics that underlie spontaneous neural organization. In this study, we used a data-driven network approach to examine whether resting-state EEG connectivity patterns differentiate individuals according to their creative abilities. Creativity was evaluated by: The Inventory of Creative Activities and Achievements (ICAA), The Divergent Association Task (DAT), The Matchstick Arithmetic Puzzles Task (MAPT) and Self-rating (SR) of creative ability in 30 healthy young adults. Graph-theoretical analyses were applied to functional connectivity matrices and clustered based on graph similarity. Two distinct participant clusters emerged, differing systematically across multiple dimensions of creativity. Cluster 1, characterized by consistently higher performance across multiple creativity variables (ICAA, DAT, MAPT and SR), showed broad alpha-band hypoconnectivity, relatively preserved left frontal connectivity and greater network modularity. Cluster 0, associated with lower creativity scores, exhibited stronger overall connectivity strength, reduced modularity and higher local clustering. These findings suggest that resting-state EEG connectivity patterns can index stable cognitive traits such as creativity. More broadly, they point to an intrinsic neural signature of adaptive brain function marked by efficient yet flexible network organization that may support creative and adaptive cognition.