Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
This study explored how advanced budgeting techniques and economic indicators influence funding levels and strategic alignment in California Community Colleges (CCCs). Despite widespread implementation of budgeting reforms, many CCCs continue to face challenges aligning financial planning with institutional missions, particularly in supporting diversity, equity, and inclusion (DEI) initiatives. The study used a quantitative correlational design, analyzing 30 years of publicly available economic data, including unemployment rates, GDP growth, and CPI, in relation to CCC funding trends. Results revealed a strong positive correlation between GDP growth and CCC funding levels, as well as between CPI and funding levels, underscoring the predictive value of macroeconomic indicators in budget planning. These findings emphasize the need for educational leaders to integrate economic forecasting into budget planning processes to safeguard institutional effectiveness and sustain programs serving underrepresented student populations.
We formalize the paradox of an omniscient yet lazy investor - a perfectly informed agent who trades infrequently due to execution or computational frictions. Starting from a deterministic geometric construction, we derive a closed-form expected profit function linking trading frequency, execution cost, and path roughness. We prove existence and uniqueness of the optimal trading frequency and show that this optimum can be interpreted through the fractal dimension of the price path. A stochastic extension under fractional Brownian motion provides analytical expressions for the optimal interval and comparative statics with respect to the Hurst exponent. Empirical illustrations on equity data confirm the theoretical scaling behavior.
Observation-driven Dirichlet models for compositional time series often use the additive log-ratio (ALR) link and include a moving-average (MA) term built from ALR residuals. In the standard B--DARMA recursion, the usual MA regressor $\alr(\mathbf{Y}_t)-\boldsymbol{\eta}_t$ has nonzero conditional mean under the Dirichlet likelihood, which biases the mean path and blurs the interpretation of MA coefficients. We propose a minimal change: replace the raw regressor with a \emph{centered} innovation $\boldsymbol{\epsilon}_t^{\circ}=\alr(\mathbf{Y}_t)-\mathbb{E}\{\alr(\mathbf{Y}_t)\mid \boldsymbol{\eta}_t,\phi_t\}$, computable in closed form via digamma functions. Centering restores mean-zero innovations for the MA block without altering either the likelihood or the ALR link. We provide simple identities for the conditional mean and the forecast recursion, show first-order equivalence to a digamma-link DARMA while retaining a closed-form inverse to $\boldsymbol{\mu}_t$, and give ready-to-use code. A weekly application to the Federal Reserve H.8 bank-asset composition compares the original (raw-MA) and centered specifications under a fixed holdout and rolling one-step origins. The centered formulation improves log predictive scores with essentially identical point error and markedly cleaner Hamiltonian Monte Carlo diagnostics.
This study presents a three-step machine learning framework to predict bubbles in the S&P 500 stock market by combining financial news sentiment with macroeconomic indicators. Building on traditional econometric approaches, the proposed approach predicts bubble formation by integrating textual and quantitative data sources. In the first step, bubble periods in the S&P 500 index are identified using a right-tailed unit root test, a widely recognized real-time bubble detection method. The second step extracts sentiment features from large-scale financial news articles using natural language processing (NLP) techniques, which capture investors' expectations and behavioral patterns. In the final step, ensemble learning methods are applied to predict bubble occurrences based on high sentiment-based and macroeconomic predictors. Model performance is evaluated through k-fold cross-validation and compared against benchmark machine learning algorithms. Empirical results indicate that the proposed three-step ensemble approach significantly improves predictive accuracy and robustness, providing valuable early warning insights for investors, regulators, and policymakers in mitigating systemic financial risks.
Artificial intelligence techniques have increasingly been applied to understand the complex relationship between public sentiment and financial market behaviour. This study explores the relationship between the sentiment of news related to the Russia-Ukraine war and the volatility of the stock market. A comprehensive dataset of news articles from major US platforms, published between January 1 and July 17, 2024, was analysed using a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model adapted for financial language. We extracted sentiment scores and applied a Generalised Autoregressive Conditional Heteroscedasticity (GARCH) model, enhanced with a Student-t distribution to capture the heavy-tailed nature of financial returns data. The results reveal a statistically significant negative relationship between negative news sentiment and market stability, suggesting that pessimistic war coverage is associated with increased volatility in the S&P 500 index. This research demonstrates how artificial intelligence and natural language processing can be integrated with econometric modelling to assess real-time market dynamics, offering valuable tools for financial risk analysis during geopolitical crises.
Despite accounting for 96.1% of all businesses in Malaysia, access to financing remains one of the most persistent challenges faced by Micro, Small, and Medium Enterprises (MSMEs). Newly established or young businesses are often excluded from formal credit markets as traditional underwriting approaches rely heavily on credit bureau data. This study investigates the potential of bank statement data as an alternative data source for credit assessment to promote financial inclusion in emerging markets. Firstly, we propose a cash flow-based underwriting pipeline where we utilise bank statement data for end to end data extraction and machine learning credit scoring. Secondly, we introduce a novel dataset of 611 loan applicants from a Malaysian lending institution. Thirdly, we develop and evaluate credit scoring models based on application information and bank transaction-derived features. Empirical results show that the use of such data boosts the performance of all models on our dataset, which can improve credit scoring for new-to-lending MSMEs. Lastly, we intend to release the anonymised bank transaction dataset to facilitate further research on MSMEs financial inclusion within Malaysia's emerging economy.
We present the unified market-based description of returns and variances of the trades with shares of a particular security, of the trades with shares of all securities in the market, and of the trades with the market portfolio. We consider the investor who doesn't trade the shares of his portfolio he collected at time t0 in the past. The investor observes the time series of the current trades with all securities made in the market during the averaging interval. The investor may convert these time series into the time series that model the trades with all securities as the trades with a single security and into the time series that model the trades with the market portfolio as the trades with a single security. That establishes the same description of the returns and variances of the trades with a single security, the trades with all securities in the market, and the market portfolio. We show that the market-based variance, which accounts for the impact of random change of the volumes of consecutive trades with securities, takes the form of Markowitz's (1952) portfolio variance if the volumes of consecutive trades with all market securities are assumed constant. That highlights that Markowitz's (1952) variance ignores the effects of random volumes of consecutive trades. We compare the market-based variances of the market portfolio and of the trades with all market securities, consider the importance of the duration of the averaging interval, and explain the economic obstacles that limit the accuracy of the predictions of the returns and variances at best by Gaussian distributions. The same methods describe the returns and variances of any portfolio and the trades with its securities.
Multifractality in time series analysis characterizes the presence of multiple scaling exponents, indicating heterogeneous temporal structures and complex dynamical behaviors beyond simple monofractal models. In the context of digital currency markets, multifractal properties arise due to the interplay of long-range temporal correlations and heavy-tailed distributions of returns, reflecting intricate market microstructure and trader interactions. Incorporating multifractal analysis into the modeling of cryptocurrency price dynamics enhances the understanding of market inefficiencies, may improve volatility forecasting and facilitate the detection of critical transitions or regime shifts. Based on the multifractal cross-correlation analysis (MFCCA) whose spacial case is the multifractal detrended fluctuation analysis (MFDFA), as the most commonly used practical tools for quantifying multifractality, in the present contribution a recently proposed method of disentangling sources of multifractality in time series was applied to the most representative instruments from the digital market. They include Bitcoin (BTC), Ethereum (ETH), decentralized exchanges (DEX) and non-fungible tokens (NFT). The results indicate the significant role of heavy tails in generating a broad multifractal spectrum. However, they also clearly demonstrate that the primary source of multifractality are temporal correlations in the series, and without them, multifractality fades out. It appears characteristic that these temporal correlations, to a large extent, do not depend on the thickness of the tails of the fluctuation distribution. These observations, made here in the context of the digital currency market, provide a further strong argument for the validity of the proposed methodology of disentangling sources of multifractality in time series.
This study examines how institutional differences and external crises shape volatility dynamics in emerging Asian stock markets. Using daily stock index returns for Indonesia, Malaysia, and the Philippines from 2010 to 2024, we estimate EGARCH(1,1) and TGARCH(1,1) models in a by-window design. The sample is split into the 2013 Taper Tantrum, the 2020-2021 COVID-19 period, the 2022-2023 rate-hike cycle, and tranquil phases. Prior work typically studies a single market or a static period; to our knowledge no study unifies institutional comparison with multi-crisis dynamics within one GARCH framework. We address this gap and show that all three markets display strong volatility persistence and fat-tailed returns. During crises both persistence and asymmetry increase, while tail thickness rises, implying more frequent extreme moves. After crises, parameters revert toward pre-shock levels. Cross-country evidence indicates a buffering role of institutional maturity: Malaysias stronger regulatory and information systems dampen amplification and speed recovery, whereas the Philippines thinner market structure prolongs instability. We conclude that crises amplify volatility structures, while institutional robustness governs recovery speed. The results provide policy guidance on transparency, macroprudential communication, and liquidity support to reduce volatility persistence during global shocks.
This study presents the implementation of a short-term forecasting system for price movements in exchange markets, using market depth data and a systematic procedure to enable a fully automated trading system. The case study focuses on the UK to Win Horse Racing market during the pre-live stage on the world's leading betting exchange, Betfair. Innovative convolutional attention mechanisms are introduced and applied to multiple recurrent neural networks and bi-dimensional convolutional recurrent neural network layers. Additionally, a novel padding method for convolutional layers is proposed, specifically designed for multivariate time series processing. These innovations are thoroughly detailed, along with their execution process. The proposed architectures follow a standard supervised learning approach, involving model training and subsequent testing on new data, which requires extensive pre-processing and data analysis. The study also presents a complete end-to-end framework for automated feature engineering and market interactions using the developed models in production. The key finding of this research is that all proposed innovations positively impact the performance metrics of the classification task under examination, thereby advancing the current state-of-the-art in convolutional attention mechanisms and padding methods applied to multivariate time series problems.
Robust optimization provides a principled framework for decision-making under uncertainty, with broad applications in finance, engineering, and operations research. In portfolio optimization, uncertainty in expected returns and covariances demands methods that mitigate estimation error, parameter instability, and model misspecification. Traditional approaches, including parametric, bootstrap-based, and Bayesian methods, enhance stability by relying on confidence intervals or probabilistic priors but often impose restrictive assumptions. This study introduces a non-parametric bootstrap framework for robust optimization in financial decision-making. By resampling empirical data, the framework constructs flexible, data-driven confidence intervals without assuming specific distributional forms, thus capturing uncertainty in statistical estimates, model parameters, and utility functions. Treating utility as a random variable enables percentile-based optimization, naturally suited for risk-sensitive and worst-case decision-making. The approach aligns with recent advances in robust optimization, reinforcement learning, and risk-aware control, offering a unified perspective on robustness and generalization. Empirically, the framework mitigates overfitting and selection bias in trading strategy optimization and improves generalization in portfolio allocation. Results across portfolio and time-series momentum experiments demonstrate that the proposed method delivers smoother, more stable out-of-sample performance, offering a practical, distribution-free alternative to traditional robust optimization methods.
Most financial recommendation systems often fail to account for key behavioral and regulatory factors, leading to advice that is misaligned with user preferences, difficult to interpret, or unlikely to be followed. We present FLARKO (Financial Language-model for Asset Recommendation with Knowledge-graph Optimization), a novel framework that integrates Large Language Models (LLMs), Knowledge Graphs (KGs), and Kahneman-Tversky Optimization (KTO) to generate asset recommendations that are both profitable and behaviorally aligned. FLARKO encodes users' transaction histories and asset trends as structured KGs, providing interpretable and controllable context for the LLM. To demonstrate the adaptability of our approach, we develop and evaluate both a centralized architecture (CenFLARKO) and a federated variant (FedFLARKO). To our knowledge, this is the first demonstration of combining KTO for fine-tuning of LLMs for financial asset recommendation. We also present the first use of structured KGs to ground LLM reasoning over behavioral financial data in a federated learning (FL) setting. Evaluated on the FAR-Trans dataset, FLARKO consistently outperforms state-of-the-art recommendation baselines on behavioral alignment and joint profitability, while remaining interpretable and resource-efficient.
We address the challenges of modeling high-frequency integer price changes in financial markets using continuous distributions, particularly the Student's t-distribution. We demonstrate that traditional GARCH models, which rely on continuous distributions, are ill-suited for high-frequency data due to the discreteness of price changes. We propose a modification to the maximum likelihood estimation procedure that accounts for the discrete nature of observations while still using continuous distributions. Our approach involves modeling the log-likelihood in terms of intervals corresponding to the rounding of continuous price changes to the nearest integer. The findings highlight the importance of adjusting for discreteness in volatility analysis and provide a framework for incroporating any continuous distribution for modeling high-frequency prices.
We use the discrete Ollivier-Ricci graph curvature with Ricci flow to examine the intrinsic geometry of financial markets through the empirical correlation graph of the NASDAQ 100 index. Our main result is the development of a technique to perform surgery on the neckpinch singularities that form during the Ricci flow of the empirical graph, using the behavior and the lower bound of curvature of the fully connected graph as a starting point. We construct an algorithm that uses the curvature generated by intrinsic geometric flow of the graph to detect hidden hierarchies, community behavior, and clustering in financial markets despite the underlying challenges posed by a highly connected geometry.
We introduce an offline nonparametric estimator for concave multi-asset propagator models based on a dataset of correlated price trajectories and metaorders. Compared to parametric models, our framework avoids parameter explosion in the multi-asset case and yields confidence bounds for the estimator. We implement the estimator using both proprietary metaorder data from Capital Fund Management (CFM) and publicly available S&P order flow data, where we augment the former dataset using a metaorder proxy. In particular, we provide unbiased evidence that self-impact is concave and exhibits a shifted power-law decay, and show that the metaorder proxy stabilizes the calibration. Moreover, we find that introducing cross-impact provides a significant gain in explanatory power, with concave specifications outperforming linear ones, suggesting that the square-root law extends to cross-impact. We also measure asymmetric cross-impact between assets driven by relative liquidity differences. Finally, we demonstrate that a shape-constrained projection of the nonparametric kernel not only ensures interpretability but also slightly outperforms established parametric models in terms of predictive accuracy.
The intricate dynamics of stock markets have led to extensive research on models that are able to effectively explain their inherent complexities. This study leverages the econometrics literature to explore the dynamic factor model as an interpretable model with sufficient predictive capabilities for capturing essential market phenomena. Although the model has been extensively applied for predictive purposes, this study focuses on analyzing the extracted loadings and common factors as an alternative framework for understanding stock price dynamics. The results reveal novel insights into traditional market theories when applied to the Philippine Stock Exchange using the Kalman method and maximum likelihood estimation, with subsequent validation against the capital asset pricing model. Notably, a one-factor model extracts a common factor representing systematic or market dynamics similar to the composite index, whereas a two-factor model extracts common factors representing market trends and volatility. Furthermore, an application of the model for nowcasting the growth rates of the Philippine gross domestic product highlights the potential of the extracted common factors as viable real-time market indicators, yielding over a 34% decrease in the out-of-sample prediction error. Overall, the results underscore the value of dynamic factor analysis in gaining a deeper understanding of market price movement dynamics.
We develop a statistical framework for risk estimation, inspired by the axiomatic theory of risk measures. Coherent risk estimators -- functionals of P&L samples inheriting the economic properties of risk measures -- are defined and characterized through robust representations linked to $L$-estimators. The framework provides a canonical methodology for constructing estimators with sound financial and statistical properties, unifying risk measure theory, principles for capital adequacy, and practical statistical challenges in market risk. A numerical study illustrates the approach, focusing on expected shortfall estimation under both i.i.d. and overlapping samples relevant for regulatory FRTB model applications.
In a dynamic landscape where portfolios and environments evolve, maintaining the accuracy of pricing models is critical. To the best of our knowledge, this is the first study to systematically examine concept drift in non-life insurance pricing. We (i) provide an overview of the relevant literature and commonly used methodologies, clarify the distinction between virtual drift and concept drift, and explain their implications for long-run model performance; (ii) review and formalize common performance measures, including the Gini index and deviance loss, and articulate their interpretation; (iii) derive the asymptotic distribution of the Gini index, enabling valid inference and hypothesis testing; and (iv) present a standardized monitoring procedure that indicates when refitting is warranted. We illustrate the framework using a modified real-world portfolio with induced concept drift and discuss practical considerations and pitfalls.