Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
Pre-taxation analysis plays a crucial role in ensuring the fairness of public revenue collection. It can also serve as a tool to reduce the risk of tax avoidance, one of the UK government's concerns. Our report utilises pre-tax income ($PI$) and total assets ($TA$) data from 567 companies listed on the FTSE All-Share index, gathered from the Refinitiv EIKON database, covering 14 years, i.e., the period from 2009 to 2022. We also derive the $PI/TA$ ratio, and distinguish between positive and negative $PI$ cases. We test the conformity of such data to Benford's Laws,- specifically studying the first significant digit ($Fd$), the second significant digit ($Sd$), and the first and second significant digits ($FSd$). We use and justify two pertinent tests, the $\chi^2$ and the Mean Absolute Deviation (MAD). We find that both tests are not leading to conclusions in complete agreement with each other, - in particular the MAD test entirely rejects the Benford's Laws conformity of the reported financial data. From the mere accounting point of view, we conclude that the findings not only cast some doubt on the reported financial data, but also suggest that many more investigations be envisaged on closely related matters. On the other hand, the study of a ratio, like $PI/TA$, of variables which are (or not) Benford's Laws compliant add to the literature debating whether such indirect variables should (or not) be Benford's Laws compliant.
We introduce a new risk modeling framework where chaotic attractors shape the geometry of Bayesian inference. By combining heavy-tailed priors with Lorenz and Rossler dynamics, the models naturally generate volatility clustering, fat tails, and extreme events. We compare two complementary approaches: Model A, which emphasizes geometric stability, and Model B, which highlights rare bursts using Fibonacci diagnostics. Together, they provide a dual perspective for systemic risk analysis, linking Black Swan theory to practical tools for stress testing and volatility monitoring.
This study explores the behavioral dynamics of illiquid stock prices in a listed stock market. Illiquidity, characterized by wide bid and ask spreads affects price formation by decoupling prices from standard risk and return relationships and increasing sensitivity to market sentiment. We model the prices at the Uganda Securities Exchange (USE) which is illiquid in that the prices remain constant much of the time thus complicating price modelling. We circumvent this challenge by combining the Markov model (MM) with two models; the exponential Ornstein Uhlenbeck model (XOU) and geometric Brownian motion (gBm). In the combined models, the MM was used to capture the constant prices in the stock prices while the XOU and gBm captured the stochastic price dynamics. We modelled stock prices using the combined models, as well as XOU and gBm alone. We found that USE stocks appeared to have low correlation with one another. Using theoretical analysis, simulation study and empirical analysis, we conclude that this apparent low correlation is due to illiquidity. In particular data simulated from combined MM-gBm, in which the gBm portion were highly correlated resulted in a low measured correlation when the Markov chain had a higher transition from zero state to zero state.
Purpose: The article aims to visualise in a single graph fish and meat processing company groups in Spain with respect to long-term solvency, energy, waste and water intensity and gender employment gap. Design/methodology/approach: The selected financial, environmental and social indicators are ratios, which require specific statistical analysis methods to prevent severe skewness and outliers. We use the compositional data methodology and the principal-component analysis biplot. Findings: Fish-processing companies have more homogeneous financial, environmental and social performance than their meat-processing counterparts. Specific company groups in both sectors can be identified as poor performers in some of the indicators. Firms with higher solvency tend to be less efficient in energy and water use. Two clusters of company groups with similar performances are identified. Research limitations/implications: As of now, few firms publish reports according to the EU Corporate Sustainability Reporting Directive. In future research larger samples will be available. Social Implications: Firm groups can visually see their areas of improvement in their financial, environmental and social performance compared to their competitors in the sector. Originality/value: This is the first time in which visualization tools have combined financial, environmental and social indicators. All individual firms can be visually ordered along all indicators simultaneously.
This paper provides robust, new evidence on the causal drivers of market troughs. We demonstrate that conclusions about these triggers are critically sensitive to model specification, moving beyond restrictive linear models with a flexible DML average partial effect causal machine learning framework. Our robust estimates identify the volatility of options-implied risk appetite and market liquidity as key causal drivers, relationships misrepresented or obscured by simpler models. These findings provide high-frequency empirical support for intermediary asset pricing theories. This causal analysis is enabled by a high-performance nowcasting model that accurately identifies capitulation events in real-time.
Precise short-term price prediction in the highly volatile cryptocurrency market is critical for informed trading strategies. Although Temporal Fusion Transformers (TFTs) have shown potential, their direct use often struggles in the face of the market's non-stationary nature and extreme volatility. This paper introduces an adaptive TFT modeling approach leveraging dynamic subseries lengths and pattern-based categorization to enhance short-term forecasting. We propose a novel segmentation method where subseries end at relative maxima, identified when the price increase from the preceding minimum surpasses a threshold, thus capturing significant upward movements, which act as key markers for the end of a growth phase, while potentially filtering the noise. Crucially, the fixed-length pattern ending each subseries determines the category assigned to the subsequent variable-length subseries, grouping typical market responses that follow similar preceding conditions. A distinct TFT model trained for each category is specialized in predicting the evolution of these subsequent subseries based on their initial steps after the preceding peak. Experimental results on ETH-USDT 10-minute data over a two-month test period demonstrate that our adaptive approach significantly outperforms baseline fixed-length TFT and LSTM models in prediction accuracy and simulated trading profitability. Our combination of adaptive segmentation and pattern-conditioned forecasting enables more robust and responsive cryptocurrency price prediction.
The CAPM regression is typically interpreted as if the market return contemporaneously \emph{causes} individual returns, motivating beta-neutral portfolios and factor attribution. For realized equity returns, however, this interpretation is inconsistent: a same-period arrow $R_{m,t} \to R_{i,t}$ conflicts with the fact that $R_m$ is itself a value-weighted aggregate of its constituents, unless $R_m$ is lagged or leave-one-out -- the ``aggregator contradiction.'' We formalize CAPM as a structural causal model and analyze the admissible three-node graphs linking an external driver $Z$, the market $R_m$, and an asset $R_i$. The empirically plausible baseline is a \emph{fork}, $Z \to \{R_m, R_i\}$, not $R_m \to R_i$. In this setting, OLS beta reflects not a causal transmission, but an attenuated proxy for how well $R_m$ captures the underlying driver $Z$. Consequently, ``beta-neutral'' portfolios can remain exposed to macro or sectoral shocks, and hedging on $R_m$ can import index-specific noise. Using stylized models and large-cap U.S.\ equity data, we show that contemporaneous betas act like proxies rather than mechanisms; any genuine market-to-stock channel, if at all, appears only at a lag and with modest economic significance. The practical message is clear: CAPM should be read as associational. Risk management and attribution should shift from fixed factor menus to explicitly declared causal paths, with ``alpha'' reserved for what remains invariant once those causal paths are explicitly blocked.
Deep Learning is evolving fast and integrates into various domains. Finance is a challenging field for deep learning, especially in the case of interpretable artificial intelligence (AI). Although classical approaches perform very well with natural language processing, computer vision, and forecasting, they are not perfect for the financial world, in which specialists use different metrics to evaluate model performance. We first introduce financially grounded loss functions derived from key quantitative finance metrics, including the Sharpe ratio, Profit-and-Loss (PnL), and Maximum Draw down. Additionally, we propose turnover regularization, a method that inherently constrains the turnover of generated positions within predefined limits. Our findings demonstrate that the proposed loss functions, in conjunction with turnover regularization, outperform the traditional mean squared error loss for return prediction tasks when evaluated using algorithmic trading metrics. The study shows that financially grounded metrics enhance predictive performance in trading strategies and portfolio optimization.
This work builds upon the long-standing conjecture that linear diffusion models are inadequate for complex market dynamics. Specifically, it provides experimental validation for the author's prior arguments that realistic market dynamics are governed by higher-order (cubic and higher) non-linearities in the drift. As the diffusion drift is given by the negative gradient of a potential function, this means that a non-linear drift translates into a non-quadratic potential. These arguments were based both on general theoretical grounds as well as a structured approach to modeling the price dynamics which incorporates money flows and their impact on market prices. Here, we find direct confirmation of this view by analyzing high-frequency crypto currency data at different time scales ranging from minutes to months. We find that markets can be characterized by either a single-well or a double-well potential, depending on the time period and sampling frequency, where a double-well potential may signal market uncertainty or stress.
Explainability in AI and ML models is critical for fostering trust, ensuring accountability, and enabling informed decision making in high stakes domains. Yet this objective is often unmet in practice. This paper proposes a general purpose framework that bridges state of the art explainability techniques with Malle's five category model of behavior explanation: Knowledge Structures, Simulation/Projection, Covariation, Direct Recall, and Rationalization. The framework is designed to be applicable across AI assisted decision making systems, with the goal of enhancing transparency, interpretability, and user trust. We demonstrate its practical relevance through real world case studies, including credit risk assessment and regulatory analysis powered by large language models (LLMs). By aligning technical explanations with human cognitive mechanisms, the framework lays the groundwork for more comprehensible, responsible, and ethical AI systems.
This study presents a unified, distribution-aware, and complexity-informed framework for understanding equity return dynamics in the Indian market, using 34 years (1990 to 2024) of Nifty 50 index data. Addressing a key gap in the literature, we demonstrate that the price to earnings ratio, as a valuation metric, may probabilistically map return distributions across investment horizons spanning from days to decades. Return profiles exhibit strong asymmetry. One-year returns show a 74 percent probability of gain, with a modal return of 10.67 percent and a reward-to-risk ratio exceeding 5. Over long horizons, modal CAGRs surpass 13 percent, while worst-case returns remain negative for up to ten years, defining a historical trapping period. This horizon shortens to six years in the post-1999 period, reflecting growing market resilience. Conditional analysis of the P/E ratio reveals regime-dependent outcomes. Low valuations (P/E less than 13) historically show zero probability of loss across all horizons, while high valuations (P/E greater than 27) correspond to unstable returns and extended breakeven periods. To uncover deeper structure, we apply tools from complexity science. Entropy, Hurst exponents, and Lyapunov indicators reveal weak persistence, long memory, and low-dimensional chaos. Information-theoretic metrics, including mutual information and transfer entropy, confirm a directional and predictive influence of valuation on future returns. These findings offer actionable insights for asset allocation, downside risk management, and long-term investment strategy in emerging markets. Our framework bridges valuation, conditional distributions, and nonlinear dynamics in a rigorous and practically relevant manner.
Financial markets are complex, non-stationary systems where the underlying data distributions can shift over time, a phenomenon known as regime changes, as well as concept drift in the machine learning literature. These shifts, often triggered by major economic events, pose a significant challenge for traditional statistical and machine learning models. A fundamental problem in developing and validating adaptive algorithms is the lack of a ground truth in real-world financial data, making it difficult to evaluate a model's ability to detect and recover from these drifts. This paper addresses this challenge by introducing a novel framework, named ProteuS, for generating semi-synthetic financial time series with pre-defined structural breaks. Our methodology involves fitting ARMA-GARCH models to real-world ETF data to capture distinct market regimes, and then simulating realistic, gradual, and abrupt transitions between them. The resulting datasets, which include a comprehensive set of technical indicators, provide a controlled environment with a known ground truth of regime changes. An analysis of the generated data confirms the complexity of the task, revealing significant overlap between the different market states. We aim to provide the research community with a tool for the rigorous evaluation of concept drift detection and adaptation mechanisms, paving the way for more robust financial forecasting models.
Through a novel approach, this paper shows that substantial change in stock market behavior has a statistically and economically significant impact on equity risk premium predictability both on in-sample and out-of-sample cases. In line with Auer's ''Bullish ratio'', a ''Bullish index'' is introduced to measure the changes in stock market behavior, which we describe through a ''fluctuation detrending moving average analysis'' (FDMAA) for returns. We consider 28 indicators. We find that a ''positive shock'' of the Bullish Index is closely related to strong equity risk premium predictability for forecasts based on macroeconomic variables for up to six months. In contrast, a ''negative shock'' is associated with strong equity risk premium predictability with adequate forecasts for up to nine months when based on technical indicators.
This paper proposes a novel stock selection strategy framework based on combined machine learning algorithms. Two types of weighting methods for three representative machine learning algorithms are developed to predict the returns of the stock selection strategy. One is static weighting based on model evaluation metrics, the other is dynamic weighting based on Information Coefficients (IC). Using CSI 300 index data, we empirically evaluate the strategy' s backtested performance and model predictive accuracy. The main results are as follows: (1) The strategy by combined machine learning algorithms significantly outperforms single-model approaches in backtested returns. (2) IC-based weighting (particularly IC_Mean) demonstrates greater competitiveness than evaluation-metric-based weighting in both backtested returns and predictive performance. (3) Factor screening substantially enhances the performance of combined machine learning strategies.
Hawkes processes were first introduced to obtain microscopic models for the rough volatility observed in asset prices. Scaling limits of such processes leads to the rough-Heston model that describes the macroscopic behavior. Blanc et al. (2017) show that Time-reversal asymmetry (TRA) or the Zumbach effect can be modeled using Quadratic Hawkes (QHawkes) processes. Dandapani et al. (2021) obtain a super-rough-Heston model as scaling limit of QHawkes processes in the case where the impact of buying and selling actions are symmetric. To model asymmetry in buying and selling actions, we propose a bivariate QHawkes process and derive a super-rough-Heston model as scaling limits for the price process in the stable and near-unstable regimes that preserves TRA. A new feature of the limiting process in the near-unstable regime is that the two driving Brownian motions exhibit a stochastic covariation that depends on the spot volatility.
This paper presents a dynamic cryptocurrency portfolio optimization strategy that integrates technical indicators and sentiment analysis to enhance investment decision-making. The proposed method employs the 14-day Relative Strength Index (RSI) and 14-day Simple Moving Average (SMA) to capture market momentum, while sentiment scores are extracted from news articles using the VADER (Valence Aware Dictionary and sEntiment Reasoner) model, with compound scores quantifying overall market tone. The large language model Google Gemini is used to further verify the sentiment scores predicted by VADER and give investment decisions. These technical indicator and sentiment signals are incorporated into the expected return estimates before applying mean-variance optimization with constraints on asset weights. The strategy is evaluated through a rolling-window backtest over cryptocurrency market data, with Bitcoin (BTC) and an equal-weighted portfolio of selected cryptocurrencies serving as benchmarks. Experimental results show that the proposed approach achieves a cumulative return of 38.72, substantially exceeding Bitcoin's 8.85 and the equal-weighted portfolio's 21.65 over the same period, and delivers a higher Sharpe ratio (1.1093 vs. 0.8853 and 1.0194, respectively). However, the strategy exhibits a larger maximum drawdown (-18.52%) compared to Bitcoin (-4.48%) and the equal-weighted portfolio (-11.02%), indicating higher short-term downside risk. These results highlight the potential of combining sentiment and technical signals to improve cryptocurrency portfolio performance, while also emphasizing the need to address risk exposure in volatile markets.
Cryptocurrency markets are characterized by extreme volatility, making accurate forecasts essential for effective risk management and informed trading strategies. Traditional deterministic (point) forecasting methods are inadequate for capturing the full spectrum of potential volatility outcomes, underscoring the importance of probabilistic approaches. To address this limitation, this paper introduces probabilistic forecasting methods that leverage point forecasts from a wide range of base models, including statistical (HAR, GARCH, ARFIMA) and machine learning (e.g. LASSO, SVR, MLP, Random Forest, LSTM) algorithms, to estimate conditional quantiles of cryptocurrency realized variance. To the best of our knowledge, this is the first study in the literature to propose and systematically evaluate probabilistic forecasts of variance in cryptocurrency markets based on predictions derived from multiple base models. Our empirical results for Bitcoin demonstrate that the Quantile Estimation through Residual Simulation (QRS) method, particularly when applied to linear base models operating on log-transformed realized volatility data, consistently outperforms more sophisticated alternatives. Additionally, we highlight the robustness of the probabilistic stacking framework, providing comprehensive insights into uncertainty and risk inherent in cryptocurrency volatility forecasting. This research fills a significant gap in the literature, contributing practical probabilistic forecasting methodologies tailored specifically to cryptocurrency markets.
The increasing penetration of variable renewable energy and flexible demand technologies, such as electric vehicles and heat pumps, introduces significant uncertainty in power systems, resulting in greater imbalance; defined as the deviation between scheduled and actual supply or demand. Short-term power markets, such as the European continuous intraday market, play a critical role in mitigating these imbalances by enabling traders to adjust forecasts close to real time. Due to the high volatility of the continuous intraday market, traders increasingly rely on electricity price forecasting to guide trading decisions and mitigate price risk. However most electricity price forecasting approaches in the literature simplify the forecasting task. They focus on single benchmark prices, neglecting intra-product price dynamics and price signals from the limit order book. They also underuse high-frequency and cross-product price data. In turn, we propose a novel directional electricity price forecasting method for hourly products in the European continuous intraday market. Our method incorporates short-term features from both hourly and quarter-hourly products and is evaluated using German European Power Exchange data from 2024-2025. The results indicate that features derived from the limit order book are the most influential exogenous variables. In addition, features from neighboring products; especially those with delivery start times that overlap with the trading period of the target product; improve forecast accuracy. Finally, our evaluation of the value captured by our electricity price forecasting suggests that the proposed electricity price forecasting method has the potential to generate profit when applied in trading strategies.