Loading...
Loading...
Browse, search and filter the latest cybersecurity research papers from arXiv
This paper investigates the interactions among consumption/savings, investment, and retirement choices with income disaster. We consider low-income people who are exposed to income disaster so that they retire involuntarily when income disaster occurs. The government provides extra income support to low-income retirees who suffer from significant income gaps. We demonstrate that the decision to enter retirement in the event of income disaster depends crucially on the level of income support. In particular, we quantitatively identify a certain income support level below which the optimal decision is to delay retirement. This implies that availability of the government's extra income support can be particularly important for the low-income people to achieve optimal retirement with income disaster.
Myopic optimization (MO) outperforms reinforcement learning (RL) in portfolio management: RL yields lower or negative returns, higher variance, larger costs, heavier CVaR, lower profitability, and greater model risk. We model execution/liquidation frictions with mark-to-market accounting. Using Malliavin calculus (Clark-Ocone/BEL), we derive policy gradients and risk shadow price, unifying HJB and KKT. This gives dual gap and convergence results: geometric MO vs. RL floors. We quantify phantom profit in RL via Malliavin policy-gradient contamination analysis and define a control-affects-dynamics (CAD) premium of RL indicating plausibly positive.
In volatile financial markets, balancing risk and return remains a significant challenge. Traditional approaches often focus solely on equity allocation, overlooking the strategic advantages of options trading for dynamic risk hedging. This work presents DeltaHedge, a multi-agent framework that integrates options trading with AI-driven portfolio management. By combining advanced reinforcement learning techniques with an ensembled options-based hedging strategy, DeltaHedge enhances risk-adjusted returns and stabilizes portfolio performance across varying market conditions. Experimental results demonstrate that DeltaHedge outperforms traditional strategies and standalone models, underscoring its potential to transform practical portfolio management in complex financial environments. Building on these findings, this paper contributes to the fields of quantitative finance and AI-driven portfolio optimization by introducing a novel multi-agent system for integrating options trading strategies, addressing a gap in the existing literature.
Prediction models calibrated using historical data may forecast poorly if the dynamics of the present and future differ from observations in the past. For this reason, predictions can be improved if information like forward looking views about the state of the system are used to refine the forecast. We develop an approach for combining a dynamic factor model for risky asset prices calibrated on historical data, and noisy expert views of future values of the factors/covariates in the model, and study the implications for dynamic portfolio choice. By exploiting the graphical structure linking factors, asset prices, and views, we derive closed-form expressions for the dynamics of the factor and price processes after conditioning on the views. For linear factor models, the price process becomes a time-inhomogeneous affine process with a new covariate formed from the views. We establish a novel theoretical connection between the conditional factor process and a process we call a Mean-Reverting Bridge (MrB), an extension of the classical Brownian bridge. We derive the investor's optimal portfolio strategy and show that views influence both the myopic mean-variance term and the intertemporal hedge. The optimal dynamic portfolio when the long-run mean of the expected return is uncertain and learned online from data is also derived. More generally, our framework offers a generalizable approach for embedding forward-looking information about covariates in a dynamic factor model.
We revisit the problem of portfolio selection, where an investor maximizes utility subject to a risk constraint. Our framework is very general and accommodates a wide range of utility and risk functionals, including non-concave utilities such as S-shaped utilities from prospect theory and non-convex risk measures such as Value at Risk. Our main contribution is a novel and complete characterization of well-posedness for utility-risk portfolio selection in one period that takes the interplay between the utility and the risk objectives fully into account. We show that under mild regularity conditions the minimal necessary and sufficient condition for well-posedness is given by a very simple either-or criterion: either the utility functional or the risk functional need to satisfy the axiom of sensitivity to large losses. This allows to easily describe well-posedness or ill-posedness for many utility-risk pairs, which we illustrate by a large number of examples. In the special case of expected utility maximization without a risk constraint (but including non-concave utilities), we show that well-posedness is fully characterised by the asymptotic loss-gain ratio, a simple and interpretable quantity that describes the investor's asymptotic relative weighting of large losses versus large gains.
Classical portfolio models collapse under structural breaks, while modern machine-learning allocators adapt flexibly but often at the cost of transparency and interpretability. This paper introduces Causal PDE-Control Models (CPCMs), a unifying framework that integrates causal inference, nonlinear filtering, and forward-backward partial differential equations for dynamic portfolio optimization. The framework delivers three theoretical advances: (i) the existence of conditional risk-neutral measures under evolving information sets; (ii) a projection-divergence duality that quantifies the stability cost of departing from the causal driver manifold; and (iii) causal completeness, establishing that a finite driver span can capture all systematic premia. Classical methods such as Markowitz, CAPM, and Black-Litterman appear as degenerate cases, while reinforcement learning and deep-hedging policies emerge as unconstrained, symmetry-breaking approximations. Empirically, CPCM solvers implemented with physics-informed neural networks achieve higher Sharpe ratios, lower turnover, and more persistent premia than both econometric and machine-learning benchmarks, using a global equity panel with more than 300 candidate drivers. By reframing portfolio optimization around structural causality and PDE control, CPCMs provide a rigorous, interpretable, and computationally tractable foundation for robust asset allocation under nonstationary conditions.
In this article, we study optimal investment and consumption in an incomplete stochastic factor model for a power utility investor on the infinite horizon. When the state space of the stochastic factor is finite, we give a complete characterisation of the well-posedness of the problem, and provide an efficient numerical algorithm for computing the value function. When the state space is a (possibly infinite) open interval and the stochastic factor is represented by an It\^o diffusion, we develop a general theory of sub- and supersolutions for second-order ordinary differential equations on open domains without boundary values to prove existence of the solution to the Hamilton-Jacobi-Bellman (HJB) equation along with explicit bounds for the solution. By characterising the asymptotic behaviour of the solution, we are also able to provide rigorous verification arguments for various models, including -- for the first time -- the Heston model. Finally, we link the discrete and continuous setting and show that that the value function in the diffusion setting can be approximated very efficiently through a fast discretisation scheme.
We study the problem of designing and hedging unit-linked life policies whose benefits depend on an investment fund that incorporates environmental criteria in its selection process. Offering these products poses two key challenges: constructing a green investment fund and developing a hedging strategy for policies written on that fund. We address these two problems separately. First, we design a portfolio selection rule driven by firms' carbon intensity that endogenously selects assets and avoids ad hoc pre-screens based on ESG scores. The effectiveness of our new portfolio selection method is tested using real market data. Second, we adopt the perspective of an insurance company issuing unit-linked policies written on this fund. Such contracts are exposed to market, carbon, and mortality risk, which the insurer seeks to hedge. Due to market incompleteness, we address the hedging problem via a quadratic approach aimed at minimizing the tracking error. We also make a numerical analysis to assess the performance of the hedging strategy. For our simulation study, we use an efficient weak second-order scheme that allows for variance reduction.
This study applies the Hierarchical Risk Parity (HRP) portfolio allocation methodology to the NUAM market, a regional holding that integrates the markets of Chile, Colombia and Peru. As one of the first empirical analyses of HRP in this newly formed Latin American context, the paper addresses a gap in the literature on portfolio construction under cross-border, emerging market conditions. HRP leverages hierarchical clustering and recursive bisection to allocate risk in a manner that is both interpretable and robust--avoiding the need to invert the covariance matrix, a common limitation in the traditional mean-variance optimization. Using daily data from 54 constituent stocks of the MSCI NUAM Index from 2019 to 2025, we compare the performance of HRP against two standard benchmarks: an equally weighted portfolio (1/N) and a maximum Sharpe ratio portfolio. Results show that while the Max Sharpe portfolio yields the highest return, the HRP portfolio delivers a smoother risk-return profile, with lower drawdowns and tracking error. These findings highlight HRP's potential as a practical and resilient asset allocation framework for investors operating in the integrated, high-volatility markets like NUAM.
This paper proposes a reinforcement learning framework that employs Proximal Policy Optimization (PPO) to dynamically optimize the weights of multiple large language model (LLM)-generated formulaic alphas for stock trading strategies. Formulaic alphas are mathematically defined trading signals derived from price, volume, sentiment, and other data. Although recent studies have shown that LLMs can generate diverse and effective alphas, a critical challenge lies in how to adaptively integrate them under varying market conditions. To address this gap, we leverage the deepseek-r1-distill-llama-70b model to generate fifty alphas for five major stocks: Apple, HSBC, Pepsi, Toyota, and Tencent, and then use PPO to adjust their weights in real time. Experimental results demonstrate that the PPO-optimized strategy achieves strong returns and high Sharpe ratios across most stocks, outperforming both an equal-weighted alpha portfolio and traditional benchmarks such as the Nikkei 225, S&P 500, and Hang Seng Index. The findings highlight the importance of reinforcement learning in the allocation of alpha weights and show the potential of combining LLM-generated signals with adaptive optimization for robust financial forecasting and trading.
This study introduces a portfolio optimization framework to minimize mixed conditional value at risk (MCVaR), incorporating a chance constraint on expected returns and limiting the number of assets via cardinality constraints. A robust MCVaR model is presented, which presumes ellipsoidal support for random returns without assuming any distribution. The model utilizes an uncertainty set grounded in a reproducing kernel Hilbert space (RKHS) to manage the chance constraint, resulting in a simplified second-order cone programming (SOCP) formulation. The performance of the robust model is tested on datasets from six distinct financial markets. The outcomes of comprehensive experiments indicate that the robust model surpasses the nominal model, market portfolio, and equal-weight portfolio with higher expected returns, lower risk metrics, enhanced reward-risk ratios, and a better value of Jensen's alpha in many cases. Furthermore, we aim to validate the robust models in different market phases (bullish, bearish, and neutral). The robust model shows a distinct advantage in bear markets, providing better risk protection against adverse conditions. In contrast, its performance in bullish and neutral phases is somewhat similar to that of the nominal model. The robust model appears effective in volatile markets, although further research is necessary to comprehend its performance across different market conditions.
In the highly volatile and uncertain global financial markets, traditional quantitative trading models relying on statistical modeling or empirical rules often fail to adapt to dynamic market changes and black swan events due to rigid assumptions and limited generalization. To address these issues, this paper proposes QTMRL (Quantitative Trading Multi-Indicator Reinforcement Learning), an intelligent trading agent combining multi-dimensional technical indicators with reinforcement learning (RL) for adaptive and stable portfolio management. We first construct a comprehensive multi-indicator dataset using 23 years of S&P 500 daily OHLCV data (2000-2022) for 16 representative stocks across 5 sectors, enriching raw data with trend, volatility, and momentum indicators to capture holistic market dynamics. Then we design a lightweight RL framework based on the Advantage Actor-Critic (A2C) algorithm, including data processing, A2C algorithm, and trading agent modules to support policy learning and actionable trading decisions. Extensive experiments compare QTMRL with 9 baselines (e.g., ARIMA, LSTM, moving average strategies) across diverse market regimes, verifying its superiority in profitability, risk adjustment, and downside risk control. The code of QTMRL is publicly available at https://github.com/ChenJiahaoJNU/QTMRL.git
Environmental, Social, and Governance (ESG) factors aim to provide non-financial insights into corporations. In this study, we investigate whether we can extract relevant ESG variables to assess corporate risk, as measured by logarithmic volatility. We propose a novel Hierarchical Variable Selection (HVS) algorithm to identify a parsimonious set of variables from raw data that are most relevant to risk. HVS is specifically designed for ESG datasets characterized by a tree structure with significantly more variables than observations. Our findings demonstrate that HVS achieves significantly higher performance than models using pre-aggregated ESG scores. Furthermore, when compared with traditional variable selection methods, HVS achieves superior explanatory power using a more parsimonious set of ESG variables. We illustrate the methodology using company data from various sectors of the US economy.
The financial domain poses unique challenges for knowledge graph (KG) construction at scale due to the complexity and regulatory nature of financial documents. Despite the critical importance of structured financial knowledge, the field lacks large-scale, open-source datasets capturing rich semantic relationships from corporate disclosures. We introduce an open-source, large-scale financial knowledge graph dataset built from the latest annual SEC 10-K filings of all S and P 100 companies - a comprehensive resource designed to catalyze research in financial AI. We propose a robust and generalizable knowledge graph (KG) construction framework that integrates intelligent document parsing, table-aware chunking, and schema-guided iterative extraction with a reflection-driven feedback loop. Our system incorporates a comprehensive evaluation pipeline, combining rule-based checks, statistical validation, and LLM-as-a-Judge assessments to holistically measure extraction quality. We support three extraction modes - single-pass, multi-pass, and reflection-agent-based - allowing flexible trade-offs between efficiency, accuracy, and reliability based on user requirements. Empirical evaluations demonstrate that the reflection-agent-based mode consistently achieves the best balance, attaining a 64.8 percent compliance score against all rule-based policies (CheckRules) and outperforming baseline methods (single-pass and multi-pass) across key metrics such as precision, comprehensiveness, and relevance in LLM-guided evaluations.
This paper explores the bifurcative dynamics of an artificial stock market exchange (ASME) with endogenous, myopic traders interacting through a limit order book (LOB). We showed that agent-based price dynamics possess intrinsic bistability, which is not a result of randomness but an emergent property of micro-level trading rules, where even identical initial conditions lead to qualitatively different long-run price equilibria: a deterministic zero-price state and a persistent positive-price equilibrium. The study also identifies a metastable region with elevated volatility between the basins of attraction and reveals distinct transient behaviors for trajectories converging to these equilibria. Furthermore, we observe that the system is neither entirely regular nor fully chaotic. By highlighting the emergence of divergent market outcomes from uniform beginnings, this work contributes a novel perspective on the inherent path dependence and complex dynamics of artificial stock markets.
Thematic investing, which aims to construct portfolios aligned with structural trends, remains a challenging endeavor due to overlapping sector boundaries and evolving market dynamics. A promising direction is to build semantic representations of investment themes from textual data. However, despite their power, general-purpose LLM embedding models are not well-suited to capture the nuanced characteristics of financial assets, since the semantic representation of investment assets may differ fundamentally from that of general financial text. To address this, we introduce THEME, a framework that fine-tunes embeddings using hierarchical contrastive learning. THEME aligns themes and their constituent stocks using their hierarchical relationship, and subsequently refines these embeddings by incorporating stock returns. This process yields representations effective for retrieving thematically aligned assets with strong return potential. Empirical results demonstrate that THEME excels in two key areas. For thematic asset retrieval, it significantly outperforms leading large language models. Furthermore, its constructed portfolios demonstrate compelling performance. By jointly modeling thematic relationships from text and market dynamics from returns, THEME generates stock embeddings specifically tailored for a wide range of practical investment applications.
Copula-based Conditional Value at Risk (CCVaR) is defined as an alternative version of the classical Conditional Value at Risk (CVaR) for multivariate random vectors intended to be real-valued. We aim to generalize CCVaR to several dimensions (d>=2) when the dependence structure is given by an Archimedean copula. While previous research focused on the bivariate case, leaving the multivariate version unexplored, an almost closed-form expression for CCVaR under an Archimedean copula is derived. The conditions under which this risk measure satisfies coherence are then examined. Finally, numerical experiments based on real data are conducted to estimate CCVaR, and the results are compared with classical measures of Value at Risk (VaR) and Conditional Value at Risk (CVaR).
This paper investigates an important problem of an appropriate variance-covariance matrix estimation in the Modern Portfolio Theory. We propose a novel framework for variancecovariance matrix estimation for purposes of the portfolio optimization, which is based on deep learning models. We employ the long short-term memory (LSTM) recurrent neural networks (RNN) along with two probabilistic deep learning models: DeepVAR and GPVAR to the task of one-day ahead multivariate forecasting. We then use these forecasts to optimize portfolios of stocks and cryptocurrencies. Our analysis presents results across different combinations of observation windows and rebalancing periods to compare performances of classical and deep learning variance-covariance estimation methods. The conclusions of the study are that although the strategies (portfolios) performance differed significantly between different combinations of parameters, generally the best results in terms of the information ratio and annualized returns are obtained using the LSTM-RNN models. Moreover, longer observation windows translate into better performance of the deep learning models indicating that these methods require longer windows to be able to efficiently capture the long-term dependencies of the variance-covariance matrix structure. Strategies with less frequent rebalancing typically perform better than these with the shortest rebalancing windows across all considered methods.