computational-finance
Real-time calibration of stochastic volatility models (SVMs) is computationally bottlenecked by the need to repeatedly solve coupled partial differential equations (PDEs). In this work, we propose DeepSVM, a physics-informed Deep Operator Network (PI-DeepONet) designed to learn the solution operator of the Heston model across its entire parameter space. Unlike standard data-driven deep learning (DL) approaches, DeepSVM requires no labelled training data. Rather, we employ a hard-constrained ansatz that enforces terminal payoffs and static no-arbitrage conditions by design. Furthermore, we use Residual-based Adaptive Refinement (RAR) to stabilize training in difficult regions subject to high gradients. Overall, DeepSVM achieves a final training loss of 10510^{-5} and predicts highly accurate option prices across a range of typical market dynamics. While pricing accuracy is high, we find that the model's derivatives (Greeks) exhibit noise in the at-the-money (ATM) regime, highlighting the specific need for higher-order regularization in physics-informed operator learning.
Modern high-frequency trading (HFT) environments are characterized by sudden price spikes that present both risk and opportunity, but conventional financial models often fail to capture the required fine temporal structure. Spiking Neural Networks (SNNs) offer a biologically inspired framework well-suited to these challenges due to their natural ability to process discrete events and preserve millisecond-scale timing. This work investigates the application of SNNs to high-frequency price-spike forecasting, enhancing performance via robust hyperparameter tuning with Bayesian Optimization (BO). This work converts high-frequency stock data into spike trains and evaluates three architectures: an established unsupervised STDP-trained SNN, a novel SNN with explicit inhibitory competition, and a supervised backpropagation network. BO was driven by a novel objective, Penalized Spike Accuracy (PSA), designed to ensure a network's predicted price spike rate aligns with the empirical rate of price events. Simulated trading demonstrated that models optimized with PSA consistently outperformed their Spike Accuracy (SA)-tuned counterparts and baselines. Specifically, the extended SNN model with PSA achieved the highest cumulative return (76.8%) in simple backtesting, significantly surpassing the supervised alternative (42.54% return). These results validate the potential of spiking networks, when robustly tuned with task-specific objectives, for effective price spike forecasting in HFT.
Evaluating faithfulness of Large Language Models (LLMs) to a given task is a complex challenge. We propose two new unsupervised metrics for faithfulness evaluation using insights from information theory and thermodynamics. Our approach treats an LLM as a bipartite information engine where hidden layers act as a Maxwell demon controlling transformations of context CC into answer AA via prompt QQ. We model Question-Context-Answer (QCA) triplets as probability distributions over shared topics. Topic transformations from CC to QQ and AA are modeled as transition matrices Q{\bf Q} and A{\bf A} encoding the query goal and actual result, respectively. Our semantic faithfulness (SF) metric quantifies faithfulness for any given QCA triplet by the Kullback-Leibler (KL) divergence between these matrices. Both matrices are inferred simultaneously via convex optimization of this KL divergence, and the final SF metric is obtained by mapping the minimal divergence onto the unit interval [0,1], where higher scores indicate greater faithfulness. Furthermore, we propose a thermodynamics-based semantic entropy production (SEP) metric in answer generation, and show that high faithfulness generally implies low entropy production. The SF and SEP metrics can be used jointly or separately for LLM evaluation and hallucination control. We demonstrate our framework on LLM summarization of corporate SEC 10-K filings.
An information-theoretic method called ECLIPSE was developed to detect AI hallucinations in finance by linking a model's uncertainty to the quality of external evidence. This approach achieved a 92% reduction in the hallucination rate on a controlled financial question-answering dataset, relying solely on API-accessible token-level log probabilities.
Volatility clustering is one of the most robust stylized facts of financial markets, yet it is typically detected using moment-based diagnostics or parametric models such as GARCH. This paper shows that clustered volatility also leaves a clear imprint on the time-reversal symmetry of horizontal visibility graphs (HVGs) constructed on absolute returns in physical time. For each time point, we compute the maximal forward and backward visibility distances, L+(t)L^{+}(t) and L(t)L^{-}(t), and use their empirical distributions to build a visibility-asymmetry fingerprint comprising the Kolmogorov--Smirnov distance, variance difference, entropy difference, and a ratio of extreme visibility spans. In a Monte Carlo study, these HVG asymmetry features sharply separate volatility-clustered GARCH(1,1) dynamics from i.i.d.\ Gaussian noise and from randomly shuffled GARCH series that preserve the marginal distribution but destroy temporal dependence; a simple linear classifier based on the fingerprint achieves about 90\% in-sample accuracy. Applying the method to daily S\&P500 data reveals a pronounced forward--backward imbalance, including a variance difference ΔVar\Delta\mathrm{Var} that exceeds the simulated GARCH values by two orders of magnitude and vanishes after shuffling. Overall, the visibility-graph asymmetry fingerprint emerges as a simple, model-free, and geometrically interpretable indicator of volatility clustering and time irreversibility in financial time series.
Despite the empirical success in modeling volatility of the rough Bergomi (rBergomi) model, it suffers from pricing and calibration difficulties stemming from its non-Markovian structure. To address this, we propose a comprehensive computational framework that enhances both simulation and calibration. First, we develop a modified Sum-of-Exponentials (mSOE) Monte Carlo scheme which hybridizes an exact simulation of the singular kernel near the origin with a multi-factor approximation for the remainder. This method achieves high accuracy, particularly for out-of-the-money options, with an O(n)\mathcal{O}(n) computational cost. Second, based on this efficient pricing engine, we then propose a distribution-matching calibration scheme by using Wasserstein distance as the optimization objective. This leverages a minimax formulation against Lipschitz payoffs, which effectively distributes pricing errors and improving robustness. Our numerical results confirm the mSOE scheme's convergence and demonstrate that the calibration algorithm reliably identifies model parameters and generalizes well to path-dependent options, which offers a powerful and generic tool for practical model fitting.
The aim of this paper is the analysis and selection of stock trading systems that combine different models with data of different nature, such as financial and microeconomic information. Specifically, based on previous work by the authors and applying advanced techniques of Machine Learning and Deep Learning, our objective is to formulate trading algorithms for the stock market with empirically tested statistical advantages, thus improving results published in the literature. Our approach integrates Long Short-Term Memory (LSTM) networks with algorithms based on decision trees, such as Random Forest and Gradient Boosting. While the former analyze price patterns of financial assets, the latter are fed with economic data of companies. Numerical simulations of algorithmic trading with data from international companies and 10-weekday predictions confirm that an approach based on both fundamental and technical variables can outperform the usual approaches, which do not combine those two types of variables. In doing so, Random Forest turned out to be the best performer among the decision trees. We also discuss how the prediction performance of such a hybrid approach can be boosted by selecting the technical variables.
We present an uncertainty-aware, physics-informed neural network (PINN) for option pricing that solves the Black--Scholes (BS) partial differential equation (PDE) as a mesh-free, global surrogate over (S,t)(S,t). The model embeds the BS operator and boundary/terminal conditions in a residual-based objective and requires no labeled prices. For American options, early exercise is handled via an obstacle-style relaxation while retaining the BS residual in the continuation region. To quantify \emph{epistemic} uncertainty, we introduce an anchored-ensemble fine-tuning stage (AT--PINN) that regularizes each model toward a sampled anchor and yields prediction bands alongside point estimates. On European calls/puts, the approach attains low errors (e.g., MAE 5×102\sim 5\times10^{-2}, RMSE 7×102\sim 7\times10^{-2}, explained variance 0.999\approx 0.999 in representative settings) and tracks ground truth closely across strikes and maturities. For American puts, the method remains accurate (MAE/RMSE on the order of 10110^{-1} with EV 0.999\approx 0.999) and does not exhibit the error accumulation associated with time-marching schemes. Against data-driven baselines (ANN, RNN) and a Kolmogorov--Arnold FINN variant (KAN), our PINN matches or outperforms on accuracy while training more stably; anchored ensembles provide uncertainty bands that align with observed error scales. We discuss design choices (loss balancing, sampling near the payoff kink), limitations, and extensions to higher-dimensional BS settings and alternative dynamics.
An end-to-end deep learning framework from Stanford University, named the Attention Factor Model, jointly optimizes factor identification, mispricing detection, and trading policy by explicitly incorporating transaction costs. The model demonstrates an annualized net Sharpe ratio of 2.28 in out-of-sample tests on U.S. equities, substantially outperforming prior two-step statistical arbitrage approaches.
This paper presents a Multi Agent Bitcoin Trading system that utilizes Large Language Models (LLMs) for alpha generation and portfolio management in the cryptocurrencies market. Unlike equities, cryptocurrencies exhibit extreme volatility and are heavily influenced by rapidly shifting market sentiments and regulatory announcements, making them difficult to model using static regression models or neural networks trained solely on historical data. The proposed framework overcomes this by structuring LLMs into specialised agents for technical analysis, sentiment evaluation, decision-making, and performance reflection. The agents improve over time via a novel verbal feedback mechanism where a Reflect agent provides daily and weekly natural-language critiques of trading decisions. These textual evaluations are then injected into future prompts of the agents, allowing them to adjust allocation logic without weight updates or finetuning. Back-testing on Bitcoin price data from July 2024 to April 2025 shows consistent outperformance across market regimes: the Quantitative agent delivered over 30\% higher returns in bullish phases and 15\% overall gains versus buy-and-hold, while the sentiment-driven agent turned sideways markets from a small loss into a gain of over 100\%. Adding weekly feedback further improved total performance by 31\% and reduced bearish losses by 10\%. The results demonstrate that verbal feedback represents a new, scalable, and low-cost approach of tuning LLMs for financial goals.
Financial news is essential for accurate market prediction, but evolving narratives across macroeconomic regimes introduce semantic and causal drift that weaken model reliability. We present an evaluation framework to quantify robustness in financial NLP under regime shifts. The framework defines four metrics: (1) Financial Causal Attribution Score (FCAS) for alignment with causal cues, (2) Patent Cliff Sensitivity (PCS) for sensitivity to semantic perturbations, (3) Temporal Semantic Volatility (TSV) for drift in latent text representations, and (4) NLI-based Logical Consistency Score (NLICS) for entailment coherence. Applied to LSTM and Transformer models across four economic periods (pre-COVID, COVID, post-COVID, and rate hike), the metrics reveal performance degradation during crises. Semantic volatility and Jensen-Shannon divergence correlate with prediction error. Transformers are more affected by drift, while feature-enhanced variants improve generalisation. A GPT-4 case study confirms that alignment-aware models better preserve causal and logical consistency. The framework supports auditability, stress testing, and adaptive retraining in financial AI systems.
AlphaSAGE introduces a framework combining Relational Graph Convolutional Networks with Generative Flow Networks to discover diverse and predictive formulaic alphas. It addresses reward sparsity, structural underrepresentation, and limited diversity in automated alpha mining, demonstrating superior performance across multiple financial markets.
17
Text and time series data offer complementary views of financial markets: news articles provide narrative context about company events, while stock prices reflect how markets react to those events. However, despite their complementary nature, effectively integrating these interleaved modalities for improved forecasting remains challenging. In this work, we propose a unified neural architecture that models these interleaved sequences using modality-specific experts, allowing the model to learn unique time series patterns, while still enabling joint reasoning across modalities and preserving pretrained language understanding capabilities. To further improve multimodal understanding, we introduce a cross-modal alignment framework with a salient token weighting mechanism that learns to align representations across modalities with a focus on the most informative tokens. We demonstrate the effectiveness of our approach on a large-scale financial forecasting task, achieving state-of-the-art performance across a wide variety of strong unimodal and multimodal baselines. We develop an interpretability method that reveals insights into the value of time series-context and reinforces the design of our cross-modal alignment objective. Finally, we demonstrate that these improvements translate to meaningful economic gains in investment simulations.
The financial domain poses unique challenges for knowledge graph (KG) construction at scale due to the complexity and regulatory nature of financial documents. Despite the critical importance of structured financial knowledge, the field lacks large-scale, open-source datasets capturing rich semantic relationships from corporate disclosures. We introduce an open-source, large-scale financial knowledge graph dataset built from the latest annual SEC 10-K filings of all S and P 100 companies - a comprehensive resource designed to catalyze research in financial AI. We propose a robust and generalizable knowledge graph (KG) construction framework that integrates intelligent document parsing, table-aware chunking, and schema-guided iterative extraction with a reflection-driven feedback loop. Our system incorporates a comprehensive evaluation pipeline, combining rule-based checks, statistical validation, and LLM-as-a-Judge assessments to holistically measure extraction quality. We support three extraction modes - single-pass, multi-pass, and reflection-agent-based - allowing flexible trade-offs between efficiency, accuracy, and reliability based on user requirements. Empirical evaluations demonstrate that the reflection-agent-based mode consistently achieves the best balance, attaining a 64.8 percent compliance score against all rule-based policies (CheckRules) and outperforming baseline methods (single-pass and multi-pass) across key metrics such as precision, comprehensiveness, and relevance in LLM-guided evaluations.
In financial trading, large language model (LLM)-based agents demonstrate significant potential. However, the high sensitivity to market noise undermines the performance of LLM-based trading systems. To address this limitation, we propose a novel multi-agent system featuring an internal competitive mechanism inspired by modern corporate management structures. The system consists of two specialized teams: (1) Data Team - responsible for processing and condensing massive market data into diversified text factors, ensuring they fit the model's constrained context. (2) Research Team - tasked with making parallelized multipath trading decisions based on deep research methods. The core innovation lies in implementing a real-time evaluation and ranking mechanism within each team, driven by authentic market feedback. Each agent's performance undergoes continuous scoring and ranking, with only outputs from top-performing agents being adopted. The design enables the system to adaptively adjust to dynamic environment, enhances robustness against market noise and ultimately delivers superior trading performance. Experimental results demonstrate that our proposed system significantly outperforms prevailing multi-agent systems and traditional quantitative investment methods across diverse evaluation metrics. ContestTrade is open-sourced on GitHub at this https URL.
This paper explores the interplay between transfer policies, R\&D, corruption, and economic development using a general equilibrium model with heterogeneous agents and a government. The government collects taxes, redistributes fiscal revenues, and undertakes public investment (in R\&D, infrastructure, etc.). Corruption is modeled as a fraction of tax revenues that is siphoned off and removed from the economy. We first establish the existence of a political-economic equilibrium. Then, using an analytically tractable framework with two private agents, we examine the effects of corruption and evaluate the impact of various policies, including redistribution and innovation-led strategies.
R&D-Agent-Quant (R&D-Agent(Q)) is a multi-agent framework that automates the entire quantitative financial research and development pipeline by jointly optimizing financial factors and predictive models. It achieves superior investment strategy performance with up to 2 times higher annualized returns and 70% fewer factors compared to state-of-the-art baselines, demonstrating robustness across diverse financial markets.
4,441
Researchers from Georgia Institute of Technology and collaborators developed the World Central Banks (WCB) dataset, the largest monetary policy corpus comprising over 380,000 sentences from 25 central banks, with 25,000 human-annotated for stance, temporality, and uncertainty. Their framework demonstrates that models trained on this aggregated dataset achieve superior performance (e.g., RoBERTa-Large with 0.740 F1 for stance detection) compared to bank-specific models and shows the derived hawkishness measure tracks inflation trends.
7
Researchers introduce PCIE (Patched Channel Integration Encoder), combining tokenization of stock price data with adaptive temporal learning and channel mixing self-attention to improve multi-step stock price forecasting, demonstrating superior performance compared to existing models across multiple datasets and prediction intervals.
In an environment of increasingly volatile financial markets, the accurate estimation of risk remains a major challenge. Traditional econometric models, such as GARCH and its variants, are based on assumptions that are often too rigid to adapt to the complexity of the current market dynamics. To overcome these limitations, we propose a hybrid framework for Value-at-Risk (VaR) estimation, combining GARCH volatility models with deep reinforcement learning. Our approach incorporates directional market forecasting using the Double Deep Q-Network (DDQN) model, treating the task as an imbalanced classification problem. This architecture enables the dynamic adjustment of risk-level forecasts according to market conditions. Empirical validation on daily Eurostoxx 50 data covering periods of crisis and high volatility shows a significant improvement in the accuracy of VaR estimates, as well as a reduction in the number of breaches and also in capital requirements, while respecting regulatory risk thresholds. The ability of the model to adjust risk levels in real time reinforces its relevance to modern and proactive risk management.
There are no more papers matching your filters at the moment.