Centre d’Economie de la SorbonneUniversité Paris 1 Panthéon-Sorbonne
In recent years, academics, regulators, and market practitioners have increasingly addressed liquidity issues. Amongst the numerous problems addressed, the optimal execution of large orders is probably the one that has attracted the most research works, mainly in the case of single-asset portfolios. In practice, however, optimal execution problems often involve large portfolios comprising numerous assets, and models should consequently account for risks at the portfolio level. In this paper, we address multi-asset optimal execution in a model where prices have multivariate Ornstein-Uhlenbeck dynamics and where the agent maximizes the expected (exponential) utility of her PnL. We use the tools of stochastic optimal control and simplify the initial multidimensional Hamilton-Jacobi-Bellman equation into a system of ordinary differential equations (ODEs) involving a Matrix Riccati ODE for which classical existence theorems do not apply. By using \textit{a priori} estimates obtained thanks to optimal control tools, we nevertheless prove an existence and uniqueness result for the latter ODE, and then deduce a verification theorem that provides a rigorous solution to the execution problem. Using examples based on data from the foreign exchange and stock markets, we eventually illustrate our results and discuss their implications for both optimal execution and statistical arbitrage.
Airborne Laser Scanning (ALS) technology has transformed modern archaeology by unveiling hidden landscapes beneath dense vegetation. However, the lack of expert-annotated, open-access resources has hindered the analysis of ALS data using advanced deep learning techniques. We address this limitation with Archaeoscape (available at this https URL), a novel large-scale archaeological ALS dataset spanning 888 km2^2 in Cambodia with 31,141 annotated archaeological features from the Angkorian period. Archaeoscape is over four times larger than comparable datasets, and the first ALS archaeology resource with open-access data, annotations, and models. We benchmark several recent segmentation models to demonstrate the benefits of modern vision techniques for this problem and highlight the unique challenges of discovering subtle human-made structures under dense jungle canopies. By making Archaeoscape available in open access, we hope to bridge the gap between traditional archaeology and modern computer vision methods.
We consider a system of binary interacting chains describing the dynamics of a group of NN components that, at each time unit, either send some signal to the others or remain silent otherwise. The interactions among the chains are encoded by a directed Erd\"os-R\'enyi random graph with unknown parameter $ p \in (0, 1) .$ Moreover, the system is structured within two populations (excitatory chains versus inhibitory ones) which are coupled via a mean field interaction on the underlying Erd\"os-R\'enyi graph. In this paper, we address the question of inferring the connectivity parameter pp based only on the observation of the interacting chains over TT time units. In our main result, we show that the connectivity parameter pp can be estimated with rate N1/2+N1/2/T+(log(T)/T)1/2N^{-1/2}+N^{1/2}/T+(\log(T)/T)^{1/2} through an easy-to-compute estimator. Our analysis relies on a precise study of the spatio-temporal decay of correlations of the interacting chains. This is done through the study of coalescing random walks defining a backward regeneration representation of the system. Interestingly, we also show that this backward regeneration representation allows us to perfectly sample the system of interacting chains (conditionally on each realization of the underlying Erd\"os-R\'enyi graph) from its stationary distribution. These probabilistic results have an interest in its own.
Over the past decade, many dealers have implemented algorithmic models to automatically respond to RFQs and manage flows originating from their electronic platforms. In parallel, building on the foundational work of Ho and Stoll, and later Avellaneda and Stoikov, the academic literature on market making has expanded to address trade size distributions, client tiering, complex price dynamics, alpha signals, and the internalization versus externalization dilemma in markets with dealer-to-client and interdealer-broker segments. In this paper, we tackle two critical dimensions: adverse selection, arising from the presence of informed traders, and price reading, whereby the market maker's own quotes inadvertently reveal the direction of their inventory. These risks are well known to practitioners, who routinely face informed flows and algorithms capable of extracting signals from quoting behavior. Yet they have received limited attention in the quantitative finance literature, beyond stylized toy models with limited actionability. Extending the existing literature, we propose a tractable and implementable framework that enables market makers to adjust their quotes with greater awareness of informational risk.
We provide explicit series expansions to certain stochastic path-dependent integral equations in terms of the path signature of the time augmented driving Brownian motion. Our framework encompasses a large class of stochastic linear Volterra and delay equations and in particular the fractional Brownian motion with a Hurst index H(0,1)H \in (0, 1). Our expressions allow to disentangle an infinite dimensional Markovian structure and open the door to straightforward and simple approximation schemes, that we illustrate numerically.
Large Language Models (LLMs) such as ChatGPT demonstrated the potential to replicate human language abilities through technology, ranging from text generation to engaging in conversations. However, it remains controversial to what extent these systems truly understand language. We examine this issue by narrowing the question down to the semantics of LLMs at the word and sentence level. By examining the inner workings of LLMs and their generated representation of language and by drawing on classical semantic theories by Frege and Russell, we get a more nuanced picture of the potential semantic capabilities of LLMs.
210
This note is a companion article to the recent paper L\"ocherbach, Loukianova, Marini (2024). We consider mean field systems of interacting particles. Each particle jumps with a jump rate depending on its position. When jumping, a macroscopic quantity is added to its own position. Moreover, simultaneously, all other particles of the system receive a small random kick which is distributed according to a positive α\alpha-stable law and scaled in N1/α,N^{-1/\alpha}, where 0 < \alpha < 1. In between successive jumps of the system, the particles follow a deterministic flow with drift depending on their position and on the empirical measure of the total system. In a more general framework where jumps and state space do not need to be positive, we have shown in L\"ocherbach, Loukianova, Marini (2024) that the mean field limit of this system is a McKean-Vlasov type process which is solution of a non-linear SDE, driven by an α \alpha-stable process. Moreover we have obtained in L\"ocherbach, Loukianova, Marini (2024) an upper bound for the strong rate of convergence with respect to some specific distance disregarding big jumps of the limit stable process. In the present note we consider the specific situation where all jumps are positive and particles take values in $[ 0, + \infty [ . $ We show that in this case it is possible to improve upon the error bounds obtained in L\"ocherbach, Loukianova, Marini (2024) by using an adhoc distance obtained after applying a concave space transform to the trajectories. The distance we propose here takes into account the big jumps of the limit $ \alpha-$stable subordinator.
This paper examines the impact of temperature shocks on European Parliament elections. We combine high-resolution climate data with results from parliamentary elections between 1989 and 2019, aggregated at the NUTS-2 regional level. Exploiting exogenous variation in unusually warm and hot days during the months preceding elections, we identify the effect of short-run temperature shocks on voting behaviour. We find that temperature shocks reduce ideological polarisation and increase vote concentration, as voters consolidate around larger, more moderate parties. This aggregated pattern is explained by a gain in support of liberal and, to a lesser extent, social democratic parties, while right-wing parties lose vote share. Consistent with a salience mechanism, complementary analysis of party manifestos shows greater emphasis on climate-related issues in warmer pre-electoral contexts. Overall, our findings indicate that climate shocks can shift party systems toward the centre and weaken political extremes.
The Queue-Reactive model introduced by Huang et al. (2015) has become a standard tool for limit order book modeling, widely adopted by both researchers and practitioners for its simplicity and effectiveness. We present the Multidimensional Deep Queue-Reactive (MDQR) model, which extends this framework in three ways: it relaxes the assumption of queue independence, enriches the state space with market features, and models the distribution of order sizes. Through a neural network architecture, the model learns complex dependencies between different price levels and adapts to varying market conditions, while preserving the interpretable point-process foundation of the original framework. Using data from the Bund futures market, we show that MDQR captures key market properties including the square-root law of market impact, cross-queue correlations, and realistic order size patterns. The model demonstrates particular strength in reproducing both conditional and stationary distributions of order sizes, as well as various stylized facts of market microstructure. The model achieves this while maintaining the computational efficiency needed for practical applications such as strategy development through reinforcement learning or realistic backtesting.
In this article, we delve into the applications and extensions of the queue-reactive model for the simulation of limit order books. Our approach emphasizes the importance of order sizes, in conjunction with their type and arrival rate, by integrating the current state of the order book to determine, not only the intensity of order arrivals and their type, but also their sizes. These extensions generate simulated markets that are in line with numerous stylized facts of the market. Our empirical calibration, using futures on German bonds, reveals that the extended queue-reactive model significantly improves the description of order flow properties and the shape of queue distributions. Moreover, our findings demonstrate that the extended model produces simulated markets with a volatility comparable to historical real data, utilizing only endogenous information from the limit order book. This research underscores the potential of the queue-reactive model and its extensions in accurately simulating market dynamics and providing valuable insights into the complex nature of limit order book modeling.
Community-based fact-checking is a promising approach to verify social media content and correct misleading posts at scale. Yet, causal evidence regarding its effectiveness in reducing the spread of misinformation on social media is missing. Here, we performed a large-scale empirical study to analyze whether community notes reduce the spread of misleading posts on X. Using a Difference-in-Differences design and repost time series data for N=237,677 (community fact-checked) cascades that had been reposted more than 431 million times, we found that exposing users to community notes reduced the spread of misleading posts by, on average, 62.0%. Furthermore, community notes increased the odds that users delete their misleading posts by 103.4%. However, our findings also suggest that community notes might be too slow to intervene in the early (and most viral) stage of the diffusion. Our work offers important implications to enhance the effectiveness of community-based fact-checking approaches on social media.
Simulation methods have always been instrumental in finance, and data-driven methods with minimal model specification, commonly referred to as generative models, have attracted increasing attention, especially after the success of deep learning in a broad range of fields. However, the adoption of these models in financial applications has not matched the growing interest, probably due to the unique complexities and challenges of financial markets. This paper contributes to a deeper understanding of the limitations of generative models, particularly in portfolio and risk management. To this end, we begin by presenting theoretical results on the importance of initial sample size, and point out the potential pitfalls of generating far more data than originally available. We then highlight the inseparable nature of model development and the desired uses by touching on a paradox: usual generative models inherently care less about what is important for constructing portfolios (in particular the long-short ones). Based on these findings, we propose a pipeline for the generation of multivariate returns that meets conventional evaluation standards on a large universe of US equities while being compliant with stylized facts observed in asset returns and turning around the pitfalls we previously identified. Moreover, we insist on the need for more accurate evaluation methods, and suggest, through an example of mean-reversion strategies, a method designed to identify poor models for a given application based on regurgitative training, i.e. retraining the model using the data it has itself generated, which is commonly referred to in statistics as identifiability.
We present an empirical study examining several claims related to option prices in rough volatility literature using SPX options data. Our results show that rough volatility models with the parameter H(0,1/2)H \in (0,1/2) are inconsistent with the global shape of SPX smiles. In particular, the at-the-money SPX skew is incompatible with the power-law shape generated by these models, which increases too fast for short maturities and decays too slowly for longer maturities. For maturities between one week and three months, rough volatility models underperform one-factor Markovian models with the same number of parameters. When extended to longer maturities, rough volatility models do not consistently outperform one-factor Markovian models. Our study identifies a non-rough path-dependent model and a two-factor Markovian model that outperform their rough counterparts in capturing SPX smiles between one week and three years, with only 3 to 4 parameters.
We consider the stochastic system of interacting neurons introduced in De Masi et al. (2015) and in Fournier and Löcherbach (2016) and then further studied in Erny, Löcherbach and Loukianova (2021) in a diffusive scaling. The system consists of N neurons, each spiking randomly with rate depending on its membrane potential. At its spiking time, the potential of the spiking neuron is reset to 0 and all other neurons receive an additional amount of potential which is a centred random variable of order 1/N. 1 / \sqrt{N}. In between successive spikes, each neuron's potential follows a deterministic flow. In a previous article we proved the convergence of the system, as NN \to \infty, to a limit nonlinear jumping stochastic differential equation. In the present article we complete this study by establishing a strong convergence result, stated with respect to an appropriate distance, with an explicit rate of convergence. The main technical ingredient of our proof is the coupling introduced in Komlós, Major and Tusnády (1976) of the point process representing the small jumps of the particle system with the limit Brownian motion
Social media usage is often cited as a potential driver behind the rising suicide rates. However, distinguishing the causal effect - whether social media increases the risk of suicide - from reverse causality, where individuals already at higher risk of suicide are more likely to use social media, remains a significant challenge. In this paper, we use an instrumental variable approach to study the quasi-exogenous geographical adoption of Twitter and its causal relationship with suicide rates. Our analysis first demonstrates that Twitter's geographical adoption was driven by the presence of certain users at the 2007 SXSW festival, which led to long-term disparities in adoption rates across counties in the United States. Then, using a two-stage least squares (2SLS) regression and controlling for a wide range of geographic, socioeconomic and demographic factors, we find no significant relationship between Twitter adoption and suicide rates.
This paper introduces and examines numerical approximation schemes for computing risk budgeting portfolios associated to positive homogeneous and sub-additive risk measures. We employ Mirror Descent algorithms to determine the optimal risk budgeting weights in both deterministic and stochastic settings, establishing convergence along with an explicit non-asymptotic quantitative rate for the averaged algorithm. A comprehensive numerical analysis follows, illustrating our theoretical findings across various risk measures -- including standard deviation, Expected Shortfall, deviation measures, and Variantiles -- and comparing the performance with that of the standard stochastic gradient descent method recently proposed in the literature.
We study in this paper the consequences of using the Mean Absolute Percentage Error (MAPE) as a measure of quality for regression models. We show that finding the best model under the MAPE is equivalent to doing weighted Mean Absolute Error (MAE) regression. We show that universal consistency of Empirical Risk Minimization remains possible using the MAPE instead of the MAE.
Building on the well-posedness of the backward Kolmogorov partial differential equation in the Wasserstein space, we analyze the strong and weak convergence rates for approximating the unique solution of a class of McKean-Vlasov stochastic differential equations via the Euler-Maruyama time discretization scheme applied to the associated system of interacting particles. We consider two distinct settings. In the first, the coefficients and test function are irregular, but the diffusion coefficient remains non-degenerate. Leveraging the smoothing properties of the underlying heat kernel, we establish the strong and weak convergence rates of the scheme in terms of the number of particles N and the mesh size h. In the second setting, where both the coefficients and the test function are smooth, we demonstrate that the weak error rate at the level of the semigroup is optimal, achieving an error of order N -1 + h.
With the emergence of decentralized finance, new trading mechanisms called Automated Market Makers have appeared. The most popular Automated Market Makers are Constant Function Market Makers. They have been studied both theoretically and empirically. In particular, the concept of impermanent loss has emerged and explains part of the profit and loss of liquidity providers in Constant Function Market Makers. In this paper, we propose another mechanism in which price discovery does not solely rely on liquidity takers but also on an external exchange rate or price oracle. We also propose to compare the different mechanisms from the point of view of liquidity providers by using a mean / variance analysis of their profit and loss compared to that of agents holding assets outside of Automated Market Makers. In particular, inspired by Markowitz' modern portfolio theory, we manage to obtain an efficient frontier for the performance of liquidity providers in the idealized case of a perfect oracle. Beyond that idealized case, we show that even when the oracle is lagged and in the presence of adverse selection by liquidity takers and systematic arbitrageurs, optimized oracle-based mechanisms perform better than popular Constant Function Market Makers.
Stochastic volatility models based on Gaussian processes, like fractional Brownian motion, are able to reproduce important stylized facts of financial markets such as rich autocorrelation structures, persistence and roughness of sample paths. This is made possible by virtue of the flexibility introduced in the choice of the covariance function of the Gaussian process. The price to pay is that, in general, such models are no longer Markovian nor semimartingales, which limits their practical use. We derive, in two different ways, an explicit analytic expression for the joint characteristic function of the log-price and its integrated variance in general Gaussian stochastic volatility models. Such analytic expression can be approximated by closed form matrix expressions. This opens the door to fast approximation of the joint density and pricing of derivatives on both the stock and its realized variance using Fourier inversion techniques. In the context of rough volatility modeling, our results apply to the (rough) fractional Stein--Stein model and provide the first analytic formulae for option pricing known to date, generalizing that of Stein--Stein, Sch{\"o}bel-Zhu and a special case of Heston.
There are no more papers matching your filters at the moment.