Rewarded Soups proposes an efficient multi-policy strategy that achieves Pareto-optimal alignment for large foundation models by interpolating the weights of independently fine-tuned expert models. This approach allows for a posteriori customization to diverse user preferences with significantly fewer training runs compared to traditional multi-objective reinforcement learning.
View blogVIPER integrates frozen Vision-Language Models for perception with fine-tuned Large Language Models for reasoning, using text as an intermediate representation to enable visual instruction-based planning. The framework, developed by researchers from Sorbonne Université and CNRS, achieves state-of-the-art performance on embodied AI benchmarks while providing inherent explainability of agent decisions.
View blogResearchers from Sorbonne Université and Criteo AI Lab introduce ECHO, a framework for efficient generative transformer operators that addresses the scalability and long-horizon error accumulation issues in neural operators for PDEs. ECHO achieves high spatio-temporal compression and accurate, multi-task solutions for million-point PDE trajectories, notably enabling super-resolution forecasting on a 1024x1024 Vorticity grid without out-of-memory errors where other models failed.
View blogLMU Munich and Sorbonne Université researchers investigate how code language models internally represent programming languages and English through three interpretability methods, revealing that English and dominant programming languages like C# and C++ function as central pivot languages in the model's concept space, with language-specific neurons concentrated in bottom layers (0-4) for general constructs and top layers (29-31) for syntax-specific mappings, while demonstrating that highly aligned programming languages share substantial neural representations, making truly exclusive neurons difficult to identify and suggesting opportunities for more efficient modular architectures that leverage shared representations across similar languages.
View blogCORAL is a novel machine learning framework for solving Partial Differential Equations (PDEs) by learning mappings between function spaces on general geometries. It employs modulated Implicit Neural Representations to handle irregular spatial samplings and achieves competitive or superior performance across initial value problems, dynamics modeling, and geometry-aware inference tasks, demonstrating robustness and efficient inference.
View blogResearchers from LMU Munich and Sorbonne Université developed a systematic framework to improve in-context machine translation for low-resource languages, using Manchu as a case study. They found that providing high-quality dictionary entries and relevant parallel examples significantly enhances LLM performance, achieving a BLEU score of 12.35 with DeepSeek-V3, and successfully applied this to augment data for traditional NMT models.
View blog