alphaXiv

History

Papers Benchmarks

System 2 AI

159

07 Aug 2025

computer-science computation-and-language machine-learning

Efficient Knowledge Injection in LLMs via Self-Distillation

Aalto University System 2 AI

A method called Prompt Distillation (PD) enables large language models to internalize new factual knowledge from free-form documents by using self-distillation, achieving closed-book performance comparable to Retrieval-Augmented Generation (RAG). This approach consistently outperforms traditional supervised fine-tuning while exhibiting greater data and parameter efficiency.

206

28 May 2025

computer-science continual-learning machine-learning

Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization

Aalto University System 2 AI

The 'Memento No More' (MNM) method trains AI agents to master multiple complex tasks by internalizing human-provided feedback, or hints, directly into the model's weights. This approach prevents performance degradation from long prompts, achieving success rates of 97.9% on ToolQA and 90.3% on OfficeBench, while also making inference 3-4 times faster and reducing token usage by 90-93%.

04 Jun 2025

computer-science computer-vision-and-pattern-recognition machine-learning

ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis

Aalto University Finnish Center for Artificial Intelligence FCAI System 2 AI

Stéphane Deny

Deep learning is providing a wealth of new approaches to the problem of novel view synthesis, from Neural Radiance Field (NeRF) based approaches to end-to-end style architectures. Each approach offers specific strengths but also comes with limitations in their applicability. This work introduces ViewFusion, an end-to-end generative approach to novel view synthesis with unparalleled flexibility. ViewFusion consists in simultaneously applying a diffusion denoising step to any number of input views of a scene, then combining the noise gradients obtained for each view with an (inferred) pixel-weighting mask, ensuring that for each region of the target view only the most informative input views are taken into account. Our approach resolves several limitations of previous approaches by (1) being trainable and generalizing across multiple scenes and object classes, (2) adaptively taking in a variable number of pose-free views at both train and test time, (3) generating plausible views even in severely underdetermined conditions (thanks to its generative nature) -- all while generating views of quality on par or even better than comparable methods. Limitations include not generating a 3D embedding of the scene, resulting in a relatively slow inference speed, and our method only being tested on the relatively small Neural 3D Mesh Renderer dataset. Code is available at this https URL.

28 May 2025

agentic-frameworks agents computer-science

Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization

University of Manchester

Aalto University System 2 AI

As the general capabilities of artificial intelligence (AI) agents continue to evolve, their ability to learn to master multiple complex tasks through experience remains a key challenge. Current LLM agents, particularly those based on proprietary language models, typically rely on prompts to incorporate knowledge about the target tasks. This approach does not allow the agent to internalize this information and instead relies on ever-expanding prompts to sustain its functionality in diverse scenarios. This resembles a system of notes used by a person affected by anterograde amnesia, the inability to form new memories. In this paper, we propose a novel method to train AI agents to incorporate knowledge and skills for multiple tasks without the need for either cumbersome note systems or prior high-quality demonstration data. Our approach employs an iterative process where the agent collects new experiences, receives corrective feedback from humans in the form of hints, and integrates this feedback into its weights via a context distillation training procedure. We demonstrate the efficacy of our approach by implementing it in a Llama-3-based agent that, after only a few rounds of feedback, outperforms advanced models GPT-4o and DeepSeek-V3 in tasksets requiring correct sequencing of information retrieval, tool use, and question answering.

There are no more papers matching your filters at the moment.