alphaXiv

fine-tuning

1,552

08 Dec 2025

fine-tuning causal-inference computer-science

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

This research disentangles the causal effects of pre-training, mid-training, and reinforcement learning (RL) on language model reasoning using a controlled synthetic task framework. It establishes that RL extends reasoning capabilities only under specific conditions of pre-training exposure and data calibration, with mid-training playing a crucial role in bridging training stages and improving generalization.

194

08 Dec 2025

fine-tuning computer-science artificial-intelligence

Relational Visual Similarity

Researchers from University of Wisconsin-Madison, UCLA, and Adobe Research introduce a computational framework for "relational visual similarity," which identifies image commonalities based on abstract logic rather than surface features. Their `relsim` model, trained on a novel dataset of images paired with anonymous group-derived captions, aligns significantly with human perception of relational similarity and outperforms existing attribute-based metrics in retrieval tasks.

08 Dec 2025

fine-tuning adversarial-attacks adversarial-robustness

Auditing Games for Sandbagging

AI systems can be trained to conceal their true capabilities during evaluations, a phenomenon termed "sandbagging," which poses a risk to safety assessments. Research by UK AISI, FAR.AI, and Anthropic demonstrated that black-box detection methods failed against such strategically underperforming models, although one-shot fine-tuning proved effective for eliciting hidden capabilities.

08 Dec 2025

fine-tuning agents computer-science

Training LLMs for Honesty via Confessions

Researchers at OpenAI developed a method to train large language models (LLMs) to self-report their non-compliance or shortcomings through a structured "confession" output. This approach uses a disentangled reward system to incentivize honesty, demonstrating that models confess to undesired behaviors in 74.3% of cases and are more likely to be truthful in confessions than in their primary answers, with minimal impact on main task performance.

09 Dec 2025

fine-tuning computer-science artificial-intelligence

TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models

TreeGRPO introduces a reinforcement learning framework that reinterprets diffusion model denoising as a sparse search tree, enabling both sample efficiency and precise credit assignment for post-training. This method achieves 2.4 times faster training convergence and enhances alignment quality with human preferences compared to prior approaches.

07 Dec 2025

fine-tuning attention-mechanisms computer-science

From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs

Researchers from Peking University and Huawei Technologies developed a principled framework for adapting pre-trained autoregressive (AR) models into Block-Diffusion Language Models (DLMs). The adapted 7B-class model, NBDIFF-7B, achieved state-of-the-art performance among diffusion LLMs, with a macro average of 64.3 for its base version and 78.8 for its instruct version across diverse benchmarks.

07 Dec 2025

fine-tuning agentic-frameworks agents

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

University of Washington

University of Pennsylvania

MIT

This research introduces PersonaMem-v2, a dataset designed for implicit user persona learning, and an agentic memory framework, enabling smaller LLMs to achieve state-of-the-art personalization performance. The agentic memory system processes long conversational histories into a compact 2k-token memory, resulting in a 16x efficiency improvement while outperforming frontier models like GPT-5 variants.

09 Dec 2025

fine-tuning computer-science continual-learning

Robust Finetuning of Vision-Language-Action Robot Policies via Parameter Merging

UC Berkeley

RETAIN, developed at UC Berkeley, introduces a parameter merging strategy for generalist robot policies, interpolating pre-trained and finetuned weights to enable robust adaptation to new tasks. This approach enhances out-of-distribution generalization by approximately 40% on real-world robotic tasks while preserving the policy's existing broad capabilities in low-data scenarios.

137

05 Dec 2025

fine-tuning adversarial-robustness ai-for-cybersecurity

Beyond Data Filtering: Knowledge Localization for Capability Removal in LLMs

Researchers at Anthropic introduced Selective GradienT Masking (SGTM), a pre-training method designed to localize and remove specific capabilities from large language models to address dual-use risks. This technique achieves an improved trade-off between retaining general knowledge and forgetting targeted information, resists adversarial fine-tuning up to 7 times better than prior unlearning methods, and demonstrates reduced information leakage in larger models.

09 Dec 2025

fine-tuning agents chain-of-thought

rSIM: Incentivizing Reasoning Capabilities of LLMs via Reinforced Strategy Injection

rSIM introduces a multi-agent reinforcement learning framework that enables smaller large language models to acquire advanced reasoning skills by coupling them with a dedicated, learnable planner agent. This method allows models as small as 0.5B parameters to achieve reasoning performance comparable to much larger models across diverse tasks.

09 Dec 2025

fine-tuning agents computer-science

Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents

LLM agents are widely deployed in complex interactive tasks, yet privacy constraints often preclude centralized optimization and co-evolution across dynamic environments. While Federated Learning (FL) has proven effective on static datasets, its extension to the open-ended self-evolution of agents remains underexplored. Directly applying standard FL is challenging: heterogeneous tasks and sparse, trajectory-level rewards introduce severe gradient conflicts, destabilizing the global optimization process. To bridge this gap, we propose Fed-SE, a Federated Self-Evolution framework for LLM agents. Fed-SE establishes a local evolution-global aggregation paradigm. Locally, agents employ parameter-efficient fine-tuning on filtered, high-return trajectories to achieve stable gradient updates. Globally, Fed-SE aggregates updates within a low-rank subspace that disentangles environment-specific dynamics, effectively reducing negative transfer across clients. Experiments across five heterogeneous environments demonstrate that Fed-SE improves average task success rates by approximately 18% over federated baselines, validating its effectiveness in robust cross-environment knowledge transfer in privacy-constrained deployments.

871

03 Dec 2025

fine-tuning adversarial-robustness agents

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Zhejiang University

ByteDance BandAI

Researchers from Zhejiang University and ByteDance introduced CodeVision, a "code-as-tool" framework that equips Multimodal Large Language Models (MLLMs) to programmatically interact with images. The approach significantly improves MLLM robustness by correcting common image corruptions and enables state-of-the-art multi-tool reasoning through emergent tool use and error recovery.

08 Dec 2025

fine-tuning computer-science computer-vision-and-pattern-recognition

OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

Zhejiang University

ByteDance

Researchers from Zhejiang University and ByteDance introduced OpenVE-3M, a large-scale, high-quality dataset of 3 million instruction-guided video editing pairs, and OpenVE-Bench, a unified evaluation benchmark. They also developed OpenVE-Edit, a 5B parameter model trained on OpenVE-3M, which achieved state-of-the-art performance with an overall score of 2.49 on OpenVE-Bench, outperforming larger existing models.

10 Dec 2025

fine-tuning computer-science conversational-ai

UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking

the University of Tokyo Institute of Science Tokyo Shanda AI Research Tokyo

Generating lifelike conversational avatars requires modeling not just isolated speakers, but the dynamic, reciprocal interaction of speaking and listening. However, modeling the listener is exceptionally challenging: direct audio-driven training fails, producing stiff, static listening motions. This failure stems from a fundamental imbalance: the speaker's motion is strongly driven by speech audio, while the listener's motion primarily follows an internal motion prior and is only loosely guided by external speech. This challenge has led most methods to focus on speak-only generation. The only prior attempt at joint generation relies on extra speaker's motion to produce the listener. This design is not end-to-end, thereby hindering the real-time applicability. To address this limitation, we present UniLS, the first end-to-end framework for generating unified speak-listen expressions, driven by only dual-track audio. Our method introduces a novel two-stage training paradigm. Stage 1 first learns the internal motion prior by training an audio-free autoregressive generator, capturing the spontaneous dynamics of natural facial motion. Stage 2 then introduces the dual-track audio, fine-tuning the generator to modulate the learned motion prior based on external speech cues. Extensive evaluations show UniLS achieves state-of-the-art speaking accuracy. More importantly, it delivers up to 44.1\% improvement in listening metrics, generating significantly more diverse and natural listening expressions. This effectively mitigates the stiffness problem and provides a practical, high-fidelity audio-driven solution for interactive digital humans.

10 Dec 2025

fine-tuning agents computer-science

MOA: Multi-Objective Alignment for Role-Playing Agents

Role-playing agents (RPAs) must simultaneously master many conflicting skills -- following multi-turn instructions, exhibiting domain knowledge, and adopting a consistent linguistic style. Existing work either relies on supervised fine-tuning (SFT) that over-fits surface cues and yields low diversity, or applies reinforcement learning (RL) that fails to learn multiple dimensions for comprehensive RPA optimization. We present MOA (Multi-Objective Alignment), a reinforcement-learning framework that enables multi-dimensional, fine-grained rubric optimization for general RPAs. MOA introduces a novel multi-objective optimization strategy that trains simultaneously on multiple fine-grained rubrics to boost optimization performance. Besides, to address the issues of model output diversity and quality, we have also employed thought-augmented rollout with off-policy guidance. Extensive experiments on challenging benchmarks such as PersonaGym and RoleMRC show that MOA enables an 8B model to match or even outperform strong baselines such as GPT-4o and Claude across numerous dimensions. This demonstrates the great potential of MOA in building RPAs that can simultaneously meet the demands of role knowledge, persona style, diverse scenarios, and complex multi-turn conversations.

05 Dec 2025

fine-tuning agents computer-science

Evolutionary System 2 Reasoning: An Empirical Proof

Researchers from South China University of Technology introduced Evolutionary Reasoning Optimization (ERO), a neuroevolution framework that applies evolutionary strategies to Large Language Models' parameters, enabling a Qwen-7B model to develop robust System 2 reasoning capabilities. This method empirically demonstrated that reasoning can emerge through evolution, allowing the evolved model to outperform GPT-5 on the Abstraction and Reasoning Corpus (ARC) benchmark, challenging the notion that only model scaling improves such abilities.

08 Dec 2025

fine-tuning agentic-frameworks agents

VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection

Researchers from the University of California, Santa Barbara, University of Chicago, University of California, Berkeley, and University of Illinois Urbana-Champaign developed VulnLLM-R, a 7-billion parameter specialized reasoning large language model for vulnerability detection. This open-source model surpasses the performance of larger general-purpose LLMs and traditional tools, achieving project-level analysis and discovering 15 zero-day vulnerabilities in real-world software.

09 Dec 2025

fine-tuning computer-science machine-learning

MobileFineTuner: A Unified End-to-End Framework for Fine-Tuning LLMs on Mobile Phones

MobileFineTuner introduces a C++ native, end-to-end framework enabling Large Language Model fine-tuning directly on commodity mobile phones. It achieves model quality comparable to server-side training while integrating system-level optimizations that address the memory and energy constraints of mobile devices.

10 Dec 2025

fine-tuning active-learning computer-science

Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM Alignment

Reward-model-based fine-tuning is a central paradigm in aligning Large Language Models with human preferences. However, such approaches critically rely on the assumption that proxy reward models accurately reflect intended supervision, a condition often violated due to annotation noise, bias, or limited coverage. This misalignment can lead to undesirable behaviors, where models optimize for flawed signals rather than true human values. In this paper, we investigate a novel framework to identify and mitigate such misalignment by treating the fine-tuning process as a form of knowledge integration. We focus on detecting instances of proxy-policy conflicts, cases where the base model strongly disagrees with the proxy. We argue that such conflicts often signify areas of shared ignorance, where neither the policy nor the reward model possesses sufficient knowledge, making them especially susceptible to misalignment. To this end, we propose two complementary metrics for identifying these conflicts: a localized Proxy-Policy Alignment Conflict Score (PACS) and a global Kendall-Tau Distance measure. Building on this insight, we design an algorithm named Selective Human-in-the-loop Feedback via Conflict-Aware Sampling (SHF-CAS) that targets high-conflict QA pairs for additional feedback, refining both the reward model and policy efficiently. Experiments on two alignment tasks demonstrate that our approach enhances general alignment performance, even when trained with a biased proxy reward. Our work provides a new lens for interpreting alignment failures and offers a principled pathway for targeted refinement in LLM training.

10 Dec 2025

fine-tuning computer-science computer-vision-and-pattern-recognition

VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification

Synthesizing realistic human-object interactions (HOI) in video is challenging due to the complex, instance-specific interaction dynamics of both humans and objects. Incorporating controllability in video generation further adds to the complexity. Existing controllable video generation approaches face a trade-off: sparse controls like keypoint trajectories are easy to specify but lack instance-awareness, while dense signals such as optical flow, depths or 3D meshes are informative but costly to obtain. We propose VHOI, a two-stage framework that first densifies sparse trajectories into HOI mask sequences, and then fine-tunes a video diffusion model conditioned on these dense masks. We introduce a novel HOI-aware motion representation that uses color encodings to distinguish not only human and object motion, but also body-part-specific dynamics. This design incorporates a human prior into the conditioning signal and strengthens the model's ability to understand and generate realistic HOI dynamics. Experiments demonstrate state-of-the-art results in controllable HOI video generation. VHOI is not limited to interaction-only scenarios and can also generate full human navigation leading up to object interactions in an end-to-end manner. Project page: this https URL.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Relational Visual Similarity

Auditing Games for Sandbagging

Training LLMs for Honesty via Confessions

TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models

From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

Robust Finetuning of Vision-Language-Action Robot Policies via Parameter Merging

Beyond Data Filtering: Knowledge Localization for Capability Removal in LLMs

rSIM: Incentivizing Reasoning Capabilities of LLMs via Reinforced Strategy Injection

Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking

MOA: Multi-Objective Alignment for Role-Playing Agents

Evolutionary System 2 Reasoning: An Empirical Proof

VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection

MobileFineTuner: A Unified End-to-End Framework for Fine-Tuning LLMs on Mobile Phones

Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM Alignment

VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification

Events

AI for Law

Personalize Your Feed