This paper establishes a foundational theoretical link between Large Language Models (LLMs) and Algorithmic Information Theory (AIT), demonstrating that LLM training and inference can be viewed as computable approximations of the Solomonoff prior and Solomonoff induction. Leveraging this understanding, the work proposes a few-shot example selection strategy based on model confidence that improved classification accuracy on tested datasets, for instance, increasing Qwen2.5 3B's accuracy from 76.62% to 90.07% on the SMS dataset.
View blogChain-of-Action (CoA) proposes a visuo-motor policy that generates robot trajectories autoregressively in reverse, starting from a task goal and reasoning backward to the current state. This approach addresses compounding errors and enhances spatial generalization, achieving an average success rate of 0.552 on 60 RLBench tasks and demonstrating improved performance on real-world Fetch robot manipulation.
View blogResearchers developed game-theory-inspired workflows to systematically enhance the strategic decision-making capabilities of large language models (LLMs) in various negotiation and strategic games. Integrating classical game theory principles, these workflows enabled LLM agents to achieve near-optimal allocations in incomplete-information negotiations, with up to 100% agreement and envy-freeness, and significantly improved adherence to Nash Equilibria in complete-information games compared to baseline LLM performance.
View blogResearchers at CAS Key Laboratory of AI Safety developed a theoretical framework to quantify the benefit and detriment of retrieved information in Retrieval-Augmented Generation (RAG) at the token level. This framework formalizes benefit as distribution completion and detriment as distribution contradiction, enabling a practical method (Tok-RAG) that improves RAG robustness and performance across diverse tasks with minimal computational overhead.
View blogResearchers from Chinese Academy of Sciences developed Gradient-Adaptive Policy Optimization (GAPO), a fine-tuning method for Large Language Models that employs gradient rescaling to balance multiple, potentially conflicting objectives like helpfulness and harmlessness. GAPO (p=1) consistently outperformed existing multi-objective alignment baselines in both model-based and GPT-4o evaluations, achieving superior average helpfulness and harmlessness scores, while P-GAPO enabled the generation of a better Pareto front for user-preferred trade-offs.
View blogThis work introduces UncertaintyRAG, a lightweight and unsupervised retrieval model for long-context Retrieval-Augmented Generation (RAG). It leverages Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate semantic similarity between text chunks, enhancing robustness to distribution shifts and achieving state-of-the-art average performance on long-context QA and summarization benchmarks while utilizing only 4% of the training data compared to baseline models.
View blogThe VL-SAE framework introduces a Sparse Autoencoder architecture that interprets and enhances vision-language alignment in Vision-Language Models by mapping both modalities to a unified concept set. This approach improves the interpretability of cross-modal reasoning and demonstrates performance gains in zero-shot classification and hallucination reduction.
View blogResearchers from the Chinese Academy of Sciences and King Abdullah University of Science and Technology introduced GGFlow, the first discrete flow matching generative model that incorporates optimal transport for molecular graphs. This model achieves nearly perfect chemical validity and state-of-the-art performance in both unconditional and property-guided molecule generation with significantly fewer inference steps.
View blogA method for music style transfer is introduced that leverages diffusion models with time-varying textual inversion, allowing users to transfer styles from any audio example, including non-musical sounds, to existing melodies while preserving structural content. This approach demonstrates superior performance in both content preservation and style fit compared to existing state-of-the-art techniques.
View blogKnowCoder, developed by ICT, CAS, enhances Universal Information Extraction (UIE) by introducing a code-style schema representation that leverages LLMs' inherent code understanding. The model demonstrates superior generalization across diverse information extraction tasks, achieving a 12.5% relative improvement in zero-shot NER F1 over leading baselines and outperforming prior state-of-the-art models in Relation and Event Extraction after fine-tuning.
View blogDIVERSIFY is a framework that tackles out-of-distribution detection and generalization for time series data by explicitly identifying and characterizing latent distributions without relying on predefined domain labels. It consistently outperforms baseline methods on OOD detection across seven diverse datasets, demonstrating its ability to learn robust representations for non-stationary time series.
View blog