alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Ask or search anything...

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

Driving factors behind multiple populations

31 Jan 2024

University of Toronto Chinese Academy of Sciences logo

Chinese Academy of Sciences

Star clusters were historically considered simple stellar populations, with all stars sharing the same age and initial chemical composition. However, the presence of chemical anomalies in globular clusters (GCs), called multiple stellar populations (MPs), has challenged star formation theories in dense environments. Literature studies show that mass, metallicity, and age are likely controlling parameters for the manifestation of MPs. Identifying the limit between clusters with/without MPs in physical parameter space is crucial to reveal the driving mechanism behind their presence. In this study, we look for MP signals in Whiting 1, traditionally considered a young GC. Using the Magellan telescope, we obtained low-resolution spectra within

\rm \lambda\lambda = 3850-5500 Å

for eight giants of Whiting 1. We measured the C and N abundances from the CN and CH spectral indices. C and N abundances have variations comparable with their measurement errors (

\sim0.1

dex), suggesting that MPs are absent from Whiting 1. Combining these findings with literature studies, we propose a limit in the metallicity vs. cluster compactness index parameter space, which relatively clearly separates star clusters with/without MPs (GCs/open clusters). This limit is physically motivated. On a larger scale, the galactic environment determines cluster compactness and metallicity, leading to metal-rich, diffuse, old clusters formed ex situ. Our proposed limit also impacts our understanding of the formation of the Sagittarius dwarf galaxy: star clusters formed after the first starburst (age

\lesssim 8-10

Gyr). These clusters are simple stellar populations because the enriched galactic environment is no longer suitable for MP formation.

#astrophysics-of-galaxies #physics

Paper thumbnail

3D Scene Graph Guided Vision-Language Pre-training

27 Nov 2024

South China University of Technology Sun Yat-Sen University logo

Sun Yat-Sen University

Researchers from Nanyang Technological University, Sun Yat-Sen University, and South China University of Technology developed a general-purpose 3D Vision-Language Pre-training framework that leverages 3D scene graphs to achieve multi-level alignment between 3D scenes and natural language. The framework establishes state-of-the-art or competitive performance across 3D visual grounding, question answering, and dense captioning tasks.

#computer-science #contrastive-learning #computer-vision-and-pattern-recognition

Paper thumbnail

From Entanglement Purification Scheduling to Fidelity-constrained Multi-Flow Routing

22 Aug 2024

Sun Yat-Sen University

Researchers from Sun Yat-sen University developed -optimal polynomial-time algorithms for entanglement purification scheduling and fidelity-constrained multi-flow routing in quantum networks. Their framework, incorporating dynamic programming and graph theory, achieves superior fidelity and lower path costs compared to existing methods, establishing theoretical conditions for optimal purification strategies.

#computer-science #networking-and-internet-architecture #distributed-learning

Paper thumbnail

How Powerful Potential of Attention on Image Restoration?

15 Mar 2024

Sun Yat-Sen University

National University of Singapore

Researchers at Sun Yat-sen University and collaborators introduce Continuous Scaling Attention (CSAttn), an attention-only Transformer block that achieves state-of-the-art performance across multiple image restoration tasks without relying on Feed-Forward Networks. The architecture demonstrates substantial improvements, including a 0.41 dB PSNR increase in image deraining and a 4.22 dB PSNR gain in low-light image enhancement, while maintaining competitive model efficiency.

#attention-mechanisms #computer-science #computer-vision-and-pattern-recognition

Paper thumbnail

Diffusion Language Models Know the Answer Before Decoding

14 Oct 2025

pengxiang-li

LI Pengxiang

Max Planck Institute for Intelligent Systems Google DeepMind logo

Google DeepMind

Researchers from The Hong Kong Polytechnic University, Dartmouth College, Max Planck Institute, Google DeepMind, and others developed Prophet, a training-free adaptive decoding paradigm for Diffusion Language Models (DLMs) that leverages early answer convergence. The method achieves up to 3.4 times faster inference by dynamically committing to answers when model confidence is high, often improving output quality compared to full-step decoding.

#computer-science #artificial-intelligence #computation-and-language

Paper thumbnail

MemoryBank: Enhancing Large Language Models with Long-Term Memory

21 May 2023

wanjun-zhong

Wanjun Zhong

Sun Yat-Sen University Harbin Institute of Technology

MemoryBank introduces a novel long-term memory mechanism for Large Language Models, enabling them to retain and recall information across extended interactions by simulating human-like forgetting and reinforcement. The system, demonstrated through the SiliconFriend chatbot, significantly enhances contextual understanding, personalizes user interactions through dynamic user portraits, and provides empathetic responses, showing strong performance across various LLMs in both qualitative and quantitative evaluations.

#computer-science #conversational-ai #artificial-intelligence

Paper thumbnail

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

14 Oct 2025

Huawei Noah’s Ark Lab Sun Yat-Sen University logo

Sun Yat-Sen University

The MemAct framework enables Large Language Model agents to autonomously manage their working memory by treating context curation as learnable actions, addressing a critical bottleneck in long-horizon tasks. This approach achieves 59.1% accuracy on multi-objective QA while reducing average context tokens to 3,447, outperforming larger baselines and improving training efficiency by up to 40%.

#agentic-frameworks #agents #computer-science

Paper thumbnail

RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

27 Aug 2025

Northeastern University Sun Yat-Sen University logo

Sun Yat-Sen University

RoboTwin 2.0 introduces a scalable simulation framework and benchmark designed to generate high-quality, domain-randomized data for robust bimanual robotic manipulation, addressing limitations in existing synthetic datasets. Policies trained with RoboTwin 2.0 data achieved a 24.4% improvement in real-world success rates for few-shot learning and 21.0% for zero-shot generalization on unseen backgrounds.

#computer-science #artificial-intelligence #computation-and-language

Resources 1,514

Paper thumbnail

New Physics Search at the CEPC: a General Perspective

10 Oct 2025

Chinese Academy of Sciences Sun Yat-Sen University logo

Sun Yat-Sen University

The Circular Electron-Positron Collider (CEPC), a proposed next-generation Higgs factory, provides new opportunities to explore physics beyond the Standard Model (SM). With its clean electron-positron collision environment and the ability to collect large samples of Higgs, W, and Z bosons, the CEPC enables precision measurements and searches for new physics. This white paper outlines the CEPC's discovery potential, including studies of exotic decays of the Higgs, Z, and top quarks, dark matter and dark sector phenomena, long-lived particles, supersymmetry, and neutrino-related signatures. Advanced detector technologies and reconstruction techniques, such as one-to-one correspondence reconstruction and jet origin identification, significantly improve sensitivity to rare and weakly interacting processes. The CEPC is particularly well suited to probe the electroweak phase transition and test models of electroweak baryogenesis and dark sector interactions. In addition, global fit analyses highlight the CEPC's complementary role in constraining a wide range of new physics scenarios. These features position the CEPC as a powerful tool for exploring the next frontier in fundamental particle physics in the post-Higgs discovery era.

#high-energy-physics-experiment #high-energy-physics-phenomenology #physics

Paper thumbnail

Search Self-play: Pushing the Frontier of Agent Capability without Supervision

21 Oct 2025

Alibaba Group Sun Yat-Sen University logo

Sun Yat-Sen University

Search Self-play (SSP) is a novel self-supervised framework enabling Large Language Model agents to autonomously generate, verify, and solve complex deep search tasks. The method consistently improves agent performance across seven question-answering benchmarks, yielding an average of 26.4 points improvement for base models and achieving state-of-the-art results on five benchmarks for larger models like Qwen2.5-32B-Instruct.

#agentic-frameworks #agents #computer-science

Paper thumbnail

OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer

14 Nov 2025

Alibaba Group Sun Yat-Sen University logo

Sun Yat-Sen University

OmniVGGT, a 3D foundation model developed by researchers including those from HKUST and NTU, integrates an arbitrary number of geometric modalities like depth maps and camera parameters, resulting in superior performance across 3D perception tasks and improved spatial reasoning for robotic manipulation.

#computer-science #computer-vision-and-pattern-recognition #geometric-deep-learning

Paper thumbnail

World-Env: Leveraging World Model as a Virtual Environment for VLA Post-Training

01 Nov 2025

Alibaba Group Sun Yat-Sen University logo

Sun Yat-Sen University

The WORLD-ENV framework enables effective post-training for Vision-Language-Action (VLA) models by utilizing a world model as a virtual environment and a VLM-guided 'instant reflector' for reward and dynamic task completion. This approach achieves an average task success rate of 79.6% on the LIBERO benchmark with only 5 expert demonstrations per task, exceeding supervised fine-tuning baselines.

#computer-science #robotics

Paper thumbnail

A Survey on Agentic Multimodal Large Language Models

13 Oct 2025

Shenzhen Research Institute of Big Data Sun Yat-Sen University logo

Sun Yat-Sen University

Researchers from Nanyang Technological University and collaborators provide a comprehensive survey defining Agentic Multimodal Large Language Models (MLLMs), distinguishing them from earlier MLLM agents by their dynamic workflows and proactive behaviors. The survey establishes a three-dimensional framework covering internal intelligence, external tool invocation, and environment interaction, while also compiling open-source resources for the field.

#agentic-frameworks #agents #computer-science

Paper thumbnail

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

20 Aug 2025

anjiliu219

Anji Liu

Shanghai Artificial Intelligence Laboratory Sun Yat-Sen University logo

Sun Yat-Sen University

An updated benchmark, VBench-2.0, provides a comprehensive and automatic suite for evaluating video generative models based on their intrinsic faithfulness, assessing adherence to real-world principles like physics, commonsense, and human anatomy. This framework, which leverages a hybrid approach of generalist and specialist AI models, demonstrates strong alignment with human preferences in evaluating generated video quality.

#computer-science #computer-vision-and-pattern-recognition #generative-models

Paper thumbnail

Acting Less is Reasoning More! Teaching Model to Act Efficiently

31 May 2025

jiahao-qiu

Jiahao Qiu

University of Illinois at Urbana-Champaign Sun Yat-Sen University logo

Sun Yat-Sen University

This research introduces Optimal Tool Call-controlled Policy Optimization (OTC-PO), an RL framework that enables Large Language Models (LLMs) to use external tools efficiently while maintaining answer correctness. The method reduces tool calls by up to 68.3% and boosts tool productivity by over 200% on search and code reasoning tasks across various benchmarks, effectively mitigating cognitive offloading in LLMs.

#agents #computer-science #artificial-intelligence

Paper thumbnail

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

09 Jul 2025

zhaochen-su

Zhaochen Su

Sun Yat-Sen University Fudan University logo

Fudan University

Researchers from Soochow University, Microsoft, and other institutions developed OPENTHINKIMG, an open-source framework, and V-TOOLRL, a reinforcement learning method, enabling Large Vision-Language Models to adaptively use visual tools for complex reasoning tasks. This approach improved accuracy on chart reasoning by 29.83 points over baseline models and outperformed GPT-4.1, teaching agents to efficiently invoke tools with visual feedback.

#agents #computer-science #computer-vision-and-pattern-recognition

Paper thumbnail

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models

31 Jan 2025

Pazhou Laboratory Alibaba Group logo

LLMDet, from Sun Yat-sen University and Alibaba Group, leverages Large Language Models to provide rich, detailed image-level and region-level captions, improving open-vocabulary object detection performance, particularly for rare categories. The approach also demonstrates mutual benefits, enhancing large multi-modal models when integrated as a vision foundation.

#computer-science #computer-vision-and-pattern-recognition #multi-modal-learning

Paper thumbnail

TreeRPO: Tree Relative Policy Optimization

27 Sep 2025

zhicheng-yang

Zhicheng Yang

ETH Zurich Sun Yat-Sen University logo

Sun Yat-Sen University

TREERPO enhances Large Language Model reasoning by employing a novel tree sampling mechanism to generate fine-grained, step-level reward signals without requiring a separate process reward model. This method improves Pass@1 accuracy by up to 16.5% for Qwen2.5-Math-1.5B and reduces average response length by 18.1% compared to GRPO.

#agents #chain-of-thought #computer-science

Paper thumbnail

Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI

25 Aug 2025

Sun Yat-Sen University Peng Cheng Laboratory

This survey paper meticulously reviews the current state and future trajectories of Embodied Artificial Intelligence, specifically focusing on the integration of Multi-modal Large Models and World Models. It systematically categorizes advancements across robots, simulators, perception, interaction, agent architectures, and sim-to-real adaptation, while introducing the ARIO (All Robots In One) dataset standard to foster the development of general-purpose embodied agents.

#computer-science #artificial-intelligence #computer-vision-and-pattern-recognition

Paper thumbnail

Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis

11 Sep 2025

Sun Yat-Sen University

National University of Singapore

Researchers from The University of Hong Kong and collaborators introduced MMOral, the first large-scale multimodal instruction dataset and benchmark specifically for panoramic X-ray interpretation. Their fine-tuned OralGPT model demonstrated a 24.73% performance improvement on the MMOral-Bench, highlighting the need for domain-specific AI in dentistry as even leading general LVLMs like GPT-4o achieved only 41.45% accuracy.

#computer-science #computer-vision-and-pattern-recognition #multimedia

Paper thumbnail

There are no more papers matching your filters at the moment.