alphaXiv

History

Papers Benchmarks

IBM QuantumIBM T.J. Watson Research Center

289

27 Oct 2025

computer-science artificial-intelligence computation-and-language

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

Michigan State University

Microsoft

The Pennsylvania State University Amazon IBM T.J. Watson Research Center The University of Utah

Researchers from Penn State, in collaboration with industry partners, provide the first comprehensive survey of Reinforcement Learning-based agentic search, systematically organizing its foundational concepts, functional roles, optimization strategies, and applications. This work clarifies the interplay between RL and agentic LLMs, delineating current capabilities, evaluation methods, and critical future research directions.

515

30 May 2025

computer-science machine-learning deep-reinforcement-learning

Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

IBM Research IBM Quantum MIT-IBM Watson Lab

IBM Research and MIT-IBM Watson Lab researchers formalized an off-policy extension for Group Relative Policy Optimization (GRPO), demonstrating it provides theoretical guarantees for policy improvement and achieves comparable or superior performance to on-policy GRPO on LLM alignment tasks while significantly reducing computational overhead through less frequent model updates.

399

22 Aug 2025

physics quantum-physics

Improved belief propagation is sufficient for real-time decoding of quantum memory

IBM Quantum

We introduce a new heuristic decoder, Relay-BP, targeting real-time quantum circuit decoding for large-scale quantum computers. Relay-BP achieves high accuracy across circuit-noise decoding problems: significantly outperforming BP+OSD+CS-10 for bivariate-bicycle codes and comparable to min-weight-matching for surface codes. As a lightweight message-passing decoder, Relay-BP is inherently parallel, enabling rapid low-footprint decoding with FPGA or ASIC real-time implementations, similar to standard BP. A core aspect of our decoder is its enhancement of the standard BP algorithm by incorporating disordered memory strengths. This dampens oscillations and breaks symmetries that trap traditional BP algorithms. By dynamically adjusting memory strengths in a relay approach, Relay-BP can consecutively encounter multiple valid corrections to improve decoding accuracy. We observe that a problem-dependent distribution of memory strengths that includes negative values is indispensable for good performance.

165

29 Oct 2025

computer-science artificial-intelligence computation-and-language

GradeSQL: Test-Time Inference with Outcome Reward Models for Text-to-SQL Generation from Large Language Models

IBM T.J. Watson Research Center Polytechnic University of Bari

Dario Di Palma

A framework called GradeSQL enhances Text-to-SQL generation from Large Language Models (LLMs) by leveraging Outcome Reward Models (ORMs) to select the semantically most correct query from a pool of candidates during test-time inference. This approach consistently achieved higher execution accuracy, demonstrating, for example, a +5.01% gain over the N=1 baseline on the BIRD dev set through improved semantic evaluation.

287

18 Dec 2023

computer-science artificial-intelligence

Generalized Planning in PDDL Domains with Pretrained Large Language Models

MIT IBM T.J. Watson Research Center

Soham Dan

Researchers from MIT CSAIL and IBM Research demonstrate that GPT-4 can serve as a powerful generalized planner for PDDL domains, synthesizing executable Python programs from domain descriptions and a few training tasks. Their methodology, enhanced by automated debugging, yields highly efficient and scalable solutions that often surpass traditional generalized planners in performance and runtime efficiency.

456

13 Jul 2025

other-condensed-matter physics chemical-physics

Chemistry Beyond the Scale of Exact Diagonalization on a Quantum-Centric Supercomputer

IBM Quantum University of Colorado IBM Research Cambridge IBM T.J. Watson Research Center RIKEN Center for Emergent Matter Science (CEMS)RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS)RIKEN Center for Quantum Computing (RQC)RIKEN Center for Computational Science (R-CCS)IBM France Lab

Electronic structure calculations on molecular systems previously intractable for exact classical methods were performed using a quantum-centric supercomputer integrating an IBM Heron processor with the Fugaku supercomputer. This approach, employing the Sample-Based Quantum Diagonalization (SQD) workflow, successfully calculated properties for systems up to 77 qubits and 10,570 gates, aided by a robust self-consistent configuration recovery technique.

9,337

12 Dec 2024

ai-for-health computer-science information-retrieval

MOPI-HFRS: A Multi-objective Personalized Health-aware Food Recommendation System with LLM-enhanced Interpretation

University of Notre Dame University of Connecticut Brandeis University IBM T.J. Watson Research Center

The prevalence of unhealthy eating habits has become an increasingly concerning issue in the United States. However, major food recommendation platforms (e.g., Yelp) continue to prioritize users' dietary preferences over the healthiness of their choices. Although efforts have been made to develop health-aware food recommendation systems, the personalization of such systems based on users' specific health conditions remains under-explored. In addition, few research focus on the interpretability of these systems, which hinders users from assessing the reliability of recommendations and impedes the practical deployment of these systems. In response to this gap, we first establish two large-scale personalized health-aware food recommendation benchmarks at the first attempt. We then develop a novel framework, Multi-Objective Personalized Interpretable Health-aware Food Recommendation System (MOPI-HFRS), which provides food recommendations by jointly optimizing the three objectives: user preference, personalized healthiness and nutritional diversity, along with an large language model (LLM)-enhanced reasoning module to promote healthy dietary knowledge through the interpretation of recommended results. Specifically, this holistic graph learning framework first utilizes two structure learning and a structure pooling modules to leverage both descriptive features and health data. Then it employs Pareto optimization to achieve designed multi-facet objectives. Finally, to further promote the healthy dietary knowledge and awareness, we exploit an LLM by utilizing knowledge-infusion, prompting the LLMs with knowledge obtained from the recommendation model for interpretation.

547

03 Jun 2025

physics quantum-physics

Tour de gross: A modular quantum computer based on bivariate bicycle codes

IBM Quantum

Researchers at IBM Quantum introduce the 'bicycle architecture,' a fault-tolerant quantum computing design leveraging bivariate bicycle codes that achieves an order of magnitude lower physical qubit overhead compared to surface code architectures. This modular approach significantly reduces physical qubit requirements for complex computations and enhances manufacturability.

24 Oct 2025

physics quantum-physics

Real-time decoding of the gross code memory with FPGAs

IBM Quantum

A Field Programmable Gate Array (FPGA) implementation developed by IBM Quantum researchers decodes advanced quantum low-density parity check (qLDPC) codes in real-time, achieving a 24ns belief propagation iteration time for the [[144, 12, 12]] gross code. This performance, significantly faster than typical syndrome collection rates, resolves the critical 'backlog problem' in quantum error correction while matching floating-point accuracy with 4-6 bits of integer precision.

10 Oct 2025

physics quantum-physics

Big cats: entanglement in 120 qubits and beyond

IBM Quantum

Entanglement is the quintessential quantum phenomenon and a key enabler of quantum algorithms. The ability to faithfully entangle many distinct particles is often used as a benchmark for the quality of hardware and control in a quantum computer. Greenberger-Horne-Zeilinger (GHZ) states, also known as Schrödinger cat states, are useful for this task. They are easy to verify, but difficult to prepare due to their high sensitivity to noise. In this Letter we report on the largest GHZ state prepared to date consisting of 120 superconducting qubits. We do this via a combination of optimized compilation, low-overhead error detection and temporary uncomputation. We use an automated compiler to maximize error-detection in state preparation circuits subject to arbitrary qubit connectivity constraints and variations in error rates. We measure a GHZ fidelity of 0.56(3) with a post-selection rate of 28%. We certify the fidelity of our GHZ states using multiple methods and show that they are all equivalent, albeit with different practical considerations.

350

04 Dec 2025

computer-science software-engineering

C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques

Columbia University IBM T.J. Watson Research Center

In recent years, there has been a lot of interest in converting C code to Rust, to benefit from the memory and thread safety guarantees of Rust. C2Rust is a rule-based system that can automatically convert C code to functionally identical Rust, but the Rust code that it produces is non-idiomatic, i.e., makes extensive use of unsafe Rust, a subset of the language that doesn't have memory or thread safety guarantees. At the other end of the spectrum are LLMs, which produce idiomatic Rust code, but these have the potential to make mistakes and are constrained in the length of code they can process. In this paper, we present C2SaferRust, a novel approach to translate C to Rust that combines the strengths of C2Rust and LLMs. We first use C2Rust to convert C code to non-idiomatic, unsafe Rust. We then decompose the unsafe Rust code into slices that can be individually translated to safer Rust by an LLM. After processing each slice, we run end-to-end test cases to verify that the code still functions as expected. We also contribute a benchmark of 7 real-world programs, translated from C to unsafe Rust using C2Rust. Each of these programs also comes with end-to-end test cases. On this benchmark, we are able to reduce the number of raw pointers by up to 38%, and reduce the amount of unsafe code by up to 28%, indicating an increase in safety. The resulting programs still pass all test cases. C2SaferRust also shows convincing gains in performance against two previous techniques for making Rust code safer.

25 Sep 2025

physics quantum-physics

Entanglement sharing schemes

University of Waterloo Freie Universität Berlin IBM Quantum

Perimeter Institute for Theoretical Physics National Research Council Canada Rikkyo University

T M

Researchers from Perimeter Institute, University of Waterloo, and collaborators introduced Entanglement Sharing Schemes (ESS), a theoretical framework for distributing quantum correlations among multiple parties. The work defines and characterizes both known and unknown partner ESS, providing constructions and establishing conditions for their realizability, and successfully applied the framework to resolve an open problem in entanglement summoning.

09 Oct 2025

physics quantum-physics

Transversal gates for probabilistic implementation of multi-qubit Pauli rotations

the University of Tokyo IBM Quantum

Researchers from The University of Tokyo and IBM Quantum introduce a general theoretical framework for "weak transversal gates," which enable probabilistic, in-place multi-qubit Pauli rotations in fault-tolerant quantum computation. This new approach for non-Clifford operations, integrated into a "Clifford+" architecture, demonstrates reductions in runtime by factors of tens to a hundred and circuit volume by factors of 100+ compared to conventional methods for large-scale circuits.

184

19 Jun 2024

computer-science emerging-technologies physics

Quantum computing with Qiskit

IBM Quantum IBM Research Europe IBM France Lab IBM, T.J. Watson Research Center

We describe Qiskit, a software development kit for quantum information science. We discuss the key design decisions that have shaped its development, and examine the software architecture and its core components. We demonstrate an end-to-end workflow for solving a problem in condensed matter physics on a quantum computer that serves to highlight some of Qiskit's capabilities, for example the representation and optimization of circuits at various abstraction levels, its scalability and retargetability to new gates, and the use of quantum-classical computations via dynamic circuits. Lastly, we discuss some of the ecosystem of tools and plugins that extend Qiskit for various tasks, and the future ahead.

130

27 Oct 2025

physics quantum-physics

Improved QLDPC Surgery: Logical Measurements and Bridging Codes

IBM Quantum IBM Research Cambridge IBM T.J. Watson Research Center

In this paper, we introduce the gauge-fixed QLDPC surgery scheme, an improved logical measurement scheme based on the construction of Cohen et al. (Sci. Adv. 8, eabn1717). Our scheme leverages expansion properties of the Tanner graph to substantially reduce the space overhead of QLDPC surgery. In certain cases, we only require

\Theta(w)

ancilla qubits to fault-tolerantly measure a weight

w

logical operator. We provide rigorous analysis for the code distance and fault distance of our schemes, and present a modular decoding algorithm that achieves maximal fault-distance. We further introduce a bridge system to facilitate fault-tolerant joint measurements of logical operators. Augmented by this bridge construction, our scheme can be used to connect different families of QLDPC codes into one universal architecture. Applying our toolbox, we show how to perform all logical Clifford gates on the [[144,12,12]] bivariate bicycle code. Our scheme adds 103 ancilla qubits into the connectivity graph, and one of the twelve logical qubits is used as an ancilla for gate synthesis. Logical measurements are combined with the automorphism gates studied by Bravyi et al. (Nature 627, 778-782) to implement 288 Pauli product measurements. We demonstrate the practicality of our scheme through circuit-level noise simulations, leveraging our proposed modular decoder that combines BPOSD with matching.

2,268

21 Feb 2024

computer-science emerging-technologies physics

High-threshold and low-overhead fault-tolerant quantum memory

IBM Quantum MIT-IBM Watson AI Lab IBM T.J. Watson Research Center

Researchers at IBM Quantum introduced Bivariate Bicycle (BB) Low-Density Parity-Check (LDPC) codes, which achieve an error threshold of 0.8%, on par with the surface code, while dramatically reducing the physical qubit overhead by over 10 times. These codes are designed with hardware-compatible connectivity and support fault-tolerant memory operations, making scalable quantum memory feasible for near-term processors.

06 Oct 2025

computer-science computational-complexity physics

Peaked quantum advantage using error correction

University of Chicago IBM Quantum

University of Maryland

A key issue of current quantum advantage experiments is that their verification requires a full classical simulation of the ideal computation. This limits the regime in which the experiments can be verified to precisely the regime in which they are also simulatable. An important outstanding question is therefore to find quantum advantage schemes that are also classically verifiable. We make progress on this question by designing a new quantum advantage proposal--Hidden Code Sampling--whose output distribution is conditionally peaked. These peaks enable verification in far less time than it takes for full simulation. At the same time, we show that exactly sampling from the output distribution is classically hard unless the polynomial hierarchy collapses, and we propose a plausible conjecture regarding average-case hardness. Our scheme is based on ideas from quantum error correction. The required quantum computations are closely related to quantum fault-tolerant circuits and can potentially be implemented transversally. Our proposal may thus give rise to a next generation of quantum advantage experiments en route to full quantum fault tolerance.

133

26 Feb 2025

computer-science artificial-intelligence physics

Practical and efficient quantum circuit synthesis and transpiling with Reinforcement Learning

IBM Quantum IBM T.J. Watson Research Center

IBM Quantum researchers developed a Reinforcement Learning framework for practical and efficient quantum circuit synthesis and routing, eliminating the need for large datasets and inherently incorporating device-specific constraints. This approach yields near-optimal circuits, achieving up to 60% reduction in CNOT layers for Cliffords and 40% shallower circuits for large quantum volume benchmarks, while being orders of magnitude faster than exact optimization methods.

28 Aug 2025

agents computer-science artificial-intelligence

Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

IBM Research IBM Quantum

Qiskit is an open-source quantum computing framework that allows users to design, simulate, and run quantum circuits on real quantum hardware. We explore post-training techniques for LLMs to assist in writing Qiskit code. We introduce quantum verification as an effective method for ensuring code quality and executability on quantum hardware. To support this, we developed a synthetic data pipeline that generates quantum problem-unit test pairs and used it to create preference data for aligning LLMs with DPO. Additionally, we trained models using GRPO, leveraging quantum-verifiable rewards provided by the quantum hardware. Our best-performing model, combining DPO and GRPO, surpasses the strongest open-source baselines on the challenging Qiskit-HumanEval-hard benchmark.

103

20 Jun 2024

computer-science artificial-intelligence generative-models

Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models

IBM Research IBM Quantum

Quantum programs are typically developed using quantum Software Development Kits (SDKs). The rapid advancement of quantum computing necessitates new tools to streamline this development process, and one such tool could be Generative Artificial intelligence (GenAI). In this study, we introduce and use the Qiskit HumanEval dataset, a hand-curated collection of tasks designed to benchmark the ability of Large Language Models (LLMs) to produce quantum code using Qiskit - a quantum SDK. This dataset consists of more than 100 quantum computing tasks, each accompanied by a prompt, a canonical solution, a comprehensive test case, and a difficulty scale to evaluate the correctness of the generated solutions. We systematically assess the performance of a set of LLMs against the Qiskit HumanEval dataset's tasks and focus on the models ability in producing executable quantum code. Our findings not only demonstrate the feasibility of using LLMs for generating quantum code but also establish a new benchmark for ongoing advancements in the field and encourage further exploration and development of GenAI-driven tools for quantum code generation.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Improved belief propagation is sufficient for real-time decoding of quantum memory

GradeSQL: Test-Time Inference with Outcome Reward Models for Text-to-SQL Generation from Large Language Models

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Chemistry Beyond the Scale of Exact Diagonalization on a Quantum-Centric Supercomputer

MOPI-HFRS: A Multi-objective Personalized Health-aware Food Recommendation System with LLM-enhanced Interpretation

Tour de gross: A modular quantum computer based on bivariate bicycle codes

Real-time decoding of the gross code memory with FPGAs

Big cats: entanglement in 120 qubits and beyond

C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques

Entanglement sharing schemes

Transversal gates for probabilistic implementation of multi-qubit Pauli rotations

Quantum computing with Qiskit

Improved QLDPC Surgery: Logical Measurements and Bridging Codes

High-threshold and low-overhead fault-tolerant quantum memory

Peaked quantum advantage using error correction

Practical and efficient quantum circuit synthesis and transpiling with Reinforcement Learning

Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models

Events

AI for Law

Personalize Your Feed