alphaXiv

History

Papers Benchmarks

INSA Rennes

02 Apr 2025

computer-science computer-vision-and-pattern-recognition generative-models

BOGausS: Better Optimized Gaussian Splatting

CNRS Orange Innovation INSA Rennes IETR-UMR 6164 Univ-Rennes

BOGausS improves 3D Gaussian Splatting optimization by addressing challenges in parameter tuning, model size reduction, and visual artifacts. It achieves higher quality scene reconstructions with up to ten times fewer Gaussians than prior methods, developed by researchers from Orange Innovation and French academic institutions.

10 Jul 2025

computer-science computer-vision-and-pattern-recognition data-curation

MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation

INSA Rennes Institute of Research and Technology b<>com Univ-Rennes

The application of methods based on Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3D GS) have steadily gained popularity in the field of 3D object segmentation in static scenes. These approaches demonstrate efficacy in a range of 3D scene understanding and editing tasks. Nevertheless, the 4D object segmentation of dynamic scenes remains an underexplored field due to the absence of a sufficiently extensive and accurately labelled multi-view video dataset. In this paper, we present MUVOD, a new multi-view video dataset for training and evaluating object segmentation in reconstructed real-world scenarios. The 17 selected scenes, describing various indoor or outdoor activities, are collected from different sources of datasets originating from various types of camera rigs. Each scene contains a minimum of 9 views and a maximum of 46 views. We provide 7830 RGB images (30 frames per video) with their corresponding segmentation mask in 4D motion, meaning that any object of interest in the scene could be tracked across temporal frames of a given view or across different views belonging to the same camera rig. This dataset, which contains 459 instances of 73 categories, is intended as a basic benchmark for the evaluation of multi-view video segmentation methods. We also present an evaluation metric and a baseline segmentation approach to encourage and evaluate progress in this evolving field. Additionally, we propose a new benchmark for 3D object segmentation task with a subset of annotated multi-view images selected from our MUVOD dataset. This subset contains 50 objects of different conditions in different scenarios, providing a more comprehensive analysis of state-of-the-art 3D object segmentation methods. Our proposed MUVOD dataset is available at this https URL.

314

26 Mar 2025

computer-science artificial-intelligence computation-and-language

EuroBERT: Scaling Multilingual Encoders for European Languages

CNRS

Carnegie Mellon University Universidade de Lisboa Instituto de Telecomunicações IRISA

Université Paris-Saclay Instituto Superior Técnico LIG Grenoble-INP INSA Rennes Unbabel Illuin Technology Equall IRT Saint-Exupéry Diabolocom CINES Artefact ISIA Lab Universit Grenoble Alpes

General-purpose multilingual vector representations, used in retrieval, regression and classification, are traditionally obtained from bidirectional encoder models. Despite their wide applicability, encoders have been recently overshadowed by advances in generative decoder-only models. However, many innovations driving this progress are not inherently tied to decoders. In this paper, we revisit the development of multilingual encoders through the lens of these advances, and introduce EuroBERT, a family of multilingual encoders covering European and widely spoken global languages. Our models outperform existing alternatives across a diverse range of tasks, spanning multilingual capabilities, mathematics, and coding, and natively supporting sequences of up to 8,192 tokens. We also examine the design decisions behind EuroBERT, offering insights into our dataset composition and training pipeline. We publicly release the EuroBERT models, including intermediate training checkpoints, together with our training framework.

29 Sep 2025

computer-science computation-and-language information-extraction

Nested Named Entity Recognition as Single-Pass Sequence Labeling

Universidade da Coruña INSA Rennes

We cast nested named entity recognition (NNER) as a sequence labeling task by leveraging prior work that linearizes constituency structures, effectively reducing the complexity of this structured prediction problem to straightforward token classification. By combining these constituency linearizations with pretrained encoders, our method captures nested entities while performing exactly n tagging actions. Our approach achieves competitive performance compared to less efficient systems, and it can be trained using any off-the-shelf sequence labeling library.

21 Oct 2025

computer-science artificial-intelligence information-theory

Model-based Implicit Neural Representation for sub-wavelength Radio Localization

CNRS

Chalmers University of Technology Mitsubishi Electric R&D Centre Europe INSA Rennes b<>com Univ-Rennes

The increasing deployment of large antenna arrays at base stations has significantly improved the spatial resolution and localization accuracy of radio-localization methods. However, traditional signal processing techniques struggle in complex radio environments, particularly in scenarios dominated by non line of sight (NLoS) propagation paths, resulting in degraded localization accuracy. Recent developments in machine learning have facilitated the development of machine learning-assisted localization techniques, enhancing localization accuracy in complex radio environments. However, these methods often involve substantial computational complexity during both the training and inference phases. This work extends the well-established fingerprinting-based localization framework by simultaneously reducing its memory requirements and improving its accuracy. Specifically, a model-based neural network is used to learn the location-to-channel mapping, and then serves as a generative neural channel model. This generative model augments the fingerprinting comparison dictionary while reducing the memory requirements. The proposed method outperforms fingerprinting baselines by achieving sub-wavelength localization accuracy, even in complex static NLoS environments. Remarkably, it offers an improvement by several orders of magnitude in localization accuracy, while simultaneously reducing memory requirements by an order of magnitude compared to classical fingerprinting methods.

11 Jul 2025

adversarial-attacks computer-science computer-vision-security

VIP: Visual Information Protection through Adversarial Attacks on Vision-Language Models

CNRS Technology Innovation Institute INSA Rennes National Higher School of Telecommunications and ICT Univ-Rennes

Recent years have witnessed remarkable progress in developing Vision-Language Models (VLMs) capable of processing both textual and visual inputs. These models have demonstrated impressive performance, leading to their widespread adoption in various applications. However, this widespread raises serious concerns regarding user privacy, particularly when models inadvertently process or expose private visual information. In this work, we frame the preservation of privacy in VLMs as an adversarial attack problem. We propose a novel attack strategy that selectively conceals information within designated Region Of Interests (ROIs) in an image, effectively preventing VLMs from accessing sensitive content while preserving the semantic integrity of the remaining image. Unlike conventional adversarial attacks that often disrupt the entire image, our method maintains high coherence in unmasked areas. Experimental results across three state-of-the-art VLMs namely LLaVA, Instruct-BLIP, and BLIP2-T5 demonstrate up to 98% reduction in detecting targeted ROIs, while maintaining global image semantics intact, as confirmed by high similarity scores between clean and adversarial outputs. We believe that this work contributes to a more privacy conscious use of multimodal models and offers a practical tool for further research, with the source code publicly available at: this https URL.

16 Dec 2024

computer-science information-theory machine-learning

CSI Compression using Channel Charting

CNRS IETR Mitsubishi Electric R&D Centre Europe INSA Rennes Univ-Rennes

Reaping the benefits of multi-antenna communication systems in frequency division duplex (FDD) requires channel state information (CSI) reporting from mobile users to the base station (BS). Over the last decades, the amount of CSI to be collected has become very challenging owing to the dramatic increase of the number of antennas at BSs. To mitigate the overhead associated with CSI reporting, compressed CSI techniques have been proposed with the idea of recovering the original CSI at the BS from its compressed version sent by the mobile users. Channel charting is an unsupervised dimensionality reduction method that consists in building a radio-environment map from CSIs. Such a method can be considered in the context of the CSI compression problem, since a chart location is, by definition, a low-dimensional representation of the CSI. In this paper, the performance of channel charting for a task-based CSI compression application is studied. A comparison of the proposed method against baselines on realistic synthetic data is proposed, showing promising results.

12 Dec 2024

computer-science computation-and-language domain-adaptation

Few-Shot Domain Adaptation for Named-Entity Recognition via Joint Constrained k-Means and Subspace Selection

CNRS IRISA

Inria INSA Rennes LMBA SCOR Universit T de Brest Universit T de Rennes Université Paris-Saclay

Named-entity recognition (NER) is a task that typically requires large annotated datasets, which limits its applicability across domains with varying entity definitions. This paper addresses few-shot NER, aiming to transfer knowledge to new domains with minimal supervision. Unlike previous approaches that rely solely on limited annotated data, we propose a weakly supervised algorithm that combines small labeled datasets with large amounts of unlabeled data. Our method extends the k-means algorithm with label supervision, cluster size constraints and domain-specific discriminative subspace selection. This unified framework achieves state-of-the-art results in few-shot NER on several English datasets.

12 Dec 2024

computer-science computation-and-language domain-adaptation

Training LayoutLM from Scratch for Efficient Named-Entity Recognition in the Insurance Domain

CNRS IRISA

Inria Univ Brest Université de Rennes INSA Rennes LMBA SCOR

Researchers at SCOR and collaborating universities trained LayoutLM from scratch on domain-specific data to improve Named-Entity Recognition (NER) in insurance documents, yielding enhanced performance and stability while demonstrating significant computational efficiency gains with reduced model complexity.

28 Aug 2025

computer-science computer-vision-and-pattern-recognition fine-tuning

Adam SLAM - the last mile of camera calibration with 3DGS

CNRS Orange Innovation INSA Rennes Univ-Rennes

The quality of the camera calibration is of major importance for evaluating progresses in novel view synthesis, as a 1-pixel error on the calibration has a significant impact on the reconstruction quality. While there is no ground truth for real scenes, the quality of the calibration is assessed by the quality of the novel view synthesis. This paper proposes to use a 3DGS model to fine tune calibration by backpropagation of novel view color loss with respect to the cameras parameters. The new calibration alone brings an average improvement of 0.4 dB PSNR on the dataset used as reference by 3DGS. The fine tuning may be long and its suitability depends on the criticity of training time, but for calibration of reference scenes, such as Mip-NeRF 360, the stake of novel view quality is the most important.

31 Jul 2025

attention-mechanisms computer-science computation-and-language

DocPolarBERT: A Pre-trained Model for Document Understanding with Relative Polar Coordinate Encoding of Layout Structures

CNRS IRISA

Inria Univ Brest Université de Rennes INSA Rennes LMBA SCOR

We introduce DocPolarBERT, a layout-aware BERT model for document understanding that eliminates the need for absolute 2D positional embeddings. We extend self-attention to take into account text block positions in relative polar coordinate system rather than the Cartesian one. Despite being pre-trained on a dataset more than six times smaller than the widely used IIT-CDIP corpus, DocPolarBERT achieves state-of-the-art results. These results demonstrate that a carefully designed attention mechanism can compensate for reduced pre-training data, offering an efficient and effective alternative for document understanding.

07 Jan 2022

adversarial-attacks adversarial-robustness computer-science

Adversarial Example Detection for DNN Models: A Review and Experimental Comparison

CNRS University of Rennes INSA Rennes National Institute of Telecommunications and ICT

Deep learning (DL) has shown great success in many human-related tasks, which has led to its adoption in many computer vision based applications, such as security surveillance systems, autonomous vehicles and healthcare. Such safety-critical applications have to draw their path to success deployment once they have the capability to overcome safety-critical challenges. Among these challenges are the defense against or/and the detection of the adversarial examples (AEs). Adversaries can carefully craft small, often imperceptible, noise called perturbations to be added to the clean image to generate the AE. The aim of AE is to fool the DL model which makes it a potential risk for DL applications. Many test-time evasion attacks and countermeasures,i.e., defense or detection methods, are proposed in the literature. Moreover, few reviews and surveys were published and theoretically showed the taxonomy of the threats and the countermeasure methods with little focus in AE detection methods. In this paper, we focus on image classification task and attempt to provide a survey for detection methods of test-time evasion attacks on neural network classifiers. A detailed discussion for such methods is provided with experimental results for eight state-of-the-art detectors under different scenarios on four datasets. We also provide potential challenges and future perspectives for this research direction.

20 Mar 2025

computer-science artificial-intelligence information-theory

Physically Parameterized Differentiable MUSIC for DoA Estimation with Uncalibrated Arrays

CNRS

Chalmers University of Technology Mitsubishi Electric R&D Centre Europe INSA Rennes IETR-UMR 6164 Univ-Rennes

Direction of arrival (DoA) estimation is a common sensing problem in radar, sonar, audio, and wireless communication systems. It has gained renewed importance with the advent of the integrated sensing and communication paradigm. To fully exploit the potential of such sensing systems, it is crucial to take into account potential hardware impairments that can negatively impact the obtained performance. This study introduces a joint DoA estimation and hardware impairment learning scheme following a model-based approach. Specifically, a differentiable version of the multiple signal classification (MUSIC) algorithm is derived, allowing efficient learning of the considered impairments. The proposed approach supports both supervised and unsupervised learning strategies, showcasing its practical potential. Simulation results indicate that the proposed method successfully learns significant inaccuracies in both antenna locations and complex gains. Additionally, the proposed method outperforms the classical MUSIC algorithm in the DoA estimation task.

14 Jan 2025

adversarial-attacks computer-science computer-vision-and-pattern-recognition

Energy Backdoor Attack to Deep Neural Networks

CNRS Khalifa University IETR INSA Rennes National Higher School of Telecommunications and ICT National Institute of Health and Medical Research Univ-Rennes

The rise of deep learning (DL) has increased computing complexity and energy use, prompting the adoption of application specific integrated circuits (ASICs) for energy-efficient edge and mobile deployment. However, recent studies have demonstrated the vulnerability of these accelerators to energy attacks. Despite the development of various inference time energy attacks in prior research, backdoor energy attacks remain unexplored. In this paper, we design an innovative energy backdoor attack against deep neural networks (DNNs) operating on sparsity-based accelerators. Our attack is carried out in two distinct phases: backdoor injection and backdoor stealthiness. Experimental results using ResNet-18 and MobileNet-V2 models trained on CIFAR-10 and Tiny ImageNet datasets show the effectiveness of our proposed attack in increasing energy consumption on trigger samples while preserving the model's performance for clean/regular inputs. This demonstrates the vulnerability of DNNs to energy backdoor attacks. The source code of our attack is available at: this https URL.

04 Oct 2024

computer-science hardware-architecture cryptography-and-security

Do Not Trust Power Management: A Survey on Internal Energy-based Attacks Circumventing Trusted Execution Environments Security Properties

CNRS INSA Rennes National Cybersecurity Agency of France (ANSSI)UnivRennes Nantes Universit

Over the past few years, several research groups have introduced innovative hardware designs for Trusted Execution Environments (TEEs), aiming to secure applications against potentially compromised privileged software, including the kernel. Since 2015, a new class of software-enabled hardware attacks leveraging energy management mechanisms has emerged. These internal energy-based attacks comprise fault, side-channel and covert channel attacks. Their aim is to bypass TEE security guarantees and expose sensitive information such as cryptographic keys. They have increased in prevalence in the past few years. Popular TEE implementations, such as ARM TrustZone and Intel SGX, incorporate countermeasures against these attacks. However, these countermeasures either hinder the capabilities of the power management mechanisms or have been shown to provide insufficient system protection. This article presents the first comprehensive knowledge survey of these attacks, along with an evaluation of literature countermeasures. We believe that this study will spur further community efforts towards this increasingly important type of attacks.

25 Sep 2019

materials-science physics

Band-edge Exciton Fine Structure and Exciton Recombination Dynamics in Single crystals of Layered Hybrid Perovskites

CNRS

University of Groningen INSA Rennes Institut FOTON Zernike Institute for Advanced Materials Univ-Rennes

Two-dimensional (2D) perovskite materials have recently re-attracted intense research interest for applications in photovoltaics and optoelectronics. As a consequence of the dielectric and quantum confinement effect, they show strongly bound and stable excitons at room temperature. In this report, the band-edge exciton fine structure and in particular its exciton and biexciton dynamics in high quality crystals of (PEA)2PbI4 are investigated. A comparison of bulk and surface exciton lifetimes yields a room temperature surface recombination velocity of 2x10^3cm/s and an intrinsic lifetime of 185ns. Biexciton emission is evidenced at room temperature, with binding energy of about 45meV and a lifetime of 80ps. At low temperature, exciton state splitting is observed, which is caused by the electron-hole exchange interaction. Transient photoluminescence resolves the low-lying dark exciton state, with a bright/dark splitting energy estimated to be 10meV. This work contributes to understand the complex scenario of the elementary photoexcitations in 2D perovskites.

06 Mar 2025

computer-science artificial-intelligence cryptography-and-security

Energy-Latency Attacks: A New Adversarial Threat to Deep Learning

CNRS Khalifa University INSA Rennes National Higher School of Telecommunications and ICT IETR-UMR 6164 Univ-Rennes

The growing computational demand for deep neural networks ( DNNs) has raised concerns about their energy consumption and carbon footprint, particularly as the size and complexity of the models continue to increase. To address these challenges, energy-efficient hardware and custom accelerators have become essential. Additionally, adaptable DNN s are being developed to dynamically balance performance and efficiency. The use of these strategies became more common to enable sustainable AI deployment. However, these efficiency-focused designs may also introduce vulnerabilities, as attackers can potentially exploit them to increase latency and energy usage by triggering their worst-case-performance scenarios. This new type of attack, called energy-latency attacks, has recently gained significant research attention, focusing on the vulnerability of DNN s to this emerging attack paradigm, which can trigger denial-of-service ( DoS) attacks. This paper provides a comprehensive overview of current research on energy-latency attacks, categorizing them using the established taxonomy for traditional adversarial attacks. We explore different metrics used to measure the success of these attacks and provide an analysis and comparison of existing attack strategies. We also analyze existing defense mechanisms and highlight current challenges and potential areas for future research in this developing field. The GitHub page for this work can be accessed at this https URL

20 Jan 2025

materials-science physics

Flexible and Efficient Semi-Empirical DFTB Parameters for Electronic Structure Prediction of 3D, 2D Iodide Perovskites and Heterostructures

CNRS University of Bremen INSA Rennes University of Mons ENSC Rennes Univ-Rennes

Density Functional Tight-Binding (DFTB), an approximative approach derived from Density Functional Theory (DFT), has the potential to pave the way for simulations of large periodic or non-periodic systems. We have specifically tailored DFTB parameters to enhance the accuracy of electronic band gap calculations in both 3D and 2D lead-iodide perovskites, at a significantly reduced computational cost relative to state-of-the-art ab initio calculations. Our electronic DFTB parameters allow computing not only the band gap but also effective masses of perovskite materials with reasonable accuracy compared to existing experimental data and state-of-the-art DFT calculations. The electronic band structures of vacancy-ordered and, lead- and iodide- deficient perovskites are also explored. Additionally, we demonstrate the efficiency of DFTB in computing electronic band alignments in perovskite heterostructures. The DFTB-based approach is anticipated to be beneficial for studying large-scale systems such as heterostructures and nanocrystals.

16 Jul 2025

computer-science computer-vision-and-pattern-recognition domain-adaptation

Prototypical Progressive Alignment and Reweighting for Generalizable Semantic Segmentation

CNRS Guangzhou University Shenzhen University INSA Rennes IETR-UMR 6164 Univ-Rennes

Generalizable semantic segmentation aims to perform well on unseen target domains, a critical challenge due to real-world applications requiring high generalizability. Class-wise prototypes, representing class centroids, serve as domain-invariant cues that benefit generalization due to their stability and semantic consistency. However, this approach faces three challenges. First, existing methods often adopt coarse prototypical alignment strategies, which may hinder performance. Second, naive prototypes computed by averaging source batch features are prone to overfitting and may be negatively affected by unrelated source data. Third, most methods treat all source samples equally, ignoring the fact that different features have varying adaptation difficulties. To address these limitations, we propose a novel framework for generalizable semantic segmentation: Prototypical Progressive Alignment and Reweighting (PPAR), leveraging the strong generalization ability of the CLIP model. Specifically, we define two prototypes: the Original Text Prototype (OTP) and Visual Text Prototype (VTP), generated via CLIP to serve as a solid base for alignment. We then introduce a progressive alignment strategy that aligns features in an easy-to-difficult manner, reducing domain gaps gradually. Furthermore, we propose a prototypical reweighting mechanism that estimates the reliability of source data and adjusts its contribution, mitigating the effect of irrelevant or harmful features (i.e., reducing negative transfer). We also provide a theoretical analysis showing the alignment between our method and domain generalization theory. Extensive experiments across multiple benchmarks demonstrate that PPAR achieves state-of-the-art performance, validating its effectiveness.

22 Jan 2025

attention-mechanisms computer-science computation-and-language

Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation

CNRS

Inria INSA Rennes University of Rennes 2 Univ-Rennes

Attention mechanism is contributing to the majority of recent advances in machine learning for natural language processing. Additionally, it results in an attention map that shows the proportional influence of each input in its decision. Empirical studies postulate that attention maps can be provided as an explanation for model output. However, it is still questionable to ask whether this explanation helps regular people to understand and accept the model output (the plausibility of the explanation). Recent studies show that attention weights in the RNN encoders are hardly plausible because they spread on input tokens. We thus propose 3 additional constraints to the learning objective function to improve the plausibility of the attention map: regularization to increase the attention weight sparsity, semi-supervision to supervise the map by a heuristic and supervision by human annotation. Results show that all techniques can improve the attention map plausibility at some level. We also observe that specific instructions for human annotation might have a negative effect on classification performance. Beyond the attention map, the result of experiments on text classification tasks also shows that no matter how the constraint brings the gain, the contextualization layer plays a crucial role in finding the right space for finding plausible tokens.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

BOGausS: Better Optimized Gaussian Splatting

MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation

EuroBERT: Scaling Multilingual Encoders for European Languages

Nested Named Entity Recognition as Single-Pass Sequence Labeling

Model-based Implicit Neural Representation for sub-wavelength Radio Localization

VIP: Visual Information Protection through Adversarial Attacks on Vision-Language Models

CSI Compression using Channel Charting

Few-Shot Domain Adaptation for Named-Entity Recognition via Joint Constrained k-Means and Subspace Selection

Training LayoutLM from Scratch for Efficient Named-Entity Recognition in the Insurance Domain

Adam SLAM - the last mile of camera calibration with 3DGS

DocPolarBERT: A Pre-trained Model for Document Understanding with Relative Polar Coordinate Encoding of Layout Structures

Adversarial Example Detection for DNN Models: A Review and Experimental Comparison

Physically Parameterized Differentiable MUSIC for DoA Estimation with Uncalibrated Arrays

Energy Backdoor Attack to Deep Neural Networks

Do Not Trust Power Management: A Survey on Internal Energy-based Attacks Circumventing Trusted Execution Environments Security Properties

Band-edge Exciton Fine Structure and Exciton Recombination Dynamics in Single crystals of Layered Hybrid Perovskites

Energy-Latency Attacks: A New Adversarial Threat to Deep Learning

Flexible and Efficient Semi-Empirical DFTB Parameters for Electronic Structure Prediction of 3D, 2D Iodide Perovskites and Heterostructures

Prototypical Progressive Alignment and Reweighting for Generalizable Semantic Segmentation

Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation

Events

AI for Law

Personalize Your Feed