ENSICAEN
Generative models, such as DALL-E, Midjourney, and Stable Diffusion, have societal implications that extend beyond the field of computer science. These models require large image databases like LAION-2B, which contain two billion images. At this scale, manual inspection is difficult and automated analysis is challenging. In addition, recent studies show that duplicated images pose copyright problems for models trained on LAION2B, which hinders its usability. This paper proposes an algorithmic chain that runs with modest compute, that compresses CLIP features to enable efficient duplicate detection, even for vast image volumes. Our approach demonstrates that roughly 700 million images, or about 30\%, of LAION-2B's images are likely duplicated. Our method also provides the histograms of duplication on this dataset, which we use to reveal more examples of verbatim copies by Stable Diffusion and further justify the approach. The current version of the de-duplicated set will be distributed online.
This paper addresses the problem of exemplar-based texture synthesis. We introduce NIFTY, a hybrid framework that combines recent insights on diffusion models trained with convolutional neural networks, and classical patch-based texture optimization techniques. NIFTY is a non-parametric flow-matching model built on non-local patch matching, which avoids the need for neural network training while alleviating common shortcomings of patch-based methods, such as poor initialization or visual artifacts. Experimental results demonstrate the effectiveness of the proposed approach compared to representative methods from the literature. Code is available at this https URL
Counterfactual explanations have shown promising results as a post-hoc framework to make image classifiers more explainable. In this paper, we propose DiME, a method allowing the generation of counterfactual images using the recent diffusion models. By leveraging the guided generative diffusion process, our proposed methodology shows how to use the gradients of the target classifier to generate counterfactual explanations of input instances. Further, we analyze current approaches to evaluate spurious correlations and extend the evaluation measurements by proposing a new metric: Correlation Difference. Our experimental validations show that the proposed algorithm surpasses previous State-of-the-Art results on 5 out of 6 metrics on CelebA.
24
Recent state-of-the-art algorithms in photometric stereo rely on neural networks and operate either through prior learning or inverse rendering optimization. Here, we revisit the problem of calibrated photometric stereo by leveraging recent advances in 3D inverse rendering using the Gaussian Splatting formalism. This allows us to parameterize the 3D scene to be reconstructed and optimize it in a more interpretable manner. Our approach incorporates a simplified model for light representation and demonstrates the potential of the Gaussian Splatting rendering engine for the photometric stereo problem.
Achieving high-fidelity 3D surface reconstruction while preserving fine details remains challenging, especially in the presence of materials with complex reflectance properties and without a dense-view setup. In this paper, we introduce a versatile framework that incorporates multi-view normal and optionally reflectance maps into radiance-based surface reconstruction. Our approach employs a pixel-wise joint re-parametrization of reflectance and surface normals, representing them as a vector of radiances under simulated, varying illumination. This formulation enables seamless incorporation into standard surface reconstruction pipelines, such as traditional multi-view stereo (MVS) frameworks or modern neural volume rendering (NVR) ones. Combined with the latter, our approach achieves state-of-the-art performance on multi-view photometric stereo (MVPS) benchmark datasets, including DiLiGenT-MV, LUCES-MV and Skoltech3D. In particular, our method excels in reconstructing fine-grained details and handling challenging visibility conditions. The present paper is an extended version of the earlier conference paper by Brument et al. (in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024), featuring an accelerated and more robust algorithm as well as a broader empirical evaluation. The code and data relative to this article is available at this https URL.
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. Yet, adversarial attacks cannot be used directly in a counterfactual explanation perspective, as such perturbations are perceived as noise and not as actionable and understandable image modifications. Building on the robust learning literature, this paper proposes an elegant method to turn adversarial attacks into semantically meaningful perturbations, without modifying the classifiers to explain. The proposed approach hypothesizes that Denoising Diffusion Probabilistic Models are excellent regularizers for avoiding high-frequency and out-of-distribution perturbations when generating adversarial attacks. The paper's key idea is to build attacks through a diffusion model to polish them. This allows studying the target model regardless of its robustification level. Extensive experimentation shows the advantages of our counterfactual explanation approach over current State-of-the-Art in multiple testbeds.
39
In recent years, Transformer-based auto-attention mechanisms have been successfully applied to the analysis of a variety of context-reliant data types, from texts to images and beyond, including data from non-Euclidean geometries. In this paper, we present such a mechanism, designed to classify sequences of Symmetric Positive Definite matrices while preserving their Riemannian geometry throughout the analysis. We apply our method to automatic sleep staging on timeseries of EEG-derived covariance matrices from a standard dataset, obtaining high levels of stage-wise performance.
Researchers from CentraleSupélec and Université Paris-Saclay developed hard negative mining strategies to improve metric learning for Zero-Shot Classification. Their Uncertainty/Correlation-based sampling method achieves 81.2% accuracy on the AwA dataset, outperforming random negative sampling and existing state-of-the-art methods while also exhibiting faster convergence.
In a real Hilbert space setting, we study the convergence properties of an inexact gradient algorithm featuring both viscous and Hessian driven damping for convex differentiable optimization. In this algorithm, the gradient evaluation can be subject to deterministic and stochastic perturbations. In the deterministic case, we show that under appropriate summability assumptions on the perturbation, our algorithm enjoys fast convergence of the objective values, of the gradients and weak convergence of the iterates toward a minimizer of the objective. In the stochastic case, assuming the perturbation is zero-mean, we can weaken our summability assumptions on the error variance and provide fast convergence of the values both in expectation and almost surely. We also improve the convergence rates from O()\mathcal{O}(\cdot) to o()o(\cdot) in almost sure sense. We also prove almost sure summability property of the gradients, which implies the almost sure fast convergence of the gradients towards zero. We will highlight the trade-off between fast convergence and the applicable regime on the sequence of errors in the gradient computations. We finally report some numerical results to support our findings.
Magnetic fields are ubiquitous across different physical systems of current interest; from the early Universe, compact astrophysical objects and heavy-ion collisions to condensed matter systems. A proper treatment of the effects produced by magnetic fields during the dynamical evolution of these systems, can help to understand observables that otherwise show a puzzling behavior. Furthermore, when these fields are comparable to or stronger than \Lambda_QCD, they serve as excellent probes to help elucidate the physics of strongly interacting matter under extreme conditions of temperature and density. In this work we provide a comprehensive review of recent developments on the description of QED and QCD systems where magnetic field driven effects are important. These include the modification of meson static properties such as masses and form factors, the chiral magnetic effect, the description of anomalous transport coefficients, superconductivity in extreme magnetic fields, the properties of neutron stars, the evolution of heavy-ion collisions, as well as effects on the QCD phase diagram. We describe recent theory and phenomenological developments using effective models as well as LQCD methods. The work represents a state-of-the-art review of the field, motivated by presentations and discussions during the "Workshop on Strongly Interacting Matter in Strong Electromagnetic Fields" that took place in the European Centre for Theoretical Studies in Nuclear Physics and Related Areas (ECT*) in the city of Trento, Italy, September 25-29, 2023.
In data-scarce scenarios, deep learning models often overfit to noise and irrelevant patterns, which limits their ability to generalize to unseen samples. To address these challenges in medical image segmentation, we introduce Diff-UMamba, a novel architecture that combines the UNet framework with the mamba mechanism to model long-range dependencies. At the heart of Diff-UMamba is a noise reduction module, which employs a signal differencing strategy to suppress noisy or irrelevant activations within the encoder. This encourages the model to filter out spurious features and enhance task-relevant representations, thereby improving its focus on clinically significant regions. As a result, the architecture achieves improved segmentation accuracy and robustness, particularly in low-data settings. Diff-UMamba is evaluated on multiple public datasets, including medical segmentation decathalon dataset (lung and pancreas) and AIIB23, demonstrating consistent performance gains of 1-3% over baseline methods in various segmentation tasks. To further assess performance under limited data conditions, additional experiments are conducted on the BraTS-21 dataset by varying the proportion of available training samples. The approach is also validated on a small internal non-small cell lung cancer dataset for the segmentation of gross tumor volume in cone beam CT, where it achieves a 4-5% improvement over baseline.
The use of optimal transport cost for learning generative models has become popular with Wasserstein Generative Adversarial Networks (WGAN). Training of WGAN relies on a theoretical background: the calculation of the gradient of the optimal transport cost with respect to the generative model parameters. We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization. We address this issue by stating a valid differentiation theorem in the case of entropic regularized transport and specify conditions under which existence is ensured. By exploiting the discrete nature of empirical data, we formulate the gradient in a semi-discrete setting and propose an algorithm for the optimization of the generative model parameters. Finally, we illustrate numerically the advantage of the proposed framework.
The progress and integration of intelligent transport systems (ITS) have therefore been central to creating safer and more efficient transport networks. The Internet of Vehicles (IoV) has the potential to improve road safety and provide comfort to travelers. However, this technology is exposed to a variety of security vulnerabilities that malicious actors could exploit. One of the most serious threats to IoV is the Distributed Denial of Service (DDoS) attack, which could be used to disrupt traffic flow, disable communication between vehicles, or even cause accidents. In this paper, we propose a novel Deep Multimodal Learning (DML) approach for detecting DDoS attacks in IoV, addressing a critical aspect of cybersecurity in intelligent transport systems. Our proposed DML model integrates Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), enhanced by Attention and Gating mechanisms, and Multi-Layer Perceptron (MLP) with a multimodal intermediate fusion architecture. This innovative method effectively identifies and mitigates DDoS attacks in real-time by utilizing the Framework for Misbehavior Detection (F2MD) to generate a synthetic dataset, thereby overcoming the limitations of the existing Vehicular Reference Misbehavior (VeReMi) extension dataset. The proposed approach is evaluated in real-time across different simulated real-world scenario with 10\%, 30%30\%, and 50%50\% attacker densities. The proposed DML model achieves an average accuracy of 96.63\%, outperforming the classical Machine Learning (ML) approaches and state-of-the-art methods which demonstrate significant efficacy and reliability in protecting vehicular networks from malicious cyber-attacks.
This paper addresses the challenge of generating Counterfactual Explanations (CEs), involving the identification and modification of the fewest necessary features to alter a classifier's prediction for a given image. Our proposed method, Text-to-Image Models for Counterfactual Explanations (TIME), is a black-box counterfactual technique based on distillation. Unlike previous methods, this approach requires solely the image and its prediction, omitting the need for the classifier's structure, parameters, or gradients. Before generating the counterfactuals, TIME introduces two distinct biases into Stable Diffusion in the form of textual embeddings: the context bias, associated with the image's structure, and the class bias, linked to class-specific features learned by the target classifier. After learning these biases, we find the optimal latent code applying the classifier's predicted class token and regenerate the image using the target embedding as conditioning, producing the counterfactual explanation. Extensive empirical studies validate that TIME can generate explanations of comparable effectiveness even when operating within a black-box setting.
8
The research investigates the fragmentation mechanisms of nitrogen-substituted polycyclic aromatic hydrocarbon (PANH) dications, quinoline and isoquinoline, under energetic ion impact, revealing isomer-specific neutral-loss channels. This work identifies HCN and HCNH+^+ as dominant decay products, which are relevant to the formation of N-bearing species in astrochemical environments like Titan.
We propose GOTEX, a general framework for texture synthesis by optimization that constrains the statistical distribution of local features. While our model encompasses several existing texture models, we focus on the case where the comparison between feature distributions relies on optimal transport distances. We show that the semi-dual formulation of optimal transport allows to control the distribution of various possible features, even if these features live in a high-dimensional space. We then study the resulting minimax optimization problem, which corresponds to a Wasserstein generative model, for which the inner concave maximization problem can be solved with standard stochastic gradient methods. The alternate optimization algorithm is shown to be versatile in terms of applications, features and architecture; in particular it allows to produce high-quality synthesized textures with different sets of features. We analyze the results obtained by constraining the distribution of patches or the distribution of responses to a pre-learned VGG neural network. We show that the patch representation can retrieve the desired textural aspect in a more precise manner. We also provide a detailed comparison with state-of-the-art texture synthesis methods. The GOTEX model based on patch features is also adapted to texture inpainting and texture interpolation. Finally, we show how to use our framework to learn a feed-forward neural network that can synthesize on-the-fly new textures of arbitrary size in a very fast manner. Experimental results and comparisons with the mainstream methods from the literature illustrate the relevance of the generative models learned with GOTEX.
6
In a recent paper --F.M. Marques et al, PRC 65 (2002) 044006-- a new approach to the production and detection of free neutron clusters was proposed and applied to data acquired for the breakup of 14Be. Six events that exhibited characteristics consistent with a bound tetraneutron were observed in coincidence with 10Be fragments. Here, two issues that were not considered in the original paper are addressed: namely the signal expected from a low-energy 4n resonance, and the detection of a bound 4n through proccesses other than elastic scattering by a proton. Searches complementary to the original study are also briefly noted.
The equation of state (EoS) is a needed input to determine the neutron-star global properties and to relate them. It is thus important to provide consistent and unified EoSs to avoid possible biases in the analyses coming from the use of inconsistent EoSs. We propose a numerical tool, CUTER, allowing the user to consistently match a nuclear-physics informed crust to an arbitrary higher density EoS. We present here the second version of this tool, CUTER v2. Two functionalities are available with the CUTER v2 tool, allowing the user to reconstruct either the whole (outer and inner) crust, or the outer crust only. We show that the code, that has been tested and validated for use by the astrophysical community, is able to efficiently perform both tasks, allowing the computation of neutron-star global properties in a consistent way.
In this paper, we propose a new hand gesture recognition method based on skeletal data by learning SPD matrices with neural networks. We model the hand skeleton as a graph and introduce a neural network for SPD matrix learning, taking as input the 3D coordinates of hand joints. The proposed network is based on two newly designed layers that transform a set of SPD matrices into a SPD matrix. For gesture recognition, we train a linear SVM classifier using features extracted from our network. Experimental results on a challenging dataset (Dynamic Hand Gesture dataset from the SHREC 2017 3D Shape Retrieval Contest) show that the proposed method outperforms state-of-the-art methods.
The SoLid experiment is a very-short-baseline experiment aimed at searching for nuclear reactor-produced active to sterile antineutrino oscillations. The detection principle is based on the pairing of two types of solid scintillators: polyvinyl toluene and 6^6LiF:ZnS(Ag), which is a new technology used in this field of Physics. In addition to good neutron-gamma discrimination, this setup allows the detector to be highly segmented (the basic detection unit is a 5 cm side cube). High segmentation provides numerous advantages, including the precise location of Inverse Beta Decay (IBD) products, the derivation of the considerate antineutrino energy estimator, and a powerful background reduction tool based on the topological signature of the signal. Finally, the system is read out by a network of wavelength-shifting fibres coupled to a photodetector (MPPC). This paper describes the design of the reconstruction algorithm that allows maximum use of the granularity of the detector. The goal of the algorithm is to convert the output of the optical-fibre readout to the list of the detection units from which it originated. This paper provides a performance comparison for three methods and concludes with a choice of the baseline approach for the experiment.
There are no more papers matching your filters at the moment.