Nantes Universit
We present a publicly available multimodal dataset for head and neck cancer research, comprising 1123 annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies from patients with histologically confirmed disease, acquired from 10 international medical centers. All studies contain co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity from a long-term, multi-institution retrospective collection. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following established guidelines. We provide anonymized NifTi files, expert-annotated segmentation masks, comprehensive clinical metadata, and radiotherapy dose distributions for a patient subset. The metadata include TNM staging, HPV status, demographics, long-term follow-up outcomes, survival times, censoring indicators, and treatment information. To demonstrate its utility, we benchmark three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, using state-of-the-art deep learning models like UNet, SegResNet, and multimodal prognostic frameworks.
MoreHopQA is a question answering dataset designed to evaluate multi-hop reasoning in large language models by requiring generative answers and integrating arithmetic, commonsense, and symbolic reasoning. Evaluations reveal a substantial performance gap between LLMs and humans, highlighting that models often rely on shortcuts rather than performing genuine multi-step reasoning.
8
Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, offering potential applications across specialized domains such as healthcare and medicine. Despite the availability of various open-source LLMs tailored for health contexts, adapting general-purpose LLMs to the medical domain presents significant challenges. In this paper, we introduce BioMistral, an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model and further pre-trained on PubMed Central. We conduct a comprehensive evaluation of BioMistral on a benchmark comprising 10 established medical question-answering (QA) tasks in English. We also explore lightweight models obtained through quantization and model merging approaches. Our results demonstrate BioMistral's superior performance compared to existing open-source medical models and its competitive edge against proprietary counterparts. Finally, to address the limited availability of data beyond English and to assess the multilingual generalization of medical LLMs, we automatically translated and evaluated this benchmark into 7 other languages. This marks the first large-scale multilingual evaluation of LLMs in the medical domain. Datasets, multilingual evaluation benchmarks, scripts, and all the models obtained during our experiments are freely released.
STONE, the current method in self-supervised learning for tonality estimation in music signals, cannot distinguish relative keys, such as C major versus A minor. In this article, we extend the neural network architecture and learning objective of STONE to perform self-supervised learning of major and minor keys (S-KEY). Our main contribution is an auxiliary pretext task to STONE, formulated using transposition-invariant chroma features as a source of pseudo-labels. S-KEY matches the supervised state of the art in tonality estimation on FMAKv2 and GTZAN datasets while requiring no human annotation and having the same parameter budget as STONE. We build upon this result and expand the training set of S-KEY to a million songs, thus showing the potential of large-scale self-supervised learning in music information retrieval.
LLM-as-a-judge provides a more accurate evaluation for extractive Question Answering (QA) performance, demonstrating an average Pearson correlation of 0.847 with human judgments compared to traditional metrics which significantly underestimate model capabilities.
5
We report on a search for weakly interacting massive particle (WIMP) dark matter (DM) via elastic DM-xenon-nucleus interactions in the XENONnT experiment. We combine datasets from the first and second science campaigns resulting in a total exposure of 3.1  tonne×year3.1\;\text{tonne}\times\text{year}. In a blind analysis of nuclear recoil events with energies above 3.8keVNR3.8\,\mathrm{keV_{NR}}, we find no significant excess above background. We set new upper limits on the spin-independent WIMP-nucleon scattering cross-section for WIMP masses above 10GeV/c210\,\mathrm{GeV}/c^2 with a minimum of 1.7×1047cm21.7\,\times\,10^{-47}\,\mathrm{cm^2} at 90%90\,\% confidence level for a WIMP mass of 30GeV/c230\,\mathrm{GeV}/c^2. We achieve a best median sensitivity of 1.4×1047cm21.4\,\times\,10^{-47}\,\mathrm{cm^2} for a 41GeV/c241\,\mathrm{GeV}/c^2 WIMP. Compared to the result from the first XENONnT science dataset, we improve our sensitivity by a factor of up to 1.8.
We report on a blinded search for dark matter with single- and few-electron signals in the first science run of XENONnT relying on a novel detector response framework that is physics-model-dependent. We derive 90\% confidence upper limits for dark matter-electron interactions. Heavy and light mediator cases are considered for the standard halo model and dark matter up-scattered in the Sun. We set stringent new limits on dark matter-electron scattering via a heavy mediator with a mass within 10-20\,MeV/c2c^2 and electron absorption of axion-like particles and dark photons for mχm_\chi below 0.186\,keV/c2c^2.
CNRS logoCNRSUniversity of Amsterdam logoUniversity of AmsterdamNational Central UniversityNew York University logoNew York UniversityNikhefUniversity of MelbourneINFN logoINFNUniversity of WarsawJoint Institute for Nuclear ResearchUniversity of GranadaUniversity of GenoaSorbonne Université logoSorbonne UniversitéTechnical University of Munich logoTechnical University of MunichLeiden University logoLeiden UniversityUniversity of SheffieldUtrecht UniversityCadi Ayyad UniversityUniversity of JohannesburgINAFUnited Arab Emirates UniversityUniversity of South DakotaNCSR DemokritosLebedev Physical InstituteUniversity of ValenciaEberhard-Karls-Universität TübingenComenius UniversityGeorgian Technical UniversityUniversità di BariNational Centre for Nuclear ResearchWestern Sydney UniversityUniversitat Politécnica de ValénciaMohammed V UniversityInstitut de Physique des 2 Infinis de LyonUniversità di FirenzeUniversity of SalentoIFICUniversity of AthensUniversità degli Studi di Bari Aldo MoroPushchino Radio Astronomy ObservatoryLUPMLPC-CaenIFIN-HHChouaïb Doukkali UniversityInstitute of Experimental PhysicsTechnical University of KošiceUniversit di CataniaUniversité Sidi Mohamed Ben AbdellahRoyal Netherlands Institute for Sea ResearchUniversité Mohammed IerInstitut universitaire de technologie de Nantes* North–West UniversityUniversit degli Studi di FerraraUniversit de ParisUniversit Grenoble AlpesUniversit degli Studi di GenovaAix-Marseille Universit",Universit di SalernoUniversit Roma TreUniversit Paris CitUniversit La SapienzaUniversit de StrasbourgNantes UniversitUniversit di PadovaUniversit degli Studi di FirenzeUniversit degli Studi di Napoli Federico IIUniversit Di Bologna
Context: The detection of the highest energy neutrino observed to date by KM3NeT, with an estimated energy of 220 PeV, opens up new possibilities for the study and identification of the astrophysical sources responsible for a diffuse flux of such ultra-high-energy neutrinos, among which gamma-ray bursts are longstanding candidates. Aims: Based on the event KM3-230213A, we derive constraints on the baryon loading and density of the surrounding environment in models of blastwaves in long-duration gamma-ray bursts. Methods: We compute the diffuse flux from gamma-ray burst blastwaves, either expanding in a constant density interstellar medium or developing in a radially decreasing density of a wind-like environment surrounding the gamma-ray burst progenitor star, by taking into account the expected neutrino spectra and luminosity function. We use a Poisson likelihood method to constrain the blastwave model parameters by calculating the expected number of neutrino events within the 90% confidence level energy range of KM3-230213A and by using the joint exposure of KM3NeT/ARCA, IceCube and Pierre Auger. Results: We constrain the baryon loading to be {392,131,39,13}\leq \{392, 131, 39, 13\} at 90% confidence level, which is inversely proportional to a varying interstellar medium particle density of {1,3,10,30}\{1, 3, 10, 30\} cm3^{-3}. In the wind-like environment case, the baryon loading is {20,50,100}\leq \{20, 50, 100\} at 90% confidence level, which is proportional to the sixth power of a varying density parameter of {0.05,0.06,0.07}\{0.05, 0.06, 0.07\}.
Due to the black-box nature of deep learning models, there is a recent development of solutions for visual explanations of CNNs. Given the high cost of user studies, metrics are necessary to compare and evaluate these different methods. In this paper, we critically analyze the Deletion Area Under Curve (DAUC) and Insertion Area Under Curve (IAUC) metrics proposed by Petsiuk et al. (2018). These metrics were designed to evaluate the faithfulness of saliency maps generated by generic methods such as Grad-CAM or RISE. First, we show that the actual saliency score values given by the saliency map are ignored as only the ranking of the scores is taken into account. This shows that these metrics are insufficient by themselves, as the visual appearance of a saliency map can change significantly without the ranking of the scores being modified. Secondly, we argue that during the computation of DAUC and IAUC, the model is presented with images that are out of the training distribution which might lead to an unreliable behavior of the model being explained. To complement DAUC/IAUC, we propose new metrics that quantify the sparsity and the calibration of explanation methods, two previously unstudied properties. Finally, we give general remarks about the metrics studied in this paper and discuss how to evaluate them in a user study.
8
Cosmological determinations of the number of relativistic neutrino species, NeffN^{ }_{\rm eff}, are becoming increasingly accurate, and further improvements are expected both from CMB and BBN data. Given this context, we update the evaluation of NeffN^{ }_{\rm eff} and the current entropy density via the momentum-averaged approach. This allows for a numerically fast description of neutrino decoupling, easily portable to an array of new physics scenarios. We revisit all aspects of this approach, including collision terms with full electron mass dependence, finite temperature QED corrections to the equation of state, neutrino oscillations, and the modelling of neutrino ensembles with effective chemical potentials. For integrated observables, our results differ by less than 0.04%0.04\% from the solution of the momentum-dependent evolution equation. We outline how to extend the approach to BSM settings, and will highlight its power in Part II. To facilitate the practical implementation, we release a Mathematica and Python code within nudec_BSM_v2, easily linkable to BBN codes.
In this paper, we introduce the Extreme Metal Vocals Dataset, which comprises a collection of recordings of extreme vocal techniques performed within the realm of heavy metal music. The dataset consists of 760 audio excerpts of 1 second to 30 seconds long, totaling about 100 min of audio material, roughly composed of 60 minutes of distorted voices and 40 minutes of clear voice recordings. These vocal recordings are from 27 different singers and are provided without accompanying musical instruments or post-processing effects. The distortion taxonomy within this dataset encompasses four distinct distortion techniques and three vocal effects, all performed in different pitch ranges. Performance of a state-of-the-art deep learning model is evaluated for two different classification tasks related to vocal techniques, demonstrating the potential of this resource for the audio processing community.
A collaborative effort from ENS Paris-Saclay, University of Cambridge, and Nantes Université established AdapTT, a dependent type theory demonstrating that structural type casts emerge from functorial type formers. This framework provides a unified approach that automatically derives cast laws for a broad range of inductive types, including Π, Σ, List, and W-types.
Scientific claim verification against tables typically requires predicting whether a claim is supported or refuted given a table. However, we argue that predicting the final label alone is insufficient: it reveals little about the model's reasoning and offers limited interpretability. To address this, we reframe table-text alignment as an explanation task, requiring models to identify the table cells essential for claim verification. We build a new dataset by extending the SciTab benchmark with human-annotated cell-level rationales. Annotators verify the claim label and highlight the minimal set of cells needed to support their decision. After the annotation process, we utilize the collected information and propose a taxonomy for handling ambiguous cases. Our experiments show that (i) incorporating table alignment information improves claim verification performance, and (ii) most LLMs, while often predicting correct labels, fail to recover human-aligned rationales, suggesting that their predictions do not stem from faithful reasoning.
The theory of representations of a crossed module is a direct generalization of the theory of representations of groups. For a finite group G, the Drinfeld quantum double of the group G is a Hopf algebra that represents a special case of crossed module of finite groups. Here we study how to extend the construction of the Drinfeld quantum double for any other kind of crossed module of finite groups. This leads to a Hopf algebra D(G, H) that presents similarities with a Drinfeld double. We then study simple subalgebras of D(G, H) and give two isomorphisms for the decomposition into a product of simple subalgebras. We then study the category D(G, H)-modFd of finite dimensional modules over D(G, H), which turns out to be isomorphic to the category of finite dimensional representations of finite crossed modules of groups. These categories being monoidal, we also study links between direct sums of simple objects and tensor products of simple objects and give some results for a Clebsch-Gordan formula. We, in this context, present and develop the character theory for representations of crossed modules of finite groups, and detail the proofs. We then study the category itself, which leads to some ribbon invariants.
Generating realistic listener facial motions in dyadic conversations remains challenging due to the high-dimensional action space and temporal dependency requirements. Existing approaches usually consider extracting 3D Morphable Model (3DMM) coefficients and modeling in the 3DMM space. However, this makes the computational speed of the 3DMM a bottleneck, making it difficult to achieve real-time interactive responses. To tackle this problem, we propose Facial Action Diffusion (FAD), which introduces the diffusion methods from the field of image generation to achieve efficient facial action generation. We further build the Efficient Listener Network (ELNet) specially designed to accommodate both the visual and audio information of the speaker as input. Considering of FAD and ELNet, the proposed method learns effective listener facial motion representations and leads to improvements of performance over the state-of-the-art methods while reducing 99% computational time.
In music information retrieval (MIR), contrastive self-supervised learning for general-purpose representation models is effective for global tasks such as automatic tagging. However, for local tasks such as chord estimation, it is widely assumed that contrastively trained general-purpose self-supervised models are inadequate and that more sophisticated SSL is necessary; e.g., masked modeling. Our paper challenges this assumption by revealing the potential of contrastive SSL paired with a transformer in local MIR tasks. We consider a lightweight vision transformer with one-dimensional patches in the time--frequency domain (ViT-1D) and train it with simple contrastive SSL through normalized temperature-scaled cross-entropy loss (NT-Xent). Although NT-Xent operates only over the class token, we observe that, potentially thanks to weight sharing, informative musical properties emerge in ViT-1D's sequence tokens. On global tasks, the temporal average of class and sequence tokens offers a performance increase compared to the class token alone, showing useful properties in the sequence tokens. On local tasks, sequence tokens perform unexpectedly well, despite not being specifically trained for. Furthermore, high-level musical features such as onsets emerge from layer-wise attention maps and self-similarity matrices show different layers capture different musical dimensions. Our paper does not focus on improving performance but advances the musical interpretation of transformers and sheds light on some overlooked abilities of contrastive SSL paired with transformers for sequence modeling in MIR.
We develop a variant of Stein's method of comparison of generators to bound the Kolmogorov, total variation, and Wasserstein-1 distances between distributions on the real line. Our discrepancy is expressed in terms of the ratio of reverse hazard rates; it therefore remains tractable even when density derivatives are intractable. Our main application concerns the approximation of normalized extremes by Fréchet laws. In this setting, the new discrepancy provides a quantitative measure of distributional proximity in terms of the average regular variation at infinity of the underlying cumulative distribution function. We illustrate the approach through explicit computations for maxima of Pareto, Cauchy, and Burr~XII distributions.
Implicit neural representations (INRs) have demonstrated strong capabilities in various medical imaging tasks, such as denoising, registration, and segmentation, by representing images as continuous functions, allowing complex details to be captured. For image reconstruction problems, INRs can also reduce artifacts typically introduced by conventional reconstruction algorithms. However, to the best of our knowledge, INRs have not been studied in the context of PET reconstruction. In this paper, we propose an unsupervised PET image reconstruction method based on the implicit SIREN neural network architecture using sinusoidal activation functions. Our method incorporates a forward projection model and a loss function adapted to perform PET image reconstruction directly from sinograms, without the need for large training datasets. The performance of the proposed approach was compared with that of conventional penalized likelihood methods and deep image prior (DIP) based reconstruction using brain phantom data and realistically simulated sinograms. The results show that the INR-based approach can reconstruct high-quality images with a simpler, more efficient model, offering improvements in PET image reconstruction, particularly in terms of contrast, activity recovery, and relative bias.
Researchers from Nantes Université and Amazon Prime Video developed a novel metric, Resolution Cross-Over Quality Loss (RCQL), and the Live Sport Cross-Over (LSCO) dataset to precisely identify optimal resolution switching points in adaptive bitrate streaming for live sports content. Their work demonstrates that pairwise comparison subjective tests provide more accurate cross-over determinations than traditional Absolute Category Rating, and highlights that VQMs designed for overall quality do not reliably predict these specific transition points.
We report on a search for sub-GeV dark matter (DM) particles interacting with electrons using the DAMIC-M prototype detector at the Modane Underground Laboratory. The data feature a significantly lower detector single ee^- rate (factor 50) compared to our previous search, while also accumulating a ten times larger exposure of \sim1.3 kg-day. DM interactions in the skipper charge-coupled devices (CCDs) are searched for as patterns of two or three consecutive pixels with a total charge between 2 and 4 ee^-. We find 144 candidates of 2 ee^- and 1 candidate of 4 ee^-, where 141.5 and 0.071, respectively, are expected from background. With no evidence of a DM signal, we place stringent constraints on DM particles with masses between 1 and 1000 MeV/c2c^2 interacting with electrons through an ultra-light or heavy mediator. For large ranges of DM masses below 1 GeV/c2^2, we exclude theoretically-motivated benchmark scenarios where hidden-sector particles are produced as a major component of DM in the Universe through the freeze-in or freeze-out mechanisms.
There are no more papers matching your filters at the moment.