Universidad Nacional de Educaciòn a Distancia
The ability to summarize long documents succinctly is increasingly important in daily life due to information overload, yet there is a notable lack of such summaries for Spanish documents in general, and in the legal domain in particular. In this work, we present BOE-XSUM, a curated dataset comprising 3,648 concise, plain-language summaries of documents sourced from Spain's ``Bolet\'ın Oficial del Estado'' (BOE), the State Official Gazette. Each entry in the dataset includes a short summary, the original text, and its document type label. We evaluate the performance of medium-sized large language models (LLMs) fine-tuned on BOE-XSUM, comparing them to general-purpose generative models in a zero-shot setting. Results show that fine-tuned models significantly outperform their non-specialized counterparts. Notably, the best-performing model -- BERTIN GPT-J 6B (32-bit precision) -- achieves a 24\% performance gain over the top zero-shot model, DeepSeek-R1 (accuracies of 41.6\% vs.\ 33.5\%).
In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class.
1
BACKGROUND: Medical large language models (LLMS) have demonstrated remarkable performance in answering medical examinations. However, the extent to which this high performance is transferable to medical questions in Spanish and from a Latin American country remains unexplored. This knowledge is crucial as LLM-based medical applications gain traction in Latin America. AIMS: to build a dataset of questions from medical examinations taken by Peruvian physicians pursuing specialty training; to fine-tune a LLM on this dataset; to evaluate and compare the performance in terms of accuracy between vanilla LLMs and the fine-tuned LLM. METHODS: We curated PeruMedQA, a multiple-choice question-answering (MCQA) datasets containing 8,380 questions spanning 12 medical domains (2018-2025). We selected eight medical LLMs including medgemma-4b-it and medgemma-27b-text-it, and developed zero-shot task-specific prompts to answer the questions appropriately. We employed parameter-efficient fine tuning (PEFT)and low-rant adaptation (LoRA) to fine-tune medgemma-4b-it utilizing all questions except those from 2025 (test set). RESULTS: medgemma-27b-text-it outperformed all other models, achieving a proportion of correct answers exceeding 90% in several instances. LLMs with <10 billion parameters exhibited <60% of correct answers, while some exams yielded results <50%. The fine-tuned version of medgemma-4b-it emerged victorious agains all LLMs with <10 billion parameters and rivaled a LLM with 70 billion parameters across various examinations. CONCLUSIONS: For medical AI application and research that require knowledge bases from Spanish-speaking countries and those exhibiting similar epidemiological profiles to Peru's, interested parties should utilize medgemma-27b-text-it or a fine-tuned version of medgemma-4b-it.
In recent years, Large Language Models (LLMs) rapidly gained popularity across all parts of society, including education. After initial skepticism and bans, many schools have chosen to embrace this new technology by integrating it into their curricula in the form of virtual tutors and teaching assistants. However, neither the companies developing this technology nor the public institutions involved in its implementation have set up a formal system to collect feedback from the stakeholders impacted by them. In this paper, we argue that understanding the perceptions of those directly or indirectly impacted by LLMs in the classroom, including parents and school staff, is essential for ensuring responsible use of AI in this critical domain. Our contributions are two-fold. First, we propose the Contextualized Perceptions for the Adoption of LLMs in Education (Co-PALE) framework, which can be used to systematically elicit perceptions and inform whether and how LLM-based tools should be designed, developed, and deployed in the classroom. Second, we explain how our framework can be used to ground specific rubrics for eliciting perceptions of the relevant stakeholders in view of specific goals and context of implementation. Overall, Co-PALE is a practical step toward helping educational agents, policymakers, researchers, and technologists ensure the responsible and effective deployment of LLM-based systems across diverse learning contexts.
We combine Kirchheim's metric differentials with Cheeger charts in order to establish a non-embeddability principle for any collection C\mathcal C of Banach (or metric) spaces: if a metric measure space XX bi-Lipschitz embeds in some element in C\mathcal C, and if every Lipschitz map XYCX\to Y\in \mathcal C is differentiable, then XX is rectifiable. This gives a simple proof of the rectifiability of Lipschitz differentiability spaces that are bi-Lipschitz embeddable in Euclidean space, due to Kell--Mondino. Our principle also implies a converse to Kirchheim's theorem: if all Lipschitz maps from a domain space to arbitrary targets are metrically differentiable, the domain is rectifiable. We moreover establish the compatibility of metric and w^*-differentials of maps from metric spaces in the spirit of Ambrosio--Kirchheim.
This article presents the experiments and results obtained by the GRESEL team in the IberLEF 2025 shared task PastReader: Transcribing Texts from the Past. Three types of experiments were conducted with the dual aim of participating in the task and enabling comparisons across different approaches. These included the use of a web-based OCR service, a traditional OCR engine, and a compact multimodal model. All experiments were run on consumer-grade hardware, which, despite lacking high-performance computing capacity, provided sufficient storage and stability. The results, while satisfactory, leave room for further improvement. Future work will focus on exploring new techniques and ideas using the Spanish-language dataset provided by the shared task, in collaboration with Biblioteca Nacional de España (BNE).
We study the maximization of the logarithmic utility for an insider with different anticipating techniques. Our aim is to compare the utilization of Russo-Vallois forward and Skorokhod integrals in this context. Theoretical analysis and illustrative numerical examples showcase that the Skorokhod insider outperforms the forward insider. This remarkable observation stands in contrast to the scenario involving risk-neutral traders. Furthermore, an ordinary trader could surpass both insiders if a significant negative fluctuation in the driving stochastic process leads to a sufficiently negative final value. These findings underline the intricate interplay between anticipating stochastic calculus and nonlinear utilities, which may yield non-intuitive results from the financial viewpoint.
This work focuses on the mathematical study of constant function market makers. We rigorously establish the conditions for optimal trading under the assumption of a quasilinear, but not necessarily convex (or concave), trade function. This generalizes previous results that used convexity, and also guarantees the robustness against arbitrage of so-designed automatic market makers. The theoretical results are illustrated by families of examples given by generalized means, and also by numerical simulations in certain concrete cases. These simulations along with the mathematical analysis suggest that the quasilinear-trade-function based automatic market makers might replicate the functioning of those based on convex functions, in particular regarding their resilience to arbitrage.
Free-floating planetary-mass objects (FFPs) have been detected through direct imaging within several young, nearby star-forming regions. The properties of circumstellar disks around these objects may provide a valuable probe into their origin, but is currently limited by the small sizes of the sample explored. We aim to perform a statistical study of the occurrence of circumstellar disks down to the planetary-mass regime. We performed a systematic survey of disks among the population identified in the 5-10 Myr-old Upper Scorpius association (USC), restricted to members outside the younger, embedded Ophiuchus region and with estimated masses below 105 M_Jup. We took advantage of unWISE photometry to search for mid-infrared excesses in the WISE (W1-W2) color. We implemented a Bayesian outlier detection method that models the photospheric sequence and computes excess probabilities for each object, enabling statistically sound estimation of disk fractions. We explore disk fractions across an unprecedentedly fine mass grid, reaching down to objects as low as ~6 M_Jup assuming 5 Myr or ~8 M_Jup assuming 10 Myr, thus extending the previous lower boundary of disk fraction studies. Depending on the age, our sample includes between 17 and 40 FFPs. We confirm that the disk fraction steadily rises with decreasing mass and exceeds 30% near the substellar-to-planetary mass boundary at ~13 M_Jup. We find hints of a possible flattening in this trend around 25-45 M_Jup, potentially signaling a transition in the dominant formation processes. This change of trend should be considered with caution and needs to be confirmed with more sensitive observations. Our results are consistent with the gradual dispersal of disks over time, as disk fractions in Upper Scorpius appear systematically lower than those in younger regions.
This work introduces a companion reproducible paper with the aim of allowing the exact replication of the methods, experiments, and results discussed in a previous work [5]. In that parent paper, we proposed many and varied techniques for compressing indexes which exploit that highly repetitive collections are formed mostly of documents that are near-copies of others. More concretely, we describe a replication framework, called uiHRDC (universal indexes for Highly Repetitive Document Collections), that allows our original experimental setup to be easily replicated using various document collections. The corresponding experimentation is carefully explained, providing precise details about the parameters that can be tuned for each indexing solution. Finally, note that we also provide uiHRDC as reproducibility package.
This registered report introduces the largest, and for the first time, reproducible experimental survey on biomedical sentence similarity with the following aims: (1) to elucidate the state of the art of the problem; (2) to solve some reproducibility problems preventing the evaluation of most of current methods; (3) to evaluate several unexplored sentence similarity methods; (4) to evaluate an unexplored benchmark, called Corpus-Transcriptional-Regulation; (5) to carry out a study on the impact of the pre-processing stages and Named Entity Recognition (NER) tools on the performance of the sentence similarity methods; and finally, (6) to bridge the lack of reproducibility resources for methods and experiments in this line of research. Our experimental survey is based on a single software platform that is provided with a detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments. In addition, we introduce a new aggregated string-based sentence similarity method, called LiBlock, together with eight variants of current ontology-based methods and a new pre-trained word embedding model trained on the full-text articles in the PMC-BioC corpus. Our experiments show that our novel string-based measure sets the new state of the art on the sentence similarity task in the biomedical domain and significantly outperforms all the methods evaluated herein, except one ontology-based method. Likewise, our experiments confirm that the pre-processing stages, and the choice of the NER tool, have a significant impact on the performance of the sentence similarity methods. We also detail some drawbacks and limitations of current methods, and warn on the need of refining the current benchmarks. Finally, a noticeable finding is that our new string-based method significantly outperforms all state-of-the-art Machine Learning models evaluated herein.
ETH Zurich logoETH ZurichUniversity of Cambridge logoUniversity of CambridgeUniversity College London logoUniversity College LondonUniversity of EdinburghETH Zürich logoETH ZürichUniversität HeidelbergUppsala UniversityUniversity of ZagrebUniversity of ViennaUniversitat de BarcelonaConsejo Superior de Investigaciones CientíficasUniversity of LeicesterUniversidad Complutense de MadridUniversiteit LeidenObservatoire de ParisUniversité de LiègeINAF - Osservatorio Astrofisico di TorinoUniversity of Groningen logoUniversity of GroningenInstituto de Astrofísica de CanariasUniversidad de ChileEuropean Space AgencyEuropean Southern Observatory logoEuropean Southern ObservatoryInstituto de Astronomía, Universidad Nacional Autónoma de MéxicoObservatoire de la Côte d’AzurInstituto de Astrofísica de Andalucía-CSICUniversité de Franche-ComtéLeibniz-Institut für Astrophysik PotsdamKatholieke Universiteit LeuvenUniversidade da CoruñaINAF - Osservatorio Astrofisico di CataniaUniversidade de VigoRoyal Observatory of BelgiumUniversität BremenTartu ObservatoryLund ObservatoryHungarian Academy of SciencesObservatoire de GenèveINAF–Osservatorio di Astrofisica e Scienza dello Spazio di BolognaWarsaw University ObservatoryUniversidad Nacional de Educaciòn a DistanciaUniversité de BesançonSpace Science Data Center - Italian Space AgencyUniversité PSL, Observatoire de ParisObservatoire de Paris, PSL Research UniversityUniversité LeuvenUniversit catholique de LouvainUniversit de BordeauxUniversit de StrasbourgUniversit de LyonINAF Osservatorio Astrofisico di ArcetriINAF Osservatorio Astronomico di PadovaAstronomisches Rechen–InstitutUniversit de MontpellierUniversit degli Studi di Torino
Gaia Data Release 3 provides novel flux-calibrated low-resolution spectrophotometry for about 220 million sources in the wavelength range 330nm - 1050nm (XP spectra). Synthetic photometry directly tied to a flux in physical units can be obtained from these spectra for any passband fully enclosed in this wavelength range. We describe how synthetic photometry can be obtained from XP spectra, illustrating the performance that can be achieved under a range of different conditions - for example passband width and wavelength range - as well as the limits and the problems affecting it. Existing top-quality photometry can be reproduced within a few per cent over a wide range of magnitudes and colour, for wide and medium bands, and with up to millimag accuracy when synthetic photometry is standardised with respect to these external sources. Some examples of potential scientific application are presented, including the detection of multiple populations in globular clusters, the estimation of metallicity extended to the very metal-poor regime, and the classification of white dwarfs. A catalogue providing standardised photometry for ~220 million sources in several wide bands of widely used photometric systems is provided (Gaia Synthetic Photometry Catalogue; GSPC) as well as a catalogue of 105\simeq 10^5 white dwarfs with DA/non-DA classification obtained with a Random Forest algorithm (Gaia Synthetic Photometry Catalogue for White Dwarfs; GSPC-WD).
The computational analysis of poetry is limited by the scarcity of tools to automatically analyze and scan poems. In a multilingual settings, the problem is exacerbated as scansion and rhyme systems only exist for individual languages, making comparative studies very challenging and time consuming. In this work, we present \textsc{Alberti}, the first multilingual pre-trained large language model for poetry. Through domain-specific pre-training (DSP), we further trained multilingual BERT on a corpus of over 12 million verses from 12 languages. We evaluated its performance on two structural poetry tasks: Spanish stanza type classification, and metrical pattern prediction for Spanish, English and German. In both cases, \textsc{Alberti} outperforms multilingual BERT and other transformers-based models of similar sizes, and even achieves state-of-the-art results for German when compared to rule-based systems, demonstrating the feasibility and effectiveness of DSP in the poetry domain.
Starting from an arbitrary full-rank state of a lattice quantum spin system, we define a "canonical purified Hamiltonian" and characterize its spectral gap in terms of a spatial mixing condition (or correlation decay) of the state. When the state considered is a Gibbs state of a local, commuting Hamiltonian at positive temperature, we show that the spectral gap of the canonical purified Hamiltonian provides a lower bound to the spectral gap of a large class of reversible generators of quantum Markov semigroup, including local and ergodic Davies generators. As an application of our construction, we show that the mixing condition is always satisfied for any finite-range 1D model, as well as by Kitaev's quantum double models.
Campisi, Zhan, Talkner and Hänggi have recently proposed [Campisi] the use of the logarithmic oscillator as an ideal Hamiltonian thermostat, both in simulations and actual experiments. However, the system exhibits several theoretical drawbacks which must be addressed if this thermostat is to be implemented effectively.
Today, it would be difficult for us to live a full life without polymers, especially in medicine, where its applicability is constantly expanding, giving satisfactory results without any harm effects on health. This study focused on the formation of hexagonal domains doped with AgNPs using a KrF excimer laser ({\lambda}=248 nm) on the polyetheretherketone (PEEK) surface that acts as an unfailing source of the antibacterial agent - silver. The hexagonal structure was formed with a grid placed in front of the incident laser beam. Surfaces with immobilized silver nanoparticles (AgNPs) were observed by AFM and SEM. Changes in surface chemistry were studied by XPS. To determine the concentration of released Ag+ ions, ICP-MS analysis was used. The antibacterial tests proved the antibacterial efficacy of Ag-doped PEEK composites against Escherichia coli and Staphylococcus aureus as the most common pathogens. Because AgNPs are also known for their strong toxicity, we also included cytotoxicity tests in this study. The findings presented here contribute to the advancement of materials design in the biomedical field, offering a novel starting point for combating bacterial infections through the innovative integration of AgNPs into inert synthetic polymers.
In recent works by L. Drewnowski and I. Labuda and J. Martínez et al., non-pathological analytic P P -ideals and non-pathological Fσ F_\sigma -ideals have been characterized and studied in terms of their representations by a sequence (xn)n (x_n)_n in a Banach space, as C((xn)n) \mathcal{C}((x_n)_n) and B((xn)n) \mathcal{B}((x_n)_n) . The ideal C((xn)n) \mathcal{C}((x_n)_n) consists of sets where the series nAxn \sum_{n \in A} x_n is unconditionally convergent, while B((xn)n) \mathcal{B}((x_n)_n) involves weak unconditional convergence. In this paper, we further study these representations and provide effective descriptions of B \mathcal{B} - and C \mathcal{C} -ideals in the universal spaces C([0,1]) C([0,1]) and C(2N) C(2^{\mathbb{N}}) , addressing a question posed by Borodulin-Nadzieja et al. A key aspect of our study is the role of the space c0 c_0 in these representations. We focus particularly on B \mathcal{B} -representations in spaces containing many copies of c0 c_0 , such as c0 c_0 -saturated spaces of continuous functions. A central tool in our analysis is the concept of c c -coloring ideals, which arise from homogeneous sets of continuous colorings. These ideals, generated by homogeneous sets of 2-colorings, exhibit a rich combinatorial structure. Among our results, we prove that for d3 d \geq 3 , the random d d -homogeneous ideal is pathological, we construct hereditarily non-pathological universal c c -coloring ideals, and we show that every B \mathcal{B} -ideal represented in C(K) C(K) , for K K countable, contains a c c -coloring ideal. Furthermore, by leveraging c c -coloring ideals, we provide examples of B \mathcal{B} -ideals that are not B \mathcal{B} -representable in c0 c_0 . These findings highlight the interplay between combinatorial properties of ideals and their representations in Banach spaces.
We extend classical work by Janusz Czelakowski on the closure properties of the class of matrix models of entailment relations - nowadays more commonly called multiple-conclusion logics - to the setting of non-deterministic matrices (Nmatrices), characterizing the Nmatrix models of an arbitrary logic through a generalization of the standard class operators to the non-deterministic setting. We highlight the main differences that appear in this more general setting, in particular: the possibility to obtain Nmatrix quotients using any compatible equivalence relation (not necessarily a congruence); the problem of determining when strict homomorphisms preserve the logic of a given Nmatrix; the fact that the operations of taking images and preimages cannot be swapped, which determines the exact sequence of operators that generates, from any complete semantics, the class of all Nmatrix models of a logic. Many results, on the other hand, generalize smoothly to the non-deterministic setting: we show for instance that a logic is finitely based if and only if both the class of its Nmatrix models and its complement are closed under ultraproducts. We conclude by mentioning possible developments in adapting the Abstract Algebraic Logic approach to logics induced by Nmatrices and the associated equational reasoning over non-deterministic algebras.
Context. Substellar objects, including brown dwarfs and free-floating planetary-mass objects, are a significant product of star formation. Their sensitivity to initial conditions and early dynamical evolution makes them especially valuable for studying planetary and stellar formation processes. Aims. We search for brown dwarfs and isolated planetary mass objects in a young star-forming region to better constrain their formation mechanisms. Methods. We took advantage of the Euclid unprecedented sensitivity, spatial resolution and wide field of view to search for brown dwarfs and free-floating planetary mass objects in the LDN 1495 region of the Taurus molecular clouds. We combined the recent Euclid Early Release Observations with older very deep ground-based images obtained over more than 20 yr to derive proper motions and multiwavelength photometry and to select members based on their morphology and their position in a proper motion diagram and in nine color-magnitude diagrams. Results. We identified 15 point sources whose proper motions, colors, and luminosity are consistent with being members of LDN 1495. Six of these objects were already known M9-L1 members. The remaining nine are newly identified sources whose spectral types might range from late-M to early-T types, with masses potentially as low as 1~2 MJup based on their luminosity and according to evolutionary models. However, follow-up observations are needed to confirm their nature, spectral type, and membership. When it is extrapolated to the entire Taurus star-forming region, this result suggests the potential presence of several dozen free-floating planetary mass objects.
Context. Gaia Data Release 3 contains astrometry and photometry results for about 1.8 billion sources based on observations collected by the European Space Agency (ESA) Gaia satellite during the first 34 months of its operational phase (the same period covered Gaia early Data Release 3; Gaia EDR3). Low-resolution spectra for 220 million sources are one of the important new data products included in this release. Aims. In this paper, we focus on the external calibration of low-resolution spectroscopic content, describing the input data, algorithms, data processing, and the validation of the results. Particular attention is given to the quality of the data and to a number of features that users may need to take into account to make the best use of the catalogue. Methods. We calibrated an instrument model to relate mean Gaia spectra to the corresponding spectral energy distributions using an extended set of calibrators: this includes modelling of the instrument dispersion relation, transmission, and line spread functions. Optimisation of the model is achieved through total least-squares regression, accounting for errors in Gaia and external spectra. Results. The resulting instrument model can be used for forward modelling of Gaia spectra or for inverse modelling of externally calibrated spectra in absolute flux units. Conclusions. The absolute calibration derived in this paper provides an essential ingredient for users of BP/RP spectra. It allows users to connect BP/RP spectra to absolute fluxes and physical wavelengths.
There are no more papers matching your filters at the moment.