University of Porto
When the available data for a target domain is limited, transfer learning (TL) methods can be used to develop models on related data-rich domains, before deploying them on the target domain. However, these TL methods are typically designed with specific, static assumptions on the amount of available labeled and unlabeled target data. This is in contrast with many real world applications, where the availability of data and corresponding labels varies over time. Since the evaluation of the TL methods is typically also performed under the same static data availability assumptions, this would lead to unrealistic expectations concerning their performance in real world settings. To support a more realistic evaluation and comparison of TL algorithms and models, we propose a data manipulation framework that (1) simulates varying data availability scenarios over time, (2) creates multiple domains through resampling of a given dataset and (3) introduces inter-domain variability by applying realistic domain transformations, e.g., creating a variety of potentially time-dependent covariate and concept shifts. These capabilities enable simulation of a large number of realistic variants of the experiments, in turn providing more information about the potential behavior of algorithms when deployed in dynamic settings. We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets. Given the confidential nature of the case study, we also illustrate the use of the framework on the publicly available Bank Account Fraud (BAF) dataset. By providing a methodology for evaluating TL methods over time and in realistic data availability scenarios, our framework facilitates understanding of the behavior of models and algorithms. This leads to better decision making when deploying models for new domains in real-world environments.
Federated Learning (FL) is a distributed machine learning approach that promises privacy by keeping the data on the device. However, gradient reconstruction and membership-inference attacks show that model updates still leak information. Fully Homomorphic Encryption (FHE) can address those privacy concerns but it suffers from ciphertext expansion and requires prohibitive overhead on resource-constrained devices. We propose the first Hybrid Homomorphic Encryption (HHE) framework for FL that pairs the PASTA symmetric cipher with the BFV FHE scheme. Clients encrypt local model updates with PASTA and send both the lightweight ciphertexts and the PASTA key (itself BFV-encrypted) to the server, which performs a homomorphic evaluation of the decryption circuit of PASTA and aggregates the resulting BFV ciphertexts. A prototype implementation, developed on top of the Flower FL framework, shows that on independently and identically distributed MNIST dataset with 12 clients and 10 training rounds, the proposed HHE system achieves 97.6% accuracy, just 1.3% below plaintext, while reducing client upload bandwidth by over 2,000x and cutting client runtime by 30% compared to a system based solely on the BFV FHE scheme. However, server computational cost increases by roughly 15621x for each client participating in the training phase, a challenge to be addressed in future work.
We present our shared task on text-based emotion detection, covering more than 30 languages from seven distinct language families. These languages are predominantly low-resource and are spoken across various continents. The data instances are multi-labeled with six emotional classes, with additional datasets in 11 languages annotated for emotion intensity. Participants were asked to predict labels in three tracks: (a) multilabel emotion detection, (b) emotion intensity score detection, and (c) cross-lingual emotion detection. The task attracted over 700 participants. We received final submissions from more than 200 teams and 93 system description papers. We report baseline results, along with findings on the best-performing systems, the most common approaches, and the most effective methods across different tracks and languages. The datasets for this task are publicly available. The dataset is available at SemEval2025 Task 11 this https URL
40
This forecasting study demonstrates that combining data from the Square Kilometre Array Observatory (SKAO) and European Southern Observatory (ESO) facilities offers significantly tighter and more robust constraints on cosmological parameters than individual surveys. Leveraging multi-tracer analyses, these synergies are projected to reduce systematic errors and parameter degeneracies, advancing precision cosmology and the exploration of physics beyond the ΛCDM model.
Virtual staining is a promising technique that uses deep generative models to recreate histological stains, providing a faster and more cost-effective alternative to traditional tissue chemical staining. Specifically for H&E-HER2 staining transfer, despite a rising trend in publications, the lack of sufficient public datasets has hindered progress in the topic. Additionally, it is currently unclear which model frameworks perform best for this particular task. In this paper, we introduce the HER2match dataset, the first publicly available dataset with the same breast cancer tissue sections stained with both H&E and HER2. Furthermore, we compare the performance of several Generative Adversarial Networks (GANs) and Diffusion Models (DMs), and implement a novel Brownian Bridge Diffusion Model for H&E-HER2 translation. Our findings indicate that, overall, GANs perform better than DMs, with only the BBDM achieving comparable results. Furthermore, we emphasize the importance of data alignment, as all models trained on HER2match produced vastly improved visuals compared to the widely used consecutive-slide BCI dataset. This research provides a new high-quality dataset ([available upon publication acceptance]), improving both model training and evaluation. In addition, our comparison of frameworks offers valuable guidance for researchers working on the topic.
CNRS logoCNRSUniversity of Amsterdam logoUniversity of AmsterdamCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of Cambridge logoUniversity of CambridgeHeidelberg UniversityINFN Sezione di NapoliUniversity of Waterloo logoUniversity of WaterlooImperial College London logoImperial College LondonUniversity College London logoUniversity College LondonUniversity of Oxford logoUniversity of OxfordUniversity of California, Irvine logoUniversity of California, IrvineScuola Normale SuperioreUniversity of Copenhagen logoUniversity of CopenhagenUniversity of EdinburghCSICNASA Goddard Space Flight Center logoNASA Goddard Space Flight CenterUniversidade de LisboaLancaster UniversityEPFL logoEPFLUniversité Paris-Saclay logoUniversité Paris-SaclayHelsinki Institute of PhysicsUniversity of HelsinkiSorbonne Université logoSorbonne UniversitéLeiden University logoLeiden UniversityCEA logoCEAUniversity of GenevaUniversity of PortsmouthLudwig-Maximilians-Universität MünchenUniversidad Complutense de MadridUniversität BonnUniversità di GenovaObservatoire de ParisThe University of British ColumbiaTechnical University of DenmarkINAF - Osservatorio Astrofisico di TorinoUniversité Côte d’AzurDurham University logoDurham UniversityUniversity of Groningen logoUniversity of GroningenInstituto de Astrofísica e Ciências do EspaçoJet Propulsion LaboratoryInstituto de Astrofísica de CanariasUniversity of NottinghamÉcole Polytechnique Fédérale de LausanneUniversitat Autònoma de BarcelonaSISSACNESINFN, Sezione di TorinoKarlsruhe Institute of Technology (KIT)Universidad de ValparaísoUniversidad Pablo de OlavideCanadian Institute for Advanced ResearchCentro de Astrobiología (CAB)Laboratoire LagrangeUniversity of São PauloObservatoire de la Côte d’AzurUniversity of Hawai’iINTAINAF – Istituto di Astrofisica e Planetologia SpazialiUniversity of the Western CapeMax Planck Institute for AstronomyThe Barcelona Institute of Science and TechnologyUniversity of PortoINAF – Osservatorio Astronomico di RomaInstitut de Física d’Altes Energies (IFAE)INFN - Sezione di PadovaInstituto de Astrofísica de Andalucía (IAA)Institut de Physique des 2 Infinis de LyonINAF-IASF MilanoInstitute of Space ScienceInstitut d’Astrophysique SpatialeINFN-Sezione di GenovaEuropean Space Agency (ESA)INFN-Sezione di BolognaINFN Sezione di RomaUniversidad Politécnica de CartagenaLAM (Laboratoire d’Astrophysique de Marseille)INFN Sezione di Roma 2ASI - Agenzia Spaziale ItalianaUniversità del SannioInfrared Processing and Analysis CenterUniversità Federico II di NapoliInternational Centre for Radio Astronomy Research, University of Western AustraliaLaboratoire Astroparticule et Cosmologie (APC)Institute of Space Sciences (ICE)ESACObservatoire de SauvernyPort d'Informació Científica (PIC)Institut de Ciències de l’Espai (ICE)Universit di CataniaINFN-Sezione di FerraraMuseo Storico della Fisica e Centro Studi e Ricerche Enrico Fermi (CREF)Cosmic Dawn Center(DAWN)Universit degli Studi di PerugiaUniversit Claude Bernard Lyon 1Universit del SalentoAix-Marseille Universit",Universit Paris CitMax Planck-Institute for Extraterrestrial PhysicsSapienza Universit di RomaUniversit di PadovaUniversit degli Studi di TorinoUniversit di Roma Tor VergataINAF Osservatorio di Astrofisica e Scienza dello Spazio di BolognaUniversit Di BolognaIFPU Institute for fundamental physics of the UniverseINAF ` Osservatorio Astronomico di TriesteINAF Osservatorio Astronomico di Brera
As the statistical precision of cosmological measurements increases, the accuracy of the theoretical description of these measurements needs to increase correspondingly in order to infer the underlying cosmology that governs the Universe. To this end, we have created the Cosmology Likelihood for Observables in Euclid (CLOE), which is a novel cosmological parameter inference pipeline developed within the Euclid Consortium to translate measurements and covariances into cosmological parameter constraints. In this first in a series of six papers, we describe the theoretical recipe of this code for the Euclid primary probes. These probes are composed of the photometric 3x2pt observables of cosmic shear, galaxy-galaxy lensing, and galaxy clustering, along with spectroscopic galaxy clustering. We provide this description in both Fourier and configuration space for standard and extended summary statistics, including the wide range of systematic uncertainties that affect them. This includes systematic uncertainties such as intrinsic galaxy alignments, baryonic feedback, photometric and spectroscopic redshift uncertainties, shear calibration uncertainties, sample impurities, photometric and spectroscopic galaxy biases, as well as magnification bias. The theoretical descriptions are further able to accommodate both Gaussian and non-Gaussian likelihoods and extended cosmologies with non-zero curvature, massive neutrinos, evolving dark energy, and simple forms of modified gravity. These theoretical descriptions that underpin CLOE will form a crucial component in revealing the true nature of the Universe with next-generation cosmological surveys such as Euclid.
This paper offers a systematic review of long-tailed learning methods and proposes a new, comprehensive taxonomy that categorizes techniques based on their role in the machine learning workflow. The review highlights that while progress has been made, substantial challenges remain for complex tasks and real-world applications, noting that different methods often complement each other.
This industrial case study by Critical TechWorks and the University of Porto demonstrates using Large Language Models to automate acceptance test generation for web applications in an automotive context. The two-step process, converting user stories to Gherkin scenarios and then to executable Cypress scripts, achieved 95% user-rated helpfulness for scenarios and 60% fully valid executable scripts in a real-world setting.
This literature review explores continual learning methods for on-device training in the context of neural networks (NNs) and decision trees (DTs) for classification tasks on smart environments. We highlight key constraints, such as data architecture (batch vs. stream) and network capacity (cloud vs. edge), which impact TinyML algorithm design, due to the uncontrolled natural arrival of data streams. The survey details the challenges of deploying deep learners on resource-constrained edge devices, including catastrophic forgetting, data inefficiency, and the difficulty of handling IoT tabular data in open-world settings. While decision trees are more memory-efficient for on-device training, they are limited in expressiveness, requiring dynamic adaptations, like pruning and meta-learning, to handle complex patterns and concept drifts. We emphasize the importance of multi-criteria performance evaluation tailored to edge applications, which assess both output-based and internal representation metrics. The key challenge lies in integrating these building blocks into autonomous online systems, taking into account stability-plasticity trade-offs, forward-backward transfer, and model convergence.
Deep generative models produce data according to a learned representation, e.g. diffusion models, through a process of approximation computing possible samples. Approximation can be understood as reconstruction and the large datasets used to train models as sets of records in which we represent the physical world with some data structure (photographs, audio recordings, manuscripts). During the process of reconstruction, e.g., image frames develop each timestep towards a textual input description. While moving forward in time, frame sets are shaped according to learned bias and their production, we argue here, can be considered as going back in time; not by inspiration on the backward diffusion process but acknowledging culture is specifically marked in the records. Futures of generative modelling, namely in film and audiovisual arts, can benefit by dealing with diffusion systems as a process to compute the future by inevitably being tied to the past, if acknowledging the records as to capture fields of view at a specific time, and to correlate with our own finite memory ideals. Models generating new data distributions can target video production as signal processors and by developing sequences through timelines we ourselves also go back to decade-old algorithmic and multi-track methodologies revealing the actual predictive failure of contemporary approaches to synthesis in moving image, both as relevant to composition and not explanatory.
26 Sep 2025
Speckle-based fiber optic sensors are well-known to offer high sensitivity but are strongly limited on the interrogation side by low camera frame rates and dynamic range. To address this limitation, we present a novel interrogation framework that explores event-based vision to achieve high throughput, high bandwidth, and low-latency speckle analysis of a multimode optical fiber sensor. In addition, leveraging a tensor-based decomposition of the raw event streams through multi-point calibration and machine-learning optimization, our approach also proves capable of isolating simultaneous deformations applied at distinct points. The experimental results validate the methodology by separating the signals of four piezoelectric actuators over a 400Hz-20kHz range with minimal crosstalk applied over varying distances from 3cm to 75cm. Finally, extending the impact of the work with an acoustic sensing proof-of-concept, we have coupled the fiber to two plastic enclosures and recovered separable audio signals between 400 and 1.8 kHz with minimal waveform distortion. Overall, these results establish event-driven speckle interrogation as a new versatile platform for real-time, multi-point acoustic sensing and pave for its application in complex and unstructured environments in future works.
AToMIC is an LLM-driven framework automating the generation of acceptance test artifacts for industrial Flutter mobile applications, demonstrated on BMW's MyBMW app. It produces Gherkin scenarios, Page Objects, and executable UI test scripts from requirements and code changes, achieving over 95% time savings for practitioners and generating artifacts with high correctness (e.g., 93.3% valid Gherkin, 100% executable UI tests).
In everyday conversations, humans can take on different roles and adapt their vocabulary to their chosen roles. We explore whether LLMs can take on, that is impersonate, different roles when they generate text in-context. We ask LLMs to assume different personas before solving vision and language tasks. We do this by prefixing the prompt with a persona that is associated either with a social identity or domain expertise. In a multi-armed bandit task, we find that LLMs pretending to be children of different ages recover human-like developmental stages of exploration. In a language-based reasoning task, we find that LLMs impersonating domain experts perform better than LLMs impersonating non-domain experts. Finally, we test whether LLMs' impersonations are complementary to visual information when describing different categories. We find that impersonation can improve performance: an LLM prompted to be a bird expert describes birds better than one prompted to be a car expert. However, impersonation can also uncover LLMs' biases: an LLM prompted to be a man describes cars better than one prompted to be a woman. These findings demonstrate that LLMs are capable of taking on diverse roles and that this in-context impersonation can be used to uncover their hidden strengths and biases.
22
The paper describes the MetroPT data set, an outcome of a eXplainable Predictive Maintenance (XPM) project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 that aimed to evaluate machine learning methods for online anomaly detection and failure prediction. By capturing several analogic sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed), we provide a dataset that can be easily used to evaluate online machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.
We introduce the Robustness of Hierarchically Organized Time Series (RHiOTS) framework, designed to assess the robustness of hierarchical time series forecasting models and algorithms on real-world datasets. Hierarchical time series, where lower-level forecasts must sum to upper-level ones, are prevalent in various contexts, such as retail sales across countries. Current empirical evaluations of forecasting methods are often limited to a small set of benchmark datasets, offering a narrow view of algorithm behavior. RHiOTS addresses this gap by systematically altering existing datasets and modifying the characteristics of individual series and their interrelations. It uses a set of parameterizable transformations to simulate those changes in the data distribution. Additionally, RHiOTS incorporates an innovative visualization component, turning complex, multidimensional robustness evaluation results into intuitive, easily interpretable visuals. This approach allows an in-depth analysis of algorithm and model behavior under diverse conditions. We illustrate the use of RHiOTS by analyzing the predictive performance of several algorithms. Our findings show that traditional statistical methods are more robust than state-of-the-art deep learning algorithms, except when the transformation effect is highly disruptive. Furthermore, we found no significant differences in the robustness of the algorithms when applying specific reconciliation methods, such as MinT. RHiOTS provides researchers with a comprehensive tool for understanding the nuanced behavior of forecasting algorithms, offering a more reliable basis for selecting the most appropriate method for a given problem.
Lung cancer is the deadliest type of cancer worldwide and late detection is the major factor for the low survival rate of patients. Low dose computed tomography has been suggested as a potential screening tool but manual screening is costly, time-consuming and prone to variability. This has fueled the development of automatic methods for the detection, segmentation and characterisation of pulmonary nodules but its application to clinical routine is challenging. In this study, a new database for the development and testing of pulmonary nodule computer-aided strategies is presented which intends to complement current databases by giving additional focus to radiologist variability and local clinical reality. State-of-the-art nodule detection, segmentation and characterization methods are tested and compared to manual annotations as well as collaborative strategies combining multiple radiologists and radiologists and computer-aided systems. It is shown that state-of-the-art methodologies can determine a patient's follow-up recommendation as accurately as a radiologist, though the nodule detection method used shows decreased performance in this database.
Temporal information extraction (TIE) has attracted a great deal of interest over the last two decades, leading to the development of a significant number of datasets. Despite its benefits, having access to a large volume of corpora makes it difficult when it comes to benchmark TIE systems. On the one hand, different datasets have different annotation schemes, thus hindering the comparison between competitors across different corpora. On the other hand, the fact that each corpus is commonly disseminated in a different format requires a considerable engineering effort for a researcher/practitioner to develop parsers for all of them. This constraint forces researchers to select a limited amount of datasets to evaluate their systems which consequently limits the comparability of the systems. Yet another obstacle that hinders the comparability of the TIE systems is the evaluation metric employed. While most research works adopt traditional metrics such as precision, recall, and F1F_1, a few others prefer temporal awareness -- a metric tailored to be more comprehensive on the evaluation of temporal systems. Although the reason for the absence of temporal awareness in the evaluation of most systems is not clear, one of the factors that certainly weights this decision is the necessity to implement the temporal closure algorithm in order to compute temporal awareness, which is not straightforward to implement neither is currently easily available. All in all, these problems have limited the fair comparison between approaches and consequently, the development of temporal extraction systems. To mitigate these problems, we have developed tieval, a Python library that provides a concise interface for importing different corpora and facilitates system evaluation. In this paper, we present the first public release of tieval and highlight its most relevant features.
Meta-learning is increasingly used to support the recommendation of machine learning algorithms and their configurations. Such recommendations are made based on meta-data, consisting of performance evaluations of algorithms on prior datasets, as well as characterizations of these datasets. These characterizations, also called meta-features, describe properties of the data which are predictive for the performance of machine learning algorithms trained on them. Unfortunately, despite being used in a large number of studies, meta-features are not uniformly described, organized and computed, making many empirical studies irreproducible and hard to compare. This paper aims to deal with this by systematizing and standardizing data characterization measures for classification datasets used in meta-learning. Moreover, it presents MFE, a new tool for extracting meta-features from datasets and identifying more subtle reproducibility issues in the literature, proposing guidelines for data characterization that strengthen reproducible empirical research in meta-learning.
ETH Zurich logoETH ZurichUniversity of Washington logoUniversity of WashingtonCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of OsloUniversity of Cambridge logoUniversity of CambridgeUniversity of ZurichUniversity of BernFreie Universität BerlinKeele UniversityInstitute for Advanced StudyUniversity of Southern QueenslandStockholm University logoStockholm UniversityUniversity of BolognaMIT logoMITPrinceton University logoPrinceton UniversityUniversity of GenevaUniversity of ViennaUniversity of Warwick logoUniversity of WarwickUniversity of LeicesterUniversity of St Andrews logoUniversity of St AndrewsUniversity of IcelandTechnische Universität BerlinChalmers University of Technology logoChalmers University of TechnologyUniversité Côte d’AzurUniversity of GrazInstituto de Astrofísica e Ciências do EspaçoNiels Bohr InstituteLund UniversityInstituto de Astrofísica de CanariasGerman Aerospace Center (DLR)Universidad de La LagunaELTE Eötvös Loránd UniversitySETI InstituteEuropean Space Research and Technology CentreMax Planck Institute for AstronomyUniversity of PortoInstitut d'Astrophysique de ParisUniversity of PaduaINAF - Osservatorio Astrofisico di CataniaCaltech-IPACNational Aeronautics and Space AdministrationSpace Research InstituteF.R.S.-FNRSUniversidad Católica del NorteEuropean Space Agency (ESA)Konkoly ObservatoryCentro de Astrobiología (CSIC-INTA)Institut de Mécanique Céleste et de Calcul des ÉphéméridesSTAR Institute, Université de LiègeUniversit Grenoble AlpesNASA, Ames Research CenterAix-Marseille Universit",Universit Paris CitInstitute for Astronomy Astrophysics Space Applications and Remote SensingINAF Osservatorio Astronomico di PadovaCenter for Astrophysics  Harvard & Smithsonian
We present the discovery and characterization of two warm mini-Neptunes transiting the K3V star TOI-815 in a K-M binary system. Analysis of the spectra and rotation period reveal it to be a young star with an age of 200200+400200^{+400}_{-200}Myr. TOI-815b has a 11.2-day period and a radius of 2.94±\pm0.05R\it{R_{\rm\mathrm{\oplus}}} with transits observed by TESS, CHEOPS, ASTEP, and LCOGT. The outer planet, TOI-815c, has a radius of 2.62±\pm0.10R\it{R_{\rm\mathrm{\oplus}}}, based on observations of three non-consecutive transits with TESS, while targeted CHEOPS photometry and radial velocity follow-up with ESPRESSO were required to confirm the 35-day period. ESPRESSO confirmed the planetary nature of both planets and measured masses of 7.6±\pm1.5 M\it{M_{\rm \mathrm{\oplus}}} (ρP\rho_\mathrm{P}=1.640.31+0.33^{+0.33}_{-0.31}gcm3^{-3}) and 23.5±\pm2.4M\it{M_{\rm\mathrm{\oplus}}} (ρP\rho_\mathrm{P}=7.21.0+1.1^{+1.1}_{-1.0}gcm3^{-3}) respectively. Thus, the planets have very different masses, unlike the usual similarity of masses in compact multi-planet systems. Moreover, our statistical analysis of mini-Neptunes orbiting FGK stars suggests that weakly irradiated planets tend to have higher bulk densities compared to those suffering strong irradiation. This could be ascribed to their cooler atmospheres, which are more compressed and denser. Internal structure modeling of TOI-815b suggests it likely has a H-He atmosphere constituting a few percent of the total planet mass, or higher if the planet is assumed to have no water. In contrast, the measured mass and radius of TOI-815c can be explained without invoking any atmosphere, challenging planetary formation theories. Finally, we infer from our measurements that the star is viewed close to pole-on, which implies a spin-orbit misalignment at the 3σ\sigma level.
There are no more papers matching your filters at the moment.