Spanish National Research Council
A new framework introduces 18 interpretable 'general scales' to evaluate AI capabilities and task demands, using automated LLM annotation to provide robust, non-saturating ability profiles for models. This methodology improves out-of-distribution performance prediction while offering transparency into what current benchmarks truly measure.
1
The vulnerability of machine learning models to adversarial attacks remains a critical security challenge. Traditional defenses, such as adversarial training, typically robustify models by minimizing a worst-case loss. However, these deterministic approaches do not account for uncertainty in the adversary's attack. While stochastic defenses placing a probability distribution on the adversary exist, they often lack statistical rigor and fail to make explicit their underlying assumptions. To resolve these issues, we introduce a formal Bayesian framework that models adversarial uncertainty through a stochastic channel, articulating all probabilistic assumptions. This yields two robustification strategies: a proactive defense enacted during training, aligned with adversarial training, and a reactive defense enacted during operations, aligned with adversarial purification. Several previous defenses can be recovered as limiting cases of our model. We empirically validate our methodology, showcasing the benefits of explicitly modeling adversarial uncertainty.
The paper "Rethinking the Illusion of Thinking" meticulously replicates and refines the controversial Towers of Hanoi and River Crossing benchmarks from Apple's "The Illusion of Thinking" study. It demonstrates that Large Reasoning Models (LRMs) excel at River Crossing when configurations are solvable but still struggle with the intrinsic complexity of Towers of Hanoi, even with stepwise prompting, offering a nuanced view of their reasoning capabilities.
Despite recent progress in training spiking neural networks (SNNs) for classification, their application to continuous motor control remains limited. Here, we demonstrate that fully spiking architectures can be trained end-to-end to control robotic arms with multiple degrees of freedom in continuous environments. Our predictive-control framework combines Leaky Integrate-and-Fire dynamics with surrogate gradients, jointly optimizing a forward model for dynamics prediction and a policy network for goal-directed action. We evaluate this approach on both a planar 2D reaching task and a simulated 6-DOF Franka Emika Panda robot. Results show that SNNs can achieve stable training and accurate torque control, establishing their viability for high-dimensional motor tasks. An extensive ablation study highlights the role of initialization, learnable time constants, and regularization in shaping training dynamics. We conclude that while stable and effective control can be achieved, recurrent spiking networks remain highly sensitive to hyperparameter settings, underscoring the importance of principled design choices.
Neurons communicate with downstream systems via sparse and incredibly brief electrical pulses, or spikes. Using these events, they control various targets such as neuromuscular units, neurosecretory systems, and other neurons in connected circuits. This gave rise to the idea of spiking neurons as controllers, in which spikes are the control signal. Using instantaneous events directly as the control inputs, also called `impulse control', is challenging as it does not scale well to larger networks and has low analytical tractability. Therefore, current spiking control usually relies on filtering the spike signal to approximate analog control. This ultimately means spiking neural networks (SNNs) have to output a continuous control signal, necessitating continuous energy input into downstream systems. Here, we circumvent the need for rate-based representations, providing a scalable method for task-specific spiking control with sparse neural activity. In doing so, we take inspiration from both optimal control and neuroscience theory, and define a spiking rule where spikes are only emitted if they bring a dynamical system closer to a target. From this principle, we derive the required connectivity for an SNN, and show that it can successfully control linear systems. We show that for physically constrained systems, predictive control is required, and the control signal ends up exploiting the passive dynamics of the downstream system to reach a target. Finally, we show that the control method scales to both high-dimensional networks and systems. Importantly, in all cases, we maintain a closed-form mathematical derivation of the network connectivity, the network dynamics and the control objective. This work advances the understanding of SNNs as biologically-inspired controllers, providing insight into how real neurons could exert control, and enabling applications in neuromorphic hardware design.
Social connections are conduits through which individuals communicate, information propagates, and diseases spread. Identifying individuals who are more likely to adopt ideas and spread them is essential in order to develop effective information campaigns, maximize the reach of resources, and fight epidemics. Influence maximization algorithms are used to identify sets of influencers. Based on extensive computer simulations on synthetic and ten diverse real-world social networks we show that seeding information using these methods creates information gaps. Our results show that these algorithms select influencers who do not disseminate information equitably, threatening to create an increasingly unequal society. To overcome this issue we devise a multi-objective algorithm which maximizes influence and information equity. Our results demonstrate it is possible to reduce vulnerability at a relatively low trade-off with respect to spread. This highlights that in our search for maximizing information we do not need to compromise on information equality.
Purpose: Low-field MRI systems operate at single MHz-range frequencies, where signal losses are primarily dominated by thermal noise from the radio-frequency (RF) receive coils. Achieving operation close to this limit is essential for maximizing imaging performance and signal-to-noise ratio (SNR). However, electromagnetic interference (EMI) from cabling, electronics, and patient loading often degrades system performance. Our goal is to develop and validate a practical protocol that guides users in identifying and suppressing electromagnetic noise in low-field MRI systems, enabling operation near the thermal noise limit. Methods: We present a systematic, stepwise methodology that includes diagnostic measurements, hardware isolation strategies, and good practices for cabling and shielding. Each step is validated with corresponding noise measurements under increasingly complex system configurations, both unloaded and with a human subject present. Results: Noise levels were monitored through the incremental assembly of a low-field MRI system, revealing key sources of EMI and quantifying their impact. Final configurations achieved noise within 1.5x the theoretical thermal bound with a subject in the scanner. Image reconstructions illustrate the direct relationship between system noise and image quality. Conclusion: The proposed protocol enables low-field MRI systems to operate close to fundamental noise limits in realistic conditions. The framework also provides actionable guidance for the integration of additional system components, such as gradient drivers and automatic tuning networks, without compromising SNR.
ETH Zurich logoETH ZurichUniversity of Toronto logoUniversity of TorontoUniversity of CincinnatiCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of Pittsburgh logoUniversity of PittsburghUniversity of Cambridge logoUniversity of CambridgeUniversity of VictoriaUniversity of California, Santa Barbara logoUniversity of California, Santa BarbaraSLAC National Accelerator LaboratoryHarvard University logoHarvard UniversityUniversity of UtahChinese Academy of Sciences logoChinese Academy of SciencesUniversity of OklahomaUniversity of Southern California logoUniversity of Southern CaliforniaUniversity of Chicago logoUniversity of ChicagoUniversity College London logoUniversity College LondonUniversity of Science and Technology of China logoUniversity of Science and Technology of ChinaShanghai Jiao Tong University logoShanghai Jiao Tong UniversityUniversity of California, Irvine logoUniversity of California, IrvineTsinghua University logoTsinghua UniversityUniversity of Michigan logoUniversity of MichiganUniversity of Copenhagen logoUniversity of CopenhagenUniversity of EdinburghOhio State UniversityTexas A&M University logoTexas A&M UniversityYale University logoYale UniversityUniversity of Texas at Austin logoUniversity of Texas at AustinKorea Astronomy and Space Science InstituteUniversity of Minnesota logoUniversity of MinnesotaBrookhaven National Laboratory logoBrookhaven National LaboratoryUniversity of the Basque Country (UPV/EHU)Lawrence Berkeley National Laboratory logoLawrence Berkeley National LaboratoryUniversity of Arizona logoUniversity of ArizonaUniversity of ZagrebSorbonne Université logoSorbonne UniversitéFermi National Accelerator LaboratoryUniversity of SheffieldShanghai Normal UniversityUniversity of QueenslandUniversity of PortsmouthUniversidade Federal do ABCWayne State UniversityIowa State UniversityUniversity of SussexDurham University logoDurham UniversityBrandeis UniversityJet Propulsion LaboratoryUniversity of the WitwatersrandSwinburne University of TechnologyUniversity of KentuckyLawrence Livermore National LaboratoryThe Johns Hopkins UniversitySpanish National Research CouncilUniversity of California, Santa Cruz logoUniversity of California, Santa CruzUniversity of Hawai’iUniversidad de Los AndesScience and Technology Facilities CouncilUniversity of California RiversideICTP South American Institute for Fundamental ResearchLaboratoire d’Astrophysique de MarseilleIKERBASQUE-Basque Foundation for ScienceUniversidade de Sao PauloIndian Institute of Science Education and Research (Mohali)Institut de Física d’Altes Energies (IFAE)Universidad de GuanajuatoLaboratório Interinstitucional de e-AstronomiaKavli Institute for Particle Astrophysics and CosmologyCerro Tololo Inter-American ObservatoryIRFU, CEA, Université Paris-SaclaySteward ObservatoryUniversidad Nacional Autonoma de MexicoUniversity of S̃ao PauloPontificia Universidad Catolica de ValparaisoUniversidad de Valpara\'isoInstituto de Astronom\'ia, UNAMInstitut de Física Teórica UAM-CSICSchool of Physics and Astronomy, University of ManchesterUniversidad Nacional Autonoma de HondurasCNRS, Sorbonne UniversiteAix-Marseille Universit",Université Paris-Saclay
·
We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-α\alpha (Lyα\alpha) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over 420000420\,000 Lyα\alpha forest spectra and their correlation with the spatial distribution of more than 700000700\,000 quasars. An essential facet of this work is the development of a new analysis methodology on a blinded dataset. We conducted rigorous tests using synthetic data to ensure the reliability of our methodology and findings before unblinding. Additionally, we conducted multiple data splits to assess the consistency of the results and scrutinized various analysis approaches to confirm their robustness. For a given value of the sound horizon (rdr_d), we measure the expansion at zeff=2.33z_{\rm eff}=2.33 with 2\% precision, H(zeff)=(239.2±4.8)(147.09 Mpc/rd)H(z_{\rm eff}) = (239.2 \pm 4.8) (147.09~{\rm Mpc} /r_d) km/s/Mpc. Similarly, we present a 2.4\% measurement of the transverse comoving distance to the same redshift, DM(zeff)=(5.84±0.14)(rd/147.09 Mpc)D_M(z_{\rm eff}) = (5.84 \pm 0.14) (r_d/147.09~{\rm Mpc}) Gpc. Together with other DESI BAO measurements at lower redshifts, these results are used in a companion paper to constrain cosmological parameters.
Advancements in multimodal Large Language Models (LLMs), such as OpenAI's GPT-4o, offer significant potential for mediating human interactions across various contexts. However, their use in areas such as persuasion, influence, and recruitment raises ethical and security concerns. To evaluate these models ethically in public influence and persuasion scenarios, we developed a prompting strategy using "Where's Waldo?" images as proxies for complex, crowded gatherings. This approach provides a controlled, replicable environment to assess the model's ability to process intricate visual information, interpret social dynamics, and propose engagement strategies while avoiding privacy concerns. By positioning Waldo as a hypothetical agent tasked with face-to-face mobilization, we analyzed the model's performance in identifying key individuals and formulating mobilization tactics. Our results show that while the model generates vivid descriptions and creative strategies, it cannot accurately identify individuals or reliably assess social dynamics in these scenarios. Nevertheless, this methodology provides a valuable framework for testing and benchmarking the evolving capabilities of multimodal LLMs in social contexts.
Spasticity is a common movement disorder symptom in individuals with cerebral palsy, hereditary spastic paraplegia, spinal cord injury and stroke, being one of the most disabling features in the progression of these diseases. Despite the potential benefit of using wearable robots to treat spasticity, their use is not currently recommended to subjects with a level of spasticity above 1+{1^+} on the Modified Ashworth Scale. The varying dynamics of this velocity-dependent tonic stretch reflex make it difficult to deploy safe personalized controllers. Here, we describe a novel adaptive torque controller via deep reinforcement learning (RL) for a knee exoskeleton under joint spasticity conditions, which accounts for task performance and interaction forces reduction. To train the RL agent, we developed a digital twin, including a musculoskeletal-exoskeleton system with joint misalignment and a differentiable spastic reflexes model for the muscles activation. Results for a simulated knee extension movement showed that the agent learns to control the exoskeleton for individuals with different levels of spasticity. The proposed controller was able to reduce maximum torques applied to the human joint under spastic conditions by an average of 10.6\% and decreases the root mean square until the settling time by 8.9\% compared to a conventional compliant controller.
At least half of a protostar's mass is accreted in the Class 0 phase, when the central protostar is deeply embedded in a dense, infalling envelope. We present the first systematic search for outbursts from Class 0 protostars in the Orion clouds. Using photometry from Spitzer/IRAC spanning 2004 to 2017, we detect three outbursts from Class 0 protostars with 2\ge 2 mag changes at 3.6 or 4.5 μ\mum. This is comparable to the magnitude change of a known protostellar FU Ori outburst. Two are newly detected bursts from the protostars HOPS 12 and 124. The number of detections implies that Class 0 protostars burst every 438 yr, with a 95% confidence interval of 161 to 1884 yr. Combining Spitzer and WISE/NEOWISE data spanning 2004-2019, we show that the bursts persist for more than nine years with significant variability during each burst. Finally, we use 1910019-100 μ\mum photometry from SOFIA, Spitzer and Herschel to measure the amplitudes of the bursts. Based on the burst interval, a duration of 15 yr, and the range of observed amplitudes, 3-100% of the mass accretion during the Class 0 phase occurs during bursts. In total, we show that bursts from Class 0 protostars are as frequent, or even more frequent, than those from more evolved protostars. This is consistent with bursts being driven by instabilities in disks triggered by rapid mass infall. Furthermore, we find that bursts may be a significant, if not dominant, mode of mass accretion during the Class 0 phase.
Characterizing hydrodynamic transport in fractured rocks is essential for carbon storage and geothermal energy production. Multiscale heterogeneities lead to anomalous solute transport, with breakthrough-curve (BTC) tailing and nonlinear growth of plume moments. We study purely advective transport in synthetic fractures with prescribed relative closure σa/a \sigma_a/\langle a \rangle and correlation length Lc L_c . For each geometry we generate multiple realizations and solve steady, depth-averaged Stokes flow under the lubrication approximation. Flow heterogeneity persists up to Lc L_c . The ensemble-averaged velocity PDFs are insensitive to Lc L_c but strongly affected by σa/a \sigma_a/\langle a \rangle , particularly their low-velocity power-law scaling. A time-domain random walk (TDRW) yields plume moments and outlet BTCs: the mean longitudinal position grows linearly in time, while the variance shows early ballistic scaling and a late-time regime controlled by the low-velocity power law with exponent α \alpha , which depends on σa/a \sigma_a/\langle a \rangle . BTC properties, including peak broadening and tail scaling, are likewise governed by α \alpha . We further model advection with a one-dimensional continuous-time random walk (CTRW) that uses only the velocity PDF, flow tortuosity, and Lc L_c . CTRW results closely match TDRW and enable analytical predictions of asymptotic transport scalings.
Let SαS_{\alpha} be the multilinear square function defined on the cone with aperture α1\alpha \geq 1. In this paper, we investigate several kinds of weighted norm inequalities for SαS_{\alpha}. We first obtain a sharp weighted estimate in terms of aperture α\alpha and wAp\vec{w} \in A_{\vec{p}}. By means of some pointwise estimates, we also establish two-weight inequalities including bump and entropy bump estimates, and Fefferman-Stein inequalities with arbitrary weights. Beyond that, we consider the mixed weak type estimates corresponding Sawyer's conjecture, for which a Coifman-Fefferman inequality with the precise AA_{\infty} norm is proved. Finally, we present the local decay estimates using the extrapolation techniques and dyadic analysis respectively. All the conclusions aforementioned hold for the Littlewood-Paley gλg^*_{\lambda} function. Some results are new even in the linear case.
Autonomous intelligent agents must bridge computational challenges at disparate levels of abstraction, from the low-level spaces of sensory input and motor commands to the high-level domain of abstract reasoning and planning. A key question in designing such agents is how best to instantiate the representational space that will interface between these two levels -- ideally without requiring supervision in the form of expensive data annotations. These objectives can be efficiently achieved by representing the world in terms of objects (grounded in perception and action). In this work, we present a novel, brain-inspired, deep-learning architecture that learns from pixels to interpret, control, and reason about its environment, using object-centric representations. We show the utility of our approach through tasks in synthetic environments that require a combination of (high-level) logical reasoning and (low-level) continuous control. Results show that the agent can learn emergent conditional behavioural reasoning, such as (AB)(¬AC)(A \to B) \land (\neg A \to C), as well as logical composition $(A \to B) \land (A \to C) \vdash A \to (B \land C)$ and XOR operations, and successfully controls its environment to satisfy objectives deduced from these logical rules. The agent can adapt online to unexpected changes in its environment and is robust to mild violations of its world model, thanks to dynamic internal desired goal generation. While the present results are limited to synthetic settings (2D and 3D activated versions of dSprites), which fall short of real-world levels of complexity, the proposed architecture shows how to manipulate grounded object representations, as a key inductive bias for unsupervised learning, to enable behavioral reasoning.
Online platforms have transformed the formal job market but continue to struggle with effectively engaging passive candidates-individuals not actively seeking employment but open to compelling opportunities. We introduce the Independent Halting Cascade (IHC) model, a novel framework that integrates complex network diffusion dynamics with economic game theory to address this challenge. Unlike traditional models that focus solely on information propagation, the IHC model empowers network agents to either disseminate a job posting or halt its spread by applying for the position themselves. By embedding economic incentives into agent decision-making processes, the model creates a dynamic interplay between maximizing information spread and promoting application. Our analysis uncovers distinct behavioral regimes within the IHC model, characterized by critical thresholds in recommendation and application probabilities. Extensive simulations on both synthetic and real-world network topologies demonstrate that the IHC model significantly outperforms traditional direct-recommendation systems in recruiting suitable passive candidates. Specifically, the model achieves up to a 30% higher hiring success rate compared to baseline methods. These findings offer strategic insights into leveraging economic incentives and network structures to enhance recruitment efficiency. The IHC model thus provides a robust framework for modernizing recruitment strategies, particularly in engaging the vast pool of passive candidates in the job market.
Optimization is key to solve many problems in computational biology. Global optimization methods provide a robust methodology, and metaheuristics in particular have proven to be the most efficient methods for many applications. Despite their utility, there is limited availability of metaheuristic tools. We present MEIGO, an R and Matlab optimization toolbox (also available in Python via a wrapper of the R version), that implements metaheuristics capable of solving diverse problems arising in systems biology and bioinformatics: enhanced scatter search method (eSS) for continuous nonlinear programming (cNLP) and mixed-integer programming (MINLP) problems, and variable neighborhood search (VNS) for Integer Programming (IP) problems. Both methods can be run on a single-thread or in parallel using a cooperative strategy. The code is supplied under GPLv3 and is available at \url{this http URL}. Documentation and examples are included. The R package has been submitted to Bioconductor. We evaluate MEIGO against optimization benchmarks, and illustrate its applicability to a series of case studies in bioinformatics and systems biology, outperforming other state-of-the-art methods. MEIGO provides a free, open-source platform for optimization, that can be applied to multiple domains of systems biology and bioinformatics. It includes efficient state of the art metaheuristics, and its open and modular structure allows the addition of further methods.
Attributing outputs from Large Language Models (LLMs) in adversarial settings-such as cyberattacks and disinformation campaigns-presents significant challenges that are likely to grow in importance. We approach this attribution problem from both a theoretical and an empirical perspective, drawing on formal language theory (identification in the limit) and data-driven analysis of the expanding LLM ecosystem. By modeling an LLM's set of possible outputs as a formal language, we analyze whether finite samples of text can uniquely pinpoint the originating model. Our results show that, under mild assumptions of overlapping capabilities among models, certain classes of LLMs are fundamentally non-identifiable from their outputs alone. We delineate four regimes of theoretical identifiability: (1) an infinite class of deterministic (discrete) LLM languages is not identifiable (Gold's classical result from 1967); (2) an infinite class of probabilistic LLMs is also not identifiable (by extension of the deterministic case); (3) a finite class of deterministic LLMs is identifiable (consistent with Angluin's tell-tale criterion); and (4) even a finite class of probabilistic LLMs can be non-identifiable (we provide a new counterexample establishing this negative result). Complementing these theoretical insights, we quantify the explosion in the number of plausible model origins (hypothesis space) for a given output in recent years. Even under conservative assumptions-each open-source model fine-tuned on at most one new dataset-the count of distinct candidate models doubles approximately every 0.5 years, and allowing multi-dataset fine-tuning combinations yields doubling times as short as 0.28 years. This combinatorial growth, alongside the extraordinary computational cost of brute-force likelihood attribution across all models and potential users, renders exhaustive attribution infeasible in practice.
At all scales, porous materials stir interstitial fluids as they are advected, leading to complex distributions of matter and energy. Of particular interest is whether porous media naturally induce chaotic advection at the Darcy scale, as these stirring kinematics profoundly impact basic processes such as solute transport and mixing, colloid transport and deposition and chemical, geochemical and biological reactivity. While many studies report complex transport phenomena characteristic of chaotic advection in heterogeneous Darcy flow, it has also been shown that chaotic dynamics are prohibited in a large class of Darcy flows. In this study we rigorously establish that chaotic advection is inherent to steady 3D Darcy flow in all realistic models of heterogeneous porous media. Anisotropic and heterogenous 3D hydraulic conductivity fields generate non-trivial braiding of stream-lines, leading to both chaotic advection and (purely advective) transverse dispersion. We establish that steady 3D Darcy flow has the same topology as unsteady 2D flow, and so use braid theory to establish a quantitative link between transverse dispersivity and Lyapunov exponent in heterogeneous Darcy flow. We show that chaotic advection and transverse dispersion occur in both anisotropic weakly heterogeneous and in heterogeneous weakly anisotropic conductivity fields, and that the quantitative link between these phenomena persists across a broad range of conductivity anisotropy and heterogeneity. The ubiquity of macroscopic chaotic advection has profound implications for the myriad of processes hosted in heterogeneous porous media and calls for a re-evaluation of transport and reaction methods in these systems.
The rapid proliferation and deployment of General-Purpose AI (GPAI) models, including large language models (LLMs), present unprecedented challenges for AI supervisory entities. We hypothesize that these entities will need to navigate an emergent ecosystem of risk and incident reporting, likely to exceed their supervision capacity. To investigate this, we develop a simulation framework parameterized by features extracted from the diverse landscape of risk, incident, or hazard reporting ecosystems, including community-driven platforms, crowdsourcing initiatives, and expert assessments. We evaluate four supervision policies: non-prioritized (first-come, first-served), random selection, priority-based (addressing the highest-priority risks first), and diversity-prioritized (balancing high-priority risks with comprehensive coverage across risk types). Our results indicate that while priority-based and diversity-prioritized policies are more effective at mitigating high-impact risks, particularly those identified by experts, they may inadvertently neglect systemic issues reported by the broader community. This oversight can create feedback loops that amplify certain types of reporting while discouraging others, leading to a skewed perception of the overall risk landscape. We validate our simulation results with several real-world datasets, including one with over a million ChatGPT interactions, of which more than 150,000 conversations were identified as risky. This validation underscores the complex trade-offs inherent in AI risk supervision and highlights how the choice of risk management policies can shape the future landscape of AI risks across diverse GPAI models used in society.
Huntington's disease (HD) is an inherited neurodegenerative disorder caused by an expanded CAG repeat in the coding sequence of the huntingtin protein. Initially, it predominantly affects medium-sized spiny neurons (MSSNs) of the corpus striatum. No effective treatment is available, thus urging the identification of potential therapeutic targets. While evidence of mitochondrial structural alterations in HD exists, previous studies mainly employed 2D approaches and were performed outside the strictly native brain context. In this study, we adopted a novel multiscale approach to conduct a comprehensive 3D in situ structural analysis of mitochondrial disturbances in a mouse model of HD. We investigated MSSNs within brain tissue under optimal structural conditions utilizing state-of-the-art 3D imaging technologies, specifically FIB/SEM for the complete imaging of neuronal somas and Electron Tomography for detailed morphological examination and image processing-based quantitative analysis. Our findings suggest a disruption of the mitochondrial network towards fragmentation in HD. The network of interlaced, slim, and long mitochondria observed in healthy conditions transforms into isolated, swollen, and short entities, with internal cristae disorganization, cavities, and abnormally large matrix granules.
There are no more papers matching your filters at the moment.