Louisiana State University
The Gravity Spy project aims to uncover the origins of glitches, transient bursts of noise that hamper analysis of gravitational-wave data. By using both the work of citizen-science volunteers and machine-learning algorithms, the Gravity Spy project enables reliable classification of glitches. Citizen science and machine learning are intrinsically coupled within the Gravity Spy framework, with machine-learning classifications providing a rapid first-pass classification of the dataset and enabling tiered volunteer training, and volunteer-based classifications verifying the machine classifications, bolstering the machine-learning training set and identifying new morphological classes of glitches. These classifications are now routinely used in studies characterizing the performance of the LIGO gravitational-wave detectors. Providing the volunteers with a training framework that teaches them to classify a wide range of glitches, as well as additional tools to aid their investigations of interesting glitches, empowers them to make discoveries of new classes of glitches. This demonstrates that, when giving suitable support, volunteers can go beyond simple classification tasks to identify new features in data at a level comparable to domain experts. The Gravity Spy project is now providing volunteers with more complicated data that includes auxiliary monitors of the detector to identify the root cause of glitches.
Gamma-ray bursts are the most luminous electromagnetic events in the universe. Their prompt gamma-ray emission has typical durations between a fraction of a second and several minutes. A rare subset of these events have durations in excess of a thousand seconds, referred to as ultra-long gamma-ray bursts. Here, we report the discovery of the longest gamma-ray burst ever seen with a ~25,000 s gamma-ray duration, GRB 250702B, and characterize this event using data from four instruments in the InterPlanetary Network and the Monitor of All-sky X-ray Image. We find a hard spectrum, subsecond variability, and high total energy, which are only known to arise from ultrarelativistic jets powered by a rapidly-spinning stellar-mass central engine. These properties and the extreme duration are together incompatible with all confirmed gamma-ray burst progenitors and nearly all models in the literature. This burst is naturally explained with the helium merger model, where a field binary ends when a black hole falls into a stripped star and proceeds to consume and explode it from within. Under this paradigm, GRB 250702B adds to the growing evidence that helium stars expand and that some ultra-long GRBs have similar evolutionary pathways as collapsars, stellar-mass gravitational wave sources, and potentially rare types of supernovae.
A key challenge in federated learning (FL) is the statistical heterogeneity that impairs the generalization of the global model on each client. To address this, we propose a method Federated learning with Adaptive Local Aggregation (FedALA) by capturing the desired information in the global model for client models in personalized FL. The key component of FedALA is an Adaptive Local Aggregation (ALA) module, which can adaptively aggregate the downloaded global model and local model towards the local objective on each client to initialize the local model before training in each iteration. To evaluate the effectiveness of FedALA, we conduct extensive experiments with five benchmark datasets in computer vision and natural language processing domains. FedALA outperforms eleven state-of-the-art baselines by up to 3.27% in test accuracy. Furthermore, we also apply ALA module to other federated learning methods and achieve up to 24.19% improvement in test accuracy.
121
Researchers developed a framework to evaluate and improve Large Language Model (LLM) adherence to formal Role-Based Access Control (RBAC) policies in Text-to-SQL systems. They found that a two-step generator-verifier pipeline significantly enhances reliable refusals, while fine-tuning can internalize permission awareness, both critical for preventing data leakage.
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source generative AI. Using a three-stage framework for Gen AI development (near, mid and long-term), we analyze the risks and opportunities of open-source generative AI models with similar capabilities to the ones currently available (near to mid-term) and with greater capabilities (long-term). We argue that, overall, the benefits of open-source Gen AI outweigh its risks. As such, we encourage the open sourcing of models, training and evaluation data, and provide a set of recommendations and best practices for managing risks associated with open-source generative AI.
· +3
Here, we present the outcomes from the second Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry, which engaged participants across global hybrid locations, resulting in 34 team submissions. The submissions spanned seven key application areas and demonstrated the diverse utility of LLMs for applications in (1) molecular and material property prediction; (2) molecular and material design; (3) automation and novel interfaces; (4) scientific communication and education; (5) research data management and automation; (6) hypothesis generation and evaluation; and (7) knowledge extraction and reasoning from scientific literature. Each team submission is presented in a summary table with links to the code and as brief papers in the appendix. Beyond team results, we discuss the hackathon event and its hybrid format, which included physical hubs in Toronto, Montreal, San Francisco, Berlin, Lausanne, and Tokyo, alongside a global online hub to enable local and virtual collaboration. Overall, the event highlighted significant improvements in LLM capabilities since the previous year's hackathon, suggesting continued expansion of LLMs for applications in materials science and chemistry research. These outcomes demonstrate the dual utility of LLMs as both multipurpose models for diverse machine learning tasks and platforms for rapid prototyping custom applications in scientific research.
PFLlib provides an open-source library and benchmark for personalized Federated Learning, integrating 37 state-of-the-art algorithms and 24 datasets to facilitate standardized implementation and evaluation. The platform improves research reproducibility and accelerates development by offering a unified, beginner-friendly environment.
1,956
Graph Neural Networks (GNNs) have recently gained traction in transportation, bioinformatics, language and image processing, but research on their application to supply chain management remains limited. Supply chains are inherently graph-like, making them ideal for GNN methodologies, which can optimize and solve complex problems. The barriers include a lack of proper conceptual foundations, familiarity with graph applications in SCM, and real-world benchmark datasets for GNN-based supply chain research. To address this, we discuss and connect supply chains with graph structures for effective GNN application, providing detailed formulations, examples, mathematical definitions, and task guidelines. Additionally, we present a multi-perspective real-world benchmark dataset from a leading FMCG company in Bangladesh, focusing on supply chain planning. We discuss various supply chain tasks using GNNs and benchmark several state-of-the-art models on homogeneous and heterogeneous graphs across six supply chain analytics tasks. Our analysis shows that GNN-based models consistently outperform statistical Machine Learning and other Deep Learning models by around 10-30% in regression, 10-30% in classification and detection tasks, and 15-40% in anomaly detection tasks on designated metrics. With this work, we lay the groundwork for solving supply chain problems using GNNs, supported by conceptual discussions, methodological insights, and a comprehensive dataset.
The Nancy Grace Roman Space Telescope (Roman) will conduct a Galactic Exoplanet Survey (RGES) to discover bound and free-floating exoplanets using gravitational microlensing. Roman should be sensitive to lenses with mass down to ~ 0.02 MM_{\oplus}, or roughly the mass of Ganymede. Thus the detection of moons with masses similar to the giant moons in our Solar System is possible with Roman. Measuring the demographics of exomoons will provide constraints on both moon and planet formation. We conduct simulations of Roman microlensing events to determine the effects of exomoons on microlensing light curves, and whether these effects are detectable with Roman. We focus on giant planets from 30 MM_{\oplus} to 10 MJupM_{Jup} on orbits from 0.3 to 30 AU, and assume that each planet is orbited by a moon with moon-planet mass ratio from 10410^{-4} to 10210^{-2} and separations from 0.1 to 0.5 planet Hill radii. We find that Roman is sensitive to exomoons, although the number of expected detections is only of order one over the duration of the survey, unless exomoons are more common or massive than we assumed. We argue that changes in the survey strategy, in particular focusing on a few fields with higher cadence, may allow for the detection of more exomoons with Roman. Regardless, the ability to detect exomoons reinforces the need to develop robust methods for modeling triple lens microlensing events to fully utilize the capabilities of Roman.
We present a sub-threshold search for gravitational-wave inspirals from binary neutron stars using data from the first part of the fourth observing run of the LIGO-Virgo-KAGRA Collaboration. To enhance sensitivity to this targeted population, we incorporate a redshift-corrected population model based on radio observations of Galactic double neutron star systems. The search identifies a significant trigger with a false-alarm rate of about one per 50 years and a network signal-to-noise ratio of 9.7, which was first reported by the LVK in low-latency processing as S231109ci and subsequently in the GWTC-4.0 catalog as GW231109_235456, a sub-threshold candidate. Accounting for a trials factor of five from the four LVK searches in GWTC-4.0 and this new search, the false-alarm rate of the reported candidate is approximately one per 10 years. If this event is of astrophysical origin, the inferred source properties indicate component masses of 1.40 to 2.24 solar masses for the primary and 0.97 to 1.49 solar masses for the secondary, yielding a total mass of 2.95 (+0.38, -0.07) solar masses. The event was localized to a region of 450 square degrees (90 percent probability) at a luminosity distance of 165 (+70, -69) megaparsecs.
The most recent Linux kernels have a new feature for securing applications: Landlock. Like Seccomp before it, Landlock makes it possible for a running process to give up access to resources. For applications running as Science Gateways, network access is required while starting up MPI, but for the sake of security, it should be taken away prior to the reading of user-supplied parameter files. We explore the usefulness of Landlock by modifying and locking down three mature scientific codes: The Einstein Toolkit (a code that studies the dynamics of relativistic astrophysics, e.g. neutron star collisions), Octo-Tiger (a code for studying the dynamics of non-relativistic astrophysics, e.g. white dwarfs), and FUKA (an initial data solver for relativistic codes). Finally, we implement a fully-functioning FUKA science gateway that relies on Landlock (instead of user authentication) for security.
The transformer architecture has become a cornerstone of modern AI, fueling remarkable progress across applications in natural language processing, computer vision, and multimodal learning. As these models continue to scale explosively for performance, implementation efficiency remains a critical challenge. Mixture of Experts (MoE) architectures, selectively activating specialized subnetworks (experts), offer a unique balance between model accuracy and computational cost. However, the adaptive routing in MoE architectures, where input tokens are dynamically directed to specialized experts based on their semantic meaning inadvertently opens up a new attack surface for privacy breaches. These input-dependent activation patterns leave distinctive temporal and spatial traces in hardware execution, which adversaries could exploit to deduce sensitive user data. In this work, we propose MoEcho, discovering a side channel analysis based attack surface that compromises user privacy on MoE based systems. Specifically, in MoEcho, we introduce four novel architectural side channels on different computing platforms, including Cache Occupancy Channels and Pageout+Reload on CPUs, and Performance Counter and TLB Evict+Reload on GPUs, respectively. Exploiting these vulnerabilities, we propose four attacks that effectively breach user privacy in large language models (LLMs) and vision language models (VLMs) based on MoE architectures: Prompt Inference Attack, Response Reconstruction Attack, Visual Inference Attack, and Visual Reconstruction Attack. MoEcho is the first runtime architecture level security analysis of the popular MoE structure common in modern transformers, highlighting a serious security and privacy threat and calling for effective and timely safeguards when harnessing MoE based models for developing efficient large scale AI services.
A new hybrid method combines the time Finite Element Method with Physics-Informed Neural Networks to solve time-dependent partial differential equations, effectively mitigating statistical errors and causality issues often found in pure PINNs. This approach, enhanced with deep adaptive sampling, demonstrated up to 100 times faster computation for convection and 40 times faster for Allen-Cahn equations compared to Causal PINNs, while achieving comparable or higher accuracy, particularly for high-dimensional and low-regularity problems.
We study two-dimensional massless Dirac fermions at neutrality, coupled to bosonic modes through a Yukawa interaction. We then examine the intriguing possibility that such a system, devoid of carriers at zero temperature, might nevertheless exhibit superconductivity. Remarkably, we find that superconductivity emerges in the vicinity of Gross-Neveu quantum criticality, provided the fermions cease to behave as well-defined quasiparticles, that is, once their anomalous dimension in the normal state becomes sufficiently large. In other words, well-defined fermions do not superconduct, whereas ill-defined ones do. We analyze four symmetry-distinct bosonic modes, each capable of driving normal-state criticality and, in three of the four cases, giving rise to a distinct superconducting phase. While phase fluctuations are strong in this regime, we argue that they do not destroy the superconducting state. We further characterize the resulting pairing states for a concrete Dirac model of spin-orbit coupled systems with orbitals of different parity. Our results are obtained using the SYK-inspired framework for Dirac systems introduced by Kim et al.[1], which provides a controlled approach to the strongly coupled regime of Dirac fluids near Gross-Neveu criticality.
The adaptation of large language models (LLMs) to time series forecasting poses unique challenges, as time series data is continuous in nature, while LLMs operate on discrete tokens. Despite the success of LLMs in natural language processing (NLP) and other structured domains, aligning time series data with language-based representations while maintaining both predictive accuracy and interpretability remains a significant hurdle. Existing methods have attempted to reprogram time series data into text-based forms, but these often fall short in delivering meaningful, interpretable results. In this paper, we propose a multi-level text alignment framework for time series forecasting using LLMs that not only improves prediction accuracy but also enhances the interpretability of time series representations. Our method decomposes time series into trend, seasonal, and residual components, which are then reprogrammed into component-specific text representations. We introduce a multi-level alignment mechanism, where component-specific embeddings are aligned with pre-trained word tokens, enabling more interpretable forecasts. Experiments on multiple datasets demonstrate that our method outperforms state-of-the-art models in accuracy while providing good interpretability.
Fairness in ranking models is crucial, as disparities in exposure can disproportionately affect protected groups. Most fairness-aware ranking systems focus on ensuring comparable average exposure for groups across the entire ranked list, which may not fully address real-world concerns. For example, when a ranking model is used for allocating resources among candidates or disaster hotspots, decision-makers often prioritize only the top-KK ranked items, while the ranking beyond top-KK becomes less relevant. In this paper, we propose a list-wise learning-to-rank framework that addresses the issues of inequalities in top-KK rankings at training time. Specifically, we propose a top-KK exposure disparity measure that extends the classic exposure disparity metric in a ranked list. We then learn a ranker to balance relevance and fairness in top-KK rankings. Since direct top-KK selection is computationally expensive for a large number of items, we transform the non-differentiable selection process into a differentiable objective function and develop efficient stochastic optimization algorithms to achieve both high accuracy and sufficient fairness. Extensive experiments demonstrate that our method outperforms existing methods.
Large Language Models (LLMs) such as ChatGPT-4, Claude 3, and LLaMA 4 are increasingly embedded in software/application development, supporting tasks from code generation to debugging. Yet, their real-world effectiveness in detecting diverse software bugs, particularly complex, security-relevant vulnerabilities, remains underexplored. This study presents a systematic, empirical evaluation of these three leading LLMs using a benchmark of foundational programming errors, classic security flaws, and advanced, production-grade bugs in C++ and Python. The dataset integrates real code from SEED Labs, OpenSSL (via the Suresoft GLaDOS database), and PyBugHive, validated through local compilation and testing pipelines. A novel multi-stage, context-aware prompting protocol simulates realistic debugging scenarios, while a graded rubric measures detection accuracy, reasoning depth, and remediation quality. Our results show that all models excel at identifying syntactic and semantic issues in well-scoped code, making them promising for educational use and as first-pass reviewers in automated code auditing. Performance diminishes in scenarios involving complex security vulnerabilities and large-scale production code, with ChatGPT-4 and Claude 3 generally providing more nuanced contextual analyses than LLaMA 4. This highlights both the promise and the present constraints of LLMs in serving as reliable code analysis tools.
Recurrent nova U Scorpii (U Sco) is one of the prototypes for a Type Ia supernova progenitor. The logic is that the white dwarf is near the Chandrasekhar mass and gas is accumulating onto its surface at a near-maximal accretion rate, so it will soon increase its mass to the supernova trigger. But the white dwarf loses mass every nova eruption, so the issue is balancing the mass ejected (MejectaM_{\rm ejecta}) against the mass accreted between eruptions (MaccretedM_{\rm accreted}). Measuring MaccretedM_{\rm accreted} can be done in several ways to useable accuracy. But the old methods for measuring MejectaM_{\rm ejecta} (involving the flux in hydrogen emission lines) are all with real error bars of 2--3 orders of magnitude. The only solution is to measure the change of the orbital period across the nova eruption (ΔP\Delta P). But this solution requires a vast photometric program of eclipse timings stretching decades. For U Sco, a program started in 1989, now reaches its culmination with measures of ΔP\Delta P for the eruptions of 1999, 2010, 2016, and 2022. This paper reports on 52 new eclipse times (for a total of 218 eclipses 1945--2025), plus a new theory result allowing for the confident calculation of MejectaM_{\rm ejecta} from ΔP\Delta P. The four eruptions ejected a total of (103±\pm14)×\times10610^{-6} MM_{\odot}, while the white dwarf accreted 4×\times10610^{-6} MM_{\odot} over the four previous eruption cycles. With Mejecta_{\rm ejecta}=26×\timesMaccreted_{\rm accreted}, the U Sco white dwarf is losing large masses each eruption cycle, so U Sco can never produce a Type Ia supernova.
Autonomous driving systems (ADS) increasingly rely on deep learning-based perception models, which remain vulnerable to adversarial attacks. In this paper, we revisit adversarial attacks and defense methods, focusing on road sign recognition and lead object detection and prediction (e.g., relative distance). Using a Level-2 production ADS, OpenPilot by Comma..ai, and the widely adopted YOLO model, we systematically examine the impact of adversarial perturbations and assess defense techniques, including adversarial training, image processing, contrastive learning, and diffusion models. Our experiments highlight both the strengths and limitations of these methods in mitigating complex attacks. Through targeted evaluations of model robustness, we aim to provide deeper insights into the vulnerabilities of ADS perception systems and contribute guidance for developing more resilient defense strategies.
There are no more papers matching your filters at the moment.