Northern Arizona University
This research introduces DA-Fusion, a data augmentation strategy that leverages pre-trained text-to-image diffusion models to generate semantically diverse training data. The method consistently improved few-shot image classification accuracy by up to 15 percentage points, particularly excelling on fine-grain and novel visual concepts like specific weed species, and its effectiveness was validated even when rigorously preventing data leakage from the generative model's training set.
233
University of Washington logoUniversity of WashingtonCNRS logoCNRSUniversity of Toronto logoUniversity of TorontoUniversity of Illinois at Urbana-Champaign logoUniversity of Illinois at Urbana-ChampaignUniversity of Pittsburgh logoUniversity of PittsburghMonash University logoMonash UniversityUniversity of California, Santa Barbara logoUniversity of California, Santa BarbaraHarvard University logoHarvard UniversityUniversity of UtahChinese Academy of Sciences logoChinese Academy of SciencesVanderbilt UniversityUniversity of OklahomaNew York University logoNew York UniversityTel Aviv University logoTel Aviv UniversityUniversidad de ConcepcionUniversity of EdinburghThe University of Texas at Austin logoThe University of Texas at AustinPeking University logoPeking UniversityIEECKU Leuven logoKU LeuvenColumbia University logoColumbia UniversityUniversity of Florida logoUniversity of FloridaSpace Telescope Science Institute logoSpace Telescope Science InstituteYork UniversityJohns Hopkins University logoJohns Hopkins UniversityUniversidad Diego PortalesUniversity of Wisconsin-Madison logoUniversity of Wisconsin-MadisonThe Pennsylvania State University logoThe Pennsylvania State UniversityUniversity of Arizona logoUniversity of ArizonaAustralian National University logoAustralian National UniversityLeiden University logoLeiden UniversityUniversity of Warwick logoUniversity of WarwickThe Ohio State University logoThe Ohio State UniversityUniversitat de BarcelonaCarnegie ObservatoriesUniversity of ConnecticutUniversity of St Andrews logoUniversity of St AndrewsUniversity of Colorado BoulderFlatiron Institute logoFlatiron InstituteLomonosov Moscow State UniversityGeorge Mason UniversityUniversidade Federal do Rio de JaneiroNorthern Arizona UniversityRussian Academy of SciencesUniversidad de ChileUniversity of KentuckyUniversity of Texas at ArlingtonNew Mexico State UniversityCentro de Astrobiología (CAB)Southwest Research InstituteEmbry-Riddle Aeronautical UniversityDrexel UniversityUniversidade Federal de SergipeMontana State UniversityTowson UniversityUniversidad de La Laguna (ULL)Instituto de Astrofísica de Canarias (IAC)Universidade de Sao PauloUniversidad Andres BelloUniversitat Polit`ecnica de CatalunyaPontificia Universidad Catolica de ChileEuropean Space Agency (ESA)Konkoly ObservatoryUniversidad de La SerenaCONACyTEotvos Lorand UniversityTexas Christian UniversityWestern Washington UniversityUniversidad Nacional Autonoma de MexicoUniversit\'e C\^ote d'AzurUniversidad Cat`olica del NorteUniversitat PotsdamUniversit´e de MontpellierValparaiso UniversityUniversity of Nebraska at OmahaCentro de Astrofísica y Tecnologías Afines (CATA)Max-Planck-Institut fur extraterrestrische Physik (MPE)Western Carolina UniversityMax-Planck-Institut f¨ur SonnensystemforschungObservat´orios NacionaisE¨otv¨os Lor´and Research Network (ELKH)Universidad Aut´onoma del Estado de Morelos (UAEM)Centre de Recerca en Ci`encies de la Terra (Geo3BCN)-CSICUniversitȁt HeidelbergMax Planck Institut für AstronomieCenter for Astrophysics  Harvard & SmithsonianLeibniz–Institut für Astrophysik Potsdam (AIP)Institut de Ciéncies del Cosmos (ICCUB)
Mapping the local and distant Universe is key to our understanding of it. For decades, the Sloan Digital Sky Survey (SDSS) has made a concerted effort to map millions of celestial objects to constrain the physical processes that govern our Universe. The most recent and fifth generation of SDSS (SDSS-V) is organized into three scientific ``mappers". Milky Way Mapper (MWM) that aims to chart the various components of the Milky Way and constrain its formation and assembly, Black Hole Mapper (BHM), which focuses on understanding supermassive black holes in distant galaxies across the Universe, and Local Volume Mapper (LVM), which uses integral field spectroscopy to map the ionized interstellar medium in the local group. This paper describes and outlines the scope and content for the nineteenth data release (DR19) of SDSS and the most substantial to date in SDSS-V. DR19 is the first to contain data from all three mappers. Additionally, we also describe nine value added catalogs (VACs) that enhance the science that can be conducted with the SDSS-V data. Finally, we discuss how to access SDSS DR19 and provide illustrative examples and tutorials.
University of Washington logoUniversity of WashingtonCNRS logoCNRSCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of Illinois at Urbana-Champaign logoUniversity of Illinois at Urbana-ChampaignSLAC National Accelerator LaboratoryNational Central UniversityUCLA logoUCLACarnegie Mellon University logoCarnegie Mellon UniversityImperial College London logoImperial College LondonDESYUniversity of Chicago logoUniversity of ChicagoUC Berkeley logoUC BerkeleyUniversity College London logoUniversity College LondonUniversity of Oxford logoUniversity of Oxfordthe University of Tokyo logothe University of TokyoStanford University logoStanford UniversityUniversity of EdinburghINFN logoINFNETH Zürich logoETH ZürichUniversity of California, San Diego logoUniversity of California, San DiegoUniversity of British Columbia logoUniversity of British ColumbiaNASA Goddard Space Flight Center logoNASA Goddard Space Flight CenterUniversity of Texas at Austin logoUniversity of Texas at AustinKavli Institute for the Physics and Mathematics of the UniverseCurtin UniversityCERN logoCERNSpace Telescope Science Institute logoSpace Telescope Science InstituteJohns Hopkins University logoJohns Hopkins UniversityArizona State University logoArizona State UniversityUniversity of Maryland logoUniversity of MarylandThe Alan Turing InstituteUniversity of North Carolina at Chapel HillPurdue University logoPurdue UniversityUniversity of HelsinkiPolitecnico di MilanoUniversity of California, Davis logoUniversity of California, DavisDuke University logoDuke UniversityMIT logoMITCEA logoCEAPrinceton University logoPrinceton UniversityUniv. LilleUniversity of Central Florida logoUniversity of Central FloridaUniversity of Colorado BoulderUniversité Côte d’AzurUniversidade Federal do Rio de JaneiroNorthern Arizona UniversityJet Propulsion LaboratoryUniversidad de ChileEuropean Space AgencyUniversity of MontenegroCNESAdam Mickiewicz UniversityPSL Research UniversitySouthwest Research InstituteSETI InstituteUniversity of North DakotaThe Johns Hopkins University Applied Physics LaboratoryObservatoire de la Côte d’AzurUniversity of Hawai’iCalifornia State Polytechnic University, PomonaThe University of ArizonaMIT Kavli Institute for Astrophysics and Space ResearchUniversidade Federal de SergipeKavli Institute for Cosmological PhysicsThe Open UniversityCarnegie Institution for ScienceUniversidad Nacional de ColombiaVera C. Rubin ObservatoryCEA SaclayCNRS/IN2P3Queen's University BelfastInstituto de Astrofísica de Canarias (IAC)Lowell ObservatoryIPACLAPPUniv Grenoble AlpesIJCLabU.S. Naval ObservatoryPlanetary Science InstituteNSF’s National Optical-Infrared Astronomy Research LaboratoryPontificia Universidad Catolica de ChileUniversidad MayorLPNHEUniversities Space Research AssociationAcademia Sinica Institute of Astronomy and Astrophysics (ASIAA)California Polytechnic State University - San Luis ObispoMullard Space Science LaboratoryELTE Gothard Astrophysical ObservatoryParis ObservatoryAstroparticule et Cosmologie (APC)Universit\`a degli Studi di Urbino ‘Carlo Bo’Universit´e Paris DiderotIMCCEELTE Eotvos Lorand UniversityAix-Marseille Universit\'eUK ATCLaboratoire d’Astrophysique de Marseille (LAM)Observatorio Astronomico NacionalInstituto Nacional de Astrofısica Optica y ElectronicaObservatorio do ValongoEarth and Planets LaboratoryUniversit´e Paris Cit´eLSST Discovery AllianceUTFPR— Universidade Tecnol´ogica Federal do Paran´aInstituto de Ciencias Planetarias y Exoplanetarias (ICPE)CONICET-IARLaborat´orio Nacional de Astrof´ısica (LNA)The ExploratoriumELKH-CSFK Konkoly ObservatoryObservat´orio Nacional, MCTILudwig-Maximilians-Universität MünchenNASA, Ames Research CenterUniversité Paris-SaclayCenter for Astrophysics  Harvard & SmithsonianINAF ` Osservatorio Astronomico di TriesteSorbonne Université
We report on the observation and measurement of astrometry, photometry, morphology, and activity of the interstellar object 3I/ATLAS, also designated C/2025 N1 (ATLAS), with the NSF-DOE Vera C. Rubin Observatory. The third interstellar object, comet 3I/ATLAS, was first discovered on UT 2025 July 1. Serendipitously, the Rubin Observatory collected imaging in the area of the sky inhabited by the object during regular commissioning activities. We successfully recovered object detections from Rubin visits spanning UT 2025 June 21 (10 days before discovery) to UT 2025 July 7. Facilitated by Rubin's high resolution and large aperture, we report on the detection of cometary activity as early as June 21st, and observe it throughout. We measure the location and magnitude of the object on 37 Rubin images in r, i, and z bands, with typical precision of about 20 mas (100 mas, systematic) and about 10 mmag, respectively. We use these to derive improved orbit solutions, and to show there is no detectable photometric variability on hourly timescales. We derive a V-band absolute magnitude of H_V = (13.7 +/- 0.2) mag, and an equivalent effective nucleus radius of around (5.6 +/- 0.7) km. These data represent the earliest observations of this object by a large (8-meter class) telescope reported to date, and illustrate the type of measurements (and discoveries) Rubin's Legacy Survey of Space and Time (LSST) will begin to provide once operational later this year.
Researchers systematically applied the GenderMag method to GitHub's interface, identifying and redesigning features that created inclusivity bugs for newcomers. The redesigned interface, implemented as a browser plugin, significantly increased task completion rates for users with Abi-like cognitive styles from 67% to 95% and substantially improved self-efficacy for all newcomers when performing initial contribution tasks.
This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG). As software complexity grows, traditional methods face challenges in detecting vulnerabilities effectively. LProtector leverages GPT-4o's powerful code comprehension and generation capabilities to perform binary classification and identify vulnerabilities within target codebases. We conducted experiments on the Big-Vul dataset, showing that LProtector outperforms two state-of-the-art baselines in terms of F1 score, demonstrating the potential of integrating LLMs with vulnerability detection.
[Context] Newcomers joining an unfamiliar software project face numerous barriers; therefore, effective onboarding is essential to help them engage with the team and develop the behaviors, attitudes, and skills needed to excel in their roles. However, onboarding can be a lengthy, costly, and error-prone process. Software solutions can help mitigate these barriers and streamline the process without overloading senior members. [Objective] This study aims to identify the state-of-the-art software solutions for onboarding newcomers. [Method] We conducted a systematic literature review (SLR) to answer six research questions. [Results] We analyzed 32 studies about software solutions for onboarding newcomers and yielded several key findings: (1) a range of strategies exists, with recommendation systems being the most prevalent; (2) most solutions are web-based; (3) solutions target a variety of onboarding aspects, with a focus on process; (4) many onboarding barriers remain unaddressed by existing solutions; (5) laboratory experiments are the most commonly used method for evaluating these solutions; and (6) diversity and inclusion aspects primarily address experience level. [Conclusion] We shed light on current technological support and identify research opportunities to develop more inclusive software solutions for onboarding. These insights may also guide practitioners in refining existing platforms and onboarding programs to promote smoother integration of newcomers into software projects.
New contributors often struggle to find tasks that they can tackle when onboarding onto a new Open Source Software (OSS) project. One reason for this difficulty is that issue trackers lack explanations about the knowledge or skills needed to complete a given task successfully. These explanations can be complex and time-consuming to produce. Past research has partially addressed this problem by labeling issues with issue types, issue difficulty level, and issue skills. However, current approaches are limited to a small set of labels and lack in-depth details about their semantics, which may not sufficiently help contributors identify suitable issues. To surmount this limitation, this paper explores large language models (LLMs) and Random Forest (RF) to predict the multilevel skills required to solve the open issues. We introduce a novel tool, SkillScope, which retrieves current issues from Java projects hosted on GitHub and predicts the multilevel programming skills required to resolve these issues. In a case study, we demonstrate that SkillScope could predict 217 multilevel skills for tasks with 91% precision, 88% recall, and 89% F-measure on average. Practitioners can use this tool to better delegate or choose tasks to solve in OSS projects.
A new taxonomy details eleven distinct types of human-AI interaction in software engineering, providing a structured framework to categorize how developers engage with AI tools and guiding future research into optimizing these collaborations.
Baryon acoustic oscillation data from the first year of the Dark Energy Spectroscopic Instrument (DESI) provide near percent-level precision of cosmic distances in seven bins over the redshift range z=0.1z=0.1-4.24.2. We use this data, together with other distance probes, to constrain the cosmic expansion history using some well-motivated physical classes of dark energy. In particular, we explore three physics-focused behaviors of dark energy from the equation of state and energy density perspectives: the thawing class (matching many simple quintessence potentials), emergent class (where dark energy comes into being recently, as in phase transition models), and mirage class (where phenomenologically the distance to CMB last scattering is close to that from a cosmological constant Λ\Lambda despite dark energy dynamics). All three classes fit the data at least as well as Λ\LambdaCDM, and indeed can improve on it by Δχ25\Delta\chi^2\approx -5 to 17-17 for the combination of DESI BAO with CMB and supernova data, while having one more parameter. The mirage class does essentially as well as w0waw_0w_aCDM while having one less parameter. These classes of dynamical behaviors highlight worthwhile avenues for further exploration into the nature of dark energy.
The wide spread of rumors on social media has caused a negative impact on people's daily life, leading to potential panic, fear, and mental health problems for the public. How to debunk rumors as early as possible remains a challenging problem. Existing studies mainly leverage information propagation structure to detect rumors, while very few works focus on correlation among users that they may coordinate to spread rumors in order to gain large popularity. In this paper, we propose a new detection model, that jointly learns both the representations of user correlation and information propagation to detect rumors on social media. Specifically, we leverage graph neural networks to learn the representations of user correlation from a bipartite graph that describes the correlations between users and source tweets, and the representations of information propagation with a tree structure. Then we combine the learned representations from these two modules to classify the rumors. Since malicious users intend to subvert our model after deployment, we further develop a greedy attack scheme to analyze the cost of three adversarial attacks: graph attack, comment attack, and joint attack. Evaluation results on two public datasets illustrate that the proposed MODEL outperforms the state-of-the-art rumor detection models. We also demonstrate our method performs well for early rumor detection. Moreover, the proposed detection method is more robust to adversarial attacks compared to the best existing method. Importantly, we show that it requires a high cost for attackers to subvert user correlation pattern, demonstrating the importance of considering user correlation for rumor detection.
Researchers from the University of Washington, SETI Institute, and other institutions developed a comprehensive roadmap for systematically searching interstellar objects (ISOs) for technosignatures. The framework categorizes potential technosignatures into four types and details observational strategies, emphasizing rigorous comparison with natural phenomena in anticipation of increased ISO discoveries from the Rubin Observatory.
Current leading mispronunciation detection and diagnosis (MDD) systems achieve promising performance via end-to-end phoneme recognition. One challenge of such end-to-end solutions is the scarcity of human-annotated phonemes on natural L2 speech. In this work, we leverage unlabeled L2 speech via a pseudo-labeling (PL) procedure and extend the fine-tuning approach based on pre-trained self-supervised learning (SSL) models. Specifically, we use Wav2vec 2.0 as our SSL model, and fine-tune it using original labeled L2 speech samples plus the created pseudo-labeled L2 speech samples. Our pseudo labels are dynamic and are produced by an ensemble of the online model on-the-fly, which ensures that our model is robust to pseudo label noise. We show that fine-tuning with pseudo labels achieves a 5.35% phoneme error rate reduction and 2.48% MDD F1 score improvement over a labeled-samples-only fine-tuning baseline. The proposed PL method is also shown to outperform conventional offline PL methods. Compared to the state-of-the-art MDD systems, our MDD solution produces a more accurate and consistent phonetic error diagnosis. In addition, we conduct an open test on a separate UTD-4Accents dataset, where our system recognition outputs show a strong correlation with human perception, based on accentedness and intelligibility.
24
Wildfires have emerged as one of the most destructive natural disasters worldwide, causing catastrophic losses in both human lives and forest wildlife. Recently, the use of Artificial Intelligence (AI) in wildfires, propelled by the integration of Unmanned Aerial Vehicles (UAVs) and deep learning models, has created an unprecedented momentum to implement and develop more effective wildfire management. Although some of the existing survey papers have explored various learning-based approaches, a comprehensive review emphasizing the application of AI-enabled UAV systems and their subsequent impact on multi-stage wildfire management is notably lacking. This survey aims to bridge these gaps by offering a systematic review of the recent state-of-the-art technologies, highlighting the advancements of UAV systems and AI models from pre-fire, through the active-fire stage, to post-fire management. To this aim, we provide an extensive analysis of the existing remote sensing systems with a particular focus on the UAV advancements, device specifications, and sensor technologies relevant to wildfire management. We also examine the pre-fire and post-fire management approaches, including fuel monitoring, prevention strategies, as well as evacuation planning, damage assessment, and operation strategies. Additionally, we review and summarize a wide range of computer vision techniques in active-fire management, with an emphasis on Machine Learning (ML), Reinforcement Learning (RL), and Deep Learning (DL) algorithms for wildfire classification, segmentation, detection, and monitoring tasks. Ultimately, we underscore the substantial advancement in wildfire modeling through the integration of cutting-edge AI techniques and UAV-based data, providing novel insights and enhanced predictive capabilities to understand dynamic wildfire behavior.
In recent years, the expansion of internet technology and advancements in automation have brought significant attention to autonomous driving technology. Major automobile manufacturers, including Volvo, Mercedes-Benz, and Tesla, have progressively introduced products ranging from assisted-driving vehicles to semi-autonomous vehicles. However, this period has also witnessed several traffic safety incidents involving self-driving vehicles. For instance, in March 2016, a Google self-driving car was involved in a minor collision with a bus. At the time of the accident, the autonomous vehicle was attempting to merge into the right lane but failed to dynamically respond to the real-time environmental information during the lane change. It incorrectly assumed that the approaching bus would slow down to avoid it, leading to a low-speed collision with the bus. This incident highlights the current technological shortcomings and safety concerns associated with autonomous lane-changing behavior, despite the rapid advancements in autonomous driving technology. Lane-changing is among the most common and hazardous behaviors in highway driving, significantly impacting traffic safety and flow. Therefore, lane-changing is crucial for traffic safety, and accurately predicting drivers' lane change intentions can markedly enhance driving safety. This paper introduces a deep learning-based prediction method for autonomous driving lane change behavior, aiming to facilitate safe lane changes and thereby improve road safety.
We report the discovery of activity emanating from (18916) 2000 OG44 (alternately designated 1977 SD), a minor planet previously reported to be both an extinct comet or an asteroid on a cometary orbit. We observed 2000 OG44 with a thin tail oriented towards the coincident anti-solar and anti-motion vectors (as projected on the sky) in images we acquired on UT 2023 July 24 and 26 with the Apache Point Observatory 3.5-meter Astrophysical Research Consortium telescope (New Mexico, USA). We also include observations made in Arizona with the Vatican Advanced Technology Telescope at the Mount Graham International Observatory and the Lowell Observatory Lowell Discovery Telescope near Happy Jack. We performed dynamical simulations that reveal 2000 OG44 most likely originated in the Oort cloud, arriving within the last 4 Myr. We find 2000 OG44, which crosses the orbits of both Jupiter and Mars, is at present on an orbit consistent with a Jupiter-family comet (JFC). We carried out thermodynamical modeling that informed our broader diagnosis that the observed activity is most likely due to volatile sublimation.
FaSTED, a new algorithm from Northern Arizona University, efficiently computes Euclidean distances on GPUs by fully leveraging mixed-precision Tensor Cores, achieving up to 150 TFLOPS. It delivers 2.5x to 51x speedups over existing GPU methods on real-world datasets with negligible accuracy loss.
Federated Learning (FL) is a distributed machine learning paradigm that allows clients to train models on their data while preserving their privacy. FL algorithms, such as Federated Averaging (FedAvg) and its variants, have been shown to converge well in many scenarios. However, these methods require clients to upload their local updates to the server in a synchronous manner, which can be slow and unreliable in realistic FL settings. To address this issue, researchers have developed asynchronous FL methods that allow clients to continue training on their local data using a stale global model. However, most of these methods simply aggregate all of the received updates without considering their relative contributions, which can slow down convergence. In this paper, we propose a contribution-aware asynchronous FL method that takes into account the staleness and statistical heterogeneity of the received updates. Our method dynamically adjusts the contribution of each update based on these factors, which can speed up convergence compared to existing methods.
In recent years, there have been frequent incidents of foreign objects intruding into railway and Airport runways. These objects can include pedestrians, vehicles, animals, and debris. This paper introduces an improved YOLOv5 architecture incorporating FasterNet and attention mechanisms to enhance the detection of foreign objects on railways and Airport runways. This study proposes a new dataset, AARFOD (Aero and Rail Foreign Object Detection), which combines two public datasets for detecting foreign objects in aviation and railway systems.The dataset aims to improve the recognition capabilities of foreign object targets. Experimental results on this large dataset have demonstrated significant performance improvements of the proposed model over the baseline YOLOv5 model, reducing computational requirements.Improved YOLO model shows a significant improvement in precision by 1.2%, recall rate by 1.0%, and mAP@.5 by 0.6%, while mAP@.5-.95 remained unchanged. The parameters were reduced by approximately 25.12%, and GFLOPs were reduced by about 10.63%. In the ablation experiment, it is found that the FasterNet module can significantly reduce the number of parameters of the model, and the reference of the attention mechanism can slow down the performance loss caused by lightweight.
There are no more papers matching your filters at the moment.