University of South Carolina
Magnetic Resonance Imaging with tagging (tMRI) has long been utilized for quantifying tissue motion and strain during deformation. However, a phenomenon known as tag fading, a gradual decrease in tag visibility over time, often complicates post-processing. The first contribution of this study is to model tag fading by considering the interplay between T1T_1 relaxation and the repeated application of radio frequency (RF) pulses during serial imaging sequences. This is a factor that has been overlooked in prior research on tMRI post-processing. Further, we have observed an emerging trend of utilizing raw tagged MRI within a deep learning-based (DL) registration framework for motion estimation. In this work, we evaluate and analyze the impact of commonly used image similarity objectives in training DL registrations on raw tMRI. This is then compared with the Harmonic Phase-based approach, a traditional approach which is claimed to be robust to tag fading. Our findings, derived from both simulated images and an actual phantom scan, reveal the limitations of various similarity losses in raw tMRI and emphasize caution in registration tasks where image intensity changes over time.
This paper offers a systematic overview of Neurosymbolic AI, categorizing methods based on how they integrate neural networks with symbolic AI to combine perception and reasoning. It analyzes various approaches and highlights how this paradigm can enhance explainability, incorporate domain knowledge, and improve safety in AI systems, demonstrating an increase in expert agreement in mental health assessments using knowledge-infused learning.
In the "Beyond Moore's Law" era, with increasing edge intelligence, domain-specific computing embracing unconventional approaches will become increasingly prevalent. At the same time, adopting a variety of nanotechnologies will offer benefits in energy cost, computational speed, reduced footprint, cyber resilience, and processing power. The time is ripe for a roadmap for unconventional computing with nanotechnologies to guide future research, and this collection aims to fill that need. The authors provide a comprehensive roadmap for neuromorphic computing using electron spins, memristive devices, two-dimensional nanomaterials, nanomagnets, and various dynamical systems. They also address other paradigms such as Ising machines, Bayesian inference engines, probabilistic computing with p-bits, processing in memory, quantum memories and algorithms, computing with skyrmions and spin waves, and brain-inspired computing for incremental learning and problem-solving in severely resource-constrained environments. These approaches have advantages over traditional Boolean computing based on von Neumann architecture. As the computational requirements for artificial intelligence grow 50 times faster than Moore's Law for electronics, more unconventional approaches to computing and signal processing will appear on the horizon, and this roadmap will help identify future needs and challenges. In a very fertile field, experts in the field aim to present some of the dominant and most promising technologies for unconventional computing that will be around for some time to come. Within a holistic approach, the goal is to provide pathways for solidifying the field and guiding future impactful discoveries.
Large language models (LLMs) excel in speed and adaptability across various reasoning tasks, but they often struggle when strict logic or constraint enforcement is required. In contrast, Large Reasoning Models (LRMs) are specifically designed for complex, step-by-step reasoning, although they come with significant computational costs and slower inference times. To address these trade-offs, we employ and generalize the SOFAI (Slow and Fast AI) cognitive architecture into SOFAI-LM, which coordinates a fast LLM with a slower but more powerful LRM through metacognition. The metacognitive module actively monitors the LLM's performance and provides targeted, iterative feedback with relevant examples. This enables the LLM to progressively refine its solutions without requiring the need for additional model fine-tuning. Extensive experiments on graph coloring and code debugging problems demonstrate that our feedback-driven approach significantly enhances the problem-solving capabilities of the LLM. In many instances, it achieves performance levels that match or even exceed those of standalone LRMs while requiring considerably less time. Additionally, when the LLM and feedback mechanism alone are insufficient, we engage the LRM by providing appropriate information collected during the LLM's feedback loop, tailored to the specific characteristics of the problem domain and leads to improved overall performance. Evaluations on two contrasting domains: graph coloring, requiring globally consistent solutions, and code debugging, demanding localized fixes, demonstrate that SOFAI-LM enables LLMs to match or outperform standalone LRMs in accuracy while maintaining significantly lower inference time.
This survey provides the first comprehensive overview of "hallucination" across all major modalities of Foundation Models (text, image, video, and audio), offering a structured understanding of its manifestations, detection, and mitigation strategies. It consolidates research on factual deviations in AI-generated content, highlighting common challenges and emerging solutions across diverse data types.
51
We tackle a new problem of multi-view camera and subject registration in the bird's eye view (BEV) without pre-given camera calibration. This is a very challenging problem since its only input is several RGB images from different first-person views (FPVs) for a multi-person scene, without the BEV image and the calibration of the FPVs, while the output is a unified plane with the localization and orientation of both the subjects and cameras in a BEV. We propose an end-to-end framework solving this problem, whose main idea can be divided into following parts: i) creating a view-transform subject detection module to transform the FPV to a virtual BEV including localization and orientation of each pedestrian, ii) deriving a geometric transformation based method to estimate camera localization and view direction, i.e., the camera registration in a unified BEV, iii) making use of spatial and appearance information to aggregate the subjects into the unified BEV. We collect a new large-scale synthetic dataset with rich annotations for evaluation. The experimental results show the remarkable effectiveness of our proposed method.
71
Assessing the effectiveness of large language models (LLMs) in performing different tasks is crucial for understanding their strengths and weaknesses. This paper presents Hierarchical Prompting Taxonomy (HPT), grounded on human cognitive principles and designed to assess LLMs by examining the cognitive demands of various tasks. The HPT utilizes the Hierarchical Prompting Framework (HPF), which structures five unique prompting strategies in a hierarchical order based on their cognitive requirement on LLMs when compared to human mental capabilities. It assesses the complexity of tasks with the Hierarchical Prompting Index (HPI), which demonstrates the cognitive competencies of LLMs across diverse datasets and offers insights into the cognitive demands that datasets place on different LLMs. This approach enables a comprehensive evaluation of an LLMs problem solving abilities and the intricacy of a dataset, offering a standardized metric for task complexity. Extensive experiments with multiple datasets and LLMs show that HPF enhances LLM performance by 2% to 63% compared to baseline performance, with GSM8k being the most cognitively complex task among reasoning and coding tasks with an average HPI of 3.20 confirming the effectiveness of HPT. To support future research and reproducibility in this domain, the implementations of HPT and HPF are available here.
The receiver operating characteristic (ROC) curve and its summary measure, the Area Under the Curve (AUC), are well-established tools for evaluating the efficacy of biomarkers in biomedical studies. Compared to the traditional ROC curve, the covariate-adjusted ROC curve allows for individual evaluation of the biomarker. However, the use of machine learning models has rarely been explored in this context, despite their potential to develop more powerful and sophisticated approaches for biomarker evaluation. The goal of this paper is to propose a framework for neural network-based covariate-adjusted ROC modeling that allows flexible and nonlinear evaluation of the effectiveness of a biomarker to discriminate between two reference populations. The finite-sample performance of our method is investigated through extensive simulation tests under varying dependency structures between biomarkers, covariates, and referenced populations. The methodology is further illustrated in a clinically case study that assesses daily physical activity - measured as total activity time (TAC), a proxy for daily step count-as a biomarker to predict mortality at three, five and eight years. Analyzes stratified by sex and adjusted for age and BMI reveal distinct covariate effects on mortality outcomes. These results underscore the importance of covariate-adjusted modeling in biomarker evaluation and highlight TAC's potential as a functional capacity biomarker based on specific individual characteristics.
Stein variational gradient descent (SVGD) is a kernel-based and non-parametric particle method for sampling from a target distribution, such as in Bayesian inference and other machine learning tasks. Different from other particle methods, SVGD does not require estimating the score, which is the gradient of the log-density. However, in practice, SVGD can be slow compared to score-estimation-based sampling algorithms. To design a fast and efficient high-dimensional sampling algorithm with the advantages of SVGD, we introduce accelerated SVGD (ASVGD), based on an accelerated gradient flow in a metric space of probability densities following Nesterov's method. We then derive a momentum-based discrete-time sampling algorithm, which evolves a set of particles deterministically. To stabilize the particles' position update, we also include a Wasserstein metric regularization. This paper extends the conference version \cite{SL2025}. For the bilinear kernel and Gaussian target distributions, we study the kernel parameter and damping parameters with an optimal convergence rate of the proposed dynamics. This is achieved by analyzing the linearized accelerated gradient flows at the equilibrium. Interestingly, the optimal parameter is a constant, which does not depend on the covariance of the target distribution. For the generalized kernel functions, such as the Gaussian kernel, numerical examples with varied target distributions demonstrate the effectiveness of ASVGD compared to SVGD and other popular sampling methods. Furthermore, we show that in the setting of Bayesian neural networks, ASVGD outperforms SVGD significantly in terms of log-likelihood and total iteration times.
Dwarf galaxies provide powerful laboratories for studying galaxy formation physics. Their early assembly, shallow gravitational potentials, and bursty, clustered star formation histories make them especially sensitive to the processes that regulate baryons through multi-phase outflows. Using high-resolution, cosmological zoom-in simulations of a dwarf galaxy from \textit{the Pandora suite}, we explore the impact of stellar radiation, magnetic fields, and cosmic ray feedback on star formation, outflows, and metal retention. We find that our purely hydrodynamical model without non-thermal physics - in which supernova feedback is boosted to reproduce realistic stellar mass assembly - drives violent, overly enriched outflows that suppress the metal content of the host galaxy. Including radiation reduces the clustering of star formation and weakens feedback. However, the additional incorporation of cosmic rays produces fast, mass-loaded, multi-phase outflows consisting of both ionized and neutral gas components, in better agreement with observations. These outflows, which entrain a denser, more temperate ISM, exhibit broad metallicity distributions while preserving metals within the galaxy. Furthermore, the star formation history becomes more bursty, in agreement with recent JWST findings. These results highlight the essential role of non-thermal physics in galaxy evolution and the need to incorporate it in future galaxy formation models.
We present a novel and practically significant problem-Geo-Contextual Soundscape-to-Landscape (GeoS2L) generation-which aims to synthesize geographically realistic landscape images from environmental soundscapes. Prior audio-to-image generation methods typically rely on general-purpose datasets and overlook geographic and environmental contexts, resulting in unrealistic images that are misaligned with real-world environmental settings. To address this limitation, we introduce a novel geo-contextual computational framework that explicitly integrates geographic knowledge into multimodal generative modeling. We construct two large-scale geo-contextual multimodal datasets, SoundingSVI and SonicUrban, pairing diverse soundscapes with real-world landscape images. We propose SounDiT, a novel Diffusion Transformer (DiT)-based model that incorporates geo-contextual scene conditioning to synthesize geographically coherent landscape images. Furthermore, we propose a practically-informed geo-contextual evaluation framework, the Place Similarity Score (PSS), across element-, scene-, and human perception-levels to measure consistency between input soundscapes and generated landscape images. Extensive experiments demonstrate that SounDiT outperforms existing baselines in both visual fidelity and geographic settings. Our work not only establishes foundational benchmarks for GeoS2L generation but also highlights the importance of incorporating geographic domain knowledge in advancing multimodal generative models, opening new directions at the intersection of generative AI, geography, urban planning, and environmental sciences.
FDS-GS improves 3D Gaussian Splatting by establishing an explicit relationship between Gaussian density and scale, enhancing the representation of high-frequency details. This approach achieves higher rendering quality with fewer Gaussian primitives by more efficiently allocating them based on scene frequency content.
23
DPSeg introduces a dual-prompt framework for open-vocabulary semantic segmentation, combining visual and text prompts to mitigate the image-text domain gap. The model employs a cost volume-guided decoder with multi-level feature guidance and a semantic-guided prompt refinement strategy, achieving new state-of-the-art performance across various benchmarks by improving accuracy and fine-grained detail segmentation.
This paper defines and proposes a comprehensive conceptual framework for "Autonomous GIS," an AI-powered next-generation system that leverages generative AI and Large Language Models to automate geospatial problem-solving. It outlines specific autonomous goals, functional components, levels of autonomy, and operational scales, while presenting proof-of-concept GIS agents that demonstrate automated data retrieval, spatial analysis, and cartographic design.
Implicit Neural Representation (INR) has demonstrated remarkable advances in the field of image representation but demands substantial GPU resources. GaussianImage recently pioneered the use of Gaussian Splatting to mitigate this cost, however, the slow training process limits its practicality, and the fixed number of Gaussians per image limits its adaptability to varying information entropy. To address these issues, we propose in this paper a generalizable and self-adaptive image representation framework based on 2D Gaussian Splatting. Our method employs a network to quickly generate a coarse Gaussian representation, followed by minimal fine-tuning steps, achieving comparable rendering quality of GaussianImage while significantly reducing training time. Moreover, our approach dynamically adjusts the number of Gaussian points based on image complexity to further enhance flexibility and efficiency in practice. Experiments on DIV2K and Kodak datasets show that our method matches or exceeds GaussianImage's rendering performance with far fewer iterations and shorter training times. Specifically, our method reduces the training time by up to one order of magnitude while achieving superior rendering performance with the same number of Gaussians.
University of Washington logoUniversity of WashingtonCNRS logoCNRSUniversity of Toronto logoUniversity of TorontoUniversity of MississippiUniversity of CincinnatiCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of Cambridge logoUniversity of CambridgeINFN Sezione di NapoliMonash University logoMonash UniversityNational Central UniversityNational Astronomical Observatory of JapanVanderbilt UniversityUniversity of Notre Dame logoUniversity of Notre DameTel Aviv University logoTel Aviv UniversityUniversity College London logoUniversity College LondonNikhefGeorgia Institute of Technology logoGeorgia Institute of TechnologyUniversity of Science and Technology of China logoUniversity of Science and Technology of ChinaTsinghua University logoTsinghua UniversityThe Chinese University of Hong Kong logoThe Chinese University of Hong KongUniversity of MelbourneThe University of Texas at Austin logoThe University of Texas at AustinUniversity of WarsawPeking University logoPeking UniversityTexas A&M University logoTexas A&M UniversityUniversity of British Columbia logoUniversity of British ColumbiaNorthwestern University logoNorthwestern UniversityNASA Goddard Space Flight Center logoNASA Goddard Space Flight CenterLouisiana State UniversityUniversity of Florida logoUniversity of FloridaINFN Sezione di PisaRutherford Appleton LaboratoryUniversity of Minnesota logoUniversity of MinnesotaUniversity of Maryland logoUniversity of MarylandUniversity of Tokyo logoUniversity of TokyoIndian Institute of ScienceNational Taiwan Normal UniversityThe Pennsylvania State University logoThe Pennsylvania State UniversityRochester Institute of TechnologyGran Sasso Science InstituteSorbonne Université logoSorbonne UniversitéUniversity of Massachusetts AmherstAustralian National University logoAustralian National UniversityUniversity of AucklandCardiff UniversityUniversity of GlasgowLeibniz Universität HannoverUniversity of PortsmouthUniversidade Federal do ABCHigh Energy Accelerator Research Organization (KEK)Indian Institute of Technology MadrasUniversity of StrathclydeUniversità di GenovaUniversity of Alabama in HuntsvilleSyracuse UniversityUniversity of SannioRMIT UniversityInstituto Nacional de Pesquisas EspaciaisUniversità di CamerinoUniversitat de les Illes BalearsMaastricht UniversityUniversity of BirminghamUniversità di TriesteNational Cheng Kung UniversityAix Marseille UniversityKyushu UniversityUniversity of South CarolinaWashington State UniversityUniversity of OregonNational Tsing-Hua UniversityKindai UniversityThe University of Western AustraliaUniversidade de AveiroEötvös Loránd UniversityUniversitat Autònoma de BarcelonaSofia UniversityNicolaus Copernicus Astronomical CenterInstituto de Fisica Teorica UAM/CSICShanghai Astronomical ObservatoryNicolaus Copernicus UniversityINFN, Laboratori Nazionali di FrascatiUniversity of Western OntarioUniversità di Napoli Federico IIUniversity of California, Santa Cruz logoUniversity of California, Santa CruzEmbry-Riddle Aeronautical UniversityUniversity of Hawai’iUniversity of Electro-CommunicationsNational Chung Hsing UniversityMontana State UniversityInternational Centre for Theoretical SciencesINFN Sezione di PerugiaIstituto Nazionale di Alta MatematicaThe University of SheffieldUniversité de la Côte d’AzurPhysikalisch-Technische BundesanstaltInstitut de Física d’Altes Energies (IFAE)INFN - Sezione di PadovaUniversity of the Balearic IslandsLaboratoire Kastler BrosselUniversità di FirenzeUniversity of ToyamaIstituto Nazionale di OtticaINFN-Sezione di GenovaUniversiteit AntwerpenThe University of MississippiUniversity of SzegedUniversità di PerugiaINFN-Sezione di BolognaUniversità di CagliariVU AmsterdamInstitute for Cosmic Ray Research, University of TokyoINFN Sezione di Roma Tor VergataUniversité de Paris, CNRS, Astroparticule et Cosmologie,California State University, Los AngelesUniversità di SienaLIGO Livingston ObservatoryNational Center for High-Performance ComputingNCBJLaboratoire AstroParticule et Cosmologie - CNRSUniversità di Urbino Carlo BoUniversità degli Studi di SassariUniversità di Trento, INFN-TIFPAWigner RCP, RMKIINFN Sezione di CagliariRESCEU, University of TokyoUniv Lyon, ENS de Lyon, CNRS, Université Claude Bernard Lyon 1Universite de Nice, ARTEMIS, CNRS, Observatoire de la Cote d’AzurIstituto de Fısica Teórica, UAM/CSICAlbert-Einstein-Institut, HanoverAPC, AstroParticule et Cosmologie, CNRSGSSI, INFN, Laboratori Nazionali del Gran SassoNational Institute of Technology, Akashi CollegeLAPP, Universit´e Savoie Mont BlancUniversità di NapoliUniversità degli Studi di CamerinoThe University of Sheffield, Department of Physics and AstronomyUniversite de Paris* National and Kapodistrian University of AthensFriedrich-Schiller-Universität JenaUniversit Grenoble AlpesUniversit degli Studi di GenovaUniversit Libre de BruxellesUniversit di TrentoUniversit di SalernoUniversit degli Studi di PadovaUniversit de BordeauxUniversit di Roma La SapienzaUniversit Paris CitUniversit de StrasbourgUniversit de LyonUniversit di PisaINAF Osservatorio Astronomico di PadovaUniversit de MontpellierUniversit di Roma Tor VergataUniversit Di BolognaINAF ` Osservatorio Astronomico di TriesteINFN Sezione di Firenze
The ever-increasing number of detections of gravitational waves (GWs) from compact binaries by the Advanced LIGO and Advanced Virgo detectors allows us to perform ever-more sensitive tests of general relativity (GR) in the dynamical and strong-field regime of gravity. We perform a suite of tests of GR using the compact binary signals observed during the second half of the third observing run of those detectors. We restrict our analysis to the 15 confident signals that have false alarm rates 103yr1\leq 10^{-3}\, {\rm yr}^{-1}. In addition to signals consistent with binary black hole (BH) mergers, the new events include GW200115_042309, a signal consistent with a neutron star--BH merger. We find the residual power, after subtracting the best fit waveform from the data for each event, to be consistent with the detector noise. Additionally, we find all the post-Newtonian deformation coefficients to be consistent with the predictions from GR, with an improvement by a factor of ~2 in the -1PN parameter. We also find that the spin-induced quadrupole moments of the binary BH constituents are consistent with those of Kerr BHs in GR. We find no evidence for dispersion of GWs, non-GR modes of polarization, or post-merger echoes in the events that were analyzed. We update the bound on the mass of the graviton, at 90% credibility, to mg2.42×1023eV/c2m_g \leq 2.42 \times 10^{-23} \mathrm{eV}/c^2. The final mass and final spin as inferred from the pre-merger and post-merger parts of the waveform are consistent with each other. The studies of the properties of the remnant BHs, including deviations of the quasi-normal mode frequencies and damping times, show consistency with the predictions of GR. In addition to considering signals individually, we also combine results from the catalog of GW signals to calculate more precise population constraints. We find no evidence in support of physics beyond GR.
Representational Similarity Analysis (RSA) is a popular method for analyzing neuroimaging and behavioral data. Here we evaluate the accuracy and reliability of RSA in the context of model selection, and compare it to that of regression. Although RSA offers flexibility in handling high-dimensional, cross-modal, and cross-species data, its reliance on a transformation of raw data into similarity structures may result in the loss of critical stimulus-response information. Across extensive simulation studies and empirical analyses, we show that RSA leads to lower model selection accuracy, regardless of sample size, noise level, feature dimensionality, or multicollinearity, relative to regression. While principal component analysis and feature reweighting mitigate RSA's deficits driven by multicollinearity, regression remains superior in accurately distinguishing between models. Empirical data and a follow-up fMRI simulation further support these conclusions. Our findings suggest that researchers should carefully consider which approach to use: RSA is less effective than linear regression for model selection and fitting when direct stimulus-response mappings are available.
The internet gives the world an open platform to express their views and share their stories. While this is very valuable, it makes fake news one of our society's most pressing problems. Manual fact checking process is time consuming, which makes it challenging to disprove misleading assertions before they cause significant harm. This is he driving interest in automatic fact or claim verification. Some of the existing datasets aim to support development of automating fact-checking techniques, however, most of them are text based. Multi-modal fact verification has received relatively scant attention. In this paper, we provide a multi-modal fact-checking dataset called FACTIFY 2, improving Factify 1 by using new data sources and adding satire articles. Factify 2 has 50,000 new data instances. Similar to FACTIFY 1.0, we have three broad categories - support, no-evidence, and refute, with sub-categories based on the entailment of visual and textual data. We also provide a BERT and Vison Transformer based baseline, which achieves 65% F1 score in the test set. The baseline codes and the dataset will be made available at this https URL.
In the era of Generative AI, Neurosymbolic AI is emerging as a powerful approach for tasks spanning from perception to cognition. The use of Neurosymbolic AI has been shown to achieve enhanced capabilities, including improved grounding, alignment, explainability, and reliability. However, due to its nascent stage, there is a lack of widely available real-world benchmark datasets tailored to Neurosymbolic AI tasks. To address this gap and support the evaluation of current and future methods, we introduce DSceneKG -- a suite of knowledge graphs of driving scenes built from real-world, high-quality scenes from multiple open autonomous driving datasets. In this article, we detail the construction process of DSceneKG and highlight its application in seven different tasks. DSceneKG is publicly accessible at: this https URL
This paper presents two industry-grade datasets captured during an 8-hour continuous operation of the manufacturing assembly line at the Future Factories Lab, University of South Carolina, on 08/13/2024. The datasets adhere to industry standards, covering communication protocols, actuators, control mechanisms, transducers, sensors, and cameras. Data collection utilized both integrated and external sensors throughout the laboratory, including sensors embedded within the actuators and externally installed devices. Additionally, high-performance cameras captured key aspects of the operation. In a prior experiment [1], a 30-hour continuous run was conducted, during which all anomalies were documented. Maintenance procedures were subsequently implemented to reduce potential errors and operational disruptions. The two datasets include: (1) a time-series analog dataset, and (2) a multi-modal time-series dataset containing synchronized system data and images. These datasets aim to support future research in advancing manufacturing processes by providing a platform for testing novel algorithms without the need to recreate physical manufacturing environments. Moreover, the datasets are open-source and designed to facilitate the training of artificial intelligence models, streamlining research by offering comprehensive, ready-to-use resources for various applications and projects.
There are no more papers matching your filters at the moment.