U.S. Geological Survey
This paper defines and proposes a comprehensive conceptual framework for "Autonomous GIS," an AI-powered next-generation system that leverages generative AI and Large Language Models to automate geospatial problem-solving. It outlines specific autonomous goals, functional components, levels of autonomy, and operational scales, while presenting proof-of-concept GIS agents that demonstrate automated data retrieval, spatial analysis, and cartographic design.
Global challenges such as food supply chain disruptions, public health crises, and natural hazard responses require access to and integration of diverse datasets, many of which are geospatial. Over the past few years, a growing number of (geo)portals have been developed to address this need. However, most existing (geo)portals are stacked by separated or sparsely connected data "silos" impeding effective data consolidation. A new way of sharing and reusing geospatial data is therefore urgently needed. In this work, we introduce KnowWhereGraph, a knowledge graph-based data integration, enrichment, and synthesis framework that not only includes schemas and data related to human and environmental systems but also provides a suite of supporting tools for accessing this information. The KnowWhereGraph aims to address the challenge of data integration by building a large-scale, cross-domain, pre-integrated, FAIR-principles-based, and AI-ready data warehouse rooted in knowledge graphs. We highlight the design principles of KnowWhereGraph, emphasizing the roles of space, place, and time in bridging various data "silos". Additionally, we demonstrate multiple use cases where the proposed geospatial knowledge graph and its associated tools empower decision-makers to uncover insights that are often hidden within complex and poorly interoperable datasets.
Process-Based Modeling (PBM) and Machine Learning (ML) are often perceived as distinct paradigms in the geosciences. Here we present differentiable geoscientific modeling as a powerful pathway toward dissolving the perceived barrier between them and ushering in a paradigm shift. For decades, PBM offered benefits in interpretability and physical consistency but struggled to efficiently leverage large datasets. ML methods, especially deep networks, presented strong predictive skills yet lacked the ability to answer specific scientific questions. While various methods have been proposed for ML-physics integration, an important underlying theme -- differentiable modeling -- is not sufficiently recognized. Here we outline the concepts, applicability, and significance of differentiable geoscientific modeling (DG). "Differentiable" refers to accurately and efficiently calculating gradients with respect to model variables, critically enabling the learning of high-dimensional unknown relationships. DG refers to a range of methods connecting varying amounts of prior knowledge to neural networks and training them together, capturing a different scope than physics-guided machine learning and emphasizing first principles. Preliminary evidence suggests DG offers better interpretability and causality than ML, improved generalizability and extrapolation capability, and strong potential for knowledge discovery, while approaching the performance of purely data-driven ML. DG models require less training data while scaling favorably in performance and efficiency with increasing amounts of data. With DG, geoscientists may be better able to frame and investigate questions, test hypotheses, and discover unrecognized linkages.
We employ a domain decomposition approach with Lagrange multipliers to implement fault slip in a finite-element code, PyLith, for use in both quasi-static and dynamic crustal deformation applications. This integrated approach to solving both quasi-static and dynamic simulations leverages common finite-element data structures and implementations of various boundary conditions, discretization schemes, and bulk and fault rheologies. We have developed a custom preconditioner for the Lagrange multiplier portion of the system of equations that provides excellent scalability with problem size compared to conventional additive Schwarz methods. We demonstrate application of this approach using benchmarks for both quasi-static viscoelastic deformation and dynamic spontaneous rupture propagation that verify the numerical implementation in PyLith.
We mapped current and future temperature suitability for malaria transmission in Africa using a published model that incorporates nonlinear physiological responses to temperature of the mosquito vector Anopheles gambiae and the malaria parasite Plasmodium falciparum. We found that a larger area of Africa currently experiences the ideal temperature for transmission than previously supposed. Under future climate projections, we predicted a modest increase in the overall area suitable for malaria transmission, but a net decrease in the most suitable area. Combined with population density projections, our maps suggest that areas with temperatures suitable for year-round, highest risk transmission will shift from coastal West Africa to the Albertine Rift between Democratic Republic of Congo and Uganda, while areas with seasonal transmission suitability will shift toward sub-Saharan coastal areas. Mapping temperature suitability places important bounds on malaria transmissibility and, along with local level demographic, socioeconomic, and ecological factors, can indicate where resources may be best spent on malaria control.
This work introduces a novel graph neural networks (GNNs)-based method to predict stream water temperature and reduce model bias across locations of different income and education levels. Traditional physics-based models often have limited accuracy because they are necessarily approximations of reality. Recently, there has been an increasing interest of using GNNs in modeling complex water dynamics in stream networks. Despite their promise in improving the accuracy, GNNs can bring additional model bias through the aggregation process, where node features are updated by aggregating neighboring nodes. The bias can be especially pronounced when nodes with similar sensitive attributes are frequently connected. We introduce a new method that leverages physical knowledge to represent the node influence in GNNs, and then utilizes physics-based influence to refine the selection and weights over the neighbors. The objective is to facilitate equitable treatment over different sensitive groups in the graph aggregation, which helps reduce spatial bias over locations, especially for those in underprivileged groups. The results on the Delaware River Basin demonstrate the effectiveness of the proposed method in preserving equitable performance across locations in different sensitive groups.
This paper presents GeoFlood, a new open-source software package for solving the shallow-water equations (SWE) on a quadtree hierarchy of mapped, logically Cartesian grids managed by the parallel, adaptive library ForestClaw (Calhoun and Burstedde, 2017). The GeoFlood model is validated using standard benchmark tests from Neelz and Pender (2013) as well as the historical Malpasset dam failure. The benchmark test results are compared against those obtained from GeoClaw (Clawpack Development Team, 2020) and the software package HEC-RAS (Hydraulic Engineering Center River Analysis System, Army Corps of Engineers) (Brunner, 2018). The Malpasset outburst flood results are compared with those presented in George (2011) (obtained from the GeoClaw software), model results from Hervouet and Petitjean (1999), and empirical data. The comparisons validate GeoFlood's capabilities for idealized benchmarks compared to other commonly used models as well as its ability to efficiently simulate highly dynamic floods in complex terrain, consistent with historical field data. Because it is massively parallel and scalable, GeoFlood may be a valuable tool for efficiently computing large-scale flooding problems at very high resolutions.
Water temperature can vary substantially even across short distances within the same sub-watershed. Accurate prediction of stream water temperature at fine spatial resolutions (i.e., fine scales, \leq 1 km) enables precise interventions to maintain water quality and protect aquatic habitats. Although spatiotemporal models have made substantial progress in spatially coarse time series modeling, challenges persist in predicting at fine spatial scales due to the lack of data at that scale.To address the problem of insufficient fine-scale data, we propose a Multi-Scale Graph Learning (MSGL) method. This method employs a multi-task learning framework where coarse-scale graph learning, bolstered by larger datasets, simultaneously enhances fine-scale graph learning. Although existing multi-scale or multi-resolution methods integrate data from different spatial scales, they often overlook the spatial correspondences across graph structures at various scales. To address this, our MSGL introduces an additional learning task, cross-scale interpolation learning, which leverages the hydrological connectedness of stream locations across coarse- and fine-scale graphs to establish cross-scale connections, thereby enhancing overall model performance. Furthermore, we have broken free from the mindset that multi-scale learning is limited to synchronous training by proposing an Asynchronous Multi-Scale Graph Learning method (ASYNC-MSGL). Extensive experiments demonstrate the state-of-the-art performance of our method for anti-sparse downscaling of daily stream temperatures in the Delaware River Basin, USA, highlighting its potential utility for water resources monitoring and management.
Choices in scientific research and management require balancing multiple, often competing objectives.Multiple-objective optimization (MOO) provides a unifying framework for solving multiple objective problems. Model selection is a critical component to scientific inference and prediction and concerns balancing the competing objectives of model fit and model complexity. The tradeoff between model fit and model complexity provides a basis for describing the model-selection problem within the MOO framework. We discuss MOO and two strategies for solving the MOO problem; modeling preferences pre-optimization and post-optimization. Most model selection methods are consistent with solving MOO problems via specification of preferences pre-optimization. We reconcile these methods within the MOO framework. We also consider model selection using post-optimization specification of preferences. That is, by first identifying Pareto optimal solutions, and then selecting among them. We demonstrate concepts with an ecological application of model selection using avian species richness data in the continental United States.
Mobile gravimetry is important in metrology, navigation, geodesy, and geophysics. Atomic gravimeters could be among the most accurate mobile gravimeters, but are currently constrained by being complex and fragile. Here, we demonstrate a mobile atomic gravimeter, measuring tidal gravity variations in the laboratory as well as surveying gravity in the field. The tidal gravity measurements achieve a sensitivity of 37 μ\muGal/Hz\sqrt{\rm Hz} and a long-term stability of better than 2 μ\muGal, revealing ocean tidal loading effects and recording several distant earthquakes. We survey gravity in the Berkeley Hills with an accuracy of around 0.04 mGal and determine the density of the subsurface rocks from the vertical gravity gradient. With simplicity and sensitivity, our instrument paves the way for bringing atomic gravimeters to field applications.
In order to search for evidence of hydration on M-type asteroid (16) Psyche, we observed this object in the 3 micron spectral region using the long-wavelength cross-dispersed (LXD: 1.9-4.2 micron) mode of the SpeX spectrograph/imager at the NASA Infrared Telescope Facility (IRTF). Our observations show that Psyche exhibits a 3 micron absorption feature, attributed to water or hydroxyl. The 3 micron absorption feature is consistent with the hydration features found on the surfaces of water-rich asteroids, attributed to OH- and/or H2O-bearing phases (phyllosilicates). The detection of a 3 micron hydration absorption band on Psyche suggests that this asteroid may not be metallic core, or it could be a metallic core that has been impacted by carbonaceous material over the past 4.5 Gyr. Our results also indicate rotational spectral variations, which we suggest reflect heterogeneity in the metal/silicate ratio on the surface of Psyche.
Point counts (PCs) are widely used in biodiversity surveys, but despite numerous advantages, simple PCs suffer from several problems: detectability, and therefore abundance, is unknown; systematic spatiotemporal variation in detectability produces biased inferences, and unknown survey area prevents formal density estimation and scaling-up to the landscape level. We introduce integrated distance sampling (IDS) models that combine distance sampling (DS) with simple PC or detection/nondetection (DND) data and capitalize on the strengths and mitigate the weaknesses of each data type. Key to IDS models is the view of simple PC and DND data as aggregations of latent DS surveys that observe the same underlying density process. This enables estimation of separate detection functions, along with distinct covariate effects, for all data types. Additional information from repeat or time-removal surveys, or variable survey duration, enables separate estimation of the availability and perceptibility components of detectability. IDS models reconcile spatial and temporal mismatches among data sets and solve the above-mentioned problems of simple PC and DND data. To fit IDS models, we provide JAGS code and the new IDS() function in the R package unmarked. Extant citizen-science data generally lack adjustments for detection biases, but IDS models address this shortcoming, thus greatly extending the utility and reach of these data. In addition, they enable formal density estimation in hybrid designs, which efficiently combine distance sampling with distance-free, point-based PC or DND surveys. We believe that IDS models have considerable scope in ecology, management, and monitoring.
Survival is a key life history parameter that can inform management decisions and life history research. Because true survival is often confounded with permanent and temporary emigration from the study area, many studies must estimate apparent survival (i.e., probability of surviving and remaining inside the study area), which can be much lower than true survival for highly mobile species. One method for estimating true survival is the Barker joint live-recapture/live-resight (JLRLR) model, which combines capture data from a study area (hereafter the capture site) with resighting data from a broader geographic area. This model assumes that live resights occur throughout the entire area where animals can disperse to and this assumption is often not met in practice. Here we use simulation to evaluate survival bias from a JLRLR model under study design scenarios that differ in the site selection for resights: global, random, fixed including the capture site, and fixed excluding the capture site. Simulation results indicate that fixed designs that included the capture site showed negative survival bias, whereas fixed designs that excluded the capture site exhibited positive survival bias. The magnitude of the bias was dependent on movement and survival, where scenarios with high survival and frequent movement had minimal bias. In effort to help minimize bias, we developed a multistate version of the JLRLR and demonstrated reductions in survival bias compared to the single-state version for most designs. Our results suggest minimizing bias can be accomplished by: 1) using a random resight design when feasible and global sampling is not possible, 2) using the multistate JLRLR model when appropriate, 3) including the capture site in the resight sampling frame when possible, and 4) reporting survival as apparent survival if fixed sites are used for resight with the single state JLRLR model.
Planets are expected to conclude their growth through a series of giant impacts: energetic, global events that significantly alter planetary composition and evolution. Computer models and theory have elucidated the diverse outcomes of giant impacts in detail, improving our ability to interpret collision conditions from observations of their remnants. However, many open questions remain, as even the formation of the Moon, a widely suspected giant-impact product for which we have the most information, is still debated. We review giant-impact theory, the diverse nature of giant-impact outcomes, and the governing physical processes. We discuss the importance of computer simulations, informed by experiments, for accurately modeling the impact process. Finally, we outline how the application of probability theory and computational advancements can assist in inferring collision histories from observations, and we identify promising opportunities for advancing giant-impact theory in the future. \bullet Giant impacts exhibit diverse possible outcomes leading to changes in planetary mass, composition, and thermal history depending on the conditions. \bullet Improvements to computer simulation methodologies and new laboratory experiments provide critical insights into the detailed outcomes of giant impacts. \bullet When colliding planets are similar in size, they can merge or escape one another with roughly equal probability, but with different effects on their resulting masses, densities, and orbits. \bullet Different sequences of giant impacts can produce similar planets, encouraging the use of probability theory to evaluate distinct formation hypotheses.
The California Community Earth Models for Seismic Hazard Assessments Workshop (this https URL, accessed December 16, 2024) was held online on March 4-5, 2024, with more than 200 participants over two days. In this report, we provide a summary of the key points from the presentations and discussions. We highlight three use cases that drive the development of community Earth models, present an inventory of existing community Earth models in California, summarize a few techniques for integrating and merging models, discuss potential connections with the Cascadia Region Earthquake Science Center (CRESCENT), and discuss what "community" means in community Earth models. Appendix B contains the workshop agenda and Appendix C contains a list of participants.
Accurate prediction of water temperature in streams is critical for monitoring and understanding biogeochemical and ecological processes in streams. Stream temperature is affected by weather patterns (such as solar radiation) and water flowing through the stream network. Additionally, stream temperature can be substantially affected by water releases from man-made reservoirs to downstream segments. In this paper, we propose a heterogeneous recurrent graph model to represent these interacting processes that underlie stream-reservoir networks and improve the prediction of water temperature in all river segments within a network. Because reservoir release data may be unavailable for certain reservoirs, we further develop a data assimilation mechanism to adjust the deep learning model states to correct for the prediction bias caused by reservoir releases. A well-trained temporal modeling component is needed in order to use adjusted states to improve future predictions. Hence, we also introduce a simulation-based pre-training strategy to enhance the model training. Our evaluation for the Delaware River Basin has demonstrated the superiority of our proposed method over multiple existing methods. We have extensively studied the effect of the data assimilation mechanism under different scenarios. Moreover, we show that the proposed method using the pre-training strategy can still produce good predictions even with limited training data.
We discuss the DISORT-based radiative transfer pipeline ('CRISM_LambertAlb') for atmospheric and thermal correction of MRO/CRISM data acquired in multispectral mapping mode (~200 m/pixel, 72 spectral channels). Currently, in this phase-one version of the system, we use aerosol optical depths, surface temperatures, and lower-atmospheric temperatures, all from climatology derived from Mars Global Surveyor Thermal Emission Spectrometer (MGS-TES) data, and surface altimetry derived from MGS Mars Orbiter Laser Altimeter (MOLA). The DISORT-based model takes as input the dust and ice aerosol optical depths (scaled to the CRISM wavelength range), the surface pressures (computed from MOLA altimetry, MGS-TES lower-atmospheric thermometry, and Viking-based pressure climatology), the surface temperatures, the reconstructed instrumental photometric angles, and the measured I/F spectrum, and then outputs a Lambertian albedo spectrum. The Lambertian albedo spectrum is valuable geologically since it allows the mineralogical composition to be estimated. Here, I/F is defined as the ratio of the radiance measured by CRISM to the solar irradiance at Mars divided by π\pi. After discussing the capabilities and limitations of the pipeline software system, we demonstrate its application on several multispectral data cubes: the outer northern ice cap of Mars, Tyrrhena Terra, and near the landing site for the Phoenix mission. For the icy spectra near the northern polar cap, aerosols need to be included in order to properly correct for the CO_2 absorption in the H_{2}O ice bands at wavelengths near 2.0 μ\mum. In future phases of software development, we intend to use CRISM data directly in order to retrieve the spatiotemporal maps of aerosol optical depths, surface pressure and surface temperature.
A fundamental question during the outbreak of a novel disease or invasion of an exotic pest is: At what location and date was it first introduced? With this information, future introductions can be anticipated and perhaps avoided. Point process models are commonly used for mapping species distribution and disease occurrence. If the time and location of introductions were known, then point process models could be used to map and understand the factors that influence introductions; however, rarely is the process of introduction directly observed. We propose embedding a point process within hierarchical Bayesian models commonly used to understand the spatio-temporal dynamics of invasion. Including a point process within a hierarchical Bayesian model enables inference regarding the location and date of introduction from indirect observation of the process such as species or disease occurrence records. We illustrate our approach using disease surveillance data collected to monitor white-nose syndrome, which is a fungal disease that threatens many North American species of bats. We use our model and surveillance data to estimate the location and date that the pathogen was introduced into the United States. Finally, we compare forecasts from our model to forecasts obtained from state-of-the-art regression-based statistical and machine learning methods. Our results show that the pathogen causing white-nose syndrome was most likely introduced into the United States 4 years prior to the first detection, but there is a moderate level of uncertainty in this estimate. The location of introduction could be up to 510 km east of the location of first discovery, but our results indicate that there is a relatively high probability the location of first detection could be the location of introduction.
In the last few years Cassini-VIMS, the Visible and Infared Mapping Spectrometer, returned to us a comprehensive view of the Saturn's icy satellites and rings. After having analyzed the satellites' spectral properties (Filacchione et al. (2007a)) and their distribution across the satellites' hemispheres (Filacchione et al. (2010)), we proceed in this paper to investigate the radial variability of icy satellites (principal and minor) and main rings average spectral properties. This analysis is done by using 2,264 disk-integrated observations of the satellites and a 12x700 pixels-wide rings radial mosaic acquired with a spatial resolution of about 125 km/pixel. The comparative analysis of these data allows us to retrieve the amount of both water ice and red contaminant materials distributed across Saturn's system and the typical surface regolith grain sizes. These measurements highlight very striking differences in the population here analyzed, which vary from the almost uncontaminated and water ice-rich surfaces of Enceladus and Calypso to the metal/organic-rich and red surfaces of Iapetus' leading hemisphere and Phoebe. Rings spectra appear more red than the icy satellites in the visible range but show more intense 1.5-2.0 micron band depths. The correlations among spectral slopes, band depths, visual albedo and phase permit us to cluster the saturnian population in different spectral classes which are detected not only among the principal satellites and rings but among co-orbital minor moons as well. Finally, we have applied Hapke's theory to retrieve the best spectral fits to Saturn's inner regular satellites using the same methodology applied previously for Rhea data discussed in Ciarniello et al. (2011).
Accurately quantifying sediment transport rates in rivers remains an important goal for geomorphologists, hydraulic engineers, and environmental scientists. However, current techniques for measuring transport rates are laborious, and formulae to predict transport are notoriously inaccurate. Here, we attempt to estimate sediment transport rates using luminescence, a property of common sedimentary minerals that is used by the geoscience community for geochronology. This method is advantageous because of the ease of measurement on ubiquitous quartz and feldspar sand. We develop a model based on conservation of energy and sediment mass to explain the patterns of luminescence in river channel sediment from a first-principles perspective. We show that the model can accurately reproduce the luminescence observed in previously published field measurements from two rivers with very different sediment transport styles. The parameters from the model can then be used to estimate the time-averaged virtual velocity, characteristic transport lengthscales, storage timescales, and floodplain exchange rates of fine sand-sized sediment in a fluvial system. The values obtained from the luminescence method appear to fall within expected ranges based on published compilations. However, caution is warranted when applying the model as the complex nature of sediment transport can sometimes invalidate underlying simplifications.
There are no more papers matching your filters at the moment.