University of Nebraska at Omaha
University of Washington logoUniversity of WashingtonCNRS logoCNRSUniversity of Toronto logoUniversity of TorontoUniversity of Illinois at Urbana-Champaign logoUniversity of Illinois at Urbana-ChampaignUniversity of Pittsburgh logoUniversity of PittsburghMonash University logoMonash UniversityUniversity of California, Santa Barbara logoUniversity of California, Santa BarbaraHarvard University logoHarvard UniversityUniversity of UtahChinese Academy of Sciences logoChinese Academy of SciencesVanderbilt UniversityUniversity of OklahomaNew York University logoNew York UniversityTel Aviv University logoTel Aviv UniversityUniversidad de ConcepcionUniversity of EdinburghThe University of Texas at Austin logoThe University of Texas at AustinPeking University logoPeking UniversityIEECKU Leuven logoKU LeuvenColumbia University logoColumbia UniversityUniversity of Florida logoUniversity of FloridaSpace Telescope Science Institute logoSpace Telescope Science InstituteYork UniversityJohns Hopkins University logoJohns Hopkins UniversityUniversidad Diego PortalesUniversity of Wisconsin-Madison logoUniversity of Wisconsin-MadisonThe Pennsylvania State University logoThe Pennsylvania State UniversityUniversity of Arizona logoUniversity of ArizonaAustralian National University logoAustralian National UniversityLeiden University logoLeiden UniversityUniversity of Warwick logoUniversity of WarwickThe Ohio State University logoThe Ohio State UniversityUniversitat de BarcelonaCarnegie ObservatoriesUniversity of ConnecticutUniversity of St Andrews logoUniversity of St AndrewsUniversity of Colorado BoulderFlatiron Institute logoFlatiron InstituteLomonosov Moscow State UniversityGeorge Mason UniversityUniversidade Federal do Rio de JaneiroNorthern Arizona UniversityRussian Academy of SciencesUniversidad de ChileUniversity of KentuckyUniversity of Texas at ArlingtonNew Mexico State UniversityCentro de Astrobiología (CAB)Southwest Research InstituteEmbry-Riddle Aeronautical UniversityDrexel UniversityUniversidade Federal de SergipeMontana State UniversityTowson UniversityUniversidad de La Laguna (ULL)Instituto de Astrofísica de Canarias (IAC)Universidade de Sao PauloUniversidad Andres BelloUniversitat Polit`ecnica de CatalunyaPontificia Universidad Catolica de ChileEuropean Space Agency (ESA)Konkoly ObservatoryUniversidad de La SerenaCONACyTEotvos Lorand UniversityTexas Christian UniversityWestern Washington UniversityUniversidad Nacional Autonoma de MexicoUniversit\'e C\^ote d'AzurUniversidad Cat`olica del NorteUniversitat PotsdamUniversit´e de MontpellierValparaiso UniversityUniversity of Nebraska at OmahaCentro de Astrofísica y Tecnologías Afines (CATA)Max-Planck-Institut fur extraterrestrische Physik (MPE)Western Carolina UniversityMax-Planck-Institut f¨ur SonnensystemforschungObservat´orios NacionaisE¨otv¨os Lor´and Research Network (ELKH)Universidad Aut´onoma del Estado de Morelos (UAEM)Centre de Recerca en Ci`encies de la Terra (Geo3BCN)-CSICUniversitȁt HeidelbergMax Planck Institut für AstronomieCenter for Astrophysics  Harvard & SmithsonianLeibniz–Institut für Astrophysik Potsdam (AIP)Institut de Ciéncies del Cosmos (ICCUB)
Mapping the local and distant Universe is key to our understanding of it. For decades, the Sloan Digital Sky Survey (SDSS) has made a concerted effort to map millions of celestial objects to constrain the physical processes that govern our Universe. The most recent and fifth generation of SDSS (SDSS-V) is organized into three scientific ``mappers". Milky Way Mapper (MWM) that aims to chart the various components of the Milky Way and constrain its formation and assembly, Black Hole Mapper (BHM), which focuses on understanding supermassive black holes in distant galaxies across the Universe, and Local Volume Mapper (LVM), which uses integral field spectroscopy to map the ionized interstellar medium in the local group. This paper describes and outlines the scope and content for the nineteenth data release (DR19) of SDSS and the most substantial to date in SDSS-V. DR19 is the first to contain data from all three mappers. Additionally, we also describe nine value added catalogs (VACs) that enhance the science that can be conducted with the SDSS-V data. Finally, we discuss how to access SDSS DR19 and provide illustrative examples and tutorials.
A novel Risk Severity Index (RSI) quantifies the safety and security profiles of nine prominent Large Language Models across 24 harm categories. The empirical analysis reveals widespread vulnerabilities and low refusal rates in LLM safety filters, particularly for cybersecurity-related prompts, with most models exhibiting "Severe" harm potential.
Increasingly, large language models (LLMs) are being used to automate workplace processes requiring a high degree of creativity. While much prior work has examined the creativity of LLMs, there has been little research on whether they can generate valid creativity assessments for humans despite the increasingly central role of creativity in modern economies. We develop a psychometrically inspired framework for creating test items (questions) for a classic free-response creativity test: the creative problem-solving (CPS) task. Our framework, the creative psychometric item generator (CPIG), uses a mixture of LLM-based item generators and evaluators to iteratively develop new prompts for writing CPS items, such that items from later iterations will elicit more creative responses from test takers. We find strong empirical evidence that CPIG generates valid and reliable items and that this effect is not attributable to known biases in the evaluation process. Our findings have implications for employing LLMs to automatically generate valid and reliable creativity tests for humans and AI.
Internet of Things (IoT) is considered as a key enabler of health informatics. IoT-enabled devices are used for in-hospital and in-home patient monitoring to collect and transfer biomedical data pertaining to blood pressure, electrocardiography (ECG), blood sugar levels, body temperature, etc. Among these devices, wearables have found their presence in a wide range of healthcare applications. These devices generate data in real-time and transmit them to nearby gateways and remote servers for processing and visualization. The data transmitted by these devices are vulnerable to a range of adversarial threats, and as such, privacy and integrity need to be preserved. In this paper, we present LightIoT, a lightweight and secure communication approach for data exchanged among the devices of a healthcare infrastructure. LightIoT operates in three phases: initialization, pairing, and authentication. These phases ensure the reliable transmission of data by establishing secure sessions among the communicating entities (wearables, gateways and a remote server). Statistical results exhibit that our scheme is lightweight, robust, and resilient against a wide range of adversarial attacks and incurs much lower computational and communication overhead for the transmitted data in the presence of existing approaches.
Let L/KL/K be a Galois extension of fields with Galois group Γ\Gamma, and suppose L/KL/K is also an HH-Hopf Galois extension. Using the recently uncovered connection between Hopf Galois structures and skew left braces, we introduce a method to quantify the failure of surjectivity of the Galois correspondence from subHopf algebras of HH to intermediate subfields of L/KL/K, given by the Fundamental Theorem of Hopf Galois Theory. Suppose LKH=LNL \otimes_K H = LN where N(G,)N \cong (G, \star). Then there exists a skew left brace (G,,)(G, \star, \circ) where (G,)Γ(G, \circ) \cong \Gamma. We show that there is a bijective correspondence between intermediate fields EE between KK and LL and certain sub-skew left braces of GG, which we call the \circ-stable subgroups of (G,)(G, \star). Counting these subgroups and comparing that number with the number of subgroups of Γ(G,)\Gamma \cong (G, \circ) describes how far the Galois correspondence for the HH-Hopf Galois structure is from being surjective. The method is illustrated by a variety of examples.
This study looks at water quality monitoring and management as a new form of community engagement. Through a series of a unique research method called `design hackathons', we engaged with a hyperlocal community of citizens who are actively involved in monitoring and management of their local watershed. These design hackathons sought to understand the motivation, practices, collaboration and experiences of these citizens. Qualitative analysis of data revealed the nature of the complex stakeholder network, workflow practices, initiatives to engage with a larger community, current state of technological infrastructure being used, and innovative design scenarios proposed by the hackathon participants. Based on this comprehensive analysis, we conceptualize water quality monitoring and management as community-based monitoring and management, and water data as community data. Such a conceptualization sheds light on how these practices can help in preempting water crisis by empowering citizens through increased awareness, active participation and informal learning of water data and resources.
Multifractal formalisms provide an apt framework to study random cascades in which multifractal spectrum width Δα\Delta\alpha fluctuates depending on the number of estimable power-law relationships. Then again, multifractality without surrogate comparison can be ambiguous: the original measurement series' multifractal spectrum width ΔαOrig\Delta\alpha_\mathrm{Orig} can be sensitive to the series length, ergodicity-breaking linear temporal correlations (e.g., fractional Gaussian noise, fGnfGn), or additive cascade dynamics. To test these threats, we built a suite of random cascades that differ by the length, type of noise (i.e., additive white Gaussian noise, awGnawGn, or fGnfGn), and mixtures of awGnawGn or fGnfGn across generations (progressively more awGnawGn, progressively more fGnfGn, and a random sampling by generation), and operations applying noise (i.e., addition vs. multiplication). The so-called ``multifractal nonlinearity'' tMFt_\mathrm{MF} (i.e., a tt-statistic comparing ΔαOrig\Delta\alpha_\mathrm{Orig} and multifractal spectra width for phase-randomized linear surrogates ΔαSurr\Delta\alpha_\mathrm{Surr}) is a robust indicator of random multiplicative rather than random additive cascade processes irrespective of the series length or type of noise. tMFt_\mathrm{MF} is more sensitive to the number of generations than the series length. Furthermore, the random additive cascades exhibited much stronger ergodicity breaking than all multiplicative analogs. Instead, ergodicity breaking in random multiplicative cascades more closely followed the ergodicity-breaking of the constituent noise types -- breaking ergodicity much less when arising from ergodic awGnawGn and more so for noise incorporating relatively more correlated fGnfGn. Hence, tMFt_\mathrm{MF} is a robust multifractal indicator of multiplicative cascade processes and not spuriously sensitive to ergodicity breaking.
We present TAO, a software testing tool performing automated test and oracle generation based on a semantic approach. TAO entangles grammar-based test generation with automated semantics evaluation using a denotational semantics framework. We show how TAO can be incorporated with the Selenium automation tool for automated web testing, and how TAO can be further extended to support automated delta debugging, where a failing web test script can be systematically reduced based on grammar-directed strategies. A real-life parking website is adopted throughout the paper to demonstrate the effectivity of our semantics-based web testing approach.
The goal of community detection algorithms is to identify densely-connected units within large networks. An implicit assumption is that all the constituent nodes belong equally to their associated community. However, some nodes are more important in the community than others. To date, efforts have been primarily driven to identify communities as a whole, rather than understanding to what extent an individual node belongs to its community. Therefore, most metrics for evaluating communities, for example modularity, are global. These metrics produce a score for each community, not for each individual node. In this paper, we argue that the belongingness of nodes in a community is not uniform. The central idea of permanence is based on the observation that the strength of membership of a vertex to a community depends upon two factors: (i) the the extent of connections of the vertex within its community versus outside its community, and (ii) how tightly the vertex is connected internally. We discuss how permanence can help us understand and utilize the structure and evolution of communities by demonstrating that it can be used to -- (i) measure the persistence of a vertex in a community, (ii) design strategies to strengthen the community structure, (iii) explore the core-periphery structure within a community, and (iv) select suitable initiators for message spreading. We demonstrate that the process of maximizing permanence produces meaningful communities that concur with the ground-truth community structure of the networks more accurately than eight other popular community detection algorithms. Finally, we show that the communities obtained by this method are (i) less affected by the changes in vertex-ordering, and (ii) more resilient to resolution limit, degeneracy of solutions and asymptotic growth of values.
Although artificial intelligence is currently one of the most interesting areas in scientific research, the potential threats posed by emerging AI systems remain a source of persistent controversy. To address the issue of AI threat, this study proposes a standard intelligence model that unifies AI and human characteristics in terms of four aspects of knowledge, i.e., input, output, mastery, and creation. Using this model, we observe three challenges, namely, expanding of the von Neumann architecture; testing and ranking the intelligence quotient of naturally and artificially intelligent systems, including humans, Google, Bing, Baidu, and Siri; and finally, the dividing of artificially intelligent systems into seven grades from robots to Google Brain. Based on this, we conclude that AlphaGo belongs to the third grade.
ThangDLU's participation in #SMM4H 2024 involved classifying social media text related to social anxiety and childhood medical disorders. The team fine-tuned pre-trained BART and T5 encoder-decoder models and systematically investigated data augmentation techniques, demonstrating improved classification performance over the workshop's average F1 scores for both tasks.
Magnetic fields with strengths ranging from 300 to 500 kG have recently been discovered in a group of four extremely similar helium-enriched hot subdwarf (He-sdO) stars. Besides their strong magnetic fields, these He-sdO stars are characterised by common atmospheric parameters, clustering around TeffT_\mathrm{eff} = 46500K, logg\log g close to 6, and intermediate helium abundances. Here we present the discovery of three additional magnetic hot subdwarfs, J123359.44-674929.11, J125611.42-575333.45, and J144405.79-674400.93. These stars are again almost identical in terms of atmospheric parameters but, at BB \approx 200kG, their magnetic fields are somewhat weaker than those previously known. The close similarity of all known He-sdOs implies a finely-tuned origin. We propose the merging of an He white dwarf with a H+He white dwarf. A differential rotation at the merge interface may initiate a toroidal magnetic field that evolves by a magnetic dynamo to produce a poloidal field. This field is either directly visible at the surface or may diffuse towards the surface if initially buried. We further discuss a broad absorption line centred at about 4630\r{A} that is common to all magnetic He-sdOs. This feature may not be related to the magnetic field but instead to the intermediate helium abundances in these He-sdO stars, allowing the strong He II 4686\r{A} line to be perturbed by collisions with hydrogen atoms.
The spreadsheet application is among the most widely used computing tools in modern society. It provides excellent usability and usefulness, and it easily enables a non-programmer to perform programming-like tasks in a visual tabular "pen and paper" approach. However, spreadsheets are mostly limited to bookkeeping-like applications due to their mono-directional data flow. This paper shows how the spreadsheet computing paradigm is extended to break this limitation for solving constraint satisfaction problems. We present an enhanced spreadsheet system where finite-domain constraint solving is well supported in a visual environment. Furthermore, a spreadsheet-specific constraint language is constructed for general users to specify constraints among data cells in a declarative and scalable way. The new spreadsheet system significantly simplifies the development of many constraint-based applications using a visual tabular interface. Examples are given to illustrate the usability and usefulness of the extended spreadsheet paradigm. KEYWORDS: Spreadsheet computing, Finite-domain constraint satisfaction, Constraint logic programming
We take up an idea from the folklore of Answer Set Programming, namely that choices, integrity constraints along with a restricted rule format is sufficient for Answer Set Programming. We elaborate upon the foundations of this idea in the context of the logic of Here-and-There and show how it can be derived from the logical principle of extension by definition. We then provide an austere form of logic programs that may serve as a normalform for logic programs similar to conjunctive normalform in classical logic. Finally, we take the key ideas and propose a modeling methodology for ASP beginners and illustrate how it can be used.
In edge computing use cases (e.g., smart cities), where several users and devices may be in close proximity to each other, computational tasks with similar input data for the same services (e.g., image or video annotation) may be offloaded to the edge. The execution of such tasks often yields the same results (output) and thus duplicate (redundant) computation. Based on this observation, prior work has advocated for "computation reuse", a paradigm where the results of previously executed tasks are stored at the edge and are reused to satisfy incoming tasks with similar input data, instead of executing these incoming tasks from scratch. However, realizing computation reuse in practical edge computing deployments, where services may be offered by multiple (distributed) edge nodes (servers) for scalability and fault tolerance, is still largely unexplored. To tackle this challenge, in this paper, we present Reservoir, a framework to enable pervasive computation reuse at the edge, while imposing marginal overheads on user devices and the operation of the edge network infrastructure. Reservoir takes advantage of Locality Sensitive Hashing (LSH) and runs on top of Named-Data Networking (NDN), extending the NDN architecture for the realization of the computation reuse semantics in the network. Our evaluation demonstrated that Reservoir can reuse computation with up to an almost perfect accuracy, achieving 4.25-21.34x lower task completion times compared to cases without computation reuse.
A double pivot algorithm that combines features of two recently published papers by these authors is proposed. The proposed algorithm is implemented in MATLAB. The MATLAB code is tested, along with a MATLAB implementation of Dantzig's algorithm, for several test sets, including a set of cycling LP problems, Klee-Minty's problems, randomly generated linear programming (LP) problems, and Netlib benchmark problems. The test result shows that the proposed algorithm is (a) degenerate-tolerance as we expected, and (b) more efficient than Dantzig's algorithm for large size randomly generated LP problems but less efficient for Netlib benchmark problems and small size randomly generated problems in terms of CPU time.
This paper presents a rich knowledge representation language aimed at formalizing causal knowledge. This language is used for accurately and directly formalizing common benchmark examples from the literature of actual causality. A definition of cause is presented and used to analyze the actual causes of changes with respect to sequences of actions representing those examples.
Tor is the most popular anonymous communication overlay network which hides clients' identities from servers by passing packets through multiple relays. To provide anonymity to both clients and servers, Tor onion services were introduced by increasing the number of relays between a client and a server. Because of the limited bandwidth of Tor relays, large numbers of users, and multiple layers of encryption at relays, onion services suffer from high end-to-end latency and low data transfer rates, which degrade user experiences, making onion services unsuitable for latency-sensitive applications. In this paper, we present a UDP-based framework, called DarkHorse, that improves the end-to-end latency and the data transfer overhead of Tor onion services by exploiting the connectionless nature of UDP. Our evaluation results demonstrate that DarkHorse is up to 3.62x faster than regular TCP-based Tor onion services and reduces the Tor network overhead by up to 47%.
Many data analytic systems have adopted a newly emerging compute resource, serverless (SL), to handle data analytics queries in a timely and cost-efficient manner, i.e., serverless data analytics. While these systems can start processing queries quickly thanks to the agility and scalability of SL, they may encounter performance- and cost-bottlenecks based on workloads due to SL's worse performance and more expensive cost than traditional compute resources, e.g., virtual machine (VM). In this project, we introduce Smartpick, a SL-enabled scalable data analytics system that exploits SL and VM together to realize composite benefits, i.e., agility from SL and better performance with reduced cost from VM. Smartpick uses a machine learning prediction scheme, decision-tree based Random Forest with Bayesian Optimizer, to determine SL and VM configurations, i.e., how many SL and VM instances for queries, that meet cost-performance goals. Smartpick offers a knob for applications to allow them to explore a richer cost-performance tradeoff space opened by exploiting SL and VM together. To maximize the benefits of SL, Smartpick supports a simple but strong mechanism, called relay-instances. Smartpick also supports event-driven prediction model retraining to deal with workload dynamics. A Smartpick prototype was implemented on Spark and deployed on live test-beds, Amazon AWS and Google Cloud Platform. Evaluation results indicate 97.05% and 83.49% prediction accuracies respectively with up to 50% cost reduction as opposed to the baselines. The results also confirm that Smartpick allows data analytics applications to navigate the richer cost-performance tradeoff space efficiently and to handle workload dynamics effectively and automatically.
An efficient and flexible engine for computing fixed points is critical for many practical applications. In this paper, we firstly present a goal-directed fixed point computation strategy in the logic programming paradigm. The strategy adopts a tabled resolution (or memorized resolution) to mimic the efficient semi-naive bottom-up computation. Its main idea is to dynamically identify and record those clauses that will lead to recursive variant calls, and then repetitively apply those alternatives incrementally until the fixed point is reached. Secondly, there are many situations in which a fixed point contains a large number or even infinite number of solutions. In these cases, a fixed point computation engine may not be efficient enough or feasible at all. We present a mode-declaration scheme which provides the capabilities to reduce a fixed point from a big solution set to a preferred small one, or from an infeasible infinite set to a finite one. The mode declaration scheme can be characterized as a meta-level operation over the original fixed point. We show the correctness of the mode declaration scheme. Thirdly, the mode-declaration scheme provides a new declarative method for dynamic programming, which is typically used for solving optimization problems. There is no need to define the value of an optimal solution recursively, instead, defining a general solution suffices. The optimal value as well as its corresponding concrete solution can be derived implicitly and automatically using a mode-directed fixed point computation engine. Finally, this fixed point computation engine has been successfully implemented in a commercial Prolog system. Experimental results are shown to indicate that the mode declaration improves both time and space performances in solving dynamic programming problems.
There are no more papers matching your filters at the moment.