alphaXiv

Indian Institute of Technology Tirupati

23 Sep 2025

Nonlinear Fractal Histopolation Function

A novel method for constructing a nonlinear fractal histopolation function associated with a given histogram is introduced in this paper. In contrast to classical fractal interpolation methods, which produce continuous and interpolatory functions, the proposed approach constructs a bounded, Riemann integrable function that is not necessarily continuous but preserves the area of a given histogram. An iterated function system based on Rakotch contractions- a generalisation of Banach contractions- is utilised, thereby extending the theoretical framework for fractal histopolation. Unlike previous formulations, the proposed construction of nonlinear fractal functions allows vertical scaling factors greater than one. The conditions for the nonlinear fractal function to be a solution for the histopolation problem are derived.

01 Oct 2025

computer-science distributed-parallel-and-cluster-computing

ThirstyFLOPS: Water Footprint Modeling and Analysis Toward Sustainable HPC Systems

University of Utah

Northeastern University Indian Institute of Technology Tirupati

High-performance computing (HPC) systems are becoming increasingly water-intensive due to their reliance on water-based cooling and the energy used in power generation. However, the water footprint of HPC remains relatively underexplored-especially in contrast to the growing focus on carbon emissions. In this paper, we present ThirstyFLOPS - a comprehensive water footprint analysis framework for HPC systems. Our approach incorporates region-specific metrics, including Water Usage Effectiveness, Power Usage Effectiveness, and Energy Water Factor, to quantify water consumption using real-world data. Using four representative HPC systems - Marconi, Fugaku, Polaris, and Frontier - as examples, we provide implications for HPC system planning and management. We explore the impact of regional water scarcity and nuclear-based energy strategies on HPC sustainability. Our findings aim to advance the development of water-aware, environmentally responsible computing infrastructures.

08 Jun 2023

computer-science programming-languages software-engineering

X-COBOL: A Dataset of COBOL Repositories

Indian Institute of Technology Tirupati

Despite being proposed as early as 1959, COBOL (Common Business-Oriented Language) still predominantly acts as an integral part of the majority of operations of several financial, banking, and governmental organizations. To support the inevitable modernization and maintenance of legacy systems written in COBOL, it is essential for organizations, researchers, and developers to understand the nature and source code of COBOL programs. However, to the best of our knowledge, we are unaware of any dataset that provides data on COBOL software projects, motivating the need for the dataset. Thus, to aid empirical research on comprehending COBOL in open-source repositories, we constructed a dataset of 84 COBOL repositories mined from GitHub, containing rich metadata on the development cycle of the projects. We envision that researchers can utilize our dataset to study COBOL projects' evolution, code properties and develop tools to support their development. Our dataset also provides 1255 COBOL files present inside the mined repositories. The dataset and artifacts are available at https://doi.org/10.5281/zenodo.7968845.

30 May 2025

mesoscale-and-nanoscale-physics high-energy-physics-theory physics

Emergence of Hermitian topology from non-Hermitian knots

Indian Institute of Technology Tirupati Banaras Hindu University

This work from Banaras Hindu University and Indian Institute of Technology Tirupati demonstrates that topological phase transitions in a Hermitian system systematically induce corresponding changes in the knot topology of a derived non-Hermitian Hamiltonian's complex eigenvalues. It introduces a "first-order knot transition" characterized by a discrete jump in eigenvalues at the transition point, occurring without the presence of Exceptional Points.

17 Jun 2021

adversarial-attacks computer-science artificial-intelligence

Class Balancing GAN with a Classifier in the Loop

Indian Institute of Technology Tirupati Indian Institute of Science Bengaluru

Generative Adversarial Networks (GANs) have swiftly evolved to imitate increasingly complex image distributions. However, majority of the developments focus on performance of GANs on balanced datasets. We find that the existing GANs and their training regimes which work well on balanced datasets fail to be effective in case of imbalanced (i.e. long-tailed) datasets. In this work we introduce a novel theoretically motivated Class Balancing regularizer for training GANs. Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset. This is achieved via modelling the effective class frequency based on the exponential forgetting observed in neural networks and encouraging the GAN to focus on underrepresented classes. We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets. Specifically, when applied to an unconditional GAN, it improves the FID from

13.03

9.01

on the long-tailed iNaturalist-

2019

dataset.

30 Apr 2024

computer-science data-structures-and-algorithms

Efficient Algorithms for Earliest and Fastest Paths in Public Transport Networks

Indian Institute of Technology Tirupati

Public transport administrators rely on efficient algorithms for various problems that arise in public transport networks. In particular, our study focused on designing linear-time algorithms for two fundamental path problems: the earliest arrival time (\textsc{eat}) and the fastest path duration (\textsc{fpd}) on public transportation data. We conduct a comparative analysis with state-of-the-art algorithms. The results are quite promising, indicating substantial efficiency improvements. Specifically, the fastest path problem shows a remarkable 34-fold speedup, while the earliest arrival time problem exhibits an even more impressive 183-fold speedup. These findings highlight the effectiveness of our algorithms to solve \textsc{eat} and \textsc{fpd} problems in public transport, and eventually help public administrators to enrich the urban transport experience.

05 Jul 2025

computer-science contrastive-learning artificial-intelligence

UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages

Indian Institute of Technology Gandhinagar Indian Institute of Technology Tirupati Indian Institute of Technology Goa

This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 567k training instances and 30k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.

19 Nov 2024

computer-science information-retrieval performance

Hardware Acceleration for Knowledge Graph Processing: Challenges & Recent Developments

ETH Zurich Dell Technologies Indian Institute of Technology Tirupati HIRO-MicroDataCenters BV Eurécom

Knowledge graphs (KGs) have achieved significant attention in recent years, particularly in the area of the Semantic Web as well as gaining popularity in other application domains such as data mining and search engines. Simultaneously, there has been enormous progress in the development of different types of heterogeneous hardware, impacting the way KGs are processed. The aim of this paper is to provide a systematic literature review of knowledge graph hardware acceleration. For this, we present a classification of the primary areas in knowledge graph technology that harnesses different hardware units for accelerating certain knowledge graph functionalities. We then extensively describe respective works, focusing on how KG related schemes harness modern hardware accelerators. Based on our review, we identify various research gaps and future exploratory directions that are anticipated to be of significant value both for academics and industry practitioners.

31 May 2019

quantum-gases strongly-correlated-electrons physics

Periodically Driven Many-Body Systems: A Floquet Density Matrix Renormalization Group Study

Indian Institute of Technology Tirupati University of Kaiserslautern

Driving a quantum system periodically in time can profoundly alter its long-time correlations and give rise to exotic quantum states of matter. The complexity of the combination of many-body correlations and dynamic manipulations has the potential to uncover a whole field of new phenomena, but the theoretical and numerical understanding becomes extremely difficult. We now propose a promising numerical method by generalizing the density matrix renormalization group to a superposition of Fourier components of periodically driven many-body systems using Floquet theory. With this method we can study the full time-dependent quantum solution in a large parameter range for all evolution times, beyond the commonly used high-frequency approximations. Numerical results are presented for the isotropic Heisenberg antiferromagnetic spin-1/2 chain under both local(edge) and global driving for spin-spin correlations and temporal fluctuations. As the frequency is lowered, we demonstrate that more and more Fourier components become relevant and determine strong length- and frequency-dependent changes of the quantum correlations that cannot be described by effective static models.

21 Nov 2022

statistical-mechanics chaotic-dynamics physics

Construction and local equivalence of dual-unitary operators: from dynamical maps to quantum combinatorial designs

Indian Institute of Technology Madras Indian Institute of Technology Tirupati

Arul Lakshminarayan

While quantum circuits built from two-particle dual-unitary (maximally entangled) operators serve as minimal models of typically nonintegrable many-body systems, the construction and characterization of dual-unitary operators themselves are only partially understood. A nonlinear map on the space of unitary operators was proposed in PRL.~125, 070501 (2020) that results in operators being arbitrarily close to dual unitaries. Here we study the map analytically for the two-qubit case describing the basins of attraction, fixed points, and rates of approach to dual unitaries. A subset of dual-unitary operators having maximum entangling power are 2-unitary operators or perfect tensors, and are equivalent to four-party absolutely maximally entangled states. It is known that they only exist if the local dimension is larger than

d=2

. We use the nonlinear map, and introduce stochastic variants of it, to construct explicit examples of new dual and 2-unitary operators. A necessary criterion for their local unitary equivalence to distinguish classes is also introduced and used to display various concrete results and a conjecture in

d=3

. It is known that orthogonal Latin squares provide a ``classical combinatorial design" for constructing permutations that are 2-unitary. We extend the underlying design from classical to genuine quantum ones for general dual-unitary operators and give an example of what might be the smallest sized genuinely quantum design of a 2-unitary in

d=4

10 Jul 2023

computer-science artificial-intelligence software-engineering

COMEX: A Tool for Generating Customized Source Code Representations

IBM Research Indian Institute of Technology Tirupati

Learning effective representations of source code is critical for any Machine Learning for Software Engineering (ML4SE) system. Inspired by natural language processing, large language models (LLMs) like Codex and CodeGen treat code as generic sequences of text and are trained on huge corpora of code data, achieving state of the art performance on several software engineering (SE) tasks. However, valid source code, unlike natural language, follows a strict structure and pattern governed by the underlying grammar of the programming language. Current LLMs do not exploit this property of the source code as they treat code like a sequence of tokens and overlook key structural and semantic properties of code that can be extracted from code-views like the Control Flow Graph (CFG), Data Flow Graph (DFG), Abstract Syntax Tree (AST), etc. Unfortunately, the process of generating and integrating code-views for every programming language is cumbersome and time consuming. To overcome this barrier, we propose our tool COMEX - a framework that allows researchers and developers to create and combine multiple code-views which can be used by machine learning (ML) models for various SE tasks. Some salient features of our tool are: (i) it works directly on source code (which need not be compilable), (ii) it currently supports Java and C#, (iii) it can analyze both method-level snippets and program-level snippets by using both intra-procedural and inter-procedural analysis, and (iv) it is easily extendable to other languages as it is built on tree-sitter - a widely used incremental parser that supports over 40 languages. We believe this easy-to-use code-view generation and customization tool will give impetus to research in source code representation learning methods and ML4SE. Tool: this https URL - GitHub: this https URL - Demo: this https URL

15 Nov 2023

computer-science software-engineering

DBJoules: An Energy Measurement Tool for Database Management Systems

Indian Institute of Technology Tirupati

In the rapidly evolving landscape of modern data-driven technologies, software relies on large datasets and constant data center operations using various database systems to support computation-intensive tasks. As energy consumption in software systems becomes a growing concern, selecting the right database from energy-efficiency perspective is also critical. To address this, we introduce \textbf{\textit{DBJoules}}, a tool that measures the energy consumption of activities in database systems. \textit{DBJoules} supports energy measurement of CRUD operations for four popular databases. Through evaluations on two widely-used datasets, we identify disparities of 7\% to 38\% in the energy consumption of these databases. Hence, the goal is to raise developer awareness about the effect of running queries in different databases from an energy consumption perspective, enabling them to select appropriate database for sustainable usage. The tool's demonstration is available at \url{this https URL} and related artifacts at \url{this https URL}.

21 Jun 2022

computer-science software-engineering

An Empirical Study On Correlation between Readme Content and Project Popularity

Indian Institute of Technology Tirupati

Readme in GitHub repositories serves as a preliminary source of information, and thus helps developers in understanding about the projects, for reuse or extension. Different types of contextual and structural content, which we refer to as categories of the content and features in the content respectively, are present in readme files, and could determine the extent of comprehension about project. Consequently, the structural and contextual aspects of the content could impact the project popularity. Studying the correlation between the content and project popularity could help in focusing on the aspects that could improve popularity, while designing the readme files. However, existing studies explore the categories of content and types of features in readme files, and do not explore their usefulness towards project popularity. Hence, we present an empirical study to understand correlation between readme file content and project popularity. We perform the study on 1950 readme files of public GitHub projects, spanning across ten programming languages, and observe that readme files in majority of the popular projects are well organised using lists and images, and comprise links to external sources. Also, repositories with readme files containing contribution guidelines and references were observed to be associated with higher popularity.

14 Aug 2025

statistical-mechanics physics

Nonextensive Thermodynamics of the Morse Oscillator: Signature and Solid State Application

Indian Institute of Technology Tirupati

In this work, we present a detailed thermodynamic analysis of a bound quantum system: the Morse oscillator within the framework of Tsallis nonextensive statistics. Using the property of the bound spectrum (upper bound) of the Morse potential, limited by the bond dissociation energy, we analytically derive the generalized partition function. We present results for both the high- and low-temperature limits. We propose the effective number of accessible states as a measure of nonextensivity. The calculation shows that the nonextensive framework further restricts the number of accessible states. We also derive the generalized internal energy and entropy and examine their dependence on temperature and the nonextensivity parameter

q

. Numerical results confirm the strong effect of nonextensive behavior in the low-temperature regime (precisely low to moderate temperature), where the ratio of generalized internal energy and internal energy calculated from the Boltzmann Gibbs (BG) formula develops a nontrivial dip structure for

q &lt; 1

. Moreover, the generalized specific heat shows the Schottky-type anomaly. We extend our study by deriving the specific heat of solids with BG and Tsallis statistics using the anharmonic energy levels of the Morse oscillator. This study suggests that the Morse oscillator is a solvable and physically meaningful testing ground for exploring the thermodynamics of quantum systems driven by nonextensive statistics, with implications for the vibrational properties of the non-equilibrium molecular thermodynamics (especially diatomic molecules).

16 May 2025

computer-science materials-science hardware-architecture

ForgetMeNot: Understanding and Modeling the Impact of Forever Chemicals Toward Sustainable Large-Scale Computing

University of Utah

Northeastern University Indian Institute of Technology Tirupati

Fluorinated compounds, often referred to as forever chemicals, are critical in various steps of semiconductor fabrication like lithography, etching, chamber cleaning, and others. Forever chemical emissions can exhibit global warming potentials thousands of times greater than carbon dioxide and persist in the atmosphere for millennia. Despite their severe impact, most sustainability works in computer systems have focused on carbon emissions alone. We address this gap by introducing ForgetMeNot, a modeling tool that quantifies fluorinated compound emissions by integrating fabrication facility-specific practices and hardware specifications, and validate its accuracy using real-world emission data from fabrication facilities. We show how ForgetMeNot can enable fabrication facilities to optimize design and material usage decisions for emission reduction and provide researchers with a methodology to calibrate emission estimates for hardware designs. When ForgetMeNot is applied to analyze emissions for manufacturing CPUs, DRAM, and storage, it illustrates how hardware generations, lithography techniques, and capacities impact fluorinated compound emissions. Finally, we demonstrate how datacenter operators can assemble low-emission servers while balancing performance demands. By factoring in fluorinated emissions into manufacturing decisions, ForgetMeNot paves the way for building more sustainable systems.

21 May 2022

clustering-algorithms computer-science machine-learning

Motif Prediction with Graph Neural Networks

ETH Zurich University of Geneva Indian Institute of Technology Tirupati

Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider - among others - correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., k-cliques) and for sparse ones (e.g., k-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis.

09 Nov 2017

computer-science machine-learning online-learning

Efficient-UCBV: An Almost Optimal Algorithm using Variance Estimates

Indian Institute of Technology Madras Indian Institute of Technology Tirupati Robert Bosch Centre for Data Science and AI (RBC-DSAI)

We propose a novel variant of the UCB algorithm (referred to as Efficient-UCB-Variance (EUCBV)) for minimizing cumulative regret in the stochastic multi-armed bandit (MAB) setting. EUCBV incorporates the arm elimination strategy proposed in UCB-Improved \citep{auer2010ucb}, while taking into account the variance estimates to compute the arms' confidence bounds, similar to UCBV \citep{audibert2009exploration}. Through a theoretical analysis we establish that EUCBV incurs a \emph{gap-dependent} regret bound of {\scriptsize

O\left( \dfrac{K\sigma^2_{\max} \log (T\Delta^2 /K)}{\Delta}\right)

} after

T

trials, where

\Delta

is the minimal gap between optimal and sub-optimal arms; the above bound is an improvement over that of existing state-of-the-art UCB algorithms (such as UCB1, UCB-Improved, UCBV, MOSS). Further, EUCBV incurs a \emph{gap-independent} regret bound of {\scriptsize

O\left(\sqrt{KT}\right)

} which is an improvement over that of UCB1, UCBV and UCB-Improved, while being comparable with that of MOSS and OCUCB. Through an extensive numerical study we show that EUCBV significantly outperforms the popular UCB variants (like MOSS, OCUCB, etc.) as well as Thompson sampling and Bayes-UCB algorithms.

15 Jul 2025

physics quantum-physics

Entanglement structure for finite system under dual-unitary dynamics

Indian Institute of Technology (Banaras Hindu University)Bar Ilan University Indian Institute of Technology Tirupati

The dynamics of quantum many-body systems in the chaotic regime are of particular interest due to the associated phenomena of information scrambling and entanglement generation within the system. While these systems are typically intractable using traditional numerical methods, an effective framework can be implemented based on dual-unitary circuits which have emerged as a minimal model for maximally chaotic dynamics. In this work, we investigate how individual two-body operators influence the global dynamics of circuits composed of dual-unitaries. We study their effect on entanglement generation while examining it from both bipartite and multipartite perspectives. Here we also highlight the significant role of local unitaries in the dynamics when paired with operators from the dual-unitary class, showing that systems with identical entangling power can exhibit a range of differing entanglement growth rates. Furthermore, we present calculations establishing time-step-dependent lower bounds, which depend on both the initial state and the entangling power of the constituent operators. Finally, we find that time-evolving an initial state composed of pair products generates a state with nearly maximal multipartite entanglement content, approaching the bounds established by Absolutely Maximally Entangled (AME) states.

06 Aug 2025

agentic-frameworks agents computer-science

Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation

Indian Institute of Technology Tirupati ABB Ability Innovation Center

The increasing vulnerability of electrical distribution systems to extreme weather events and cyber threats necessitates the development of economically viable frameworks for resilience enhancement. While existing approaches focus primarily on technical resilience metrics and enhancement strategies, there remains a significant gap in establishing market-driven mechanisms that can effectively commercialize resilience features while optimizing their deployment through intelligent decision-making. Moreover, traditional optimization approaches for distribution network reconfiguration often fail to dynamically adapt to both normal and emergency conditions. This paper introduces a novel framework integrating dual-agent Proximal Policy Optimization (PPO) with market-based mechanisms, achieving an average resilience score of 0.85 0.08 over 10 test episodes. The proposed architecture leverages a dual-agent PPO scheme, where a strategic agent selects optimal DER-driven switching configurations, while a tactical agent fine-tunes individual switch states and grid preferences under budget and weather constraints. These agents interact within a custom-built dynamic simulation environment that models stochastic calamity events, budget limits, and resilience-cost trade-offs. A comprehensive reward function is designed that balances resilience enhancement objectives with market profitability (with up to 200x reward incentives, resulting in 85% of actions during calamity steps selecting configurations with 4 DERs), incorporating factors such as load recovery speed, system robustness, and customer satisfaction. Over 10 test episodes, the framework achieved a benefit-cost ratio of 0.12 0.01, demonstrating sustainable market incentives for resilience investment. This framework creates sustainable market incentives

20 Dec 2024

computer-science software-engineering

Code Review Automation Via Multi-task Federated LLM -- An Empirical Study

Indian Institute of Technology Tirupati

Code review is a crucial process before deploying code to production, as it validates the code, provides suggestions for improvements, and identifies errors such as missed edge cases. In projects with regular production releases, the effort required for peer code-reviews remains high. Consequently, there has been significant interest from software engineering (SE) researchers in automating the code review process. Previous research on code review automation has typically approached the task as three independent sub-tasks: review necessity prediction, review comment generation, and code refinement. Our study attempts to (i) leverage the relationships between the sub-tasks of code review automation, by developing a multi-task model that addresses all tasks in an integrated manner, and (ii) increase model robustness on unseen data via collaborative large language model (LLM) modeling, while retaining the proprietary nature of code, by using federated learning (FL). The study explores five simple techniques for multi-task training, including two sequential methods, one parallel method, and two cumulative methods. The results indicate that sequentially training a federated LLM (FedLLM) for our code review multi-task use case is less efficient in terms of time, computation, and performance metrics, compared to training separate models for each task. Because sequential training demonstrates catastrophic forgetting, alternatively cumulative fine-tuning for multi-task training performs better than training models for individual tasks. This study highlights the need for research focused on effective fine-tuning of multi-task FedLLMs for SE tasks.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Nonlinear Fractal Histopolation Function

ThirstyFLOPS: Water Footprint Modeling and Analysis Toward Sustainable HPC Systems

X-COBOL: A Dataset of COBOL Repositories

Emergence of Hermitian topology from non-Hermitian knots

Class Balancing GAN with a Classifier in the Loop

Efficient Algorithms for Earliest and Fastest Paths in Public Transport Networks

UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages

Hardware Acceleration for Knowledge Graph Processing: Challenges & Recent Developments

Periodically Driven Many-Body Systems: A Floquet Density Matrix Renormalization Group Study

Construction and local equivalence of dual-unitary operators: from dynamical maps to quantum combinatorial designs

COMEX: A Tool for Generating Customized Source Code Representations

DBJoules: An Energy Measurement Tool for Database Management Systems

An Empirical Study On Correlation between Readme Content and Project Popularity

Nonextensive Thermodynamics of the Morse Oscillator: Signature and Solid State Application

ForgetMeNot: Understanding and Modeling the Impact of Forever Chemicals Toward Sustainable Large-Scale Computing

Motif Prediction with Graph Neural Networks

Efficient-UCBV: An Almost Optimal Algorithm using Variance Estimates

Entanglement structure for finite system under dual-unitary dynamics

Agentic-AI based Mathematical Framework for Commercialization of Energy Resilience in Electrical Distribution System Planning and Operation

Code Review Automation Via Multi-task Federated LLM -- An Empirical Study

Events

AI for Law

Personalize Your Feed