University of Malaya
This paper presents VulBERTa, a deep learning approach to detect security vulnerabilities in source code. Our approach pre-trains a RoBERTa model with a custom tokenisation pipeline on real-world code from open-source C/C++ projects. The model learns a deep knowledge representation of the code syntax and semantics, which we leverage to train vulnerability detection classifiers. We evaluate our approach on binary and multi-class vulnerability detection tasks across several datasets (Vuldeepecker, Draper, REVEAL and muVuldeepecker) and benchmarks (CodeXGLUE and D2A). The evaluation results show that VulBERTa achieves state-of-the-art performance and outperforms existing approaches across different datasets, despite its conceptual simplicity, and limited cost in terms of size of training data and number of model parameters.
A large-scale investigation into perplexity-based LLM code detection methods reveals they offer robust generalization across different LLMs, though their accuracy and efficiency vary significantly with programming language, code length, and the use of perturbation techniques.
The Massive Multitask Language Understanding (MMLU) benchmark has been widely used to evaluate language models across various domains. However, existing MMLU datasets primarily focus on high-resource languages such as English, which leaves low-resource languages like Bengali underrepresented. In this paper, we introduce BnMMLU, a benchmark to evaluate the multitask language understanding capabilities of Bengali in language models. The dataset spans 23 domains, including science, humanities, mathematics and general knowledge and is structured in a multiple-choice format to assess factual knowledge, application-based problem-solving and reasoning abilities of language models. It consists of 138,949 question-option pairs. We benchmark several proprietary and open-source large language models (LLMs) on the BnMMLU test set. Additionally, we annotate the test set with three cognitive categories-factual knowledge, procedural application and reasoning-to gain deeper insights into model strengths and weaknesses across various cognitive tasks. The results reveal significant performance gaps, highlighting the need for improved pre-training and fine-tuning strategies tailored to Bengali data. We release the dataset and benchmark results to facilitate further research in this area.
Power grid fault diagnosis is a critical task for ensuring the reliability and stability of electrical infrastructure. Traditional diagnostic systems often struggle with the complexity and variability of power grid data. This paper proposes a novel approach that leverages Large Language Models (LLMs), specifically ChatGPT and GPT-4, combined with advanced prompt engineering to enhance fault diagnosis accuracy and explainability. We designed comprehensive, context-aware prompts to guide the LLMs in interpreting complex data and providing detailed, actionable insights. Our method was evaluated against baseline techniques, including standard prompting, Chain-of-Thought (CoT), and Tree-of-Thought (ToT) methods, using a newly constructed dataset comprising real-time sensor data, historical fault records, and component descriptions. Experimental results demonstrate significant improvements in diagnostic accuracy, explainability quality, response coherence, and contextual understanding, underscoring the effectiveness of our approach. These findings suggest that prompt-engineered LLMs offer a promising solution for robust and reliable power grid fault diagnosis.
StreetViewLLM, developed by researchers at the Spatial Sciences Institute at USC, introduces a chain-of-thought multimodal large language model framework for extracting geographic information. It significantly improves the precision and granularity of geospatial predictions by integrating street view imagery, geographic coordinates, and textual data, outperforming traditional machine learning and deep learning baselines by at least 49.43% across various urban indicators.
In this paper, a novel bio-inspired optimization algorithm is proposed, called Bombardier Beetle Optimizer (BBO). This type of species is very intelligent, which has an ability to defense and escape from predators. The principles of the former one is inspired by the defense mechanism of Bombardier Beetle against the predators, which the Bombardier Beetle triggers a toxic chemical spray when it feels threatened. This reaction occurs in a specialized reaction chamber inside its abdomen and includes a well regulated enzymatic mechanism, which comprises hot water vapor, oxygen, and irritating substances like p-benzoquinones. In addition, the proposed BBO simulates also the escape mechanism of Bombardier Beetle from predator, which it has the ability to calculate its distance from predator and it can fly away. The BBO is tested with optimizing Congress on Evolutionary Computation (CEC 2017) test bed suites. Moreover, it is compared against well-known metaheuristic optimization algorithms includes Chernobyl Disaster Optimizer (CDO), Grey Wolf Optimizer (GWO), Particle Swarm Optimization (PSO), Bermuda Triangle Optimizer (BTO), Sperm Swarm Optimization (SSO) and Gravitational Search Algorithm (GSA). The outcomes of this paper prove the BBO's efficiency in which outperforms the other algorithms in terms of convergence rate and quality of results.
In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.
Feature pyramids have been widely adopted in convolutional neural networks and transformers for tasks in medical image segmentation. However, existing models generally focus on the Encoder-side Transformer for feature extraction. We further explore the potential in improving the feature decoder with a well-designed architecture. We propose Cross Feature Pyramid Transformer decoder (CFPFormer), a novel decoder block that integrates feature pyramids and transformers. Even though transformer-like architecture impress with outstanding performance in segmentation, the concerns to reduce the redundancy and training costs still exist. Specifically, by leveraging patch embedding, cross-layer feature concatenation mechanisms, CFPFormer enhances feature extraction capabilities while complexity issue is mitigated by our Gaussian Attention. Benefiting from Transformer structure and U-shaped connections, our work is capable of capturing long-range dependencies and effectively up-sample feature maps. Experimental results are provided to evaluate CFPFormer on medical image segmentation datasets, demonstrating the efficacy and effectiveness. With a ResNet50 backbone, our method achieves 92.02\% Dice Score, highlighting the efficacy of our methods. Notably, our VGG-based model outperformed baselines with more complex ViT and Swin Transformer backbone.
According to the Maxwell demon paradigm, additional work can be extracted from a classical or quantum system by exploiting information obtained through measurements on a correlated ancillary system. In the quantum setting, the maximum work extractable via unitary operations in such measurement-assisted protocols is referred to as daemonic ergotropy. In this work, we explore this concept in the context of continuous-variable quantum systems, focusing on Gaussian states and general-dyne (Gaussian) measurements. We derive a general expression for the daemonic ergotropy and examine two key scenarios: (i) bipartite Gaussian states where a general-dyne measurement is performed on one of the two parties, and (ii) open Gaussian quantum systems under continuous general-dyne monitoring of the environment. Remarkably, we show that for single-mode Gaussian states, the ergotropy depends solely on the state's energy and purity. This enables us to express the daemonic ergotropy as a simple function of the unconditional energy and the purity of the conditional states, revealing that enhanced daemonic work extraction is directly linked to measurement-induced purification. We illustrate our findings through two paradigmatic examples: extracting daemonic work from a two-mode squeezed thermal state and from a continuously monitored optical parametric oscillator. In both case we identify the optimal general-dyne strategies that maximize the conditional purity and, in turn, the daemonic ergotropy.
This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 - 82.65%, ii) T2.1 - 74.3%, iii) T2.2 - 85.32%, iv) T3.1 - 53.86%, and v) T3.2 - 54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants methods. The dataset, the evaluation kit as well as the results are publicly available at this https URL
Robust text reading from street view images provides valuable information for various applications. Performance improvement of existing methods in such a challenging scenario heavily relies on the amount of fully annotated training data, which is costly and in-efficient to obtain. To scale up the amount of training data while keeping the labeling procedure cost-effective, this competition introduces a new challenge on Large-scale Street View Text with Partial Labeling (LSVT), providing 50, 000 and 400, 000 images in full and weak annotations, respectively. This competition aims to explore the abilities of state-of-the-art methods to detect and recognize text instances from large-scale street view images, closing the gap between research benchmarks and real applications. During the competition period, a total of 41 teams participated in the two proposed tasks with 132 valid submissions, i.e., text detection and end-to-end text spotting. This paper includes dataset descriptions, task definitions, evaluation protocols and results summaries of the ICDAR 2019-LSVT challenge.
Large Vision-Language Models (LVLMs) have shown remarkable progress in various multimodal tasks, yet they often struggle with complex visual reasoning that requires multi-step inference. To address this limitation, we propose MF-SQ-LLaVA, a novel approach that enhances LVLMs by enabling implicit self-questioning through end-to-end training. Our method involves augmenting visual question answering datasets with reasoning chains consisting of sub-question and answer pairs, and training the LVLM with a multi-task loss that encourages the generation and answering of these intermediate steps, as well as the prediction of the final answer. We conduct extensive experiments on the ScienceQA and VQAv2 datasets, demonstrating that MF-SQ-LLaVA significantly outperforms existing state-of-the-art models, including the base LLaVA and the original SQ-LLaVA. Ablation studies further validate the contribution of each component of our approach, and human evaluation confirms the improved accuracy and coherence of the reasoning process enabled by our method.
This paper presents an MFG-based decision-making framework for autonomous driving in heterogeneous traffic. To capture diverse human behaviors, we propose a quantitative driving style representation that maps abstract traits to parameters such as speed, safety factors, and reaction time. These parameters are embedded into the MFG through a spatial influence field model. To ensure safe operation in dense traffic, we introduce a safety-critical lane-changing algorithm that leverages dynamic safety margins, time-to-collision analysis, and multi-layered constraints. Real-world NGSIM data is employed for style calibration and empirical validation. Experimental results demonstrate zero collisions across six style combinations, two 15-vehicle scenarios, and NGSIM-based trials, consistently outperforming conventional game-theoretic baselines. Overall, our approach provides a scalable, interpretable, and behavior-aware planning framework for real-world autonomous driving applications.
We investigate dimuon production in the context of a first-order phase transition in QCD matter using a chiral fluid dynamics model. This approach incorporates non-equilibrium effects such as entropy production and reheating, which emerge during the dynamical evolution through a first-order phase transition. By comparing equilibrium and non-equilibrium scenarios across a range of beam energies (sNN=2.26.2\sqrt{s_{NN}}=2.2-6.2~GeV), we analyze the resulting invariant mass spectra. Our results reveal a substantial enhancement of dilepton yields in the non-equilibrium scenario, particularly pronounced at lower beam energies, where reheating leads to a prolonged lifetime of the fireball and increased emission. The enhancement persists even after normalizing to pion multiplicities, indicating sensitivity beyond effects of entropy production.
Researchers present the first comprehensive review of eXplainable Goal-Driven Agents and Robots (XGDAIs), synthesizing existing techniques for explaining perceptual functions and cognitive reasoning in autonomous systems. The work identifies prevalent explanation generation and communication methods, categorizes approaches by behavioral architecture (deliberative, reactive, hybrid), and outlines critical gaps and future directions for developing more transparent and understandable intelligent agents.
The advancement of the Event Horizon Telescope has enabled the study of relativistic jets in active galactic nuclei down to sub-parsec linear scales even at high redshift. Quasi-simultaneous multifrequency observations provide insights into the physical conditions in compact regions and allow testing accretion theories. Initially we aimed at measuring the magnetic field strength close to the central supermassive black hole in NRAO 530 (1730-130) by studying frequency-dependent opacity of the jet matter, Faraday rotation and the spectral index in the mm-radio bands. NRAO 530 was observed quasi-simultaneously at 15, 22, 43, 86, and 227 GHz at four different very long baseline interferometer (VLBI) networks. By the means of imaging and model-fitting, we aligned the images, taken at different frequencies. We explored opacity along the jet and distribution of the linearly polarized emission in it. Our findings reveal that the jet of NRAO 530 at 86 and 227 GHz is transparent down to its origin, with 70 mJy emission detected at 227 GHz potentially originating from the accretion disk. The magnetic field strength near the black hole, estimated at 5rg5r_\mathrm{g}, is 3×1033×1043\times10^3-3\times10^4 G (depending on the central black hole mass). These values represent some of the highest magnetic field strengths reported for active galaxies. We also report the first ever VLBI measurement of the Faraday rotation at 43-227 GHz, which reveals rotation measure values as high as -48000 rad/m2 consistent with higher particle density and stronger magnetic fields at the jet's outset. The complex shape of the jet in NRAO 530 is in line with the expected behavior of a precessing jet, with a period estimated to be around 6±46\pm4~years.
Artificial Intelligence Generated Content (AIGC) technology development has facilitated the creation of rumors with misinformation, impacting societal, economic, and political ecosystems, challenging democracy. Current rumor detection efforts fall short by merely labeling potentially misinformation (classification task), inadequately addressing the issue, and it is unrealistic to have authoritative institutions debunk every piece of information on social media. Our proposed comprehensive debunking process not only detects rumors but also provides explanatory generated content to refute the authenticity of the information. The Expert-Citizen Collective Wisdom (ECCW) module we designed aensures high-precision assessment of the credibility of information and the retrieval module is responsible for retrieving relevant knowledge from a Real-time updated debunking database based on information keywords. By using prompt engineering techniques, we feed results and knowledge into a LLM (Large Language Model), achieving satisfactory discrimination and explanatory effects while eliminating the need for fine-tuning, saving computational costs, and contributing to debunking efforts.
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data, i.e., images, text, and audio. Accordingly, its promising performance has led to the GAN-based adversarial attack methods in the white-box and black-box attack scenarios. The importance of transferable black-box attacks lies in their ability to be effective across different models and settings, more closely aligning with real-world applications. However, it remains challenging to retain the performance in terms of transferable adversarial examples for such methods. Meanwhile, we observe that some enhanced gradient-based transferable adversarial attack algorithms require prolonged time for adversarial sample generation. Thus, in this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples whilst improving the algorithm's efficiency. The main approach is via optimising the training process of the generator parameters. With the functional and characteristic similarity analysis, we introduce a novel gradient editing (GE) mechanism and verify its feasibility in generating transferable samples on various models. Moreover, by exploring the frequency domain information to determine the gradient editing direction, GE-AdvGAN can generate highly transferable adversarial samples while minimizing the execution time in comparison to the state-of-the-art transferable adversarial attack algorithms. The performance of GE-AdvGAN is comprehensively evaluated by large-scale experiments on different datasets, which results demonstrate the superiority of our algorithm. The code for our algorithm is available at: this https URL
The advancement of smart grid technologies necessitates the integration of cutting-edge computational methods to enhance predictive energy optimization. This study proposes a multi-faceted approach by incorporating (1) Deep Reinforcement Learning (DRL) agents trained using data from Digital Twins (DTs) to optimize energy consumption in real time, (2) Physics-Informed Neural Networks (PINNs) to seamlessly embed physical laws within the optimization process, ensuring model accuracy and interpretability, and (3) Blockchain (BC) technology to facilitate secure and transparent communication across the smart grid infrastructure. The model was trained and validated using comprehensive datasets, including smart meter energy consumption data, renewable energy outputs, dynamic pricing, and user preferences collected from IoT devices. The proposed framework achieved superior predictive performance with a Mean Absolute Error (MAE) of 0.237 kWh, Root Mean Square Error (RMSE) of 0.298 kWh, and an R-squared (R2) value of 0.978, indicating a 97.8% explanation of data variance. Classification metrics further demonstrated the model's robustness, achieving 97.7% accuracy, 97.8% precision, 97.6% recall, and an F1 Score of 97.7%. Comparative analysis with traditional models like Linear Regression, Random Forest, SVM, LSTM, and XGBoost revealed the superior accuracy and real-time adaptability of the proposed method. In addition to enhancing energy efficiency, the model reduced energy costs by 35%, maintained a 96% user comfort index, and increased renewable energy utilization to 40%. This study demonstrates the transformative potential of integrating PINNs, DT, and Blockchain technologies to optimize energy consumption in smart grids, paving the way for sustainable, secure, and efficient energy management systems.
Opinion mining refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in textual material. Opinion mining, also known as sentiment analysis, has received a lot of attention in recent times, as it provides a number of tools to analyse the public opinion on a number of different topics. Comparative opinion mining is a subfield of opinion mining that deals with identifying and extracting information that is expressed in a comparative form (e.g.~"paper X is better than the Y"). Comparative opinion mining plays a very important role when ones tries to evaluate something, as it provides a reference point for the comparison. This paper provides a review of the area of comparative opinion mining. It is the first review that cover specifically this topic as all previous reviews dealt mostly with general opinion mining. This survey covers comparative opinion mining from two different angles. One from perspective of techniques and the other from perspective of comparative opinion elements. It also incorporates preprocessing tools as well as dataset that were used by the past researchers that can be useful to the future researchers in the field of comparative opinion mining.
There are no more papers matching your filters at the moment.