University of Greenwich
Multi-modal object Re-Identification (ReID) aims to obtain accurate identity features across heterogeneous modalities. However, most existing methods rely on implicit feature fusion modules, making it difficult to model fine-grained recognition patterns under various challenges in real world. Benefiting from the powerful Multi-modal Large Language Models (MLLMs), the object appearances are effectively translated into descriptive captions. In this paper, we propose a reliable caption generation pipeline based on attribute confidence, which significantly reduces the unknown recognition rate of MLLMs and improves the quality of generated text. Additionally, to model diverse identity patterns, we propose a novel ReID framework, named NEXT, the Multi-grained Mixture of Experts via Text-Modulation for Multi-modal Object Re-Identification. Specifically, we decouple the recognition problem into semantic and structural branches to separately capture fine-grained appearance features and coarse-grained structure features. For semantic recognition, we first propose a Text-Modulated Semantic Experts (TMSE), which randomly samples high-quality captions to modulate experts capturing semantic features and mining inter-modality complementary cues. Second, to recognize structure features, we propose a Context-Shared Structure Experts (CSSE), which focuses on the holistic object structure and maintains identity structural consistency via a soft routing mechanism. Finally, we propose a Multi-Grained Features Aggregation (MGFA), which adopts a unified fusion strategy to effectively integrate multi-grained experts into the final identity representations. Extensive experiments on four public datasets demonstrate the effectiveness of our method and show that it significantly outperforms existing state-of-the-art methods.
8
This article presents a comprehensive dataset featuring ten distinct hen breeds, sourced from various regions, capturing the unique characteristics and traits of each breed. The dataset encompasses Bielefeld, Blackorpington, Brahma, Buckeye, Fayoumi, Leghorn, Newhampshire, Plymouthrock, Sussex, and Turken breeds, offering a diverse representation of poultry commonly bred worldwide. A total of 1010 original JPG images were meticulously collected, showcasing the physical attributes, feather patterns, and distinctive features of each hen breed. These images were subsequently standardized, resized, and converted to PNG format for consistency within the dataset. The compilation, although unevenly distributed across the breeds, provides a rich resource, serving as a foundation for research and applications in poultry science, genetics, and agricultural studies. This dataset holds significant potential to contribute to various fields by enabling the exploration and analysis of unique characteristics and genetic traits across different hen breeds, thereby supporting advancements in poultry breeding, farming, and genetic research.
This paper investigates the critical issue of data poisoning attacks on AI models, a growing concern in the ever-evolving landscape of artificial intelligence and cybersecurity. As advanced technology systems become increasingly prevalent across various sectors, the need for robust defence mechanisms against adversarial attacks becomes paramount. The study aims to develop and evaluate novel techniques for detecting and preventing data poisoning attacks, focusing on both theoretical frameworks and practical applications. Through a comprehensive literature review, experimental validation using the CIFAR-10 and Insurance Claims datasets, and the development of innovative algorithms, this paper seeks to enhance the resilience of AI models against malicious data manipulation. The study explores various methods, including anomaly detection, robust optimization strategies, and ensemble learning, to identify and mitigate the effects of poisoned data during model training. Experimental results indicate that data poisoning significantly degrades model performance, reducing classification accuracy by up to 27% in image recognition tasks (CIFAR-10) and 22% in fraud detection models (Insurance Claims dataset). The proposed defence mechanisms, including statistical anomaly detection and adversarial training, successfully mitigated poisoning effects, improving model robustness and restoring accuracy levels by an average of 15-20%. The findings further demonstrate that ensemble learning techniques provide an additional layer of resilience, reducing false positives and false negatives caused by adversarial data injections.
This paper presents a novel methodology that integrates trustworthy artificial intelligence (AI) with an energy-efficient robotic arm for intelligent waste classification and sorting. By utilizing a convolutional neural network (CNN) enhanced through transfer learning with MobileNetV2, the system accurately classifies waste into six categories: plastic, glass, metal, paper, cardboard, and trash. The model achieved a high training accuracy of 99.8% and a validation accuracy of 80.5%, demonstrating strong learning and generalization. A robotic arm simulator is implemented to perform virtual sorting, calculating the energy cost for each action using Euclidean distance to ensure optimal and efficient movement. The framework incorporates key elements of trustworthy AI, such as transparency, robustness, fairness, and safety, making it a reliable and scalable solution for smart waste management systems in urban settings.
To tackle the challenge of vehicle re-identification (Re-ID) in complex lighting environments and diverse scenes, multi-spectral sources like visible and infrared information are taken into consideration due to their excellent complementary advantages. However, multi-spectral vehicle Re-ID suffers cross-modality discrepancy caused by heterogeneous properties of different modalities as well as a big challenge of the diverse appearance with different views in each identity. Meanwhile, diverse environmental interference leads to heavy sample distributional discrepancy in each modality. In this work, we propose a novel cross-directional consistency network to simultaneously overcome the discrepancies from both modality and sample aspects. In particular, we design a new cross-directional center loss to pull the modality centers of each identity close to mitigate cross-modality discrepancy, while the sample centers of each identity close to alleviate the sample discrepancy. Such strategy can generate discriminative multi-spectral feature representations for vehicle Re-ID. In addition, we design an adaptive layer normalization unit to dynamically adjust individual feature distribution to handle distributional discrepancy of intra-modality features for robust learning. To provide a comprehensive evaluation platform, we create a high-quality RGB-NIR-TIR multi-spectral vehicle Re-ID benchmark (MSVR310), including 310 different vehicles from a broad range of viewpoints, time spans and environmental complexities. Comprehensive experiments on both created and public datasets demonstrate the effectiveness of the proposed approach comparing to the state-of-the-art methods.
The MITRE ATT&CK framework, a comprehensive knowledge base of adversary tactics and techniques, has been widely adopted by the cybersecurity industry as well as by academic researchers. Its broad range of industry applications include threat intelligence, threat detection, and incident response, some of which go beyond what it was originally designed for. Despite its popularity, there is a lack of a systematic review of the applications and the research on ATT&CK. This systematization of work aims to fill this gap. To this end, it introduces the first taxonomic systematization of the research literature on ATT&CK, studies its degree of usefulness in different applications, and identifies important gaps and discrepancies in the literature to identify key directions for future work. The results of this work provide valuable insights for academics and practitioners alike, highlighting the need for more research on the practical implementation and evaluation of ATT&CK.
Deep neural networks have become increasingly of interest in dynamical system prediction, but out-of-distribution generalization and long-term stability still remains challenging. In this work, we treat the domain parameters of dynamical systems as factors of variation of the data generating process. By leveraging ideas from supervised disentanglement and causal factorization, we aim to separate the domain parameters from the dynamics in the latent space of generative models. In our experiments we model dynamics both in phase space and in video sequences and conduct rigorous OOD evaluations. Results indicate that disentangled VAEs adapt better to domain parameters spaces that were not present in the training data. At the same time, disentanglement can improve the long-term and out-of-distribution predictions of state-of-the-art models in video sequences.
14
This paper presents the development of a process automation architecture leveraging Radio Frequency Identification (RFID) technology for secure, transparent and efficient voting systems. The proposed architecture automates the voting workflow through RFID-enabled voter identification, encrypted vote casting, and secure data transmission. Each eligible voter receives a smart RFID card containing a uniquely encrypted identifier, which is verified using an RC522 reader interfaced with a microcontroller. Upon successful verification, the voter interacts with a touchscreen interface to cast a vote, which is then encrypted using AES-128 and securely stored on a local SD card or transmitted via GSM to a central server. A tamper-proof monitoring mechanism records each session with time-stamped digital signatures, ensuring auditability and data integrity. The architecture is designed to function in both online and offline modes, with an automated batch synchronization mechanism that updates vote records once network connectivity is restored. System testing in simulated environments confirmed 100% voter authentication accuracy, minimized latency (average voting time of 11.5 seconds), and robustness against cloning, double voting, and data interception. The integration of real-time monitoring and secure process control modules enables electoral authorities to automate data logging, detect anomalies, and validate system integrity dynamically. This work demonstrates a scalable, automation-driven solution for voting infrastructure, offering enhanced transparency, resilience, and deployment flexibility, especially in environments where digital transformation of electoral processes is critically needed.
Customisation in food properties is a challenging task involving optimisation of the production process with the demand to support computational creativity which is geared towards ensuring the presence of alternatives. This paper addresses the personalisation of beer properties in the specific case of craft beers where the production process is more flexible. We investigate the problem by using three swarm intelligence and evolutionary computation techniques that enable brewers to map physico-chemical properties to target organoleptic properties to design a specific brew. While there are several tools, using the original mathematical and chemistry formulas, or machine learning models that deal with the process of determining beer properties based on the pre-determined quantities of ingredients, the next step is to investigate an automated quantitative ingredient selection approach. The process is illustrated by a number of experiments designing craft beers where the results are investigated by "cloning" popular commercial brands based on their known properties. Algorithms performance is evaluated using accuracy, efficiency, reliability, population-diversity, iteration-based improvements and solution diversity. The proposed approach allows for the discovery of new recipes, personalisation and alternative high-fidelity reproduction of existing ones.
Transforming food systems is essential to bring about a healthier, equitable, sustainable, and resilient future, including achieving global development and sustainability goals. To date, no comprehensive framework exists to track food systems transformation and their contributions to global goals. In 2021, the Food Systems Countdown to 2030 Initiative (FSCI) articulated an architecture to monitor food systems across five themes: 1 diets, nutrition, and health; 2 environment, natural resources, and production; 3 livelihoods, poverty, and equity; 4 governance; and 5 resilience and sustainability. Each theme comprises three-to-five indicator domains. This paper builds on that architecture, presenting the inclusive, consultative process used to select indicators and an application of the indicator framework using the latest available data, constructing the first global food systems baseline to track transformation. While data are available to cover most themes and domains, critical indicator gaps exist such as off-farm livelihoods, food loss and waste, and governance. Baseline results demonstrate every region or country can claim positive outcomes in some parts of food systems, but none are optimal across all domains, and some indicators are independent of national income. These results underscore the need for dedicated monitoring and transformation agendas specific to food systems. Tracking these indicators to 2030 and beyond will allow for data-driven food systems governance at all scales and increase accountability for urgently needed progress toward achieving global goals.
Market illiquidity, feedback effects, presence of transaction costs, risk from unprotected portfolio and other nonlinear effects in PDE based option pricing models can be described by solutions to the generalized Black-Scholes parabolic equation with a diffusion term nonlinearly depending on the option price itself. Different linearization techniques such as Newton's method and analytic asymptotic approximation formula are adopted and compared for a wide class of nonlinear Black-Scholes equations including, in particular, the market illiquidity model and the risk-adjusted pricing model. Accuracy and time complexity of both numerical methods are compared. Furthermore, market quotes data was used to calibrate model parameters.
Due to digitalization in everyday life, the need for automatically recognizing handwritten digits is increasing. Handwritten digit recognition is essential for numerous applications in various industries. Bengali ranks the fifth largest language in the world with 265 million speakers (Native and non-native combined) and 4 percent of the world population speaks Bengali. Due to the complexity of Bengali writing in terms of variety in shape, size, and writing style, researchers did not get better accuracy using Supervised machine learning algorithms to date. Moreover, fewer studies have been done on Bangla handwritten digit recognition (BHwDR). In this paper, we proposed a novel CNN-based pre-trained handwritten digit recognition model which includes Resnet-50, Inception-v3, and EfficientNetB0 on NumtaDB dataset of 17 thousand instances with 10 classes.. The Result outperformed the performance of other models to date with 97% accuracy in the 10-digit classes. Furthermore, we have evaluated the result or our model with other research studies while suggesting future study
In the rapidly evolving landscape of 5G and B5G (beyond 5G) networks, efficient resource optimization is critical to addressing the escalating demands for high-speed, low-latency, and energy efficient communication. This study explores the integration of Radio Frequency Identification (RFID) technology as a novel approach to enhance resource management in 5G/B5G networks. The motivation behind this research lies in overcoming persistent challenges such as spectrum congestion, high latency, and inefficient load balancing, which impede the performance of traditional resource allocation methods. To achieve this, RFID tags were embedded in critical network components, including user devices, base stations, and Internet of Things (IoT) nodes, enabling the collection of real-time data on device status, location, and resource utilization. RFID readers strategically placed across the network continuously captured this data, which was processed by a centralized controller using a custom-designed optimization algorithm. This algorithm dynamically managed key network resources, including spectrum allocation, load balancing, and energy consumption, ensuring efficient operation under varying network conditions. Simulations were conducted to evaluate the performance of the RFID-based model against traditional 4G dynamic resource allocation techniques. The results demonstrated substantial improvements in key performance metrics.
Integrated Access and Backhaul (IAB) has been recently proposed by 3GPP to enable network operators to deploy fifth generation (5G) mobile networks with reduced costs. In this paper, we propose to use IAB to build a dynamic wireless backhaul network capable to provide additional capacity to those Base Stations (BS) experiencing congestion momentarily. As the mobile traffic demand varies across time and space, and the number of slice combinations deployed in a BS can be prohibitively high, we propose to use Deep Reinforcement Learning (DRL) to select, from a set of candidate BSs, the one that can provide backhaul capacity for each of the slices deployed in a congested BS. Our results show that a Double Deep Q-Network (DDQN) agent using a fully connected neural network and the Rectified Linear Unit (ReLU) activation function with only one hidden layer is capable to perform the BS selection task successfully, without any failure during the test phase, after being trained for around 20 episodes.
Online marketplaces are the main engines of legal and illegal e-commerce, yet their empirical properties are poorly understood due to the absence of large-scale data. We analyze two comprehensive datasets containing 245M transactions (16B USD) that took place on online marketplaces between 2010 and 2021, covering 28 dark web marketplaces, i.e., unregulated markets whose main currency is Bitcoin, and 144 product markets of one popular regulated e-commerce platform. We show that transactions in online marketplaces exhibit strikingly similar patterns despite significant differences in language, lifetimes, products, regulation, and technology. Specifically, we find remarkable regularities in the distributions of transaction amounts, number of transactions, inter-event times and time between first and last transactions. We show that buyer behavior is affected by the memory of past interactions and use this insight to propose a model of network formation reproducing our main empirical observations. Our findings have implications for understanding market power on online marketplaces as well as inter-marketplace competition, and provide empirical foundation for theoretical economic models of online marketplaces.
This report details the development of a networked distributed system named Group Communication System (GCS), implemented in Java to exemplify socket programming and communication protocols. GCS facilitates group-based client-server communication through a command-line interface (CLI), enabling seamless group interaction and management. The project emphasizes fault tolerance, design patterns, and version control system (VCS) utilization. The report offers insights into system architecture, implementation, and practical considerations, providing a comprehensive understanding of distributed systems' technical background and operational aspects.
Modern technology has drastically changed the way we interact and consume information. For example, online social platforms allow for seamless communication exchanges at an unprecedented scale. However, we are still bounded by cognitive and temporal constraints. Our attention is limited and extremely valuable. Algorithmic personalisation has become a standard approach to tackle the information overload problem. As result, the exposure to our friends' opinions and our perception about important issues might be distorted. However, the effects of algorithmic gatekeeping on our hyper-connected society are poorly understood. Here, we devise an opinion dynamics model where individuals are connected through a social network and adopt opinions as function of the view points they are exposed to. We apply various filtering algorithms that select the opinions shown to users i) at random ii) considering time ordering or iii) their current beliefs. Furthermore, we investigate the interplay between such mechanisms and crucial features of real networks. We found that algorithmic filtering might influence opinions' share and distributions, especially in case information is biased towards the current opinion of each user. These effects are reinforced in networks featuring topological and spatial correlations where echo chambers and polarisation emerge. Conversely, heterogeneity in connectivity patterns reduces such tendency. We consider also a scenario where one opinion, through nudging, is centrally pushed to all users. Interestingly, even minimal nudging is able to change the status quo moving it towards the desired view point. Our findings suggest that simple filtering algorithms might be powerful tools to regulate opinion dynamics taking place on social networks
This paper presents the design and characterization of a rectangular microstrip patch antenna array optimized for operation within the Ku-band frequency range. The antenna array is impedance-matched to 50 Ohms and utilizes a microstrip line feeding mechanism for excitation. The design maintains compact dimensions, with the overall antenna occupying an area of 29.5x7 mm. The antenna structure is modelled on an R03003 substrate material, featuring a dielectric constant of 3, a low-loss tangent of 0.0009, and a thickness of 1.574 mm. The substrate is backed by a conducting ground plane, and the array consists of six radiating patch elements positioned on top. Evaluation of the designed antenna array reveals a resonant frequency of 18GHz, with a -10 dB impedance bandwidth extending over 700MHz. The antenna demonstrates a high gain of 7.51dBi, making it well-suited for applications in 5G and future communication systems. Its compact form factor, cost-effectiveness, and broad impedance and radiation coverage further underscore its potential in these domains.
Energy consumption in robotic arms is a significant concern in industrial automation due to rising operational costs and environmental impact. This study investigates the use of a local reduction method to optimize energy efficiency in robotic systems without compromising performance. The approach refines movement parameters, minimizing energy use while maintaining precision and operational reliability. A three-joint robotic arm model was tested using simulation over a 30-second period for various tasks, including pick-and-place and trajectory-following operations. The results revealed that the local reduction method reduced energy consumption by up to 25% compared to traditional techniques such as Model Predictive Control (MPC) and Genetic Algorithms (GA). Unlike MPC, which requires significant computational resources, and GA, which has slow convergence rates, the local reduction method demonstrated superior adaptability and computational efficiency in real-time applications. The study highlights the scalability and simplicity of the local reduction approach, making it an attractive option for industries seeking sustainable and cost-effective solutions. Additionally, this method can integrate seamlessly with emerging technologies like Artificial Intelligence (AI), further enhancing its application in dynamic and complex environments. This research underscores the potential of the local reduction method as a practical tool for optimizing robotic arm operations, reducing energy demands, and contributing to sustainability in industrial automation. Future work will focus on extending the approach to real-world scenarios and incorporating AI-driven adjustments for more dynamic adaptability.
Recent studies have experimentally shown that we can achieve in non-Euclidean metric space effective and efficient graph embedding, which aims to obtain the vertices' representations reflecting the graph's structure in the metric space. Specifically, graph embedding in hyperbolic space has experimentally succeeded in embedding graphs with hierarchical-tree structure, e.g., data in natural languages, social networks, and knowledge bases. However, recent theoretical analyses have shown a much higher upper bound on non-Euclidean graph embedding's generalization error than Euclidean one's, where a high generalization error indicates that the incompleteness and noise in the data can significantly damage learning performance. It implies that the existing bound cannot guarantee the success of graph embedding in non-Euclidean metric space in a practical training data size, which can prevent non-Euclidean graph embedding's application in real problems. This paper provides a novel upper bound of graph embedding's generalization error by evaluating the local Rademacher complexity of the model as a function set of the distances of representation couples. Our bound clarifies that the performance of graph embedding in non-Euclidean metric space, including hyperbolic space, is better than the existing upper bounds suggest. Specifically, our new upper bound is polynomial in the metric space's geometric radius RR and can be O(1S)O(\frac{1}{S}) at the fastest, where SS is the training data size. Our bound is significantly tighter and faster than the existing one, which can be exponential to RR and O(1S)O(\frac{1}{\sqrt{S}}) at the fastest. Specific calculations on example cases show that graph embedding in non-Euclidean metric space can outperform that in Euclidean space with much smaller training data than the existing bound has suggested.
There are no more papers matching your filters at the moment.