Taif University
Rare earth doping is an effective way to convert chemically stable oxides into multifunctional materials with coupled electronic, optical, and magnetic properties. We present first principles calculations of pristine and Tm3+ doped Ca2SnO4 to understand how localized 4f states change the structural, electronic, magnetic, and optical behavior of the host. Pristine Ca2SnO4 is a mechanically stable, wide band gap insulator with mostly ionic covalent bonding and diamagnetic character. Replacing Ca2+ with Tm3+ introduces several key changes: (i) localized Tm 4f states create intermediate levels inside the wide gap, reducing the optical band gap; (ii) exchange and spin orbit interactions generate strong local magnetic moments and spin asymmetry near the conduction band; (iii) electron localization function analysis shows enhanced covalency and electron pockets that stabilize luminescent centers; and (iv) the optical response shows visible range absorption, refractive index features, and low energy plasmon peaks while maintaining high energy dielectric stability. These effects make Tm doped Ca2SnO4 a mechanically robust, optically tunable, and magnetically active oxide phosphor suitable for red emission, intermediate band photovoltaics, and spin photon coupling. More broadly, our results show how targeted rare earth substitution can enable multifunctionality in wide gap stannates and guide the design of next generation spintronic photonic oxides.
Stroke is one of the leading causes of death globally, making early and accurate diagnosis essential for improving patient outcomes, particularly in emergency settings where timely intervention is critical. CT scans are the key imaging modality because of their speed, accessibility, and cost-effectiveness. This study proposed an artificial intelligence framework for multiclass stroke classification (ischemic, hemorrhagic, and no stroke) using CT scan images from a dataset provided by the Republic of Turkey's Ministry of Health. The proposed method adopted MaxViT, a state-of-the-art Vision Transformer, as the primary deep learning model for image-based stroke classification, with additional transformer variants (vision transformer, transformer-in-transformer, and ConvNext). To enhance model generalization and address class imbalance, we applied data augmentation techniques, including synthetic image generation. The MaxViT model trained with augmentation achieved the best performance, reaching an accuracy and F1-score of 98.00%, outperforming all other evaluated models and the baseline methods. The primary goal of this study was to distinguish between stroke types with high accuracy while addressing crucial issues of transparency and trust in artificial intelligence models. To achieve this, Explainable Artificial Intelligence (XAI) was integrated into the framework, particularly Grad-CAM++. It provides visual explanations of the model's decisions by highlighting relevant stroke regions in the CT scans and establishing an accurate, interpretable, and clinically applicable solution for early stroke detection. This research contributed to the development of a trustworthy AI-assisted diagnostic tool for stroke, facilitating its integration into clinical practice and enhancing access to timely and optimal stroke diagnosis in emergency departments, thereby saving more lives.
Melanoma is a malignant tumor that originates from skin cell lesions. Accurate and efficient segmentation of skin lesions is essential for quantitative analysis but remains a challenge due to blurred lesion boundaries, gradual color changes, and irregular shapes. To address this, we propose ScaleFusionNet, a hybrid model that integrates a Cross-Attention Transformer Module (CATM) and adaptive fusion block (AFB) to enhance feature extraction and fusion by capturing both local and global features. We introduce CATM, which utilizes Swin transformer blocks and Cross Attention Fusion (CAF) to adaptively refine feature fusion and reduce semantic gaps in the encoder-decoder to improve segmentation accuracy. Additionally, the AFB uses Swin Transformer-based attention and deformable convolution-based adaptive feature extraction to help the model gather local and global contextual information through parallel pathways. This enhancement refines the lesion boundaries and preserves fine-grained details. ScaleFusionNet achieves Dice scores of 92.94\% and 91.80\% on the ISIC-2016 and ISIC-2018 datasets, respectively, demonstrating its effectiveness in skin lesion analysis. Simultaneously, independent validation experiments were conducted on the PH2^2 dataset using the pretrained model weights. The results show that ScaleFusionNet demonstrates significant performance improvements compared with other state-of-the-art methods. Our code implementation is publicly available at GitHub.
This survey paper by researchers from Queen Mary University of London provides a comprehensive review of automated fact-checking, focusing on the pipelines of claim detection and claim validation. It systematically analyzes current methodologies, available datasets, and identifies prevailing challenges and future research directions in the field.
458
Biometric authentication has garnered significant attention as a secure and efficient method of identity verification. Among the various modalities, hand vein biometrics, including finger vein, palm vein, and dorsal hand vein recognition, offer unique advantages due to their high accuracy, low susceptibility to forgery, and non-intrusiveness. The vein patterns within the hand are highly complex and distinct for each individual, making them an ideal biometric identifier. Additionally, hand vein recognition is contactless, enhancing user convenience and hygiene compared to other modalities such as fingerprint or iris recognition. Furthermore, the veins are internally located, rendering them less susceptible to damage or alteration, thus enhancing the security and reliability of the biometric system. The combination of these factors makes hand vein biometrics a highly effective and secure method for identity verification. This review paper delves into the latest advancements in deep learning techniques applied to finger vein, palm vein, and dorsal hand vein recognition. It encompasses all essential fundamentals of hand vein biometrics, summarizes publicly available datasets, and discusses state-of-the-art metrics used for evaluating the three modes. Moreover, it provides a comprehensive overview of suggested approaches for finger, palm, dorsal, and multimodal vein techniques, offering insights into the best performance achieved, data augmentation techniques, and effective transfer learning methods, along with associated pretrained deep learning models. Additionally, the review addresses research challenges faced and outlines future directions and perspectives, encouraging researchers to enhance existing methods and propose innovative techniques.
Engineering quantum resources that survive against environmental temperature is of great interest for modern quantum technologies. However, it is a tricky task to synthetize such quantum states. Here, we propose a scheme to generate highly resilient tripartite entanglement and quantum coherence against thermal fluctuations. Our benchmark model consists of a mechanical resonator driven by two electromagnetic fields, which are optically coupled. A modulated photon hopping JJ captures the optical coupling, and each optical cavity hosts saturable gain or loss. When the saturable gain/loss are off, we observe a slightly enhancement of both tripartite entanglement and quantum coherence for an appropriate tuning of the phase modulation. When the saturation effects are turned on, we observe a significant enhancement of the tripartite entanglement, up to one order of magnitude, together with a moderate improvement of the quantum coherence. More interestingly, our results show that the threshold thermal phonon mumber for preserving tripartite entanglement in our proposal has been postponed up to two order of magnitude stronger than when the saturation effects are not accounted. The inclusion of saturable gain/loss in our proposal induces noise-tolerant quantum resources, and may lead to room temperature quantum applications such as quantum information processing, and quantum computional tasks. Our findings are quite general, and suggest saturation nonlinear effects as a tool for engineering thermal-immune quantum correlations.
The process of data mining produces various patterns from a given data source. The most recognized data mining tasks are the process of discovering frequent itemsets, frequent sequential patterns, frequent sequential rules and frequent association rules. Numerous efficient algorithms have been proposed to do the above processes. Frequent pattern mining has been a focused topic in data mining research with a good number of references in literature and for that reason an important progress has been made, varying from performant algorithms for frequent itemset mining in transaction databases to complex algorithms, such as sequential pattern mining, structured pattern mining, correlation mining. Association Rule mining (ARM) is one of the utmost current data mining techniques designed to group objects together from large databases aiming to extract the interesting correlation and relation among huge amount of data. In this article, we provide a brief review and analysis of the current status of frequent pattern mining and discuss some promising research directions. Additionally, this paper includes a comparative study between the performance of the described approaches.
We propose a novel method for generating high-resolution videos of talking-heads from speech audio and a single 'identity' image. Our method is based on a convolutional neural network model that incorporates a pre-trained StyleGAN generator. We model each frame as a point in the latent space of StyleGAN so that a video corresponds to a trajectory through the latent space. Training the network is in two stages. The first stage is to model trajectories in the latent space conditioned on speech utterances. To do this, we use an existing encoder to invert the generator, mapping from each video frame into the latent space. We train a recurrent neural network to map from speech utterances to displacements in the latent space of the image generator. These displacements are relative to the back-projection into the latent space of an identity image chosen from the individuals depicted in the training dataset. In the second stage, we improve the visual quality of the generated videos by tuning the image generator on a single image or a short video of any chosen identity. We evaluate our model on standard measures (PSNR, SSIM, FID and LMD) and show that it significantly outperforms recent state-of-the-art methods on one of two commonly used datasets and gives comparable performance on the other. Finally, we report on ablation experiments that validate the components of the model. The code and videos from experiments can be found at this https URL
23
Abusive language detection has become an increasingly important task as a means to tackle this type of harmful content in social media. There has been a substantial body of research developing models for determining if a social media post is abusive or not; however, this research has primarily focused on exploiting social media posts individually, overlooking additional context that can be derived from surrounding posts. In this study, we look at conversational exchanges, where a user replies to an earlier post by another user (the parent tweet). We ask: does leveraging context from the parent tweet help determine if a reply post is abusive or not, and what are the features that contribute the most? We study a range of content-based and account-based features derived from the context, and compare this to the more widely studied approach of only looking at the features from the reply tweet. For a more generalizable study, we test four different classification models on a dataset made of conversational exchanges (parent-reply tweet pairs) with replies labeled as abusive or not. Our experiments show that incorporating contextual features leads to substantial improvements compared to the use of features derived from the reply tweet only, confirming the importance of leveraging context. We observe that, among the features under study, it is especially the content-based features (what is being posted) that contribute to the classification performance rather than account-based features (who is posting it). While using content-based features, it is best to combine a range of different features to ensure improved performance over being more selective and using fewer features. Our study provides insights into the development of contextualized abusive language detection models in realistic settings involving conversations.
The performance of text classification methods has improved greatly over the last decade for text instances of less than 512 tokens. This limit has been adopted by most state-of-the-research transformer models due to the high computational cost of analyzing longer text instances. To mitigate this problem and to improve classification for longer texts, researchers have sought to resolve the underlying causes of the computational cost and have proposed optimizations for the attention mechanism, which is the key element of every transformer model. In our study, we are not pursuing the ultimate goal of long text classification, i.e., the ability to analyze entire text instances at one time while preserving high performance at a reasonable computational cost. Instead, we propose a text truncation method called Text Guide, in which the original text length is reduced to a predefined limit in a manner that improves performance over naive and semi-naive approaches while preserving low computational costs. Text Guide benefits from the concept of feature importance, a notion from the explainable artificial intelligence domain. We demonstrate that Text Guide can be used to improve the performance of recent language models specifically designed for long text classification, such as Longformer. Moreover, we discovered that parameter optimization is the key to Text Guide performance and must be conducted before the method is deployed. Future experiments may reveal additional benefits provided by this new method.
Swarm behavior emerges from the local interaction of agents and their environment often encoded as simple rules. Extracting the rules by watching a video of the overall swarm behavior could help us study and control swarm behavior in nature, or artificial swarms that have been designed by external actors. It could also serve as a new source of inspiration for swarm robotics. Yet extracting such rules is challenging as there is often no visible link between the emergent properties of the swarm and their local interactions. To this end, we develop a method to automatically extract understandable swarm controllers from video demonstrations. The method uses evolutionary algorithms driven by a fitness function that compares eight high-level swarm metrics. The method is able to extract many controllers (behavior trees) in a simple collective movement task. We then provide a qualitative analysis of behaviors that resulted in different trees, but similar behaviors. This provides the first steps toward automatic extraction of swarm controllers based on observations.
In semantic technologies, the shared common understanding of the structure of information among artifacts (people or software agents) can be realized by building an ontology. To do this, it is imperative for an ontology builder to answer several questions: a) what are the main components of an ontology? b) How an ontology look likes and how it works? c) Verify if it is required to consider reusing existing ontologies or not? c) What is the complexity of the ontology to be developed? d) What are the principles of ontology design and development? e) How to evaluate an ontology? This paper answers all the key questions above. The aim of this paper is to present a set of guiding principles to help ontology developers and also inexperienced users to answer such questions.
Deep convolutional neural networks (Deep CNN) have achieved hopeful performance for single image super-resolution. In particular, the Deep CNN skip Connection and Network in Network (DCSCN) architecture has been successfully applied to natural images super-resolution. In this work we propose an approach called SDT-DCSCN that jointly performs super-resolution and deblurring of low-resolution blurry text images based on DCSCN. Our approach uses subsampled blurry images in the input and original sharp images as ground truth. The used architecture is consists of a higher number of filters in the input CNN layer to a better analysis of the text details. The quantitative and qualitative evaluation on different datasets prove the high performance of our model to reconstruct high-resolution and sharp text images. In addition, in terms of computational time, our proposed method gives competitive performance compared to state of the art methods.
The tasks of semantic web service (discovery, selection, composition, and execution) are supposed to enable seamless interoperation between systems, whereby human intervention is kept at a minimum. In the field of Web service description research, the exploitation of descriptions of services through semantics is a better support for the life-cycle of Web services. The large number of developed ontologies, languages of representations, and integrated frameworks supporting the discovery, composition and invocation of services is a good indicator that research in the field of Semantic Web Services (SWS) has been considerably active. We provide in this paper a detailed classification of the approaches and solutions, indicating their core characteristics and objectives required and provide indicators for the interested reader to follow up further insights and details about these solutions and related software.
24
Running parallel applications requires special and expensive processing resources to obtain the required results within a reasonable time. Before parallelizing serial applications, some analysis is recommended to be carried out to decide whether it will benefit from parallelization or not. In this paper we discuss the issue of speed up gained from parallelization using Message Passing Interface (MPI) to compromise between the overhead of parallelization cost and the gained parallel speed up. We also propose an experimental method to predict the speed up of MPI applications.
This paper explores the efficacy of Mel Frequency Cepstral Coefficients (MFCCs) in detecting abnormal heart sounds using two classification strategies: a single classifier and an ensemble classifier approach. Heart sounds were first pre-processed to remove noise and then segmented into S1, systole, S2, and diastole intervals, with thirteen MFCCs estimated from each segment, yielding 52 MFCCs per beat. Finally, MFCCs were used for heart sound classification. For that purpose, in the single classifier strategy, the MFCCs from nine consecutive beats were averaged to classify heart sounds by a single classifier (either a support vector machine (SVM), the k nearest neighbors (kNN), or a decision tree (DT)). Conversely, the ensemble classifier strategy employed nine classifiers (either nine SVMs, nine kNN classifiers, or nine DTs) to individually assess beats as normal or abnormal, with the overall classification based on the majority vote. Both methods were tested on a publicly available phonocardiogram database. The heart sound classification accuracy was 91.95% for the SVM, 91.9% for the kNN, and 87.33% for the DT in the single classifier strategy. Also, the accuracy was 93.59% for the SVM, 91.84% for the kNN, and 92.22% for the DT in the ensemble classifier strategy. Overall, the results demonstrated that the ensemble classifier strategy improved the accuracies of the DT and the SVM by 4.89% and 1.64%, establishing MFCCs as more effective than other features, including time, time-frequency, and statistical features, evaluated in similar studies.
PCANet and its variants provided good accuracy results for classification tasks. However, despite the importance of network depth in achieving good classification accuracy, these networks were trained with a maximum of nine layers. In this paper, we introduce a residual compensation convolutional network, which is the first PCANet-like network trained with hundreds of layers while improving classification accuracy. The design of the proposed network consists of several convolutional layers, each followed by post-processing steps and a classifier. To correct the classification errors and significantly increase the network's depth, we train each layer with new labels derived from the residual information of all its preceding layers. This learning mechanism is accomplished by traversing the network's layers in a single forward pass without backpropagation or gradient computations. Our experiments on four distinct classification benchmarks (MNIST, CIFAR-10, CIFAR-100, and TinyImageNet) show that our deep network outperforms all existing PCANet-like networks and is competitive with several traditional gradient-based models.
Selecting check-worthy claims for fact-checking is considered a crucial part of expediting the fact-checking process by filtering out and ranking the check-worthy claims for being validated among the impressive amount of claims could be found online. The check-worthy claim detection task, however, becomes more challenging when the model needs to deal with new topics that differ from those seen earlier. In this study, we propose a domain-adaptation framework for check-worthy claims detection across topics for the Arabic language to adopt a new topic, mimicking a real-life scenario of the daily emergence of events worldwide. We propose the Gradual Topic Learning (GTL) model, which builds an ability to learning gradually and emphasizes the check-worthy claims for the target topic during several stages of the learning process. In addition, we introduce the Similarity-driven Gradual Topic Learning (SGTL) model that synthesizes gradual learning with a similarity-based strategy for the target topic. Our experiments demonstrate the effectiveness of our proposed model, showing an overall tendency for improving performance over the state-of-the-art baseline across 11 out of the 14 topics under study.
Researchers from Queen Mary University of London and collaborators developed a predictive system to estimate the volume of abusive replies a social media post will receive *before* it is published. Their ExtraTreesRegressor model achieved an MSE of 0.62 and R² of 0.89, demonstrating that the content and characteristics of a message are the primary factors in predicting abuse, rather than the identity of the user posting it.
An important component of an automated fact-checking system is the claim check-worthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifying check-worthy claims across different topics. In this paper, we assess and quantify the challenge of detecting check-worthy claims for new, unseen topics. After highlighting the problem, we propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics. The AraCWA model enables boosting the performance for new topics by incorporating two components for few-shot learning and data augmentation. Using a publicly available dataset of Arabic tweets consisting of 14 different topics, we demonstrate that our proposed data augmentation strategy achieves substantial improvements across topics overall, where the extent of the improvement varies across topics. Further, we analyse the semantic similarities between topics, suggesting that the similarity metric could be used as a proxy to determine the difficulty level of an unseen topic prior to undertaking the task of labelling the underlying sentences.
There are no more papers matching your filters at the moment.