University of Kurdistan Hewler
Offline Handwritten Text Recognition (HTR) systems play a crucial role in applications such as historical document digitization, automatic form processing, and biometric authentication. However, their performance is often hindered by the limited availability of annotated training data, particularly for low-resource languages and complex scripts. This paper presents a comprehensive survey of offline handwritten data augmentation and generation techniques designed to improve the accuracy and robustness of HTR systems. We systematically examine traditional augmentation methods alongside recent advances in deep learning, including Generative Adversarial Networks (GANs), diffusion models, and transformer-based approaches. Furthermore, we explore the challenges associated with generating diverse and realistic handwriting samples, particularly in preserving script authenticity and addressing data scarcity. This survey follows the PRISMA methodology, ensuring a structured and rigorous selection process. Our analysis began with 1,302 primary studies, which were filtered down to 848 after removing duplicates, drawing from key academic sources such as IEEE Digital Library, Springer Link, Science Direct, and ACM Digital Library. By evaluating existing datasets, assessment metrics, and state-of-the-art methodologies, this survey identifies key research gaps and proposes future directions to advance the field of handwritten text generation across diverse linguistic and stylistic landscapes.
The rapid proliferation of tourism data across sectors, including accommodations, cultural sites, and events, has made it increasingly challenging for travelers to identify relevant and personalized recommendations. While traditional recommender systems such as collaborative, content-based, and context-aware systems offer partial solutions, they often struggle with issues like data sparsity and overspecialization. This study proposes a novel hybrid recommender system that combines evolutionary Apriori and K-means clustering algorithms to improve recommendation accuracy and efficiency in the tourism domain. Designed specifically to address the diverse and dynamic tourism landscape in Iraq, the system provides personalized recommendations and clusters of tourist destinations tailored to user preferences and contextual information. To evaluate the systems performance, experiments were conducted on an augmented dataset representative of Iraqs tourism activity, comparing the proposed system with existing methods. Results indicate that the proposed hybrid system significantly reduces execution time by 27-56% and space consumption by 24-31%, while achieving consistently lower Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) values, thereby enhancing prediction accuracy. This approach offers a scalable, context-aware framework that is well-suited for application in regions where tourism data is limited, such as Iraq, ultimately advancing tourism recommender systems by addressing their limitations in complex and data-scarce environments.
Nowadays, Natural Language Processing (NLP) is an important tool for most people's daily life routines, ranging from understanding speech, translation, named entity recognition (NER), and text categorization, to generative text models such as ChatGPT. Due to the existence of big data and consequently large corpora for widely used languages like English, Spanish, Turkish, Persian, and many more, these applications have been developed accurately. However, the Kurdish language still requires more corpora and large datasets to be included in NLP applications. This is because Kurdish has a rich linguistic structure, varied dialects, and a limited dataset, which poses unique challenges for Kurdish NLP (KNLP) application development. While several studies have been conducted in KNLP for various applications, Kurdish NER (KNER) remains a challenge for many KNLP tasks, including text analysis and classification. In this work, we address this limitation by proposing a methodology for fine-tuning the pre-trained RoBERTa model for KNER. To this end, we first create a Kurdish corpus, followed by designing a modified model architecture and implementing the training procedures. To evaluate the trained model, a set of experiments is conducted to demonstrate the performance of the KNER model using different tokenization methods and trained models. The experimental results show that fine-tuned RoBERTa with the SentencePiece tokenization method substantially improves KNER performance, achieving a 12.8% improvement in F1-score compared to traditional models, and consequently establishes a new benchmark for KNLP.
Speaker diarization is a fundamental task in speech processing that involves dividing an audio stream by speaker. Although state-of-the-art models have advanced performance in high-resource languages, low-resource languages such as Kurdish pose unique challenges due to limited annotated data, multiple dialects and frequent code-switching. In this study, we address these issues by training the Wav2Vec 2.0 self-supervised learning model on a dedicated Kurdish corpus. By leveraging transfer learning, we adapted multilingual representations learned from other languages to capture the phonetic and acoustic characteristics of Kurdish speech. Relative to a baseline method, our approach reduced the diarization error rate by seven point two percent and improved cluster purity by thirteen percent. These findings demonstrate that enhancements to existing models can significantly improve diarization performance for under-resourced languages. Our work has practical implications for developing transcription services for Kurdish-language media and for speaker segmentation in multilingual call centers, teleconferencing and video-conferencing systems. The results establish a foundation for building effective diarization systems in other understudied languages, contributing to greater equity in speech technology.
Analyzing large datasets to select optimal features is one of the most important research areas in machine learning and data mining. This feature selection procedure involves dimensionality reduction which is crucial in enhancing the performance of the model, making it less complex. Recently, several types of attribute selection methods have been proposed that use different approaches to obtain representative subsets of the attributes. However, population-based evolutionary algorithms like Genetic Algorithms (GAs) have been proposed to provide remedies for these drawbacks by avoiding local optima and improving the selection process itself. This manuscript presents a sweeping review on GA-based feature selection techniques in applications and their effectiveness across different domains. This review was conducted using the PRISMA methodology; hence, the systematic identification, screening, and analysis of relevant literature were performed. Thus, our results hint that the field's hybrid GA methodologies including, but not limited to, GA-Wrapper feature selector and HGA-neural networks, have substantially improved their potential through the resolution of problems such as exploration of unnecessary search space, accuracy performance problems, and complexity. The conclusions of this paper would result in discussing the potential that GAs bear in feature selection and future research directions for their enhancement in applicability and performance.
Data clustering involves identifying latent similarities within a dataset and organizing them into clusters or groups. The outcomes of various clustering algorithms differ as they are susceptible to the intrinsic characteristics of the original dataset, including noise and dimensionality. The effectiveness of such clustering procedures directly impacts the homogeneity of clusters, underscoring the significance of evaluating algorithmic outcomes. Consequently, the assessment of clustering quality presents a significant and complex endeavor. A pivotal aspect affecting clustering validation is the cluster validity metric, which aids in determining the optimal number of clusters. The main goal of this study is to comprehensively review and explain the mathematical operation of internal and external cluster validity indices, but not all, to categorize these indices and to brainstorm suggestions for future advancement of clustering validation research. In addition, we review and evaluate the performance of internal and external clustering validation indices on the most common clustering algorithms, such as the evolutionary clustering algorithm star (ECA*). Finally, we suggest a classification framework for examining the functionality of both internal and external clustering validation measures regarding their ideal values, user-friendliness, responsiveness to input data, and appropriateness across various fields. This classification aids researchers in selecting the appropriate clustering validation measure to suit their specific requirements.
The paper by Rasul et al. offers a comprehensive review of in silico methods used throughout the drug discovery pipeline, spanning target identification, virtual screening, lead optimization, and ADMET prediction. It serves as an educational resource for researchers across various scientific disciplines by systematically exploring computational approaches from molecular docking to artificial intelligence applications.
In this paper, we propose and demonstrate an adaptive-sliding mode control for trajectory tracking control of robot manipulators subjected to uncertain dynamics, vibration disturbance, and payload variation disturbance. Throughout this work we seek a controller that is, robust to the uncertainty and disturbance, accurate, and implementable. To perform these requirements, we use a nonlinear Lyapunov-based approach for designing the controller and guaranteeing its stability. MATLAB-SIMULINK software is used to validate the approach and demonstrate the performance of the controller. Simulation results show that the derived controller is stable, robust to the disturbance and uncertainties, accurate, and implementable.
This paper presents the Multi-Objective Ant Nesting Algorithm (MOANA), a novel extension of the Ant Nesting Algorithm (ANA), specifically designed to address multi-objective optimization problems (MOPs). MOANA incorporates adaptive mechanisms, such as deposition weight parameters, to balance exploration and exploitation, while a polynomial mutation strategy ensures diverse and high-quality solutions. The algorithm is evaluated on standard benchmark datasets, including ZDT functions and the IEEE Congress on Evolutionary Computation (CEC) 2019 multi-modal benchmarks. Comparative analysis against state-of-the-art algorithms like MOPSO, MOFDO, MODA, and NSGA-III demonstrates MOANA's superior performance in terms of convergence speed and Pareto front coverage. Furthermore, MOANA's applicability to real-world engineering optimization, such as welded beam design, showcases its ability to generate a broad range of optimal solutions, making it a practical tool for decision-makers. MOANA addresses key limitations of traditional evolutionary algorithms by improving scalability and diversity in multi-objective scenarios, positioning it as a robust solution for complex optimization tasks.
Idiom detection using Natural Language Processing (NLP) is the computerized process of recognizing figurative expressions within a text that convey meanings beyond the literal interpretation of the words. While idiom detection has seen significant progress across various languages, the Kurdish language faces a considerable research gap in this area despite the importance of idioms in tasks like machine translation and sentiment analysis. This study addresses idiom detection in Sorani Kurdish by approaching it as a text classification task using deep learning techniques. To tackle this, we developed a dataset containing 10,580 sentences embedding 101 Sorani Kurdish idioms across diverse contexts. Using this dataset, we developed and evaluated three deep learning models: KuBERT-based transformer sequence classification, a Recurrent Convolutional Neural Network (RCNN), and a BiLSTM model with an attention mechanism. The evaluations revealed that the transformer model, the fine-tuned BERT, consistently outperformed the others, achieving nearly 99% accuracy while the RCNN achieved 96.5% and the BiLSTM 80%. These results highlight the effectiveness of Transformer-based architectures in low-resource languages like Kurdish. This research provides a dataset, three optimized models, and insights into idiom detection, laying a foundation for advancing Kurdish NLP.
Swarm Intelligence is a metaheuristic optimization approach that has become very predominant over the last few decades. These algorithms are inspired by animals' physical behaviors and their evolutionary perceptions. The simplicity of these algorithms allows researchers to simulate different natural phenomena to solve various real-world problems. This paper suggests a novel algorithm called Donkey and Smuggler Optimization Algorithm (DSO). The DSO is inspired by the searching behavior of donkeys. The algorithm imitates transportation behavior such as searching and selecting routes for movement by donkeys in the actual world. Two modes are established for implementing the search behavior and route-selection in this algorithm. These are the Smuggler and Donkeys. In the Smuggler mode, all the possible paths are discovered and the shortest path is then found. In the Donkeys mode, several donkey behaviors are utilized such as Run, Face & Suicide, and Face & Support. Real world data and applications are used to test the algorithm. The experimental results consisted of two parts, firstly, we used the standard benchmark test functions to evaluate the performance of the algorithm in respect to the most popular and the state of the art algorithms. Secondly, the DSO is adapted and implemented on three real-world applications namely; traveling salesman problem, packet routing, and ambulance routing. The experimental results of DSO on these real-world problems are very promising. The results exhibit that the suggested DSO is appropriate to tackle other unfamiliar search spaces and complex problems.
The dragonfly algorithm was developed in 2016. It is one of the algorithms used by researchers to optimize an extensive series of uses and applications in various areas. At times, it offers superior performance compared to the most well-known optimization techniques. However, this algorithm faces several difficulties when it is utilized to enhance complex optimization problems. This work addressed the robustness of the method to solve real-world optimization issues, and its deficiency to improve complex optimization problems. This review paper shows a comprehensive investigation of the dragonfly algorithm in the engineering area. First, an overview of the algorithm is discussed. Besides, we also examined the modifications of the algorithm. The merged forms of this algorithm with different techniques and the modifications that have been done to make the algorithm perform better are addressed. Additionally, a survey on applications in the engineering area that used the dragonfly algorithm is offered. The utilized engineering applications are the applications in the field of mechanical engineering problems, electrical engineering problems, optimal parameters, economic load dispatch, and loss reduction. The algorithm is tested and evaluated against particle swarm optimization algorithm and firefly algorithm. To evaluate the ability of the dragonfly algorithm and other participated algorithms a set of traditional benchmarks (TF1-TF23) were utilized. Moreover, to examine the ability of the algorithm to optimize large-scale optimization problems CEC-C2019 benchmarks were utilized. A comparison is made between the algorithm and other metaheuristic techniques to show its ability to enhance various problems.
Extracting concise information from scientific documents aids learners, researchers, and practitioners. Automatic Text Summarization (ATS), a key Natural Language Processing (NLP) application, automates this process. While ATS methods exist for many languages, Kurdish remains underdeveloped due to limited resources. This study develops a dataset and language model based on 231 scientific papers in Sorani Kurdish, collected from four academic departments in two universities in the Kurdistan Region of Iraq (KRI), averaging 26 pages per document. Using Sentence Weighting and Term Frequency-Inverse Document Frequency (TF-IDF) algorithms, two experiments were conducted, differing in whether the conclusions were included. The average word count was 5,492.3 in the first experiment and 5,266.96 in the second. Results were evaluated manually and automatically using ROUGE-1, ROUGE-2, and ROUGE-L metrics, with the best accuracy reaching 19.58%. Six experts conducted manual evaluations using three criteria, with results varying by document. This research provides valuable resources for Kurdish NLP researchers to advance ATS and related fields.
This study proposes the GOOSE algorithm as a novel metaheuristic algorithm based on the goose's behavior during rest and foraging. The goose stands on one leg and keeps his balance to guard and protect other individuals in the flock. The GOOSE algorithm is benchmarked on 19 well-known benchmark test functions, and the results are verified by a comparative study with genetic algorithm (GA), particle swarm optimization (PSO), dragonfly algorithm (DA), and fitness dependent optimizer (FDO). In addition, the proposed algorithm is tested on 10 modern benchmark functions, and the gained results are compared with three recent algorithms, such as the dragonfly algorithm, whale optimization algorithm (WOA), and salp swarm algorithm (SSA). Moreover, the GOOSE algorithm is tested on 5 classical benchmark functions, and the obtained results are evaluated with six algorithms, such as fitness dependent optimizer (FDO), FOX optimizer, butterfly optimization algorithm (BOA), whale optimization algorithm, dragonfly algorithm, and chimp optimization algorithm (ChOA). The achieved findings attest to the proposed algorithm's superior performance compared to the other algorithms that were utilized in the current study. The technique is then used to optimize Welded beam design and Economic Load Dispatch Problem, three renowned real-world engineering challenges, and the Pathological IgG Fraction in the Nervous System. The outcomes of the engineering case studies illustrate how well the suggested approach can optimize issues that arise in the real-world.
The rapid advancement of intelligent technology has led to the development of optimization algorithms that leverage natural behaviors to address complex issues. Among these, the Rat Swarm Optimizer (RSO), inspired by rats' social and behavioral characteristics, has demonstrated potential in various domains, although its convergence precision and exploration capabilities are limited. To address these shortcomings, this study introduces the Modified Rat Swarm Optimizer (MRSO), designed to enhance the balance between exploration and exploitation. MRSO incorporates unique modifications to improve search efficiency and durability, making it suitable for challenging engineering problems such as welded beam, pressure vessel, and gear train design. Extensive testing with classical benchmark functions shows that MRSO significantly improves performance, avoiding local optima and achieving higher accuracy in six out of nine multimodal functions and in all seven fixed-dimension multimodal functions. In the CEC 2019 benchmarks, MRSO outperforms the standard RSO in six out of ten functions, demonstrating superior global search capabilities. When applied to engineering design problems, MRSO consistently delivers better average results than RSO, proving its effectiveness. Additionally, we compared our approach with eight recent and well-known algorithms using both classical and CEC-2019 bench-marks. MRSO outperforms each of these algorithms, achieving superior results in six out of 23 classical benchmark functions and in four out of ten CEC-2019 benchmark functions. These results further demonstrate MRSO's significant contributions as a reliable and efficient tool for optimization tasks in engineering applications.
Automatic Speech Recognition (ASR) for low-resource languages remains a challenging task due to limited training data. This paper introduces a comprehensive study exploring the effectiveness of Whisper, a pre-trained ASR model, for Northern Kurdish (Kurmanji) an under-resourced language spoken in the Middle East. We investigate three fine-tuning strategies: vanilla, specific parameters, and additional modules. Using a Northern Kurdish fine-tuning speech corpus containing approximately 68 hours of validated transcribed data, our experiments demonstrate that the additional module fine-tuning strategy significantly improves ASR accuracy on a specialized test set, achieving a Word Error Rate (WER) of 10.5% and Character Error Rate (CER) of 5.7% with Whisper version 3. These results underscore the potential of sophisticated transformer models for low-resource ASR and emphasize the importance of tailored fine-tuning techniques for optimal performance.
Purpose: The development of metaheuristic algorithms has increased by researchers to use them extensively in the field of business, science, and engineering. One of the common metaheuristic optimization algorithms is called Grey Wolf Optimization (GWO). The algorithm works based on imitation of the wolves' searching and the process of attacking grey wolves. The main purpose of this paper to overcome the GWO problem which is trapping into local optima. Design or Methodology or Approach: In this paper, the K-means clustering algorithm is used to enhance the performance of the original Grey Wolf Optimization by dividing the population into different parts. The proposed algorithm is called K-means clustering Grey Wolf Optimization (KMGWO). Findings: Results illustrate the efficiency of KMGWO is superior to GWO. To evaluate the performance of the KMGWO, KMGWO applied to solve 10 CEC2019 benchmark test functions. Results prove that KMGWO is better compared to GWO. KMGWO is also compared to Cat Swarm Optimization (CSO), Whale Optimization Algorithm-Bat Algorithm (WOA-BAT), and WOA, so, KMGWO achieves the first rank in terms of performance. Statistical results proved that KMGWO achieved a higher significant value compared to the compared algorithms. Also, the KMGWO is used to solve a pressure vessel design problem and it has outperformed results. Originality/value: Results prove that KMGWO is superior to GWO. KMGWO is also compared to cat swarm optimization (CSO), whale optimization algorithm-bat algorithm (WOA-BAT), WOA, and GWO so KMGWO achieved the first rank in terms of performance. Also, the KMGWO is used to solve a classical engineering problem and it is superior
3
Identifying university students' weaknesses results in better learning and can function as an early warning system to enable students to improve. However, the satisfaction level of existing systems is not promising. New and dynamic hybrid systems are needed to imitate this mechanism. A hybrid system (a modified Recurrent Neural Network with an adapted Grey Wolf Optimizer) is used to forecast students' outcomes. This proposed system would improve instruction by the faculty and enhance the students' learning experiences. The results show that a modified recurrent neural network with an adapted Grey Wolf Optimizer has the best accuracy when compared with other models.
This paper presents an in-depth survey and performance evaluation of the Cat Swarm Optimization (CSO) Algorithm. CSO is a robust and powerful metaheuristic swarm-based optimization approach that has received very positive feedback since its emergence. It has been tackling many optimization problems and many variants of it have been introduced. However, the literature lacks a detailed survey or a performance evaluation in this regard. Therefore, this paper is an attempt to review all these works, including its developments and applications, and group them accordingly. In addition, CSO is tested on 23 classical benchmark functions and 10 modern benchmark functions (CEC 2019). The results are then compared against three novel and powerful optimization algorithms, namely Dragonfly algorithm (DA), Butterfly optimization algorithm (BOA) and Fitness Dependent Optimizer (FDO). These algorithms are then ranked according to Friedman test and the results show that CSO ranks first on the whole. Finally, statistical approaches are employed to further confirm the outperformance of CSO algorithm.
Automated brain tumor detection is becoming a highly considerable medical diagnosis research. In recent medical diagnoses, detection and classification are highly considered to employ machine learning and deep learning techniques. Nevertheless, the accuracy and performance of current models need to be improved for suitable treatments. In this paper, an improvement in deep convolutional learning is ensured by adopting enhanced optimization algorithms, Thus, Deep Convolutional Neural Network (DCNN) based on improved Harris Hawks Optimization (HHO), called G-HHO has been considered. This hybridization features Grey Wolf Optimization (GWO) and HHO to give better results, limiting the convergence rate and enhancing performance. Moreover, Otsu thresholding is adopted to segment the tumor portion that emphasizes brain tumor detection. Experimental studies are conducted to validate the performance of the suggested method on a total number of 2073 augmented MRI images. The technique's performance was ensured by comparing it with the nine existing algorithms on huge augmented MRI images in terms of accuracy, precision, recall, f-measure, execution time, and memory usage. The performance comparison shows that the DCNN-G-HHO is much more successful than existing methods, especially on a scoring accuracy of 97%. Additionally, the statistical performance analysis indicates that the suggested approach is faster and utilizes less memory at identifying and categorizing brain tumor cancers on the MR images. The implementation of this validation is conducted on the Python platform. The relevant codes for the proposed approach are available at: this https URL.
There are no more papers matching your filters at the moment.