Korea Aerospace University
The threats posed by AI-generated media, particularly deepfakes, are now raising significant challenges for multimedia forensics, misinformation detection, and biometric system resulting in erosion of public trust in the legal system, significant increase in frauds, and social engineering attacks. Although several forensic methods have been proposed, they suffer from three critical gaps: (i) use of non-standardized benchmarks with GAN- or diffusion-generated images, (ii) inconsistent training protocols (e.g., scratch, frozen, fine-tuning), and (iii) limited evaluation metrics that fail to capture generalization and explainability. These limitations hinder fair comparison, obscure true robustness, and restrict deployment in security-critical applications. This paper introduces a unified benchmarking framework for systematic evaluation of forensic methods under controlled and reproducible conditions. We benchmark ten SoTA forensic methods (scratch, frozen, and fine-tuned) and seven publicly available datasets (GAN and diffusion) to perform extensive and systematic evaluations. We evaluate performance using multiple metrics, including accuracy, average precision, ROC-AUC, error rate, and class-wise sensitivity. We also further analyze model interpretability using confidence curves and Grad-CAM heatmaps. Our evaluations demonstrate substantial variability in generalization, with certain methods exhibiting strong in-distribution performance but degraded cross-model transferability. This study aims to guide the research community toward a deeper understanding of the strengths and limitations of current forensic approaches, and to inspire the development of more robust, generalizable, and explainable solutions.
Using a relativistic mean-field model calibrated to finite-nucleus observables and bulk properties of dense nuclear matter, we investigate hyperonic neutron-star matter within an SU(3) flavor-symmetry scheme. To retain SU(6)-based couplings within SU(3) flavor symmetry, we add a quartic ϕ\phi self-interaction and ϕ\phi-ρ\rho mixing. We demonstrate the roles of αv\alpha_{v} (F/(F+D)F/(F+D) ratio), θv\theta_{v} (mixing angle), and zvz_{v} (singlet-to-octet coupling ratio) in SU(3)-invariant vector-meson couplings. It is found that zvz_{v} predominantly controls the maximum mass of a neutron star, and 2M2M_{\odot} neutron stars can be supported for zv0.15z_{v}\le0.15. The αv\alpha_{v} also helps sustain large masses, whereas θv\theta_{v} has a smaller effect on neutron-star properties. This SU(3) framework reconciles nuclear and astrophysical constraints, and offers a plausible resolution to the hyperon puzzle.
Flight trajectory prediction for multiple aircraft is essential and provides critical insights into how aircraft navigate within current air traffic flows. However, predicting multi-agent flight trajectories is inherently challenging. One of the major difficulties is modeling both the individual aircraft behaviors over time and the complex interactions between flights. Generating explainable prediction outcomes is also a challenge. Therefore, we propose a Multi-Agent Inverted Transformer, MAIFormer, as a novel neural architecture that predicts multi-agent flight trajectories. The proposed framework features two key attention modules: (i) masked multivariate attention, which captures spatio-temporal patterns of individual aircraft, and (ii) agent attention, which models the social patterns among multiple agents in complex air traffic scenes. We evaluated MAIFormer using a real-world automatic dependent surveillance-broadcast flight trajectory dataset from the terminal airspace of Incheon International Airport in South Korea. The experimental results show that MAIFormer achieves the best performance across multiple metrics and outperforms other methods. In addition, MAIFormer produces prediction outcomes that are interpretable from a human perspective, which improves both the transparency of the model and its practical utility in air traffic control.
Researchers from Sungkyunkwan University and Korea Aerospace University introduce VIEW-QA, the first egocentric 360-degree video question answering dataset designed for visually impaired persons. The dataset includes 1,030 videos and 4,120 QA pairs spanning five real-life categories, and while state-of-the-art models like ViTis and EgoVLPv2 outperform baselines, their performance indicates a current gap for practical real-world assistive applications.
Spiking neural networks (SNNs) have recently been attracting significant attention for their biological plausibility and energy efficiency, but semi-supervised learning (SSL) methods for SNN-based models remain underexplored compared to those for artificial neural networks (ANNs). In this paper, we introduce SpikeMatch, the first SSL framework for SNNs that leverages the temporal dynamics through the leakage factor of SNNs for diverse pseudo-labeling within a co-training framework. By utilizing agreement among multiple predictions from a single SNN, SpikeMatch generates reliable pseudo-labels from weakly-augmented unlabeled samples to train on strongly-augmented ones, effectively mitigating confirmation bias by capturing discriminative features with limited labels. Experiments show that SpikeMatch outperforms existing SSL methods adapted to SNN backbones across various standard benchmarks.
12
Aircraft trajectory modeling plays a crucial role in Air Traffic Management (ATM) and is important for various downstream tasks, including conflict detection and landing time prediction. Dataset augmentation through the addition of synthetically generated trajectory data is necessary to develop a more robust aircraft trajectory model and ensure that the trajectory dataset is sufficient and balanced. In this work, we propose a novel framework called ATRADA for aircraft trajectory dataset augmentation. In the proposed framework, a Transformer encoder learns the underlying patterns in the original trajectory dataset and converts each data point into a context vector in the learned latent space. The converted dataset in the latent space is projected into reduced dimensions using principal component analysis (PCA), and a Gaussian mixture model (GMM) is applied to fit the probability distribution of the data points in the reduced-dimensional space. Finally, new samples are drawn from the fitted GMM, the dimension of the samples is reverted to the original dimension, and they are decoded with a Multi-Layer Perceptron (MLP). Several experiments demonstrate that the framework effectively generates new, high-quality synthetic aircraft trajectory data, which were compared to the results of several baselines.
Swarm robotics explores the coordination of multiple robots to achieve collective goals, with collective decision-making being a central focus. This process involves decentralized robots autonomously making local decisions and communicating them, which influences the overall emergent behavior. Testing such decentralized algorithms in real-world scenarios with hundreds or more robots is often impractical, underscoring the need for effective simulation tools. We propose SPACE (Swarm Planning and Control Evaluation), a Python-based simulator designed to support the research, evaluation, and comparison of decentralized Multi-Robot Task Allocation (MRTA) algorithms. SPACE streamlines core algorithmic development by allowing users to implement decision-making algorithms as Python plug-ins, easily construct agent behavior trees via an intuitive GUI, and leverage built-in support for inter-agent communication and local task awareness. To demonstrate its practical utility, we implement and evaluate CBBA and GRAPE within the simulator, comparing their performance across different metrics, particularly in scenarios with dynamically introduced tasks. This evaluation shows the usefulness of SPACE in conducting rigorous and standardized comparisons of MRTA algorithms, helping to support future research in the field.
Parallelizing Gated Recurrent Unit (GRU) networks is a challenging task, as the training procedure of GRU is inherently sequential. Prior efforts to parallelize GRU have largely focused on conventional parallelization strategies such as data-parallel and model-parallel training algorithms. However, when the given sequences are very long, existing approaches are still inevitably performance limited in terms of training time. In this paper, we present a novel parallel training scheme (called parallel-in-time) for GRU based on a multigrid reduction in time (MGRIT) solver. MGRIT partitions a sequence into multiple shorter sub-sequences and trains the sub-sequences on different processors in parallel. The key to achieving speedup is a hierarchical correction of the hidden state to accelerate end-to-end communication in both the forward and backward propagation phases of gradient descent. Experimental results on the HMDB51 dataset, where each video is an image sequence, demonstrate that the new parallel training scheme achieves up to 6.5×\times speedup over a serial approach. As efficiency of our new parallelization strategy is associated with the sequence length, our parallel GRU algorithm achieves significant performance improvement as the sequence length increases.
It is well known that query-based attacks tend to have relatively higher success rates in adversarial black-box attacks. While research on black-box attacks is actively being conducted, relatively few studies have focused on pixel attacks that target only a limited number of pixels. In image classification, query-based pixel attacks often rely on patches, which heavily depend on randomness and neglect the fact that scattered pixels are more suitable for adversarial attacks. Moreover, to the best of our knowledge, query-based pixel attacks have not been explored in the field of object detection. To address these issues, we propose a novel pixel-based black-box attack called Remember and Forget Pixel Attack using Reinforcement Learning(RFPAR), consisting of two main components: the Remember and Forget processes. RFPAR mitigates randomness and avoids patch dependency by leveraging rewards generated through a one-step RL algorithm to perturb pixels. RFPAR effectively creates perturbed images that minimize the confidence scores while adhering to limited pixel constraints. Furthermore, we advance our proposed attack beyond image classification to object detection, where RFPAR reduces the confidence scores of detected objects to avoid detection. Experiments on the ImageNet-1K dataset for classification show that RFPAR outperformed state-of-the-art query-based pixel attacks. For object detection, using the MSCOCO dataset with YOLOv8 and DDQ, RFPAR demonstrates comparable mAP reduction to state-of-the-art query-based attack while requiring fewer query. Further experiments on the Argoverse dataset using YOLOv8 confirm that RFPAR effectively removed objects on a larger scale dataset. Our code is available at this https URL
A novel system using an egocentric 360-degree camera detects subtle, short-duration physical safety anomalies for people with visual impairments, achieving 86.00% AUC-ROC on a new VIEW360 dataset and accurately predicting the anomaly's direction with 75.04% accuracy. The Frame and Direction Prediction Network (FDPN) employs saliency-driven masking and coarse-to-fine learning to enable precise, frame-level anomaly detection and directional localization.
There is a growing interest in custom spatial accelerators for machine learning applications. These accelerators employ a spatial array of processing elements (PEs) interacting via custom buffer hierarchies and networks-on-chip. The efficiency of these accelerators comes from employing optimized dataflow (i.e., spatial/temporal partitioning of data across the PEs and fine-grained scheduling) strategies to optimize data reuse. The focus of this work is to evaluate these accelerator architectures using a tiled general matrix-matrix multiplication (GEMM) kernel. To do so, we develop a framework that finds optimized mappings (dataflow and tile sizes) for a tiled GEMM for a given spatial accelerator and workload combination, leveraging an analytical cost model for runtime and energy. Our evaluations over five spatial accelerators demonstrate that the tiled GEMM mappings systematically generated by our framework achieve high performance on various GEMM workloads and accelerators.
The primordial lithium abundance inferred from observations of metal-poor stars is ~3 times smaller than the theoretical value in standard big bang nucleosynthesis (BBN) model. We assume a simple model including a sterile neutrino nu_H with mass of O(10) MeV which decays long after BBN. We then investigate cosmological effects of a sterile neutrino decay. We formulate the injection spectrum of nonthermal photons induced by electrons and positrons generated at the nu_H decay, as a function of the nu_H mass and the photon temperature. We then consistently solve (1) the cosmic thermal history, (2) nonthermal nucleosynthesis induced by the nonthermal photons, (3) the baryon-to-photon ratio eta, and (4) the effective neutrino number N_eff. Amounts of energy injection at the nu_H decay are constrained from limits on primordial D and 7Li abundances, the N_eff value, and the cosmic microwave background energy spectrum. We find that 7Be is photodisintegrated and the Li problem is partially solved for the lifetime 10^4-10^5 s and the mass >~ 14 MeV. 7Be destruction by more than a factor of 3 is not possible because of an associated D over-destruction. In the parameter region, the eta value is decreased slightly, while the N_eff value is increased by a factor of <~ 1. In this study, errors in photodisintegration cross sections of 7Be(g, a)3He and 7Li(g, a)3H that have propagated through literatures are corrected. It is then found that the new photodisintegration rates are 2.3 to 2.5 times smaller than the old rates, so that efficiencies of 7Be and 7Li photodisintegration are significantly smaller.
We address the problem of few-shot semantic segmentation (FSS), which aims to segment novel class objects in a target image with a few annotated samples. Though recent advances have been made by incorporating prototype-based metric learning, existing methods still show limited performance under extreme intra-class object variations and semantically similar inter-class objects due to their poor feature representation. To tackle this problem, we propose a dual prototypical contrastive learning approach tailored to the FSS task to capture the representative semanticfeatures effectively. The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space. To this end, we first present a class-specific contrastive loss with a dynamic prototype dictionary that stores the class-aware prototypes during training, thus enabling the same class prototypes similar and the different class prototypes to be dissimilar. Furthermore, we introduce a class-agnostic contrastive loss to enhance the generalization ability to unseen classes by compressing the feature distribution of semantic class within each episode. We demonstrate that the proposed dual prototypical contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets. The code is available at:this https URL.
8
Accurate prediction of turbine blade fatigue life is essential for ensuring the safety and reliability of aircraft engines. A significant challenge in this domain is uncovering the intrinsic relationship between mechanical properties and fatigue life. This paper introduces Reinforced Symbolic Learning (RSL), a method that derives predictive formulas linking these properties to fatigue life. RSL incorporates logical constraints during symbolic optimization, ensuring that the generated formulas are both physically meaningful and interpretable. The optimization process is further enhanced using deep reinforcement learning, which efficiently guides the symbolic regression towards more accurate models. The proposed RSL method was evaluated on two turbine blade materials, GH4169 and TC4, to identify optimal fatigue life prediction models. When compared with six empirical formulas and five machine learning algorithms, RSL not only produces more interpretable formulas but also achieves superior or comparable predictive accuracy. Additionally, finite element simulations were conducted to assess mechanical properties at critical points on the blade, which were then used to predict fatigue life under various operating conditions.
We address the challenge of single-image de-raining, a task that involves recovering rain-free background information from a single rain image. While recent advancements have utilized real-world time-lapse data for training, enabling the estimation of consistent backgrounds and realistic rain streaks, these methods often suffer from computational and memory consumption, limiting their applicability in real-world scenarios. In this paper, we introduce a novel solution: the Rain Streak Prototype Unit (RsPU). The RsPU efficiently encodes rain streak-relevant features as real-time prototypes derived from time-lapse data, eliminating the need for excessive memory resources. Our de-raining network combines encoder-decoder networks with the RsPU, allowing us to learn and encapsulate diverse rain streak-relevant features as concise prototypes, employing an attention-based approach. To ensure the effectiveness of our approach, we propose a feature prototype loss encompassing cohesion and divergence components. This loss function captures both the compactness and diversity aspects of the prototypical rain streak features within the RsPU. Our method evaluates various de-raining benchmarks, accompanied by comprehensive ablation studies. We show that it can achieve competitive results in various rain images compared to state-of-the-art methods.
Understanding how air traffic controllers construct a mental 'picture' of complex air traffic situations is crucial but remains a challenge due to the inherently intricate, high-dimensional interactions between aircraft, pilots, and controllers. Previous work on modeling the strategies of air traffic controllers and their mental image of traffic situations often centers on specific air traffic control tasks or pairwise interactions between aircraft, neglecting to capture the comprehensive dynamics of an air traffic situation. To address this issue, we propose a machine learning-based framework for explaining air traffic situations. Specifically, we employ a Transformer-based multi-agent trajectory model that encapsulates both the spatio-temporal movement of aircraft and social interaction between them. By deriving attention scores from the model, we can quantify the influence of individual aircraft on overall traffic dynamics. This provides explainable insights into how air traffic controllers perceive and understand the traffic situation. Trained on real-world air traffic surveillance data collected from the terminal airspace around Incheon International Airport in South Korea, our framework effectively explicates air traffic situations. This could potentially support and enhance the decision-making and situational awareness of air traffic controllers.
A 28nm dense 6T-SRAM Digital(D)/Analog(A) Hybrid compute-in-memory (CIM) macro supporting complex num-ber MAC operation is presented. By introducing a 2D-weighted Capacitor Array, a hybrid configuration is adopted where digital CIM is applied only to the upper bits and ana-log CIM is applied to the rest, without the need for input DACs resulting in improved accuracy and lower area overhead. The CIM prototype macro achieves 1.80 Mb/mm2 memory density and 0.435% RMS error. Complex CIM unit outputs real and imaginary part with a single conversion to reduce latency.
Plasma simulations are powerful tools for understanding fundamental plasma science phenomena and for process optimization in applications. To ensure their quantitative accuracy, they must be validated against experiments. In this work, such an experimental validation is performed for a 1d3v particle-in-cell simulation complemented with the Monte Carlo treatment of collision processes of a capacitively coupled radio frequency plasma driven at 13.56 MHz and operated in neon gas. In a geometrically symmetric reactor the electron density in the discharge center and the spatio-temporal distribution of the electron impact excitation rate from the ground into the Ne 2p1_1 state are measured by a microwave cutoff probe and phase resolved optical emission spectroscopy, respectively. The measurements are conducted for electrode gaps between 50 mm and 90 mm, neutral gas pressures between 20 mTorr and 50 mTorr, and peak-to-peak values of the driving voltage waveform between 250 V and 650 V. Simulations are performed under identical discharge conditions. In the simulations, various combinations of surface coefficients characterising the interactions of electrons and heavy particles with the anodized aluminium electrode surfaces are adopted. We find, that the simulations using a constant effective heavy particle induced secondary electron emission coefficient of 0.3 and a realistic electron-surface interaction model (which considers energy-dependent and material specific elastic and inelastic electron reflection, as well as the emission of true secondary electrons from the surface) yield results which are in good quantitative agreement with the experimental data.
With the framework of KIDS (Korea-IBS-Daegu-SKKU) density functional model, the isoscalar and isovector effective masses of nucleon and the effect of symmetry energy in nuclear medium are investigated in inclusive (e,e)(e,e') reaction in quasielastic region. The effective masses are varied in the range (0.71.0)M(0.7 \sim 1.0)M with free nucleon mass MM, and the symmetry energy is varied within the uncertainty allowed by nuclear data and neutron star observation. The wave functions of nucleons inside target nucleus are generated by solving Hartree-Fock equation with adjusting equation of state, binding energy and radius of various stable nuclei, and effective mass of nucleon in the KIDS model. With the obtained wave functions, we calculate the differential cross section for the inclusive (e,e)(e,e') reaction and compare the theoretical results with Bates, Saclay, and SLAC experimental data. Our model describes experimental data better at SLAC-type high incident electron energy than those measured from Bates and Saclay. The influence of the effective mass and symmetry energy appears to be precise on the longitudinal cross section.
We calculate muon-neutrino (νμ\nu_{\mu}) scattering off 12^{12}C via charged current (CC) by exploiting the 236 MeV νμ{\nu_{\mu}} from the kaon-decay-at-rest (KDAR). In this energy region, since both inelastic scattering below the quasielastic (QE) region and the QE scattering contribute simultaneously, we combine the inelastic scattering obtained by the QRPA and the QE scattering obtained by distorted wave born approximation (DWBA) based on the relativistic mean field (RMF) theory. We compare the results to the data from MiniBooNE. Further, since the KDR νμ\nu_{\mu} CC scattering may have angle dependence of outgoing muon, we investigate the differential angular dependent cross section in the νμ{\nu_{\mu}}-12^{12}C scattering and compare to the results by νe\nu_e-12^{12}C scattering. These results could be useful for the calibration of the forthcoming KDAR neutrino cross section experiments.
There are no more papers matching your filters at the moment.