Yeungnam University
Recent progress in generative AI, primarily through diffusion models, presents significant challenges for real-world deepfake detection. The increased realism in image details, diverse content, and widespread accessibility to the general public complicates the identification of these sophisticated deepfakes. Acknowledging the urgency to address the vulnerability of current deepfake detectors to this evolving threat, our paper introduces two extensive deepfake datasets generated by state-of-the-art diffusion models as other datasets are less diverse and low in quality. Our extensive experiments also showed that our dataset is more challenging compared to the other face deepfake datasets. Our strategic dataset creation not only challenge the deepfake detectors but also sets a new benchmark for more evaluation. Our comprehensive evaluation reveals the struggle of existing detection methods, often optimized for specific image domains and manipulations, to effectively adapt to the intricate nature of diffusion deepfakes, limiting their practical utility. To address this critical issue, we investigate the impact of enhancing training data diversity on representative detection methods. This involves expanding the diversity of both manipulation techniques and image domains. Our findings underscore that increasing training data diversity results in improved generalizability. Moreover, we propose a novel momentum difficulty boosting strategy to tackle the additional challenge posed by training data heterogeneity. This strategy dynamically assigns appropriate sample weights based on learning difficulty, enhancing the model's adaptability to both easy and challenging samples. Extensive experiments on both existing and newly proposed benchmarks demonstrate that our model optimization approach surpasses prior alternatives significantly.
2,290
The advent of Non-Terrestrial Networks (NTN) represents a compelling response to the International Mobile Telecommunications 2030 (IMT-2030) framework, enabling the delivery of advanced, seamless connectivity that supports reliable, sustainable, and resilient communication systems. Nevertheless, the integration of NTN with Terrestrial Networks (TN) necessitates considerable alterations to the existing cellular infrastructure in order to address the challenges intrinsic to NTN implementation. Additionally, Ambient Backscatter Communication (AmBC), which utilizes ambient Radio Frequency (RF) signals to transmit data to the intended recipient by altering and reflecting these signals, exhibits considerable potential for the effective integration of NTN and TN. Furthermore, AmBC is constrained by its limitations regarding power, interference, and other related factors. In contrast, the application of Artificial Intelligence (AI) within wireless networks demonstrates significant potential for predictive analytics through the use of extensive datasets. AI techniques enable the real-time optimization of network parameters, mitigating interference and power limitations in AmBC. These predictive models also enhance the adaptive integration of NTN and TN, driving significant improvements in network reliability and Energy Efficiency (EE). In this paper, we present a comprehensive examination of how the commixture of AI, AmBC, and NTN can facilitate the integration of NTN and TN. We also provide a thorough analysis indicating a marked enhancement in EE predicated on this triadic relationship.
This paper presents MIS-LSTM, a hybrid framework that joins CNN encoders with an LSTM sequence model for sleep quality and stress prediction at the day level from multimodal lifelog data. Continuous sensor streams are first partitioned into N-hour blocks and rendered as multi-channel images, while sparse discrete events are encoded with a dedicated 1D-CNN. A Convolutional Block Attention Module fuses the two modalities into refined block embeddings, which an LSTM then aggregates to capture long-range temporal dependencies. To further boost robustness, we introduce UALRE, an uncertainty-aware ensemble that overrides lowconfidence majority votes with high-confidence individual predictions. Experiments on the 2025 ETRI Lifelog Challenge dataset show that Our base MISLSTM achieves Macro-F1 0.615; with the UALRE ensemble, the score improves to 0.647, outperforming strong LSTM, 1D-CNN, and CNN baselines. Ablations confirm (i) the superiority of multi-channel over stacked-vertical imaging, (ii) the benefit of a 4-hour block granularity, and (iii) the efficacy of modality-specific discrete encoding.
The Hawkes model is suitable for describing self and mutually exciting random events. In addition, the exponential decay in the Hawkes process allows us to calculate the moment properties in the model. However, due to the complexity of the model and formula, few studies have been conducted on the performance of Hawkes volatility. In this study, we derived a variance formula that is directly applicable under the general settings of both unmarked and marked Hawkes models for tick-level price dynamics. In the marked model, the linear impact function and possible dependency between the marks and underlying processes are considered. The Hawkes volatility is applied to the mid-price process filtered at 0.1-second intervals to show reliable results; furthermore, intraday estimation is expected to have high utilization in real-time risk management. We also note the increasing predictive power of intraday Hawkes volatility over time and examine the relationship between futures and stock volatilities.
This study proposes a versatile model for the dynamics of the best bid and ask prices using an extended Hawkes process. The model incorporates the zero intensities of the spread-narrowing processes at the minimum bid-ask spread, spread-dependent intensities, possible negative excitement, and nonnegative intensities. We apply the model to high-frequency best bid and ask price data from US stock markets. The empirical findings demonstrate a spread-narrowing tendency, excitations of the intensities caused by previous events, the impact of flash crashes, characteristic trends in fast trading over time, and the different features of market participants in the various exchanges.
This study explores the prediction of high-frequency price changes using deep learning models. Although state-of-the-art methods perform well, their complexity impedes the understanding of successful predictions. We found that an inadequately defined target price process may render predictions meaningless by incorporating past information. The commonly used three-class problem in asset price prediction can generally be divided into volatility and directional prediction. When relying solely on the price process, directional prediction performance is not substantial. However, volume imbalance improves directional prediction performance.
UAVs are increasingly becoming vital tools in various wireless communication applications including internet of things (IoT) and sensor networks, thanks to their rapid and agile non-terrestrial mobility. Despite recent research, planning three-dimensional (3D) UAV trajectories over a continuous temporal-spatial domain remains challenging due to the need to solve computationally intensive optimization problems. In this paper, we study UAV-assisted IoT data collection aimed at minimizing total energy consumption while accounting for the UAV's physical capabilities, the heterogeneous data demands of IoT nodes, and 3D terrain. We propose a matrix-based differential evolution with constraint handling (MDE-CH), a computation-efficient evolutionary algorithm designed to address non-convex constrained optimization problems with several different types of constraints. Numerical evaluations demonstrate that the proposed MDE-CH algorithm provides a continuous 3D temporal-spatial UAV trajectory capable of efficiently minimizing energy consumption under various practical constraints and outperforms the conventional fly-hover-fly model for both two-dimensional (2D) and 3D trajectory planning.
Snackjack is a highly simplified version of blackjack that was proposed by Ethier (2010) and given its name by Epstein (2013). The eight-card deck comprises two aces, two deuces, and four treys, with aces having value either 1 or 4, and deuces and treys having values 2 and 3, respectively. The target total is 7 (vs. 21 in blackjack), and ace-trey is a natural. The dealer stands on 6 and 7, including soft totals, and otherwise hits. The player can stand, hit, double, or split, but split pairs receive only one card per paircard (like split aces in blackjack), and there is no insurance. We analyze the game, both single and multiple deck, deriving basic strategy and one-parameter card-counting systems. Unlike in blackjack, these derivations can be done by hand, though it may nevertheless be easier and more reliable to use a computer. More importantly, the simplicity of snackjack allows us to do computations that would be prohibitively time-consuming at blackjack. We can thereby enhance our understanding of blackjack by thoroughly exploring snackjack.
This study proposes a versatile model for the dynamics of the best bid and ask prices using an extended Hawkes process. The model incorporates the zero intensities of the spread-narrowing processes at the minimum bid-ask spread, spread-dependent intensities, possible negative excitement, and nonnegative intensities. We apply the model to high-frequency best bid and ask price data from US stock markets. The empirical findings demonstrate a spread-narrowing tendency, excitations of the intensities caused by previous events, the impact of flash crashes, characteristic trends in fast trading over time, and the different features of market participants in the various exchanges.
Noisier2Inverse presents a self-supervised deep learning method for image reconstruction in general inverse problems, specifically designed to handle statistically correlated measurement noise. The approach operates in one step and avoids extrapolation during inference, consistently outperforming existing self-supervised techniques like Noisier2Noise and Noise2Inverse in 2D CT reconstruction tasks.
This study examines the use of a recurrent neural network for estimating the parameters of a Hawkes model based on high-frequency financial data, and subsequently, for computing volatility. Neural networks have shown promising results in various fields, and interest in finance is also growing. Our approach demonstrates significantly faster computational performance compared to traditional maximum likelihood estimation methods while yielding comparable accuracy in both simulation and empirical studies. Furthermore, we demonstrate the application of this method for real-time volatility measurement, enabling the continuous estimation of financial volatility as new price data keeps coming from the market.
03 Feb 2024
Blind deconvolution aims to recover an original image from a blurred version in the case where the blurring kernel is unknown. It has wide applications in diverse fields such as astronomy, microscopy, and medical imaging. Blind deconvolution is a challenging ill-posed problem that suffers from significant non-uniqueness. Solution methods therefore require the integration of appropriate prior information. Early approaches rely on hand-crafted priors for the original image and the kernel. Recently, deep learning methods have shown excellent performance in addressing this challenge. However, most existing learning methods for blind deconvolution require a paired dataset of original and blurred images, which is often difficult to obtain. In this paper, we present a novel unsupervised learning approach named ECALL (Expectation-CALibrated Learning) that uses separate unpaired collections of original and blurred images. Key features of the proposed loss function are cycle consistency involving the kernel and associated reconstruction operator, and terms that use expectation values of data distributions to obtain information about the kernel. Numerical results are used to support ECALL.
This study examine the theoretical and empirical perspectives of the symmetric Hawkes model of the price tick structure. Combined with the maximum likelihood estimation, the model provides a proper method of volatility estimation specialized in ultra-high-frequency analysis. Empirical studies based on the model using the ultra-high-frequency data of stocks in the S\&P 500 are performed. The performance of the volatility measure, intraday estimation, and the dynamics of the parameters are discussed. A new approach of diffusion analogy to the symmetric Hawkes model is proposed with the distributional properties very close to the Hawkes model. As a diffusion process, the model provides more analytical simplicity when computing the variance formula, incorporating skewness and examining the probabilistic property. An estimation of the diffusion model is performed using the simulated maximum likelihood method and shows similar patterns to the Hawkes model.
We propose the Hawkes flocking model that assesses systemic risk in high-frequency processes at the two perspectives -- endogeneity and interactivity. We examine the futures markets of WTI crude oil and gasoline for the past decade, and perform a comparative analysis with conditional value-at-risk as a benchmark measure. In terms of high-frequency structure, we derive the empirical findings. The endogenous systemic risk in WTI was significantly higher than that in gasoline, and the level at which gasoline affects WTI was constantly higher than in the opposite case. Moreover, although the relative influence's degree was asymmetric, its difference has gradually reduced.
Synthetic aperture radar technology is crucial for high-resolution imaging under various conditions; however, the acquisition of real-world synthetic aperture radar data for deep learning-based automatic target recognition remains challenging due to high costs and data availability issues. To overcome these challenges, synthetic data generated through simulations have been employed, although discrepancies between synthetic and real data can degrade model performance. In this study, we introduce a novel framework, soft segmented randomization, designed to reduce domain discrepancy and improve the generalize ability of synthetic aperture radar automatic target recognition models. The soft segmented randomization framework applies a Gaussian mixture model to segment target and clutter regions softly, introducing randomized variations that align the synthetic data's statistical properties more closely with those of real-world data. Experimental results demonstrate that the proposed soft segmented randomization framework significantly enhances model performance on measured synthetic aperture radar data, making it a promising approach for robust automatic target recognition in scenarios with limited or no access to measured data.
04 Feb 2023
In this article, we are concerned with a nonlinear inverse problem with a forward operator involving an unknown function. The problem arises in diverse applications and is challenging in the presence of an unknown function, which makes it ill-posed. Additionally, the nonlinear nature of the problem makes it difficult to use traditional methods, and thus, the study addresses a simplified version of the problem by either linearizing it or assuming knowledge of the unknown function. Here, we propose self-supervised learning to directly tackle a nonlinear inverse problem involving an unknown function. In particular, we focus on an inverse problem derived in photoacoustic tomograpy (PAT), which is a hybrid medical imaging with high resolution and contrast. PAT can be modeled based on the wave equation. The measured data provide the solution to an equation restricted to surface and initial pressure of an equation that contains biological information on the object of interest. The speed of a sound wave in the equation is unknown. Our goal is to determine the initial pressure and the speed of the sound wave simultaneously. Under a simple assumption that sound speed is a function of the initial pressure, the problem becomes a nonlinear inverse problem involving an unknown function. The experimental results demonstrate that the proposed framework performs successfully.
A Brownian ratchet is a one-dimensional diffusion process that drifts toward a minimum of a periodic asymmetric sawtooth potential. A flashing Brownian ratchet is a process that alternates between two regimes, a one-dimensional Brownian motion and a Brownian ratchet, producing directed motion. These processes have been of interest to physicists and biologists for nearly 25 years. The flashing Brownian ratchet is the process that motivated Parrondo's paradox, in which two fair games of chance, when alternated, produce a winning game. Parrondo's games are relatively simple, being discrete in time and space. The flashing Brownian ratchet is rather more complicated. We show how one can study the latter process numerically using a random walk approximation.
The flashing Brownian ratchet is a stochastic process that alternates between two regimes, a one-dimensional Brownian motion and a Brownian ratchet, the latter being a one-dimensional diffusion process that drifts towards a minimum of a periodic asymmetric sawtooth potential. The result is directed motion. In the presence of a static homogeneous force that acts in the direction opposite that of the directed motion, there is a reduction (or even a reversal) of the directed motion effect. Such a process may be called a tilted flashing Brownian ratchet. We show how one can study this process numerically, using a random walk approximation or, equivalently, using numerical solution of the Fokker-Planck equation. Stochastic simulation is another viable method.
Developing a robust algorithm to diagnose and quantify the severity of COVID-19 using Chest X-ray (CXR) requires a large number of well-curated COVID-19 datasets, which is difficult to collect under the global COVID-19 pandemic. On the other hand, CXR data with other findings are abundant. This situation is ideally suited for the Vision Transformer (ViT) architecture, where a lot of unlabeled data can be used through structural modeling by the self-attention mechanism. However, the use of existing ViT is not optimal, since feature embedding through direct patch flattening or ResNet backbone in the standard ViT is not intended for CXR. To address this problem, here we propose a novel Vision Transformer that utilizes low-level CXR feature corpus obtained from a backbone network that extracts common CXR findings. Specifically, the backbone network is first trained with large public datasets to detect common abnormal findings such as consolidation, opacity, edema, etc. Then, the embedded features from the backbone network are used as corpora for a Transformer model for the diagnosis and the severity quantification of COVID-19. We evaluate our model on various external test datasets from totally different institutions to evaluate the generalization capability. The experimental results confirm that our model can achieve the state-of-the-art performance in both diagnosis and severity quantification tasks with superior generalization capability, which are sine qua non of widespread deployment.
The mixed-state Hall resistivity and the longitudinal resistivity in HgBa_{2}CaCu_{2}O_{6}, HgBa_{2}Ca_{2}Cu_{3}O_{8}, and Tl_{2}Ba_{2}CaCu_{2}O_{8} thin films have been investigated as functions of the magnetic field up to 18 T. We observe the universal scaling behavior between \rho_{xy} and \rho_{xx} in the regions of the clean and the moderately clean limit. The scaling exponent \beta is 1.9 in the clean limit at high field and low temperature whereas \beta is 1.0 in the moderately clean limit at low field and high temperature, consistent with a theory based on the midgap states in the vortex cores. This finding implies that the Hall conductivity is also universal in Hg- and Tl-based superconductors.
There are no more papers matching your filters at the moment.