University of the Philippines Diliman
PARSeq, a Transformer-based model from the University of the Philippines, Diliman, adapts Permutation Language Modeling to Scene Text Recognition, achieving state-of-the-art accuracy on various benchmarks while maintaining superior computational efficiency. The model integrates context-aware and context-free inference within a unified architecture, demonstrating enhanced robustness across diverse text orientations and conditions.
621
We present cosmo_learn, an open-source python-based software package designed to simulate cosmological data and perform data-driven inference using a range of modern statistical and machine learning techniques. Motivated by the growing complexity of cosmological models and the emergence of observational tensions, cosmo_learn provides a standardized and flexible framework for benchmarking cosmological inference methods. The package supports realistic noise modeling for key observables in the late Universe, including cosmic chronometers, supernovae Ia, baryon acoustic oscillations, redshift space distortions, and gravitational wave bright sirens. We demonstrate the internal consistency of the simulated data with the input cosmology via residuals and parameter recovery using a fiducial wwCDM model. Built-in learning and inference modules include traditional Markov Chain Monte Carlo, as well as more recent approaches such as genetic algorithms, Gaussian processes, Bayesian ridge regression, and artificial neural networks. These methods are implemented in a modular and extensible architecture designed to facilitate comparisons across inference strategies in a common pipeline. By providing a flexible and transparent simulation and learning environment, cosmo_learn supports both educational and research efforts at the intersection of cosmology, statistics, and machine learning.
We reconstruct the Hubble function from cosmic chronometers, supernovae, and baryon acoustic oscillations compiled data sets via the Gaussian process (GP) method and use it to draw out Horndeski theories that are fully anchored on expansion history data. In particular, we consider three well-established formalisms of Horndeski gravity which single out a potential through the expansion data, namely: quintessence potential, designer Horndeski, and tailoring Horndeski. We discuss each method in detail and complement it with the GP reconstructed Hubble function to obtain predictive constraints on the potentials and the dark energy equation of state.
We propose Reverse Contrast Attention (RCA), a plug-in method that enhances object localization in vision-language transformers without retraining. RCA reweights final-layer attention by suppressing extremes and amplifying mid-level activations to let semantically relevant but subdued tokens guide predictions. We evaluate it on Open Vocabulary Referring Object Detection (OV-RefOD), introducing FitAP, a confidence-free average precision metric based on IoU and box area. RCA improves FitAP in 11 out of 15 open-source VLMs, with gains up to +26.6%+26.6\%. Effectiveness aligns with attention sharpness and fusion timing; while late-fusion models benefit consistently, models like DeepSeek-VL2\texttt{DeepSeek-VL2} also improve, pointing to capacity and disentanglement as key factors. RCA offers both interpretability and performance gains for multimodal transformers. Codes and dataset are available from this https URL
Factor analysis is a way to characterize the relationships between many observable variables in terms of a smaller number of unobservable random variables. However, the application of factor models and its success can be subjective or difficult to gauge, since the factor model is not identifiable. Thus, there is a need to operationalize a criterion that measures how meaningful or "interpretable" a factor model is. While there are already techniques that address interpretability, new indices and methods are proposed to measure interpretability. The proposed methods can directly incorporate both loadings and semantics, and are generalized to incorporate any "prior information". Moreover, the indices allow for complete or partial specification of relationships at a pairwise level. Two other main benefits of the proposed methods are that they do not require the estimation of factor scores, which avoids the factor score indeterminacy problem, and that no additional explanatory variables are necessary. The implementation of the proposed methods is written in Python 3 and is made available together with several helper functions through the package interpretablefa on the Python Package Index. The methods' application is demonstrated here using data on the Experiences in Close Relationships Scale, obtained from the Open-Source Psychometrics Project.
In the present work, we further study the computational power of virus machines (VMs in short).VMs provide a computing paradigm inspired by the transmission and replication networks of viruses.VMs consist of process units (called hosts) structured by a directed graph whose arcs are called channels and an instruction graph that controls the transmissions of virus objects among hosts. The present work complements our understanding of the computing power of VMs by introducing normal forms; these expressions restrict the features in a given computing model.Some of the features that we restrict in our normal forms include (a) the number of hosts, (b) the number of instructions, and (c) the number of virus objects in each host. After we recall some known results on the computing power of VMs we give our series of normal forms, such as the size of the loops in the network, proving new characterisations of family of sets, such as finite sets, semilinear sets, or recursively enumerable sets (NRE).
3
We study the minimum number of minimal codewords in linear codes from the point of view of projective geometry. We derive bounds and in some cases determine the exact values. We also present an extension to minimal subcode supports.
Multilayer graphene with different stacking sequences has emerged as a powerful setting for correlated and topological phases. In parallel, progress in graphene heterostructures with magnetic or correlated materials-most notably the Kitaev candidate α\alpha-RuCl3_3-has demonstrated charge transfer, magnetic proximity effects, and interfacial reconstruction, creating new opportunities for engineered quantum systems. Motivated by these developments, we explore a three-dimensional analogue in which α\alpha-RuCl3_3 layers are inserted directly into the van der Waals gaps of graphite, forming an intercalated system. Here, we report the successful synthesis and comprehensive characterization of graphite intercalated with α\alpha-RuCl3_3. Using a combination of X-ray diffraction, quantum oscillation measurements, and first-principles electronic structure calculations, we study the structural and electronic properties of these intercalated crystals. Our results demonstrate that graphite intercalated with α\alpha-RuCl3_3 offers a robust route to develop three-dimensional materials with access to novel correlated and topological states.
We extend the classical Keplerian framework of existing analytic TDE models by incorporating the gravitational potential of a spherically symmetric galactic mass distribution. We then demonstrate that this broader structure imprints light curve features beyond the predictive scope of traditional models, such as phases of shallower-than-standard decay and late-time rebrightening episodes. Importantly, our framework predicts the occurrence of environment-induced rebrightenings but only on very long timescales, unless the host environment is unrealistically ultra-compact. This means the early evolution of TDEs occurring in typical galaxies is essentially untouched by the host potential, which explains why Keplerian models have been so successful in describing the first few years after disruption. To illustrate, we applied our model to the TDE candidate eRASSt J133157.9-324321 (J1331), the event with the longest reported rebrightening interval, and find that even matching its {\sim}30-year rebrightening would demand an implausibly dense host. This demonstrates the limits of environmental effects as an explanation for early rebrightenings reported in the literature. More broadly, our work shows that while the host galaxy leaves TDEs nearly Keplerian at early times, it actively shapes their long-term evolution and can drive departures from the canonical t5/3t^{-5/3} decay law. These delayed signals give us a testable way to see how the host galaxy shapes the event, and they may even offer clues about the galaxy's underlying structure.
We study the minimum number of minimal codewords in linear codes from the point of view of projective geometry. We derive bounds and in some cases determine the exact values. We also present an extension to minimal subcode supports.
The Rényi entanglement entropy is calculated exactly for mode-partitioned isolated systems such as the two-mode squeezed state and the multi-mode Silbey-Harris polaron ansatz state. Effective thermodynamic descriptions of the correlated partitions are constructed to present quantum information theory concepts in the language of thermodynamics. Boltzmann weights are obtained from the entanglement spectrum by deriving the exact relationship between an effective temperature and the physical entanglement parameters. The partition function of the resulting effective thermal theory can be obtained directly from the single-copy entanglement.
More than 3030 years ago, Delandtsheer and Doyen showed that the automorphism group of a block-transitive 22-design, with blocks of size kk, could leave invariant a nontrivial point-partition, but only if the number of points was bounded in terms of kk. Since then examples have been found where there are two nontrivial point partitions, either forming a chain of partitions, or forming a grid structure on the point set. We show, by construction of infinite families of designs, that there is no limit on the length of a chain of invariant point partitions for a block-transitive 22-design. We introduce the notion of an `array' of a set of points which describes how the set interacts with parts of the various partitions, and we obtain necessary and sufficient conditions in terms of the `array' of a point set, relative to a partition chain, for it to be a block of such a design.
Precise and accurate estimation of cosmological parameters is crucial for understanding the Universe's dynamics and addressing cosmological tensions. In this methods paper, we explore bio-inspired metaheuristic algorithms, including the Improved Multi-Operator Differential Evolution scheme and the Philippine Eagle Optimization Algorithm (PEOA), alongside the relatively known genetic algorithm, for cosmological parameter estimation. Using mock data that underlay a true fiducial cosmology, we test the viability of each optimization method to recover the input cosmological parameters with confidence regions generated by bootstrapping on top of optimization. We compare the results with Markov chain Monte Carlo (MCMC) in terms of accuracy and precision, and show that PEOA performs comparably well under the specific circumstances provided. Understandably, Bayesian inference and optimization serve distinct purposes, but comparing them highlights the potential of nature-inspired algorithms in cosmological analysis, offering alternative pathways to explore parameter spaces and validate standard results.
The online assignment problem plays an important role in operational research and computer science which is why immense attention has been given to improving its solution quality. Due to the incomplete information about the input, it is difficult for online algorithms to produce the optimal solution. The quality of the solution of an online algorithm is measured using a competitive ratio. No online deterministic algorithm can achieve a competitive ratio better than (2n-1). It has been shown that advice in online computation improves the lower bound of the competitive ratio of online problems. Advice in online computation can be interpreted as additional information for the online algorithm to compensate for the lack of information about the whole input sequence. In this study, we investigate how introducing machine-learned advice could improve the competitive ratio for this problem. We provide an online algorithm for the online assignment problem by simulating a machine learning algorithm that predicts the whole input in advance. We utilize an optimal offline algorithm to provide a matching solution from the predicted input. Furthermore, we investigate how the prediction error of machine learning affects the competitive ratio of the online algorithm. We utilize a benchmark data set to perform our empirical analysis. We show that as the Machine Learning prediction error increases, the solution quality decreases. Moreover, the magnitude of error is directly proportional to the size of the input. This result is analogous to the competitive ratio of the best deterministic algorithm for the online assignment problem which is dependent also on the parameter n.
We reintroduce the parafermion-paraboson classification in RR-paraparticles in terms of their average occupation numbers, analogous to Green's parastatistics. The notion of pp-order in RR-parafermions is also redefined as the maximum number of particles that can occupy a quantum state. An example of an order-22 RR-parafermion with m=2m=2 internal degrees of freedom is presented, which obeys an exclusion principle that is not Pauli's. The interacting RR-parafermions are studied in the context of bosonization. Specifically, we show that while density waves are generally bosonic in nature and that flavor-charge separation naturally occurs for any one-dimensional RR-parafermion system described by the Luttinger model, flavor waves do not always satisfy bose statistics. Comparison of the partition functions further show that only (p=1)(p=1)-ordered RR-parafermions are compatible with the bosonization procedure in the low-energy limit. Based from these results, we discuss a potential realization of parafermion signatures in one-dimensional systems.
Gaussian processes offers a convenient way to perform nonparametric reconstructions of observational data assuming only a kernel which describes the covariance between neighbouring points in a data set. We approach the ambiguity in the choice of kernel in Gaussian processes with two methods -- (a) approximate Bayesian computation with sequential Monte Carlo sampling and (b) genetic algorithm -- and use the overall resulting method to reconstruct the cosmic chronometers and supernovae type Ia data sets. The results have shown that the Matérn(ν=5/2)\left( \nu = 5/2 \right) kernel emerges on top of the two-hyperparameter family of kernels for both cosmological data sets. On the other hand, we use the genetic algorithm in order to select a most naturally-fit kernel among a competitive pool made up of a ten-hyperparameters class of kernels. Imposing a Bayesian information criterion-inspired measure of the fitness, the results have shown that a hybrid of the Radial Basis Function and the Matérn(ν=5/2)\left( \nu = 5/2 \right) kernel best represented both data sets. The kernel selection problem is not totally closed and may benefit from further analysis using other strategies to resolve an optimal kernel for a particular data set.
There is a desire to observe the sun's poles to further deepen our understanding of solar activity. However, because of the large speeds needed to perform out-of-ecliptic plane maneuvers, chemical and electric rocket propulsion mechanisms have been proven to be costly and impractical, leaving alternative space technology systems like solar sails to be considered for these applications. In this paper, we study the possibility of using a solar sail as a probe observing the sun. We design and optimize the trajectories of the solar sail probe through the surface constraint approach, with the assumption that the sail moves on a displaced spherical surface. We first review the surface constraint approach, focusing on its important assumptions and limitations. Then, we solve and obtain a family of radial and azimuthal trajectory equations by choosing the correct constraint equation. We characterize the trajectories based on the functional dependence of the sail's orientation with the polar angle. Finally, we determine the trajectories of the probe that will give us the minimum flight time. Results show that increasing the number of mission stages decreases the total flight time, with minimal changes in the sail's radial and polar velocities. Furthermore, changing the functional dependence of the clock angle resets the azimuthal velocity, making the sail not reverse its direction and directly approach the sun along the spherical surface.
This paper presents UD-NewsCrawl, the largest Tagalog treebank to date, containing 15.6k trees manually annotated according to the Universal Dependencies framework. We detail our treebank development process, including data collection, pre-processing, manual annotation, and quality assurance procedures. We provide baseline evaluations using multiple transformer-based models to assess the performance of state-of-the-art dependency parsers on Tagalog. We also highlight challenges in the syntactic analysis of Tagalog given its distinctive grammatical properties, and discuss its implications for the annotation of this treebank. We anticipate that UD-NewsCrawl and our baseline model implementations will serve as valuable resources for advancing computational linguistics research in underrepresented languages like Tagalog.
This paper presents a framework that uses generative AI agents to simulate human sentiment responses by embedding them with psychological profiles derived from real survey data. The approach shifts from retrospective sentiment classification to prospective sentiment simulation, achieving up to 92% accuracy in reproducing human survey responses and 81-86% accuracy in predicting sentiment towards new scenarios.
Road transport infrastructure is critical for safe, fast, economical, and reliable mobility within the whole country that is conducive to a productive society. However, roads tend to deteriorate over time due to natural causes in the environment and repeated traffic loads. Pavement Distress (PD) detection is essential in monitoring the current conditions of the public roads to enable targeted rehabilitation and preventive maintenance. Nonetheless, distress detection surveys are still done via manual inspection for developing countries such as the Philippines. This study proposed the use of deep learning for two ways of recording pavement distresses from 2D RGB images - detection and segmentation. YOLOv4 is used for pavement distress detection while DeepLabv3 is employed for pavement distress segmentation on a small dataset of pavement images in the Philippines. This study aims to provide a basis to potentially spark solutions in building a cheap, scalable, and automated end-to-end solution for PD detection in the country.
There are no more papers matching your filters at the moment.